DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH v1 0/2]  bbdev: add device info on queue topology
@ 2022-03-09  0:22 Nicolas Chautru
  2022-03-09  0:22 ` [PATCH v1 1/2] " Nicolas Chautru
                   ` (2 more replies)
  0 siblings, 3 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-03-09  0:22 UTC (permalink / raw)
  To: dev, gakhil; +Cc: trix, hemant.agrawal, mingshan.zhang, Nicolas Chautru

Addressing an historical concern that the device info struct only
imperfectly captured what queues are available on the device
(number of operation and priority). This ended up being an iterative
process for application to find each queue could be configured.

ie. the gap was captured as technical debt previously  in comments
/* This isn't ideal because it reports the maximum number of queues but
 * does not provide info on how many can be uplink/downlink or different
 * priorities
 */

This is now being exposed explictly based on the what the device actually
supports using the existing info_get api

Note: did not update the release yet notes since it will go in next release

Nicolas Chautru (2):
  bbdev: add device info on queue topology
  drivers/baseband: update PMDs to expose queue per operation

 drivers/baseband/acc100/rte_acc100_pmd.c           | 29 +++++++++++++---------
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  8 ++++++
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  8 ++++++
 drivers/baseband/la12xx/bbdev_la12xx.c             |  8 +++++-
 lib/bbdev/rte_bbdev.h                              |  4 +++
 5 files changed, 44 insertions(+), 13 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v1 1/2] bbdev: add device info on queue topology
  2022-03-09  0:22 [PATCH v1 0/2] bbdev: add device info on queue topology Nicolas Chautru
@ 2022-03-09  0:22 ` Nicolas Chautru
  2022-03-09  1:28   ` Stephen Hemminger
  2022-03-09  0:22 ` [PATCH v1 2/2] drivers/baseband: update PMDs to expose queue per operation Nicolas Chautru
  2022-06-06 16:15 ` [PATCH v1 0/2] bbdev: add device info on queue topology Chautru, Nicolas
  2 siblings, 1 reply; 174+ messages in thread
From: Nicolas Chautru @ 2022-03-09  0:22 UTC (permalink / raw)
  To: dev, gakhil; +Cc: trix, hemant.agrawal, mingshan.zhang, Nicolas Chautru

Adding more options in the API to expose the number
of queues exposed and related priority.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 lib/bbdev/rte_bbdev.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index b88c881..10c06b6 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -274,6 +274,10 @@ struct rte_bbdev_driver_info {
 
 	/** Maximum number of queues supported by the device */
 	unsigned int max_num_queues;
+	/** Maximum number of queues supported per operation type */
+	unsigned int num_queues[RTE_BBDEV_OP_TYPE_COUNT];
+	/** Priority level supported per operation type */
+	unsigned int queue_priority[RTE_BBDEV_OP_TYPE_COUNT];
 	/** Queue size limit (queue size must also be power of 2) */
 	uint32_t queue_size_lim;
 	/** Set if device off-loads operation to hardware  */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v1 2/2] drivers/baseband: update PMDs to expose queue per operation
  2022-03-09  0:22 [PATCH v1 0/2] bbdev: add device info on queue topology Nicolas Chautru
  2022-03-09  0:22 ` [PATCH v1 1/2] " Nicolas Chautru
@ 2022-03-09  0:22 ` Nicolas Chautru
  2022-06-17 18:37   ` [PATCH v2 0/5] bbdev changes for 22.11 Nicolas Chautru
  2022-06-06 16:15 ` [PATCH v1 0/2] bbdev: add device info on queue topology Chautru, Nicolas
  2 siblings, 1 reply; 174+ messages in thread
From: Nicolas Chautru @ 2022-03-09  0:22 UTC (permalink / raw)
  To: dev, gakhil; +Cc: trix, hemant.agrawal, mingshan.zhang, Nicolas Chautru

Add support in existing bbdev PMDs for the explicit number of queue
and priority for each operation type configured on the device.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c           | 29 +++++++++++++---------
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  8 ++++++
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  8 ++++++
 drivers/baseband/la12xx/bbdev_la12xx.c             |  8 +++++-
 4 files changed, 40 insertions(+), 13 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index de7e4bc..49cc9d2 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -966,6 +966,7 @@
 		struct rte_bbdev_driver_info *dev_info)
 {
 	struct acc100_device *d = dev->data->dev_private;
+	int i;
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
 		{
@@ -1061,19 +1062,23 @@
 	/* Read and save the populated config from ACC100 registers */
 	fetch_acc100_config(dev);
 
-	/* This isn't ideal because it reports the maximum number of queues but
-	 * does not provide info on how many can be uplink/downlink or different
-	 * priorities
-	 */
-	dev_info->max_num_queues =
-			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
-			d->acc100_conf.q_dl_5g.num_qgroups +
-			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
-			d->acc100_conf.q_ul_5g.num_qgroups +
-			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
-			d->acc100_conf.q_dl_4g.num_qgroups +
-			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
+	/* Expose number of queues */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = d->acc100_conf.q_ul_4g.num_aqs_per_groups *
 			d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = d->acc100_conf.q_dl_4g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_4g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = d->acc100_conf.q_ul_5g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_5g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = d->acc100_conf.q_dl_5g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_5g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = d->acc100_conf.q_dl_4g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = d->acc100_conf.q_ul_5g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = d->acc100_conf.q_dl_5g.num_qgroups;
+	dev_info->max_num_queues = 0;
+	for (i = RTE_BBDEV_OP_NONE; i < RTE_BBDEV_OP_TYPE_COUNT; i++)
+		dev_info->max_num_queues += dev_info->num_queues[i];
 	dev_info->queue_size_lim = ACC100_MAX_QUEUE_DEPTH;
 	dev_info->hardware_accelerated = true;
 	dev_info->max_dl_queue_priority =
diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
index 15d23d6..f92b59a 100644
--- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
+++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
@@ -382,6 +382,14 @@
 		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
 			dev_info->max_num_queues++;
 	}
+	/* Expose number of queue per operation type */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = dev_info->max_num_queues / 2;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
 }
 
 /**
diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
index 21d3529..56d1baf 100644
--- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
+++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
@@ -654,6 +654,14 @@ struct __rte_cache_aligned fpga_queue {
 		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
 			dev_info->max_num_queues++;
 	}
+	/* Expose number of queue per operation type */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = 0;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = 1;
 }
 
 /**
diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c b/drivers/baseband/la12xx/bbdev_la12xx.c
index 4d1bd16..69f32ee 100644
--- a/drivers/baseband/la12xx/bbdev_la12xx.c
+++ b/drivers/baseband/la12xx/bbdev_la12xx.c
@@ -100,7 +100,13 @@ struct bbdev_la12xx_params {
 	dev_info->capabilities = bbdev_capabilities;
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->min_alignment = 64;
-
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = LA12XX_MAX_QUEUES / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = LA12XX_MAX_QUEUES / 2;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
 	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v1 1/2] bbdev: add device info on queue topology
  2022-03-09  0:22 ` [PATCH v1 1/2] " Nicolas Chautru
@ 2022-03-09  1:28   ` Stephen Hemminger
  0 siblings, 0 replies; 174+ messages in thread
From: Stephen Hemminger @ 2022-03-09  1:28 UTC (permalink / raw)
  To: Nicolas Chautru; +Cc: dev, gakhil, trix, hemant.agrawal, mingshan.zhang

On Tue,  8 Mar 2022 16:22:34 -0800
Nicolas Chautru <nicolas.chautru@intel.com> wrote:

> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
> index b88c881..10c06b6 100644
> --- a/lib/bbdev/rte_bbdev.h
> +++ b/lib/bbdev/rte_bbdev.h
> @@ -274,6 +274,10 @@ struct rte_bbdev_driver_info {
>  
>  	/** Maximum number of queues supported by the device */
>  	unsigned int max_num_queues;
> +	/** Maximum number of queues supported per operation type */
> +	unsigned int num_queues[RTE_BBDEV_OP_TYPE_COUNT];
> +	/** Priority level supported per operation type */
> +	unsigned int queue_priority[RTE_BBDEV_OP_TYPE_COUNT];
>  	/** Queue size limit (queue size must also be power of 2) */
>  	uint32_t queue_size_lim;
>  	/** Set if device off-loads operation to hardware  */

This breaks ABI of rte_bbdev_info_get.

^ permalink raw reply	[flat|nested] 174+ messages in thread

* RE: [PATCH v1 0/2]  bbdev: add device info on queue topology
  2022-03-09  0:22 [PATCH v1 0/2] bbdev: add device info on queue topology Nicolas Chautru
  2022-03-09  0:22 ` [PATCH v1 1/2] " Nicolas Chautru
  2022-03-09  0:22 ` [PATCH v1 2/2] drivers/baseband: update PMDs to expose queue per operation Nicolas Chautru
@ 2022-06-06 16:15 ` Chautru, Nicolas
  2 siblings, 0 replies; 174+ messages in thread
From: Chautru, Nicolas @ 2022-06-06 16:15 UTC (permalink / raw)
  To: dev, gakhil; +Cc: trix, hemant.agrawal, Zhang, Mingshan

Hi Hemant, 
Could you review this serie, this is targeting 22.11 release. Let me know if unclear. 
Thanks
Nic

> -----Original Message-----
> From: Chautru, Nicolas <nicolas.chautru@intel.com>
> Sent: Tuesday, March 8, 2022 4:23 PM
> To: dev@dpdk.org; gakhil@marvell.com
> Cc: trix@redhat.com; hemant.agrawal@nxp.com; Zhang, Mingshan
> <mingshan.zhang@intel.com>; Chautru, Nicolas
> <nicolas.chautru@intel.com>
> Subject: [PATCH v1 0/2] bbdev: add device info on queue topology
> 
> Addressing an historical concern that the device info struct only imperfectly
> captured what queues are available on the device (number of operation and
> priority). This ended up being an iterative process for application to find each
> queue could be configured.
> 
> ie. the gap was captured as technical debt previously  in comments
> /* This isn't ideal because it reports the maximum number of queues but
>  * does not provide info on how many can be uplink/downlink or different
>  * priorities
>  */
> 
> This is now being exposed explictly based on the what the device actually
> supports using the existing info_get api
> 
> Note: did not update the release yet notes since it will go in next release
> 
> Nicolas Chautru (2):
>   bbdev: add device info on queue topology
>   drivers/baseband: update PMDs to expose queue per operation
> 
>  drivers/baseband/acc100/rte_acc100_pmd.c           | 29 +++++++++++++------
> ---
>  drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  8 ++++++
>  drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  8 ++++++
>  drivers/baseband/la12xx/bbdev_la12xx.c             |  8 +++++-
>  lib/bbdev/rte_bbdev.h                              |  4 +++
>  5 files changed, 44 insertions(+), 13 deletions(-)
> 
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v2 0/5] bbdev changes for 22.11
  2022-03-09  0:22 ` [PATCH v1 2/2] drivers/baseband: update PMDs to expose queue per operation Nicolas Chautru
@ 2022-06-17 18:37   ` Nicolas Chautru
  2022-06-17 18:37     ` [PATCH v2 1/5] bbdev: allow operation type enum for growth Nicolas Chautru
                       ` (4 more replies)
  0 siblings, 5 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-06-17 18:37 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Hi,

Agregating together in a single serie a number of bbdev api changes previously submitted over the last few months and all targeted for 22.11 (4 different series detailed below). Related deprecation notice being pushed in 22.07 in parallel. 
* bbdev: add device status info
* bbdev: add new operation for FFT processing
* bbdev: add device info on queue topology
* bbdev: allow operation type enum for growth

v2: Update to the RTE_BBDEV_COUNT removal based on feedback from Thomas/Stephen : rejecting out of range op type and adjusting the new name for the padded maximum value used for fixed size arrays. 

---

Previous cover letters agregated below:

* bbdev: add device status info
https://patches.dpdk.org/project/dpdk/list/?series=23367

The updated structure will allow PMDs to expose through info_get what be may the status of the underlying accelerator, notably in case an HW error event having happened.

* bbdev: add new operation for FFT processing
https://patches.dpdk.org/project/dpdk/list/?series=22111

This contribution adds a new operation type to the existing ones already supported by the bbdev PMDs.
This set of operation is FFT-based processing for 5GNR baseband processing acceleration. This operates in the same lookaside fashion as other existing bbdev operation with a dedicated set of capabilities and parameters (marked as experimental).

I plan to also include a new PMD supporting this operation (and most of the related capabilities) in the next couple of months (either in 22.06 or 22.09) as well as extending the related bbdev-test.

* bbdev: add device info on queue topology
https://patches.dpdk.org/project/dpdk/list/?series=22076

Addressing an historical concern that the device info struct only
imperfectly captured what queues are available on the device
(number of operation and priority). This ended up being an iterative
process for application to find each queue could be configured.

ie. the gap was captured as technical debt previously  in comments
/* This isn't ideal because it reports the maximum number of queues but
 * does not provide info on how many can be uplink/downlink or different
 * priorities
 */

This is now being exposed explictly based on the what the device actually
supports using the existing info_get api

* bbdev: allow operation type enum for growth
https://patches.dpdk.org/project/dpdk/list/?series=23509

This is related to the general intent to remove using MAX value for enums. There is consensus that we should avoid this for a while notably for future-proofed ABI concerns https://patches.dpdk.org/project/dpdk/patch/20200130142003.2645765-1-ferruh.yigit@intel.com/.
But still there is arguably not yet an explicit best recommendation to handle this especially when we actualy need to expose array whose index is such an enum.
As a specific example here I am refering to RTE_BBDEV_OP_TYPE_COUNT in enum rte_bbdev_op_type which is being extended for new operation type being support in bbdev (such as https://patches.dpdk.org/project/dpdk/patch/1646956157-245769-2-git-send-email-nicolas.chautru@intel.com/ adding new FFT operation)

There is also the intent to be able to expose information for each operation type through the bbdev api such as dynamically configured queues information per such operation type https://patches.dpdk.org/project/dpdk/patch/1646785355-168133-2-git-send-email-nicolas.chautru@intel.com/

Basically we are considering best way to accomodate for this, notably based on discussions with Ray Kinsella and Bruce Richardson, to handle such a case moving forward: specifically for the example with RTE_BBDEV_OP_TYPE_COUNT and also more generally.

One possible option is captured in that patchset and is basically based on the simple principle to allow for growth and prevent ABI breakage. Ie. the last value of the enum is set with a higher value than required so that to allow insertion of new enum outside of the major ABI versions.
In that case the RTE_BBDEV_OP_TYPE_COUNT is still present and can be exposed and used while still allowing for addition thanks to the implicit padding-like room. As an alternate variant, instead of using that last enum value, that extended size could be exposed as an #define outside of the enum but would be fundamentally the same (public).

Another option would be to avoid array alltogether and use each time this a new dedicated API function (operation type enum being an input argument instead of an index to an array in an existing structure so that to get access to structure related to a given operation type enum) but that is arguably not well scalable within DPDK to use such a scheme for each enums and keep an uncluttered and clean API. In that very example that would be very odd indeed not to get this simply from info_get().

Some pros and cons, arguably the simple option in that patchset is a valid compromise option and a step in the right direction but we would like to know your view wrt best recommendation, or any other thought. 





Nicolas Chautru (5):
  bbdev: allow operation type enum for growth
  bbdev: add device status info
  bbdev: add device info on queue topology
  drivers/baseband: update PMDs to expose queue per operation
  bbdev: add new operation for FFT processing

 app/test-bbdev/test_bbdev.c                        |   2 +-
 app/test-bbdev/test_bbdev_perf.c                   |   4 +-
 doc/guides/prog_guide/bbdev.rst                    | 130 ++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.c           |  30 ++--
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |   9 ++
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |   9 ++
 drivers/baseband/la12xx/bbdev_la12xx.c             |  10 +-
 drivers/baseband/null/bbdev_null.c                 |   1 +
 drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  12 ++
 examples/bbdev_app/main.c                          |   2 +-
 lib/bbdev/rte_bbdev.c                              |  37 ++++-
 lib/bbdev/rte_bbdev.h                              | 115 +++++++++++++++-
 lib/bbdev/rte_bbdev_op.h                           | 151 ++++++++++++++++++++-
 lib/bbdev/version.map                              |  10 ++
 14 files changed, 499 insertions(+), 23 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v2 1/5] bbdev: allow operation type enum for growth
  2022-06-17 18:37   ` [PATCH v2 0/5] bbdev changes for 22.11 Nicolas Chautru
@ 2022-06-17 18:37     ` Nicolas Chautru
  2022-06-17 18:37     ` [PATCH v2 2/5] bbdev: add device status info Nicolas Chautru
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-06-17 18:37 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Updating the enum for rte_bbdev_op_type
to allow to keep ABI compatible for enum insertion
while adding padded maximum value for array need.
Removing RTE_BBDEV_OP_TYPE_COUNT and instead exposing
RTE_BBDEV_OP_TYPE_PADDED_MAX.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 app/test-bbdev/test_bbdev.c      | 2 +-
 app/test-bbdev/test_bbdev_perf.c | 4 ++--
 examples/bbdev_app/main.c        | 2 +-
 lib/bbdev/rte_bbdev.c            | 9 +++++----
 lib/bbdev/rte_bbdev_op.h         | 2 +-
 5 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/app/test-bbdev/test_bbdev.c b/app/test-bbdev/test_bbdev.c
index ac06d73..1063f6e 100644
--- a/app/test-bbdev/test_bbdev.c
+++ b/app/test-bbdev/test_bbdev.c
@@ -521,7 +521,7 @@ struct bbdev_testsuite_params {
 	rte_mempool_free(mp);
 
 	TEST_ASSERT((mp = rte_bbdev_op_pool_create("Test_INV",
-			RTE_BBDEV_OP_TYPE_COUNT, size, cache_size, 0)) == NULL,
+			RTE_BBDEV_OP_TYPE_PADDED_MAX, size, cache_size, 0)) == NULL,
 			"Failed test for rte_bbdev_op_pool_create: "
 			"returned value is not NULL for invalid type");
 
diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index fad3b1e..1abda2d 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -2428,13 +2428,13 @@ typedef int (test_case_function)(struct active_device *ad,
 
 	/* Find capabilities */
 	const struct rte_bbdev_op_cap *cap = info.drv.capabilities;
-	for (i = 0; i < RTE_BBDEV_OP_TYPE_COUNT; i++) {
+	do {
 		if (cap->type == test_vector.op_type) {
 			capabilities = cap;
 			break;
 		}
 		cap++;
-	}
+	} while (cap->type != RTE_BBDEV_OP_NONE);
 	TEST_ASSERT_NOT_NULL(capabilities,
 			"Couldn't find capabilities");
 
diff --git a/examples/bbdev_app/main.c b/examples/bbdev_app/main.c
index fc7e8b8..ef0ba76 100644
--- a/examples/bbdev_app/main.c
+++ b/examples/bbdev_app/main.c
@@ -1041,7 +1041,7 @@ uint16_t bbdev_parse_number(const char *mask)
 	void *sigret;
 	struct app_config_params app_params = def_app_config;
 	struct rte_mempool *ethdev_mbuf_mempool, *bbdev_mbuf_mempool;
-	struct rte_mempool *bbdev_op_pools[RTE_BBDEV_OP_TYPE_COUNT];
+	struct rte_mempool *bbdev_op_pools[RTE_BBDEV_OP_TYPE_PADDED_MAX];
 	struct lcore_conf lcore_conf[RTE_MAX_LCORE] = { {0} };
 	struct lcore_statistics lcore_stats[RTE_MAX_LCORE] = { {0} };
 	struct stats_lcore_params stats_lcore;
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index aaee7b7..22bd894 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -23,6 +23,8 @@
 
 #define DEV_NAME "BBDEV"
 
+/* Number of supported operation types */
+#define BBDEV_OP_TYPE_COUNT 5
 
 /* BBDev library logging ID */
 RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE);
@@ -890,10 +892,10 @@ struct rte_mempool *
 		return NULL;
 	}
 
-	if (type >= RTE_BBDEV_OP_TYPE_COUNT) {
+	if (type >= BBDEV_OP_TYPE_COUNT) {
 		rte_bbdev_log(ERR,
 				"Invalid op type (%u), should be less than %u",
-				type, RTE_BBDEV_OP_TYPE_COUNT);
+				type, BBDEV_OP_TYPE_COUNT);
 		return NULL;
 	}
 
@@ -1122,10 +1124,9 @@ struct rte_mempool *
 		"RTE_BBDEV_OP_TURBO_DEC",
 		"RTE_BBDEV_OP_TURBO_ENC",
 		"RTE_BBDEV_OP_LDPC_DEC",
-		"RTE_BBDEV_OP_LDPC_ENC",
 	};
 
-	if (op_type < RTE_BBDEV_OP_TYPE_COUNT)
+	if (op_type < BBDEV_OP_TYPE_COUNT)
 		return op_types[op_type];
 
 	rte_bbdev_log(ERR, "Invalid operation type");
diff --git a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h
index 6d56133..cd82418 100644
--- a/lib/bbdev/rte_bbdev_op.h
+++ b/lib/bbdev/rte_bbdev_op.h
@@ -748,7 +748,7 @@ enum rte_bbdev_op_type {
 	RTE_BBDEV_OP_TURBO_ENC,  /**< Turbo encode */
 	RTE_BBDEV_OP_LDPC_DEC,  /**< LDPC decode */
 	RTE_BBDEV_OP_LDPC_ENC,  /**< LDPC encode */
-	RTE_BBDEV_OP_TYPE_COUNT,  /**< Count of different op types */
+	RTE_BBDEV_OP_TYPE_PADDED_MAX = 8,  /**< Maximum op type number including padding */
 };
 
 /** Bit indexes of possible errors reported through status field */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v2 2/5] bbdev: add device status info
  2022-06-17 18:37   ` [PATCH v2 0/5] bbdev changes for 22.11 Nicolas Chautru
  2022-06-17 18:37     ` [PATCH v2 1/5] bbdev: allow operation type enum for growth Nicolas Chautru
@ 2022-06-17 18:37     ` Nicolas Chautru
  2022-06-17 18:37     ` [PATCH v2 3/5] bbdev: add device info on queue topology Nicolas Chautru
                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-06-17 18:37 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Added device status information, so that the PMD can
expose information related to the underlying accelerator device status.
Minor order change in structure to fit into padding hole.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c           |  1 +
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  1 +
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  1 +
 drivers/baseband/la12xx/bbdev_la12xx.c             |  1 +
 drivers/baseband/null/bbdev_null.c                 |  1 +
 drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  1 +
 lib/bbdev/rte_bbdev.c                              | 21 +++++++++++++
 lib/bbdev/rte_bbdev.h                              | 35 ++++++++++++++++++++--
 lib/bbdev/version.map                              |  6 ++++
 9 files changed, 66 insertions(+), 2 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index de7e4bc..c4a164e 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -1060,6 +1060,7 @@
 
 	/* Read and save the populated config from ACC100 registers */
 	fetch_acc100_config(dev);
+	dev_info->device_status = RTE_BBDEV_DEV_NOSTATUS;
 
 	/* This isn't ideal because it reports the maximum number of queues but
 	 * does not provide info on how many can be uplink/downlink or different
diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
index 82ae6ba..fa74a14 100644
--- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
+++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
@@ -369,6 +369,7 @@
 	dev_info->capabilities = bbdev_capabilities;
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOSTATUS;
 
 	/* Calculates number of queues assigned to device */
 	dev_info->max_num_queues = 0;
diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
index 21d3529..92c0624 100644
--- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
+++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
@@ -645,6 +645,7 @@ struct __rte_cache_aligned fpga_queue {
 	dev_info->capabilities = bbdev_capabilities;
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOSTATUS;
 
 	/* Calculates number of queues assigned to device */
 	dev_info->max_num_queues = 0;
diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c b/drivers/baseband/la12xx/bbdev_la12xx.c
index 4d1bd16..310a1c4 100644
--- a/drivers/baseband/la12xx/bbdev_la12xx.c
+++ b/drivers/baseband/la12xx/bbdev_la12xx.c
@@ -100,6 +100,7 @@ struct bbdev_la12xx_params {
 	dev_info->capabilities = bbdev_capabilities;
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->min_alignment = 64;
+	dev_info->device_status = RTE_BBDEV_DEV_NOSTATUS;
 
 	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
diff --git a/drivers/baseband/null/bbdev_null.c b/drivers/baseband/null/bbdev_null.c
index 248e129..bcdf269 100644
--- a/drivers/baseband/null/bbdev_null.c
+++ b/drivers/baseband/null/bbdev_null.c
@@ -82,6 +82,7 @@ struct bbdev_queue {
 	 * here for code completeness.
 	 */
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOSTATUS;
 
 	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
index af7bc41..00d108a 100644
--- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
+++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
@@ -254,6 +254,7 @@ struct turbo_sw_queue {
 	dev_info->min_alignment = 64;
 	dev_info->harq_buffer_size = 0;
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOSTATUS;
 
 	rte_bbdev_log_debug("got device info from %u\n", dev->data->dev_id);
 }
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index 22bd894..14d7456 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -1132,3 +1132,24 @@ struct rte_mempool *
 	rte_bbdev_log(ERR, "Invalid operation type");
 	return NULL;
 }
+
+const char *
+rte_bbdev_device_status_str(enum rte_bbdev_device_status status)
+{
+	static const char * const dev_sta_string[] = {
+		"RTE_BBDEV_DEV_NOSTATUS",
+		"RTE_BBDEV_DEV_RESET",
+		"RTE_BBDEV_DEV_CONFIGURED",
+		"RTE_BBDEV_DEV_ACTIVE",
+		"RTE_BBDEV_DEV_FATAL_ERR",
+		"RTE_BBDEV_DEV_RESTART_REQ",
+		"RTE_BBDEV_DEV_RECONFIG_REQ",
+		"RTE_BBDEV_DEV_CORRECT_ERR",
+	};
+
+	if (status < RTE_BBDEV_DEV_COUNT)
+		return dev_sta_string[status];
+
+	rte_bbdev_log(ERR, "Invalid device status");
+	return NULL;
+}
diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index b88c881..29c3cd4 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -223,6 +223,21 @@ struct rte_bbdev_queue_conf {
 int
 rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
 
+/**
+ * Flags indicate the status of the device
+ */
+enum rte_bbdev_device_status {
+	RTE_BBDEV_DEV_NOSTATUS,        /**< Nothing to report */
+	RTE_BBDEV_DEV_RESET,           /**< Device in reset and unconfigure state */
+	RTE_BBDEV_DEV_CONFIGURED,      /**< Device is configured and ready to use */
+	RTE_BBDEV_DEV_ACTIVE,          /**< Device is configured and VF is being used */
+	RTE_BBDEV_DEV_FATAL_ERR,       /**< Device has hit a fatal uncorrectable error */
+	RTE_BBDEV_DEV_RESTART_REQ,     /**< Device requires application to restart */
+	RTE_BBDEV_DEV_RECONFIG_REQ,    /**< Device requires application to reconfig queues */
+	RTE_BBDEV_DEV_CORRECT_ERR,     /**< Warning of a correctable error event happened */
+	RTE_BBDEV_DEV_COUNT
+};
+
 /** Device statistics. */
 struct rte_bbdev_stats {
 	uint64_t enqueued_count;  /**< Count of all operations enqueued */
@@ -285,12 +300,14 @@ struct rte_bbdev_driver_info {
 	/** Set if device supports per-queue interrupts */
 	bool queue_intr_supported;
 	/** Minimum alignment of buffers, in bytes */
-	uint16_t min_alignment;
-	/** HARQ memory available in kB */
+	/** Device Status */
+	enum rte_bbdev_device_status device_status;
 	uint32_t harq_buffer_size;
 	/** Byte endianness (RTE_BIG_ENDIAN/RTE_LITTLE_ENDIAN) supported
 	 *  for input/output data
 	 */
+	uint16_t min_alignment;
+	/** HARQ memory available in kB */
 	uint8_t data_endianness;
 	/** Default queue configuration used if none is supplied  */
 	struct rte_bbdev_queue_conf default_queue_conf;
@@ -827,6 +844,20 @@ typedef void (*rte_bbdev_cb_fn)(uint16_t dev_id,
 rte_bbdev_queue_intr_ctl(uint16_t dev_id, uint16_t queue_id, int epfd, int op,
 		void *data);
 
+/**
+ * Converts device status from enum to string
+ *
+ * @param status
+ *   Device status as enum
+ *
+ * @returns
+ *   Operation type as string or NULL if op_type is invalid
+ *
+ */
+__rte_experimental
+const char*
+rte_bbdev_device_status_str(enum rte_bbdev_device_status status);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map
index cce3f3c..9ac3643 100644
--- a/lib/bbdev/version.map
+++ b/lib/bbdev/version.map
@@ -39,3 +39,9 @@ DPDK_22 {
 
 	local: *;
 };
+
+EXPERIMENTAL {
+	global:
+
+	rte_bbdev_device_status_str;
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v2 3/5] bbdev: add device info on queue topology
  2022-06-17 18:37   ` [PATCH v2 0/5] bbdev changes for 22.11 Nicolas Chautru
  2022-06-17 18:37     ` [PATCH v2 1/5] bbdev: allow operation type enum for growth Nicolas Chautru
  2022-06-17 18:37     ` [PATCH v2 2/5] bbdev: add device status info Nicolas Chautru
@ 2022-06-17 18:37     ` Nicolas Chautru
  2022-06-17 18:37     ` [PATCH v2 4/5] drivers/baseband: update PMDs to expose queue per operation Nicolas Chautru
  2022-06-17 18:37     ` [PATCH v2 5/5] bbdev: add new operation for FFT processing Nicolas Chautru
  4 siblings, 0 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-06-17 18:37 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Adding more options in the API to expose the number
of queues exposed and related priority.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 lib/bbdev/rte_bbdev.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index 29c3cd4..9fd7fb0 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -289,6 +289,10 @@ struct rte_bbdev_driver_info {
 
 	/** Maximum number of queues supported by the device */
 	unsigned int max_num_queues;
+	/** Maximum number of queues supported per operation type */
+	unsigned int num_queues[RTE_BBDEV_OP_TYPE_PADDED_MAX];
+	/** Priority level supported per operation type */
+	unsigned int queue_priority[RTE_BBDEV_OP_TYPE_PADDED_MAX];
 	/** Queue size limit (queue size must also be power of 2) */
 	uint32_t queue_size_lim;
 	/** Set if device off-loads operation to hardware  */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v2 4/5] drivers/baseband: update PMDs to expose queue per operation
  2022-06-17 18:37   ` [PATCH v2 0/5] bbdev changes for 22.11 Nicolas Chautru
                       ` (2 preceding siblings ...)
  2022-06-17 18:37     ` [PATCH v2 3/5] bbdev: add device info on queue topology Nicolas Chautru
@ 2022-06-17 18:37     ` Nicolas Chautru
  2022-06-17 18:37     ` [PATCH v2 5/5] bbdev: add new operation for FFT processing Nicolas Chautru
  4 siblings, 0 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-06-17 18:37 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Add support in existing bbdev PMDs for the explicit number of queue
and priority for each operation type configured on the device.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c           | 29 +++++++++++++---------
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  8 ++++++
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  8 ++++++
 drivers/baseband/la12xx/bbdev_la12xx.c             |  7 ++++++
 drivers/baseband/turbo_sw/bbdev_turbo_software.c   | 11 ++++++++
 5 files changed, 51 insertions(+), 12 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index c4a164e..3836ae1 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -966,6 +966,7 @@
 		struct rte_bbdev_driver_info *dev_info)
 {
 	struct acc100_device *d = dev->data->dev_private;
+	int i;
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
 		{
@@ -1062,19 +1063,23 @@
 	fetch_acc100_config(dev);
 	dev_info->device_status = RTE_BBDEV_DEV_NOSTATUS;
 
-	/* This isn't ideal because it reports the maximum number of queues but
-	 * does not provide info on how many can be uplink/downlink or different
-	 * priorities
-	 */
-	dev_info->max_num_queues =
-			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
-			d->acc100_conf.q_dl_5g.num_qgroups +
-			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
-			d->acc100_conf.q_ul_5g.num_qgroups +
-			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
-			d->acc100_conf.q_dl_4g.num_qgroups +
-			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
+	/* Expose number of queues */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = d->acc100_conf.q_ul_4g.num_aqs_per_groups *
 			d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = d->acc100_conf.q_dl_4g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_4g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = d->acc100_conf.q_ul_5g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_5g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = d->acc100_conf.q_dl_5g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_5g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = d->acc100_conf.q_dl_4g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = d->acc100_conf.q_ul_5g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = d->acc100_conf.q_dl_5g.num_qgroups;
+	dev_info->max_num_queues = 0;
+	for (i = RTE_BBDEV_OP_TURBO_DEC; i < RTE_BBDEV_OP_LDPC_ENC; i++)
+		dev_info->max_num_queues += dev_info->num_queues[i];
 	dev_info->queue_size_lim = ACC100_MAX_QUEUE_DEPTH;
 	dev_info->hardware_accelerated = true;
 	dev_info->max_dl_queue_priority =
diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
index fa74a14..bf27630 100644
--- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
+++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
@@ -379,6 +379,14 @@
 		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
 			dev_info->max_num_queues++;
 	}
+	/* Expose number of queue per operation type */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = dev_info->max_num_queues / 2;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
 }
 
 /**
diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
index 92c0624..5560f73 100644
--- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
+++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
@@ -655,6 +655,14 @@ struct __rte_cache_aligned fpga_queue {
 		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
 			dev_info->max_num_queues++;
 	}
+	/* Expose number of queue per operation type */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = 0;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = 1;
 }
 
 /**
diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c b/drivers/baseband/la12xx/bbdev_la12xx.c
index 310a1c4..8c847ed 100644
--- a/drivers/baseband/la12xx/bbdev_la12xx.c
+++ b/drivers/baseband/la12xx/bbdev_la12xx.c
@@ -102,6 +102,13 @@ struct bbdev_la12xx_params {
 	dev_info->min_alignment = 64;
 	dev_info->device_status = RTE_BBDEV_DEV_NOSTATUS;
 
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = LA12XX_MAX_QUEUES / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = LA12XX_MAX_QUEUES / 2;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
 	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
 
diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
index 00d108a..387307f 100644
--- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
+++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
@@ -256,6 +256,17 @@ struct turbo_sw_queue {
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
 	dev_info->device_status = RTE_BBDEV_DEV_NOSTATUS;
 
+	const struct rte_bbdev_op_cap *op_cap = bbdev_capabilities;
+	int num_op_type = 0;
+	for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
+		num_op_type++;
+	op_cap = bbdev_capabilities;
+	if (num_op_type > 0) {
+		int num_queue_per_type = dev_info->max_num_queues / num_op_type;
+		for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
+			dev_info->num_queues[op_cap->type] = num_queue_per_type;
+	}
+
 	rte_bbdev_log_debug("got device info from %u\n", dev->data->dev_id);
 }
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v2 5/5] bbdev: add new operation for FFT processing
  2022-06-17 18:37   ` [PATCH v2 0/5] bbdev changes for 22.11 Nicolas Chautru
                       ` (3 preceding siblings ...)
  2022-06-17 18:37     ` [PATCH v2 4/5] drivers/baseband: update PMDs to expose queue per operation Nicolas Chautru
@ 2022-06-17 18:37     ` Nicolas Chautru
  2022-06-28  1:35       ` [PATCH v3 0/7] bbdev changes for 22.11 Nicolas Chautru
                         ` (9 more replies)
  4 siblings, 10 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-06-17 18:37 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Extension of bbdev operation to support FFT based operations.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by:  Hemant Agrawal <hemant.agrawal@nxp.com>
---
 doc/guides/prog_guide/bbdev.rst | 130 +++++++++++++++++++++++++++++++++++
 lib/bbdev/rte_bbdev.c           |  11 ++-
 lib/bbdev/rte_bbdev.h           |  76 ++++++++++++++++++++
 lib/bbdev/rte_bbdev_op.h        | 149 ++++++++++++++++++++++++++++++++++++++++
 lib/bbdev/version.map           |   4 ++
 5 files changed, 369 insertions(+), 1 deletion(-)

diff --git a/doc/guides/prog_guide/bbdev.rst b/doc/guides/prog_guide/bbdev.rst
index 70fa01a..4a055b5 100644
--- a/doc/guides/prog_guide/bbdev.rst
+++ b/doc/guides/prog_guide/bbdev.rst
@@ -1118,6 +1118,136 @@ Figure :numref:`figure_turbo_tb_decode` above
 showing the Turbo decoding of CBs using BBDEV interface in TB-mode
 is also valid for LDPC decode.
 
+BBDEV FFT Operation
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This operation allows to run a combination of DFT and/or IDFT and/or time-domain windowing.
+These can be used in a modular fashion (using bypass modes) or as a processing pipeline
+which can be used for FFT-based baseband signal processing.
+In more details it allows :
+- to process the data first through an IDFT of adjustable size and padding;
+- to perform the windowing as a programmable cyclic shift offset of the data followed by a
+pointwise multiplication by a time domain window;
+- to process the related data through a DFT of adjustable size and depadding for each such cyclic
+shift output.
+
+A flexible number of Rx antennas are being processed in parallel with the same configuration.
+The API allows more generally for flexibility in what the PMD may support (cabability flags) and
+flexibility to adjust some of the parameters of the processing.
+
+The operation/capability flags that can be set for each FFT operation are given below.
+
+  **NOTE:** The actual operation flags that may be used with a specific
+  BBDEV PMD are dependent on the driver capabilities as reported via
+  ``rte_bbdev_info_get()``, and may be a subset of those below.
+
++--------------------------------------------------------------------+
+|Description of FFT capability flags                                 |
++====================================================================+
+|RTE_BBDEV_FFT_WINDOWING                                             |
+| Set to enable/support windowing in time domain                     |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_CS_ADJUSTMENT                                         |
+| Set to enable/support  the cyclic shift time offset adjustment     |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_DFT_BYPASS                                            |
+| Set to bypass the DFT and use directly the IDFT as an option       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_IDFT_BYPASS                                           |
+| Set to bypass the IDFT and use directly the DFT as an option       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_WINDOWING_BYPASS                                      |
+| Set to bypass the time domain windowing  as an option              |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_POWER_MEAS                                            |
+| Set to provide an optional power measument of the DFT output       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_FP16_INPUT                                            |
+| Set if the input data shall use FP16 format instead of INT16       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_FP16_OUTPUT                                           |
+| Set if the output data shall use FP16 format instead of INT16      |
++--------------------------------------------------------------------+
+
+The structure passed for each FFT operation is given below,
+with the operation flags forming a bitmask in the ``op_flags`` field.
+
+.. code-block:: c
+
+    struct rte_bbdev_op_fft {
+        struct rte_bbdev_op_data base_input;
+        struct rte_bbdev_op_data base_output;
+        struct rte_bbdev_op_data power_meas_output;
+        uint32_t op_flags;
+        uint16_t input_sequence_size;
+        uint16_t input_leading_padding;
+        uint16_t output_sequence_size;
+        uint16_t output_leading_depadding;
+        uint8_t window_index[RTE_BBDEV_MAX_CS_2];
+        uint16_t cs_bitmap;
+        uint8_t num_antennas_log2;
+        uint8_t idft_log2;
+        uint8_t dft_log2;
+        int8_t cs_time_adjustment;
+        int8_t idft_shift;
+        int8_t dft_shift;
+        uint16_t ncs_reciprocal;
+        uint16_t power_shift;
+        uint16_t fp16_exp_adjust;
+    };
+
+The FFT parameters are set out in the table below.
+
++----------------------+--------------------------------------------------------------+
+|Parameter             |Description                                                   |
++======================+==============================================================+
+|base_input            |input data                                                    |
++----------------------+--------------------------------------------------------------+
+|base_output           |output data                                                   |
++----------------------+--------------------------------------------------------------+
+|power_meas_output     |optional output data with power measurement on DFT output     |
++----------------------+--------------------------------------------------------------+
+|op_flags              |bitmask of all active operation capabilities                  |
++----------------------+--------------------------------------------------------------+
+|input_sequence_size   |size of the input sequence in 32-bits points per antenna      |
++----------------------+--------------------------------------------------------------+
+|input_leading_padding |number of points padded at the start of input data            |
++----------------------+--------------------------------------------------------------+
+|output_sequence_size  |size of the output sequence per antenna and cyclic shift      |
++----------------------+--------------------------------------------------------------+
+|output_depadding      |number of points depadded at the start of output data         |
++----------------------+--------------------------------------------------------------+
+|window_index          |optional windowing profile index used for each cyclic shift   |
++----------------------+--------------------------------------------------------------+
+|cs_bitmap             |bitmap of the cyclic shift output requested (LSB for index 0) |
++----------------------+--------------------------------------------------------------+
+|num_antennas_log2     |number of antennas as a log2 (10 maps to 1024...)             |
++----------------------+--------------------------------------------------------------+
+|idft_log2             |iDFT size as a log2                                           |
++----------------------+--------------------------------------------------------------+
+|dft_log2              |DFT size as a log2                                            |
++----------------------+--------------------------------------------------------------+
+|cs_time_adjustment    |adjustment of time position of all the cyclic shift output    |
++----------------------+--------------------------------------------------------------+
+|idft_shift            |shift down of signal level post iDFT                          |
++----------------------+--------------------------------------------------------------+
+|dft_shift             |shift down of signal level post DFT                           |
++----------------------+--------------------------------------------------------------+
+|ncs_reciprocal        |inverse of max number of CS normalized to 15b (ie. 231 for 12)|
++----------------------+--------------------------------------------------------------+
+|power_shift           |shift down of level of power measurement when enabled         |
++----------------------+--------------------------------------------------------------+
+|fp16_exp_adjust       |value added to FP16 exponent at conversion from INT16         |
++----------------------+--------------------------------------------------------------+
+
+The mbuf input ``base_input`` is mandatory for all BBDEV PMDs and is the
+incoming data for the processing. Its size may not fit into an actual mbuf, but the
+structure is used to pass iova address.
+The mbuf output ``output`` is mandatory and is output of the FFT processing chain.
+Each point is a complex number of 32bits : either as 2 INT16 or as 2 FP16 based when the option
+supported.
+The data layout is based on contiguous concatenation of output data first by cyclic shift then
+by antenna.
 
 Sample code
 -----------
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index 14d7456..e90aeaa 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -24,7 +24,7 @@
 #define DEV_NAME "BBDEV"
 
 /* Number of supported operation types */
-#define BBDEV_OP_TYPE_COUNT 5
+#define BBDEV_OP_TYPE_COUNT 6
 
 /* BBDev library logging ID */
 RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE);
@@ -852,6 +852,9 @@ struct rte_bbdev *
 	case RTE_BBDEV_OP_LDPC_ENC:
 		result = sizeof(struct rte_bbdev_enc_op);
 		break;
+	case RTE_BBDEV_OP_FFT:
+		result = sizeof(struct rte_bbdev_fft_op);
+		break;
 	default:
 		break;
 	}
@@ -875,6 +878,10 @@ struct rte_bbdev *
 		struct rte_bbdev_enc_op *op = element;
 		memset(op, 0, mempool->elt_size);
 		op->mempool = mempool;
+	} else if (type == RTE_BBDEV_OP_FFT) {
+		struct rte_bbdev_fft_op *op = element;
+		memset(op, 0, mempool->elt_size);
+		op->mempool = mempool;
 	}
 }
 
@@ -1124,6 +1131,8 @@ struct rte_mempool *
 		"RTE_BBDEV_OP_TURBO_DEC",
 		"RTE_BBDEV_OP_TURBO_ENC",
 		"RTE_BBDEV_OP_LDPC_DEC",
+		"RTE_BBDEV_OP_LDPC_ENC",
+		"RTE_BBDEV_OP_FFT",
 	};
 
 	if (op_type < BBDEV_OP_TYPE_COUNT)
diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index 9fd7fb0..20cd2d9 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -401,6 +401,12 @@ typedef uint16_t (*rte_bbdev_enqueue_dec_ops_t)(
 		struct rte_bbdev_dec_op **ops,
 		uint16_t num);
 
+/** @internal Enqueue fft operations for processing on queue of a device. */
+typedef uint16_t (*rte_bbdev_enqueue_fft_ops_t)(
+		struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_fft_op **ops,
+		uint16_t num);
+
 /** @internal Dequeue encode operations from a queue of a device. */
 typedef uint16_t (*rte_bbdev_dequeue_enc_ops_t)(
 		struct rte_bbdev_queue_data *q_data,
@@ -411,6 +417,11 @@ typedef uint16_t (*rte_bbdev_dequeue_dec_ops_t)(
 		struct rte_bbdev_queue_data *q_data,
 		struct rte_bbdev_dec_op **ops, uint16_t num);
 
+/** @internal Dequeue fft operations from a queue of a device. */
+typedef uint16_t (*rte_bbdev_dequeue_fft_ops_t)(
+		struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_fft_op **ops, uint16_t num);
+
 #define RTE_BBDEV_NAME_MAX_LEN  64  /**< Max length of device name */
 
 /**
@@ -459,6 +470,10 @@ struct __rte_cache_aligned rte_bbdev {
 	rte_bbdev_dequeue_enc_ops_t dequeue_ldpc_enc_ops;
 	/** Dequeue decode function */
 	rte_bbdev_dequeue_dec_ops_t dequeue_ldpc_dec_ops;
+	/** Enqueue FFT function */
+	rte_bbdev_enqueue_fft_ops_t enqueue_fft_ops;
+	/** Dequeue FFT function */
+	rte_bbdev_dequeue_fft_ops_t dequeue_fft_ops;
 	const struct rte_bbdev_ops *dev_ops;  /**< Functions exported by PMD */
 	struct rte_bbdev_data *data;  /**< Pointer to device data */
 	enum rte_bbdev_state state;  /**< If device is currently used or not */
@@ -591,6 +606,36 @@ struct __rte_cache_aligned rte_bbdev {
 	return dev->enqueue_ldpc_dec_ops(q_data, ops, num_ops);
 }
 
+/**
+ * Enqueue a burst of fft operations to a queue of the device.
+ * This functions only enqueues as many operations as currently possible and
+ * does not block until @p num_ops entries in the queue are available.
+ * This function does not provide any error notification to avoid the
+ * corresponding overhead.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_id
+ *   The index of the queue.
+ * @param ops
+ *   Pointer array containing operations to be enqueued Must have at least
+ *   @p num_ops entries
+ * @param num_ops
+ *   The maximum number of operations to enqueue.
+ *
+ * @return
+ *   The number of operations actually enqueued (this is the number of processed
+ *   entries in the @p ops array).
+ */
+__rte_experimental
+static inline uint16_t
+rte_bbdev_enqueue_fft_ops(uint16_t dev_id, uint16_t queue_id,
+		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
+{
+	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
+	struct rte_bbdev_queue_data *q_data = &dev->data->queues[queue_id];
+	return dev->enqueue_fft_ops(q_data, ops, num_ops);
+}
 
 /**
  * Dequeue a burst of processed encode operations from a queue of the device.
@@ -716,6 +761,37 @@ struct __rte_cache_aligned rte_bbdev {
 	return dev->dequeue_ldpc_dec_ops(q_data, ops, num_ops);
 }
 
+/**
+ * Dequeue a burst of fft operations from a queue of the device.
+ * This functions returns only the current contents of the queue, and does not
+ * block until @ num_ops is available.
+ * This function does not provide any error notification to avoid the
+ * corresponding overhead.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_id
+ *   The index of the queue.
+ * @param ops
+ *   Pointer array where operations will be dequeued to. Must have at least
+ *   @p num_ops entries
+ * @param num_ops
+ *   The maximum number of operations to dequeue.
+ *
+ * @return
+ *   The number of operations actually dequeued (this is the number of entries
+ *   copied into the @p ops array).
+ */
+__rte_experimental
+static inline uint16_t
+rte_bbdev_dequeue_fft_ops(uint16_t dev_id, uint16_t queue_id,
+		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
+{
+	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
+	struct rte_bbdev_queue_data *q_data = &dev->data->queues[queue_id];
+	return dev->dequeue_fft_ops(q_data, ops, num_ops);
+}
+
 /** Definitions of device event types */
 enum rte_bbdev_event_type {
 	RTE_BBDEV_EVENT_UNKNOWN,  /**< unknown event type */
diff --git a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h
index cd82418..3e46f1d 100644
--- a/lib/bbdev/rte_bbdev_op.h
+++ b/lib/bbdev/rte_bbdev_op.h
@@ -47,6 +47,8 @@
 #define RTE_BBDEV_TURBO_MAX_CODE_BLOCKS (64)
 /* LDPC:  Maximum number of Code Blocks in Transport Block.*/
 #define RTE_BBDEV_LDPC_MAX_CODE_BLOCKS (256)
+/* 12 CS maximum */
+#define RTE_BBDEV_MAX_CS_2 (6)
 
 /** Flags for turbo decoder operation and capability structure */
 enum rte_bbdev_op_td_flag_bitmasks {
@@ -211,6 +213,26 @@ enum rte_bbdev_op_ldpcenc_flag_bitmasks {
 	RTE_BBDEV_LDPC_ENC_CONCATENATION = (1ULL << 7)
 };
 
+/** Flags for DFT operation and capability structure */
+enum rte_bbdev_op_fft_flag_bitmasks {
+	/** Flexible windowing capability */
+	RTE_BBDEV_FFT_WINDOWING = (1ULL << 0),
+	/** Flexible adjustment of Cyclic Shift time offset */
+	RTE_BBDEV_FFT_CS_ADJUSTMENT = (1ULL << 1),
+	/** Set for bypass the DFT and get directly into iDFT input */
+	RTE_BBDEV_FFT_DFT_BYPASS = (1ULL << 2),
+	/** Set for bypass the IDFT and get directly the DFT output */
+	RTE_BBDEV_FFT_IDFT_BYPASS = (1ULL << 3),
+	/** Set for bypass time domain windowing */
+	RTE_BBDEV_FFT_WINDOWING_BYPASS = (1ULL << 4),
+	/** Set for optional power measurement on DFT output */
+	RTE_BBDEV_FFT_POWER_MEAS = (1ULL << 5),
+	/** Set if the input data used FP16 format */
+	RTE_BBDEV_FFT_FP16_INPUT = (1ULL << 6),
+	/**  Set if the output data uses FP16 format  */
+	RTE_BBDEV_FFT_FP16_OUTPUT = (1ULL << 7)
+};
+
 /** Flags for the Code Block/Transport block mode  */
 enum rte_bbdev_op_cb_mode {
 	/** One operation is one or fraction of one transport block  */
@@ -689,6 +711,55 @@ struct rte_bbdev_op_ldpc_enc {
 	};
 };
 
+/** Operation structure for FFT processing.
+ *
+ * The operation processes the data for multiple antennas in a single call
+ * (.i.e for all the REs belonging to a given SRS sequence for instance)
+ *
+ * The output mbuf data structure is expected to be allocated by the
+ * application with enough room for the output data.
+ */
+struct rte_bbdev_op_fft {
+	/** Input data starting from first antenna */
+	struct rte_bbdev_op_data base_input;
+	/** Output data starting from first antenna and first cyclic shift */
+	struct rte_bbdev_op_data base_output;
+	/** Optional power measurement output data */
+	struct rte_bbdev_op_data power_meas_output;
+	/** Flags from rte_bbdev_op_fft_flag_bitmasks */
+	uint32_t op_flags;
+	/** Input sequence size in 32-bits points */
+	uint16_t input_sequence_size;
+	/** Padding at the start of the sequence */
+	uint16_t input_leading_padding;
+	/** Output sequence size in 32-bits points */
+	uint16_t output_sequence_size;
+	/** Depadding at the start of the DFT output */
+	uint16_t output_leading_depadding;
+	/** Window index being used for each cyclic shift output */
+	uint8_t window_index[RTE_BBDEV_MAX_CS_2];
+	/** Bitmap of the cyclic shift output requested */
+	uint16_t cs_bitmap;
+	/** Number of antennas as a log2 – 8 to 128 */
+	uint8_t num_antennas_log2;
+	/** iDFT size as a log2 - 32 to 2048 */
+	uint8_t idft_log2;
+	/** DFT size as a log2 - 8 to 2048 */
+	uint8_t dft_log2;
+	/** Adjustment of position of the cyclic shifts - -31 to 31 */
+	int8_t cs_time_adjustment;
+	/** iDFT shift down */
+	int8_t idft_shift;
+	/** DFT shift down */
+	int8_t dft_shift;
+	/** NCS reciprocal factor  */
+	uint16_t ncs_reciprocal;
+	/** power measurement out shift down */
+	uint16_t power_shift;
+	/** Adjust the FP6 exponent for INT<->FP16 conversion */
+	uint16_t fp16_exp_adjust;
+};
+
 /** List of the capabilities for the Turbo Decoder */
 struct rte_bbdev_op_cap_turbo_dec {
 	/** Flags from rte_bbdev_op_td_flag_bitmasks */
@@ -741,6 +812,16 @@ struct rte_bbdev_op_cap_ldpc_enc {
 	uint16_t num_buffers_dst;
 };
 
+/** List of the capabilities for the FFT */
+struct rte_bbdev_op_cap_fft {
+	/** Flags from rte_bbdev_op_ldpcenc_flag_bitmasks */
+	uint32_t capability_flags;
+	/** Num input code block buffers */
+	uint16_t num_buffers_src;
+	/** Num output code block buffers */
+	uint16_t num_buffers_dst;
+};
+
 /** Different operation types supported by the device */
 enum rte_bbdev_op_type {
 	RTE_BBDEV_OP_NONE,  /**< Dummy operation that does nothing */
@@ -748,6 +829,7 @@ enum rte_bbdev_op_type {
 	RTE_BBDEV_OP_TURBO_ENC,  /**< Turbo encode */
 	RTE_BBDEV_OP_LDPC_DEC,  /**< LDPC decode */
 	RTE_BBDEV_OP_LDPC_ENC,  /**< LDPC encode */
+	RTE_BBDEV_OP_FFT,  /**< FFT */
 	RTE_BBDEV_OP_TYPE_PADDED_MAX = 8,  /**< Maximum op type number including padding */
 };
 
@@ -791,6 +873,18 @@ struct rte_bbdev_dec_op {
 	};
 };
 
+/** Structure specifying a single fft operation */
+struct rte_bbdev_fft_op {
+	/** Status of operation that was performed */
+	int status;
+	/** Mempool which op instance is in */
+	struct rte_mempool *mempool;
+	/** Opaque pointer for user data */
+	void *opaque_data;
+	/** Contains turbo decoder specific parameters */
+	struct rte_bbdev_op_fft fft;
+};
+
 /** Operation capabilities supported by a device */
 struct rte_bbdev_op_cap {
 	enum rte_bbdev_op_type type;  /**< Type of operation */
@@ -799,6 +893,7 @@ struct rte_bbdev_op_cap {
 		struct rte_bbdev_op_cap_turbo_enc turbo_enc;
 		struct rte_bbdev_op_cap_ldpc_dec ldpc_dec;
 		struct rte_bbdev_op_cap_ldpc_enc ldpc_enc;
+		struct rte_bbdev_op_cap_fft fft;
 	} cap;  /**< Operation-type specific capabilities */
 };
 
@@ -918,6 +1013,42 @@ struct rte_mempool *
 }
 
 /**
+ * Bulk allocate fft operations from a mempool with parameter defaults reset.
+ *
+ * @param mempool
+ *   Operation mempool, created by rte_bbdev_op_pool_create().
+ * @param ops
+ *   Output array to place allocated operations
+ * @param num_ops
+ *   Number of operations to allocate
+ *
+ * @returns
+ *   - 0 on success
+ *   - EINVAL if invalid mempool is provided
+ */
+__rte_experimental
+static inline int
+rte_bbdev_fft_op_alloc_bulk(struct rte_mempool *mempool,
+		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
+{
+	struct rte_bbdev_op_pool_private *priv;
+	int ret;
+
+	/* Check type */
+	priv = (struct rte_bbdev_op_pool_private *)
+			rte_mempool_get_priv(mempool);
+	if (unlikely(priv->type != RTE_BBDEV_OP_FFT))
+		return -EINVAL;
+
+	/* Get elements */
+	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
+	if (unlikely(ret < 0))
+		return ret;
+
+	return 0;
+}
+
+/**
  * Free decode operation structures that were allocated by
  * rte_bbdev_dec_op_alloc_bulk().
  * All structures must belong to the same mempool.
@@ -951,6 +1082,24 @@ struct rte_mempool *
 		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops, num_ops);
 }
 
+/**
+ * Free encode operation structures that were allocated by
+ * rte_bbdev_fft_op_alloc_bulk().
+ * All structures must belong to the same mempool.
+ *
+ * @param ops
+ *   Operation structures
+ * @param num_ops
+ *   Number of structures
+ */
+__rte_experimental
+static inline void
+rte_bbdev_fft_op_free_bulk(struct rte_bbdev_fft_op **ops, unsigned int num_ops)
+{
+	if (num_ops > 0)
+		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops, num_ops);
+}
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map
index 9ac3643..efae50b 100644
--- a/lib/bbdev/version.map
+++ b/lib/bbdev/version.map
@@ -44,4 +44,8 @@ EXPERIMENTAL {
 	global:
 
 	rte_bbdev_device_status_str;
+	rte_bbdev_enqueue_fft_ops;
+	rte_bbdev_dequeue_fft_ops;
+	rte_bbdev_fft_op_alloc_bulk;
+	rte_bbdev_fft_op_free_bulk;
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v3 0/7]  bbdev changes for 22.11
  2022-06-17 18:37     ` [PATCH v2 5/5] bbdev: add new operation for FFT processing Nicolas Chautru
@ 2022-06-28  1:35       ` Nicolas Chautru
  2022-06-28  1:35         ` [PATCH v3 1/7] bbdev: allow operation type enum for growth Nicolas Chautru
                           ` (6 more replies)
  2022-07-06  0:23       ` [PATCH v4 0/7] bbdev changes for 22.11 Nicolas Chautru
                         ` (8 subsequent siblings)
  9 siblings, 7 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-06-28  1:35 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

v3: update to device status info to also use padded size for the related array.
Adding also 2 additionals commits to allow the API struc to expose more information related to queues
corner cases/warning as well as an optional rw lock.
Hemant, Maxime, this is planned for DPDK 21.11 but would like review/ack early is possible
to get this applied earlier and due to time off this summer.
Thanks
Nic

-- 

Hi,

Agregating together in a single serie a number of bbdev api changes previously submitted over the last few months and all targeted for 22.11 (4 different series detailed below). Related deprecation notice being pushed in 22.07 in parallel. 
* bbdev: add device status info
* bbdev: add new operation for FFT processing
* bbdev: add device info on queue topology
* bbdev: allow operation type enum for growth

v2: Update to the RTE_BBDEV_COUNT removal based on feedback from Thomas/Stephen : rejecting out of range op type and adjusting the new name for the padded maximum value used for fixed size arrays. 

---

Previous cover letters agregated below:

* bbdev: add device status info
https://patches.dpdk.org/project/dpdk/list/?series=23367

The updated structure will allow PMDs to expose through info_get what be may the status of the underlying accelerator, notably in case an HW error event having happened.

* bbdev: add new operation for FFT processing
https://patches.dpdk.org/project/dpdk/list/?series=22111

This contribution adds a new operation type to the existing ones already supported by the bbdev PMDs.
This set of operation is FFT-based processing for 5GNR baseband processing acceleration. This operates in the same lookaside fashion as other existing bbdev operation with a dedicated set of capabilities and parameters (marked as experimental).

I plan to also include a new PMD supporting this operation (and most of the related capabilities) in the next couple of months (either in 22.06 or 22.09) as well as extending the related bbdev-test.

* bbdev: add device info on queue topology
https://patches.dpdk.org/project/dpdk/list/?series=22076

Addressing an historical concern that the device info struct only
imperfectly captured what queues are available on the device
(number of operation and priority). This ended up being an iterative
process for application to find each queue could be configured.

ie. the gap was captured as technical debt previously  in comments
/* This isn't ideal because it reports the maximum number of queues but
 * does not provide info on how many can be uplink/downlink or different
 * priorities
 */

This is now being exposed explictly based on the what the device actually
supports using the existing info_get api

* bbdev: allow operation type enum for growth
https://patches.dpdk.org/project/dpdk/list/?series=23509

This is related to the general intent to remove using MAX value for enums. There is consensus that we should avoid this for a while notably for future-proofed ABI concerns https://patches.dpdk.org/project/dpdk/patch/20200130142003.2645765-1-ferruh.yigit@intel.com/.
But still there is arguably not yet an explicit best recommendation to handle this especially when we actualy need to expose array whose index is such an enum.
As a specific example here I am refering to RTE_BBDEV_OP_TYPE_COUNT in enum rte_bbdev_op_type which is being extended for new operation type being support in bbdev (such as https://patches.dpdk.org/project/dpdk/patch/1646956157-245769-2-git-send-email-nicolas.chautru@intel.com/ adding new FFT operation)

There is also the intent to be able to expose information for each operation type through the bbdev api such as dynamically configured queues information per such operation type https://patches.dpdk.org/project/dpdk/patch/1646785355-168133-2-git-send-email-nicolas.chautru@intel.com/

Basically we are considering best way to accomodate for this, notably based on discussions with Ray Kinsella and Bruce Richardson, to handle such a case moving forward: specifically for the example with RTE_BBDEV_OP_TYPE_COUNT and also more generally.

One possible option is captured in that patchset and is basically based on the simple principle to allow for growth and prevent ABI breakage. Ie. the last value of the enum is set with a higher value than required so that to allow insertion of new enum outside of the major ABI versions.
In that case the RTE_BBDEV_OP_TYPE_COUNT is still present and can be exposed and used while still allowing for addition thanks to the implicit padding-like room. As an alternate variant, instead of using that last enum value, that extended size could be exposed as an #define outside of the enum but would be fundamentally the same (public).

Another option would be to avoid array alltogether and use each time this a new dedicated API function (operation type enum being an input argument instead of an index to an array in an existing structure so that to get access to structure related to a given operation type enum) but that is arguably not well scalable within DPDK to use such a scheme for each enums and keep an uncluttered and clean API. In that very example that would be very odd indeed not to get this simply from info_get().

Some pros and cons, arguably the simple option in that patchset is a valid compromise option and a step in the right direction but we would like to know your view wrt best recommendation, or any other thought. 



Nicolas Chautru (7):
  bbdev: allow operation type enum for growth
  bbdev: add device status info
  bbdev: add device info on queue topology
  drivers/baseband: update PMDs to expose queue per operation
  bbdev: add new operation for FFT processing
  bbdev: add queue related warning and status information
  bbdev: add a lock option for enqueue/dequeue operation

 app/test-bbdev/test_bbdev.c                        |   2 +-
 app/test-bbdev/test_bbdev_perf.c                   |   6 +-
 doc/guides/prog_guide/bbdev.rst                    | 130 ++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.c           |  30 ++--
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |   9 ++
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |   9 ++
 drivers/baseband/la12xx/bbdev_la12xx.c             |  10 +-
 drivers/baseband/null/bbdev_null.c                 |   1 +
 drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  12 ++
 examples/bbdev_app/main.c                          |   2 +-
 lib/bbdev/rte_bbdev.c                              |  42 +++++-
 lib/bbdev/rte_bbdev.h                              | 136 ++++++++++++++++++-
 lib/bbdev/rte_bbdev_op.h                           | 151 ++++++++++++++++++++-
 lib/bbdev/version.map                              |  10 ++
 14 files changed, 527 insertions(+), 23 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v3 1/7] bbdev: allow operation type enum for growth
  2022-06-28  1:35       ` [PATCH v3 0/7] bbdev changes for 22.11 Nicolas Chautru
@ 2022-06-28  1:35         ` Nicolas Chautru
  2022-06-28  1:35         ` [PATCH v3 2/7] bbdev: add device status info Nicolas Chautru
                           ` (5 subsequent siblings)
  6 siblings, 0 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-06-28  1:35 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Updating the enum for rte_bbdev_op_type
to allow to keep ABI compatible for enum insertion
while adding padded maximum value for array need.
Removing RTE_BBDEV_OP_TYPE_COUNT and instead exposing
RTE_BBDEV_OP_TYPE_PADDED_MAX.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 app/test-bbdev/test_bbdev.c      | 2 +-
 app/test-bbdev/test_bbdev_perf.c | 4 ++--
 examples/bbdev_app/main.c        | 2 +-
 lib/bbdev/rte_bbdev.c            | 9 +++++----
 lib/bbdev/rte_bbdev_op.h         | 2 +-
 5 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/app/test-bbdev/test_bbdev.c b/app/test-bbdev/test_bbdev.c
index ac06d73..1063f6e 100644
--- a/app/test-bbdev/test_bbdev.c
+++ b/app/test-bbdev/test_bbdev.c
@@ -521,7 +521,7 @@ struct bbdev_testsuite_params {
 	rte_mempool_free(mp);
 
 	TEST_ASSERT((mp = rte_bbdev_op_pool_create("Test_INV",
-			RTE_BBDEV_OP_TYPE_COUNT, size, cache_size, 0)) == NULL,
+			RTE_BBDEV_OP_TYPE_PADDED_MAX, size, cache_size, 0)) == NULL,
 			"Failed test for rte_bbdev_op_pool_create: "
 			"returned value is not NULL for invalid type");
 
diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index fad3b1e..1abda2d 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -2428,13 +2428,13 @@ typedef int (test_case_function)(struct active_device *ad,
 
 	/* Find capabilities */
 	const struct rte_bbdev_op_cap *cap = info.drv.capabilities;
-	for (i = 0; i < RTE_BBDEV_OP_TYPE_COUNT; i++) {
+	do {
 		if (cap->type == test_vector.op_type) {
 			capabilities = cap;
 			break;
 		}
 		cap++;
-	}
+	} while (cap->type != RTE_BBDEV_OP_NONE);
 	TEST_ASSERT_NOT_NULL(capabilities,
 			"Couldn't find capabilities");
 
diff --git a/examples/bbdev_app/main.c b/examples/bbdev_app/main.c
index fc7e8b8..ef0ba76 100644
--- a/examples/bbdev_app/main.c
+++ b/examples/bbdev_app/main.c
@@ -1041,7 +1041,7 @@ uint16_t bbdev_parse_number(const char *mask)
 	void *sigret;
 	struct app_config_params app_params = def_app_config;
 	struct rte_mempool *ethdev_mbuf_mempool, *bbdev_mbuf_mempool;
-	struct rte_mempool *bbdev_op_pools[RTE_BBDEV_OP_TYPE_COUNT];
+	struct rte_mempool *bbdev_op_pools[RTE_BBDEV_OP_TYPE_PADDED_MAX];
 	struct lcore_conf lcore_conf[RTE_MAX_LCORE] = { {0} };
 	struct lcore_statistics lcore_stats[RTE_MAX_LCORE] = { {0} };
 	struct stats_lcore_params stats_lcore;
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index aaee7b7..22bd894 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -23,6 +23,8 @@
 
 #define DEV_NAME "BBDEV"
 
+/* Number of supported operation types */
+#define BBDEV_OP_TYPE_COUNT 5
 
 /* BBDev library logging ID */
 RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE);
@@ -890,10 +892,10 @@ struct rte_mempool *
 		return NULL;
 	}
 
-	if (type >= RTE_BBDEV_OP_TYPE_COUNT) {
+	if (type >= BBDEV_OP_TYPE_COUNT) {
 		rte_bbdev_log(ERR,
 				"Invalid op type (%u), should be less than %u",
-				type, RTE_BBDEV_OP_TYPE_COUNT);
+				type, BBDEV_OP_TYPE_COUNT);
 		return NULL;
 	}
 
@@ -1122,10 +1124,9 @@ struct rte_mempool *
 		"RTE_BBDEV_OP_TURBO_DEC",
 		"RTE_BBDEV_OP_TURBO_ENC",
 		"RTE_BBDEV_OP_LDPC_DEC",
-		"RTE_BBDEV_OP_LDPC_ENC",
 	};
 
-	if (op_type < RTE_BBDEV_OP_TYPE_COUNT)
+	if (op_type < BBDEV_OP_TYPE_COUNT)
 		return op_types[op_type];
 
 	rte_bbdev_log(ERR, "Invalid operation type");
diff --git a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h
index 6d56133..cd82418 100644
--- a/lib/bbdev/rte_bbdev_op.h
+++ b/lib/bbdev/rte_bbdev_op.h
@@ -748,7 +748,7 @@ enum rte_bbdev_op_type {
 	RTE_BBDEV_OP_TURBO_ENC,  /**< Turbo encode */
 	RTE_BBDEV_OP_LDPC_DEC,  /**< LDPC decode */
 	RTE_BBDEV_OP_LDPC_ENC,  /**< LDPC encode */
-	RTE_BBDEV_OP_TYPE_COUNT,  /**< Count of different op types */
+	RTE_BBDEV_OP_TYPE_PADDED_MAX = 8,  /**< Maximum op type number including padding */
 };
 
 /** Bit indexes of possible errors reported through status field */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v3 2/7] bbdev: add device status info
  2022-06-28  1:35       ` [PATCH v3 0/7] bbdev changes for 22.11 Nicolas Chautru
  2022-06-28  1:35         ` [PATCH v3 1/7] bbdev: allow operation type enum for growth Nicolas Chautru
@ 2022-06-28  1:35         ` Nicolas Chautru
  2022-06-28  1:35         ` [PATCH v3 3/7] bbdev: add device info on queue topology Nicolas Chautru
                           ` (4 subsequent siblings)
  6 siblings, 0 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-06-28  1:35 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Added device status information, so that the PMD can
expose information related to the underlying accelerator device status.
Minor order change in structure to fit into padding hole.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c           |  1 +
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  1 +
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  1 +
 drivers/baseband/la12xx/bbdev_la12xx.c             |  1 +
 drivers/baseband/null/bbdev_null.c                 |  1 +
 drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  1 +
 lib/bbdev/rte_bbdev.c                              | 24 +++++++++++++++
 lib/bbdev/rte_bbdev.h                              | 35 ++++++++++++++++++++--
 lib/bbdev/version.map                              |  6 ++++
 9 files changed, 69 insertions(+), 2 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index de7e4bc..17ba798 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -1060,6 +1060,7 @@
 
 	/* Read and save the populated config from ACC100 registers */
 	fetch_acc100_config(dev);
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	/* This isn't ideal because it reports the maximum number of queues but
 	 * does not provide info on how many can be uplink/downlink or different
diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
index 82ae6ba..57b12af 100644
--- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
+++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
@@ -369,6 +369,7 @@
 	dev_info->capabilities = bbdev_capabilities;
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	/* Calculates number of queues assigned to device */
 	dev_info->max_num_queues = 0;
diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
index 21d3529..2a330c4 100644
--- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
+++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
@@ -645,6 +645,7 @@ struct __rte_cache_aligned fpga_queue {
 	dev_info->capabilities = bbdev_capabilities;
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	/* Calculates number of queues assigned to device */
 	dev_info->max_num_queues = 0;
diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c b/drivers/baseband/la12xx/bbdev_la12xx.c
index 4d1bd16..c1f88c6 100644
--- a/drivers/baseband/la12xx/bbdev_la12xx.c
+++ b/drivers/baseband/la12xx/bbdev_la12xx.c
@@ -100,6 +100,7 @@ struct bbdev_la12xx_params {
 	dev_info->capabilities = bbdev_capabilities;
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->min_alignment = 64;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
diff --git a/drivers/baseband/null/bbdev_null.c b/drivers/baseband/null/bbdev_null.c
index 248e129..94a1976 100644
--- a/drivers/baseband/null/bbdev_null.c
+++ b/drivers/baseband/null/bbdev_null.c
@@ -82,6 +82,7 @@ struct bbdev_queue {
 	 * here for code completeness.
 	 */
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
index af7bc41..dbc5524 100644
--- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
+++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
@@ -254,6 +254,7 @@ struct turbo_sw_queue {
 	dev_info->min_alignment = 64;
 	dev_info->harq_buffer_size = 0;
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	rte_bbdev_log_debug("got device info from %u\n", dev->data->dev_id);
 }
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index 22bd894..555bda9 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -25,6 +25,8 @@
 
 /* Number of supported operation types */
 #define BBDEV_OP_TYPE_COUNT 5
+/* Number of supported device status */
+#define BBDEV_DEV_STATUS_COUNT 9
 
 /* BBDev library logging ID */
 RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE);
@@ -1132,3 +1134,25 @@ struct rte_mempool *
 	rte_bbdev_log(ERR, "Invalid operation type");
 	return NULL;
 }
+
+const char *
+rte_bbdev_device_status_str(enum rte_bbdev_device_status status)
+{
+	static const char * const dev_sta_string[] = {
+		"RTE_BBDEV_DEV_NOSTATUS",
+		"RTE_BBDEV_DEV_NOT_SUPPORTED",
+		"RTE_BBDEV_DEV_RESET",
+		"RTE_BBDEV_DEV_CONFIGURED",
+		"RTE_BBDEV_DEV_ACTIVE",
+		"RTE_BBDEV_DEV_FATAL_ERR",
+		"RTE_BBDEV_DEV_RESTART_REQ",
+		"RTE_BBDEV_DEV_RECONFIG_REQ",
+		"RTE_BBDEV_DEV_CORRECT_ERR",
+	};
+
+	if (status < BBDEV_DEV_STATUS_COUNT)
+		return dev_sta_string[status];
+
+	rte_bbdev_log(ERR, "Invalid device status");
+	return NULL;
+}
diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index b88c881..9b1ffa4 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -223,6 +223,21 @@ struct rte_bbdev_queue_conf {
 int
 rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
 
+/**
+ * Flags indicate the status of the device
+ */
+enum rte_bbdev_device_status {
+	RTE_BBDEV_DEV_NOSTATUS,        /**< Nothing being reported */
+	RTE_BBDEV_DEV_NOT_SUPPORTED,   /**< Device status is not supported on the PMD */
+	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-configured state */
+	RTE_BBDEV_DEV_CONFIGURED,      /**< Device is configured and ready to use */
+	RTE_BBDEV_DEV_ACTIVE,          /**< Device is configured and VF is being used */
+	RTE_BBDEV_DEV_FATAL_ERR,       /**< Device has hit a fatal uncorrectable error */
+	RTE_BBDEV_DEV_RESTART_REQ,     /**< Device requires application to restart */
+	RTE_BBDEV_DEV_RECONFIG_REQ,    /**< Device requires application to reconfigure queues */
+	RTE_BBDEV_DEV_CORRECT_ERR,     /**< Warning of a correctable error event happened */
+};
+
 /** Device statistics. */
 struct rte_bbdev_stats {
 	uint64_t enqueued_count;  /**< Count of all operations enqueued */
@@ -285,12 +300,14 @@ struct rte_bbdev_driver_info {
 	/** Set if device supports per-queue interrupts */
 	bool queue_intr_supported;
 	/** Minimum alignment of buffers, in bytes */
-	uint16_t min_alignment;
-	/** HARQ memory available in kB */
+	/** Device Status */
+	enum rte_bbdev_device_status device_status;
 	uint32_t harq_buffer_size;
 	/** Byte endianness (RTE_BIG_ENDIAN/RTE_LITTLE_ENDIAN) supported
 	 *  for input/output data
 	 */
+	uint16_t min_alignment;
+	/** HARQ memory available in kB */
 	uint8_t data_endianness;
 	/** Default queue configuration used if none is supplied  */
 	struct rte_bbdev_queue_conf default_queue_conf;
@@ -827,6 +844,20 @@ typedef void (*rte_bbdev_cb_fn)(uint16_t dev_id,
 rte_bbdev_queue_intr_ctl(uint16_t dev_id, uint16_t queue_id, int epfd, int op,
 		void *data);
 
+/**
+ * Converts device status from enum to string
+ *
+ * @param status
+ *   Device status as enum
+ *
+ * @returns
+ *   Operation type as string or NULL if op_type is invalid
+ *
+ */
+__rte_experimental
+const char*
+rte_bbdev_device_status_str(enum rte_bbdev_device_status status);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map
index cce3f3c..9ac3643 100644
--- a/lib/bbdev/version.map
+++ b/lib/bbdev/version.map
@@ -39,3 +39,9 @@ DPDK_22 {
 
 	local: *;
 };
+
+EXPERIMENTAL {
+	global:
+
+	rte_bbdev_device_status_str;
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v3 3/7] bbdev: add device info on queue topology
  2022-06-28  1:35       ` [PATCH v3 0/7] bbdev changes for 22.11 Nicolas Chautru
  2022-06-28  1:35         ` [PATCH v3 1/7] bbdev: allow operation type enum for growth Nicolas Chautru
  2022-06-28  1:35         ` [PATCH v3 2/7] bbdev: add device status info Nicolas Chautru
@ 2022-06-28  1:35         ` Nicolas Chautru
  2022-06-28  1:35         ` [PATCH v3 4/7] drivers/baseband: update PMDs to expose queue per operation Nicolas Chautru
                           ` (3 subsequent siblings)
  6 siblings, 0 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-06-28  1:35 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Adding more options in the API to expose the number
of queues exposed and related priority.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 lib/bbdev/rte_bbdev.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index 9b1ffa4..ac941d6 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -289,6 +289,10 @@ struct rte_bbdev_driver_info {
 
 	/** Maximum number of queues supported by the device */
 	unsigned int max_num_queues;
+	/** Maximum number of queues supported per operation type */
+	unsigned int num_queues[RTE_BBDEV_OP_TYPE_PADDED_MAX];
+	/** Priority level supported per operation type */
+	unsigned int queue_priority[RTE_BBDEV_OP_TYPE_PADDED_MAX];
 	/** Queue size limit (queue size must also be power of 2) */
 	uint32_t queue_size_lim;
 	/** Set if device off-loads operation to hardware  */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v3 4/7] drivers/baseband: update PMDs to expose queue per operation
  2022-06-28  1:35       ` [PATCH v3 0/7] bbdev changes for 22.11 Nicolas Chautru
                           ` (2 preceding siblings ...)
  2022-06-28  1:35         ` [PATCH v3 3/7] bbdev: add device info on queue topology Nicolas Chautru
@ 2022-06-28  1:35         ` Nicolas Chautru
  2022-06-28  1:35         ` [PATCH v3 5/7] bbdev: add new operation for FFT processing Nicolas Chautru
                           ` (2 subsequent siblings)
  6 siblings, 0 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-06-28  1:35 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Add support in existing bbdev PMDs for the explicit number of queue
and priority for each operation type configured on the device.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c           | 29 +++++++++++++---------
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  8 ++++++
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  8 ++++++
 drivers/baseband/la12xx/bbdev_la12xx.c             |  7 ++++++
 drivers/baseband/turbo_sw/bbdev_turbo_software.c   | 11 ++++++++
 5 files changed, 51 insertions(+), 12 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 17ba798..d568d0d 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -966,6 +966,7 @@
 		struct rte_bbdev_driver_info *dev_info)
 {
 	struct acc100_device *d = dev->data->dev_private;
+	int i;
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
 		{
@@ -1062,19 +1063,23 @@
 	fetch_acc100_config(dev);
 	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
-	/* This isn't ideal because it reports the maximum number of queues but
-	 * does not provide info on how many can be uplink/downlink or different
-	 * priorities
-	 */
-	dev_info->max_num_queues =
-			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
-			d->acc100_conf.q_dl_5g.num_qgroups +
-			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
-			d->acc100_conf.q_ul_5g.num_qgroups +
-			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
-			d->acc100_conf.q_dl_4g.num_qgroups +
-			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
+	/* Expose number of queues */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = d->acc100_conf.q_ul_4g.num_aqs_per_groups *
 			d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = d->acc100_conf.q_dl_4g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_4g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = d->acc100_conf.q_ul_5g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_5g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = d->acc100_conf.q_dl_5g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_5g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = d->acc100_conf.q_dl_4g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = d->acc100_conf.q_ul_5g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = d->acc100_conf.q_dl_5g.num_qgroups;
+	dev_info->max_num_queues = 0;
+	for (i = RTE_BBDEV_OP_TURBO_DEC; i < RTE_BBDEV_OP_LDPC_ENC; i++)
+		dev_info->max_num_queues += dev_info->num_queues[i];
 	dev_info->queue_size_lim = ACC100_MAX_QUEUE_DEPTH;
 	dev_info->hardware_accelerated = true;
 	dev_info->max_dl_queue_priority =
diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
index 57b12af..b4982af 100644
--- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
+++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
@@ -379,6 +379,14 @@
 		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
 			dev_info->max_num_queues++;
 	}
+	/* Expose number of queue per operation type */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = dev_info->max_num_queues / 2;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
 }
 
 /**
diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
index 2a330c4..dc7f479 100644
--- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
+++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
@@ -655,6 +655,14 @@ struct __rte_cache_aligned fpga_queue {
 		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
 			dev_info->max_num_queues++;
 	}
+	/* Expose number of queue per operation type */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = 0;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = 1;
 }
 
 /**
diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c b/drivers/baseband/la12xx/bbdev_la12xx.c
index c1f88c6..e99ea9a 100644
--- a/drivers/baseband/la12xx/bbdev_la12xx.c
+++ b/drivers/baseband/la12xx/bbdev_la12xx.c
@@ -102,6 +102,13 @@ struct bbdev_la12xx_params {
 	dev_info->min_alignment = 64;
 	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = LA12XX_MAX_QUEUES / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = LA12XX_MAX_QUEUES / 2;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
 	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
 
diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
index dbc5524..647e706 100644
--- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
+++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
@@ -256,6 +256,17 @@ struct turbo_sw_queue {
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
 	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
+	const struct rte_bbdev_op_cap *op_cap = bbdev_capabilities;
+	int num_op_type = 0;
+	for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
+		num_op_type++;
+	op_cap = bbdev_capabilities;
+	if (num_op_type > 0) {
+		int num_queue_per_type = dev_info->max_num_queues / num_op_type;
+		for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
+			dev_info->num_queues[op_cap->type] = num_queue_per_type;
+	}
+
 	rte_bbdev_log_debug("got device info from %u\n", dev->data->dev_id);
 }
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v3 5/7] bbdev: add new operation for FFT processing
  2022-06-28  1:35       ` [PATCH v3 0/7] bbdev changes for 22.11 Nicolas Chautru
                           ` (3 preceding siblings ...)
  2022-06-28  1:35         ` [PATCH v3 4/7] drivers/baseband: update PMDs to expose queue per operation Nicolas Chautru
@ 2022-06-28  1:35         ` Nicolas Chautru
  2022-06-28  1:35         ` [PATCH v3 6/7] bbdev: add queue related warning and status information Nicolas Chautru
  2022-06-28  1:35         ` [PATCH v3 7/7] bbdev: add a lock option for enqueue/dequeue operation Nicolas Chautru
  6 siblings, 0 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-06-28  1:35 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Extension of bbdev operation to support FFT based operations.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
 doc/guides/prog_guide/bbdev.rst | 130 +++++++++++++++++++++++++++++++++++
 lib/bbdev/rte_bbdev.c           |  11 ++-
 lib/bbdev/rte_bbdev.h           |  76 ++++++++++++++++++++
 lib/bbdev/rte_bbdev_op.h        | 149 ++++++++++++++++++++++++++++++++++++++++
 lib/bbdev/version.map           |   4 ++
 5 files changed, 369 insertions(+), 1 deletion(-)

diff --git a/doc/guides/prog_guide/bbdev.rst b/doc/guides/prog_guide/bbdev.rst
index 70fa01a..4a055b5 100644
--- a/doc/guides/prog_guide/bbdev.rst
+++ b/doc/guides/prog_guide/bbdev.rst
@@ -1118,6 +1118,136 @@ Figure :numref:`figure_turbo_tb_decode` above
 showing the Turbo decoding of CBs using BBDEV interface in TB-mode
 is also valid for LDPC decode.
 
+BBDEV FFT Operation
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This operation allows to run a combination of DFT and/or IDFT and/or time-domain windowing.
+These can be used in a modular fashion (using bypass modes) or as a processing pipeline
+which can be used for FFT-based baseband signal processing.
+In more details it allows :
+- to process the data first through an IDFT of adjustable size and padding;
+- to perform the windowing as a programmable cyclic shift offset of the data followed by a
+pointwise multiplication by a time domain window;
+- to process the related data through a DFT of adjustable size and depadding for each such cyclic
+shift output.
+
+A flexible number of Rx antennas are being processed in parallel with the same configuration.
+The API allows more generally for flexibility in what the PMD may support (cabability flags) and
+flexibility to adjust some of the parameters of the processing.
+
+The operation/capability flags that can be set for each FFT operation are given below.
+
+  **NOTE:** The actual operation flags that may be used with a specific
+  BBDEV PMD are dependent on the driver capabilities as reported via
+  ``rte_bbdev_info_get()``, and may be a subset of those below.
+
++--------------------------------------------------------------------+
+|Description of FFT capability flags                                 |
++====================================================================+
+|RTE_BBDEV_FFT_WINDOWING                                             |
+| Set to enable/support windowing in time domain                     |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_CS_ADJUSTMENT                                         |
+| Set to enable/support  the cyclic shift time offset adjustment     |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_DFT_BYPASS                                            |
+| Set to bypass the DFT and use directly the IDFT as an option       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_IDFT_BYPASS                                           |
+| Set to bypass the IDFT and use directly the DFT as an option       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_WINDOWING_BYPASS                                      |
+| Set to bypass the time domain windowing  as an option              |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_POWER_MEAS                                            |
+| Set to provide an optional power measument of the DFT output       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_FP16_INPUT                                            |
+| Set if the input data shall use FP16 format instead of INT16       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_FP16_OUTPUT                                           |
+| Set if the output data shall use FP16 format instead of INT16      |
++--------------------------------------------------------------------+
+
+The structure passed for each FFT operation is given below,
+with the operation flags forming a bitmask in the ``op_flags`` field.
+
+.. code-block:: c
+
+    struct rte_bbdev_op_fft {
+        struct rte_bbdev_op_data base_input;
+        struct rte_bbdev_op_data base_output;
+        struct rte_bbdev_op_data power_meas_output;
+        uint32_t op_flags;
+        uint16_t input_sequence_size;
+        uint16_t input_leading_padding;
+        uint16_t output_sequence_size;
+        uint16_t output_leading_depadding;
+        uint8_t window_index[RTE_BBDEV_MAX_CS_2];
+        uint16_t cs_bitmap;
+        uint8_t num_antennas_log2;
+        uint8_t idft_log2;
+        uint8_t dft_log2;
+        int8_t cs_time_adjustment;
+        int8_t idft_shift;
+        int8_t dft_shift;
+        uint16_t ncs_reciprocal;
+        uint16_t power_shift;
+        uint16_t fp16_exp_adjust;
+    };
+
+The FFT parameters are set out in the table below.
+
++----------------------+--------------------------------------------------------------+
+|Parameter             |Description                                                   |
++======================+==============================================================+
+|base_input            |input data                                                    |
++----------------------+--------------------------------------------------------------+
+|base_output           |output data                                                   |
++----------------------+--------------------------------------------------------------+
+|power_meas_output     |optional output data with power measurement on DFT output     |
++----------------------+--------------------------------------------------------------+
+|op_flags              |bitmask of all active operation capabilities                  |
++----------------------+--------------------------------------------------------------+
+|input_sequence_size   |size of the input sequence in 32-bits points per antenna      |
++----------------------+--------------------------------------------------------------+
+|input_leading_padding |number of points padded at the start of input data            |
++----------------------+--------------------------------------------------------------+
+|output_sequence_size  |size of the output sequence per antenna and cyclic shift      |
++----------------------+--------------------------------------------------------------+
+|output_depadding      |number of points depadded at the start of output data         |
++----------------------+--------------------------------------------------------------+
+|window_index          |optional windowing profile index used for each cyclic shift   |
++----------------------+--------------------------------------------------------------+
+|cs_bitmap             |bitmap of the cyclic shift output requested (LSB for index 0) |
++----------------------+--------------------------------------------------------------+
+|num_antennas_log2     |number of antennas as a log2 (10 maps to 1024...)             |
++----------------------+--------------------------------------------------------------+
+|idft_log2             |iDFT size as a log2                                           |
++----------------------+--------------------------------------------------------------+
+|dft_log2              |DFT size as a log2                                            |
++----------------------+--------------------------------------------------------------+
+|cs_time_adjustment    |adjustment of time position of all the cyclic shift output    |
++----------------------+--------------------------------------------------------------+
+|idft_shift            |shift down of signal level post iDFT                          |
++----------------------+--------------------------------------------------------------+
+|dft_shift             |shift down of signal level post DFT                           |
++----------------------+--------------------------------------------------------------+
+|ncs_reciprocal        |inverse of max number of CS normalized to 15b (ie. 231 for 12)|
++----------------------+--------------------------------------------------------------+
+|power_shift           |shift down of level of power measurement when enabled         |
++----------------------+--------------------------------------------------------------+
+|fp16_exp_adjust       |value added to FP16 exponent at conversion from INT16         |
++----------------------+--------------------------------------------------------------+
+
+The mbuf input ``base_input`` is mandatory for all BBDEV PMDs and is the
+incoming data for the processing. Its size may not fit into an actual mbuf, but the
+structure is used to pass iova address.
+The mbuf output ``output`` is mandatory and is output of the FFT processing chain.
+Each point is a complex number of 32bits : either as 2 INT16 or as 2 FP16 based when the option
+supported.
+The data layout is based on contiguous concatenation of output data first by cyclic shift then
+by antenna.
 
 Sample code
 -----------
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index 555bda9..28b105d 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -24,7 +24,7 @@
 #define DEV_NAME "BBDEV"
 
 /* Number of supported operation types */
-#define BBDEV_OP_TYPE_COUNT 5
+#define BBDEV_OP_TYPE_COUNT 6
 /* Number of supported device status */
 #define BBDEV_DEV_STATUS_COUNT 9
 
@@ -854,6 +854,9 @@ struct rte_bbdev *
 	case RTE_BBDEV_OP_LDPC_ENC:
 		result = sizeof(struct rte_bbdev_enc_op);
 		break;
+	case RTE_BBDEV_OP_FFT:
+		result = sizeof(struct rte_bbdev_fft_op);
+		break;
 	default:
 		break;
 	}
@@ -877,6 +880,10 @@ struct rte_bbdev *
 		struct rte_bbdev_enc_op *op = element;
 		memset(op, 0, mempool->elt_size);
 		op->mempool = mempool;
+	} else if (type == RTE_BBDEV_OP_FFT) {
+		struct rte_bbdev_fft_op *op = element;
+		memset(op, 0, mempool->elt_size);
+		op->mempool = mempool;
 	}
 }
 
@@ -1126,6 +1133,8 @@ struct rte_mempool *
 		"RTE_BBDEV_OP_TURBO_DEC",
 		"RTE_BBDEV_OP_TURBO_ENC",
 		"RTE_BBDEV_OP_LDPC_DEC",
+		"RTE_BBDEV_OP_LDPC_ENC",
+		"RTE_BBDEV_OP_FFT",
 	};
 
 	if (op_type < BBDEV_OP_TYPE_COUNT)
diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index ac941d6..ed528b8 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -401,6 +401,12 @@ typedef uint16_t (*rte_bbdev_enqueue_dec_ops_t)(
 		struct rte_bbdev_dec_op **ops,
 		uint16_t num);
 
+/** @internal Enqueue fft operations for processing on queue of a device. */
+typedef uint16_t (*rte_bbdev_enqueue_fft_ops_t)(
+		struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_fft_op **ops,
+		uint16_t num);
+
 /** @internal Dequeue encode operations from a queue of a device. */
 typedef uint16_t (*rte_bbdev_dequeue_enc_ops_t)(
 		struct rte_bbdev_queue_data *q_data,
@@ -411,6 +417,11 @@ typedef uint16_t (*rte_bbdev_dequeue_dec_ops_t)(
 		struct rte_bbdev_queue_data *q_data,
 		struct rte_bbdev_dec_op **ops, uint16_t num);
 
+/** @internal Dequeue fft operations from a queue of a device. */
+typedef uint16_t (*rte_bbdev_dequeue_fft_ops_t)(
+		struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_fft_op **ops, uint16_t num);
+
 #define RTE_BBDEV_NAME_MAX_LEN  64  /**< Max length of device name */
 
 /**
@@ -459,6 +470,10 @@ struct __rte_cache_aligned rte_bbdev {
 	rte_bbdev_dequeue_enc_ops_t dequeue_ldpc_enc_ops;
 	/** Dequeue decode function */
 	rte_bbdev_dequeue_dec_ops_t dequeue_ldpc_dec_ops;
+	/** Enqueue FFT function */
+	rte_bbdev_enqueue_fft_ops_t enqueue_fft_ops;
+	/** Dequeue FFT function */
+	rte_bbdev_dequeue_fft_ops_t dequeue_fft_ops;
 	const struct rte_bbdev_ops *dev_ops;  /**< Functions exported by PMD */
 	struct rte_bbdev_data *data;  /**< Pointer to device data */
 	enum rte_bbdev_state state;  /**< If device is currently used or not */
@@ -591,6 +606,36 @@ struct __rte_cache_aligned rte_bbdev {
 	return dev->enqueue_ldpc_dec_ops(q_data, ops, num_ops);
 }
 
+/**
+ * Enqueue a burst of fft operations to a queue of the device.
+ * This functions only enqueues as many operations as currently possible and
+ * does not block until @p num_ops entries in the queue are available.
+ * This function does not provide any error notification to avoid the
+ * corresponding overhead.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_id
+ *   The index of the queue.
+ * @param ops
+ *   Pointer array containing operations to be enqueued Must have at least
+ *   @p num_ops entries
+ * @param num_ops
+ *   The maximum number of operations to enqueue.
+ *
+ * @return
+ *   The number of operations actually enqueued (this is the number of processed
+ *   entries in the @p ops array).
+ */
+__rte_experimental
+static inline uint16_t
+rte_bbdev_enqueue_fft_ops(uint16_t dev_id, uint16_t queue_id,
+		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
+{
+	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
+	struct rte_bbdev_queue_data *q_data = &dev->data->queues[queue_id];
+	return dev->enqueue_fft_ops(q_data, ops, num_ops);
+}
 
 /**
  * Dequeue a burst of processed encode operations from a queue of the device.
@@ -716,6 +761,37 @@ struct __rte_cache_aligned rte_bbdev {
 	return dev->dequeue_ldpc_dec_ops(q_data, ops, num_ops);
 }
 
+/**
+ * Dequeue a burst of fft operations from a queue of the device.
+ * This functions returns only the current contents of the queue, and does not
+ * block until @ num_ops is available.
+ * This function does not provide any error notification to avoid the
+ * corresponding overhead.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_id
+ *   The index of the queue.
+ * @param ops
+ *   Pointer array where operations will be dequeued to. Must have at least
+ *   @p num_ops entries
+ * @param num_ops
+ *   The maximum number of operations to dequeue.
+ *
+ * @return
+ *   The number of operations actually dequeued (this is the number of entries
+ *   copied into the @p ops array).
+ */
+__rte_experimental
+static inline uint16_t
+rte_bbdev_dequeue_fft_ops(uint16_t dev_id, uint16_t queue_id,
+		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
+{
+	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
+	struct rte_bbdev_queue_data *q_data = &dev->data->queues[queue_id];
+	return dev->dequeue_fft_ops(q_data, ops, num_ops);
+}
+
 /** Definitions of device event types */
 enum rte_bbdev_event_type {
 	RTE_BBDEV_EVENT_UNKNOWN,  /**< unknown event type */
diff --git a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h
index cd82418..3e46f1d 100644
--- a/lib/bbdev/rte_bbdev_op.h
+++ b/lib/bbdev/rte_bbdev_op.h
@@ -47,6 +47,8 @@
 #define RTE_BBDEV_TURBO_MAX_CODE_BLOCKS (64)
 /* LDPC:  Maximum number of Code Blocks in Transport Block.*/
 #define RTE_BBDEV_LDPC_MAX_CODE_BLOCKS (256)
+/* 12 CS maximum */
+#define RTE_BBDEV_MAX_CS_2 (6)
 
 /** Flags for turbo decoder operation and capability structure */
 enum rte_bbdev_op_td_flag_bitmasks {
@@ -211,6 +213,26 @@ enum rte_bbdev_op_ldpcenc_flag_bitmasks {
 	RTE_BBDEV_LDPC_ENC_CONCATENATION = (1ULL << 7)
 };
 
+/** Flags for DFT operation and capability structure */
+enum rte_bbdev_op_fft_flag_bitmasks {
+	/** Flexible windowing capability */
+	RTE_BBDEV_FFT_WINDOWING = (1ULL << 0),
+	/** Flexible adjustment of Cyclic Shift time offset */
+	RTE_BBDEV_FFT_CS_ADJUSTMENT = (1ULL << 1),
+	/** Set for bypass the DFT and get directly into iDFT input */
+	RTE_BBDEV_FFT_DFT_BYPASS = (1ULL << 2),
+	/** Set for bypass the IDFT and get directly the DFT output */
+	RTE_BBDEV_FFT_IDFT_BYPASS = (1ULL << 3),
+	/** Set for bypass time domain windowing */
+	RTE_BBDEV_FFT_WINDOWING_BYPASS = (1ULL << 4),
+	/** Set for optional power measurement on DFT output */
+	RTE_BBDEV_FFT_POWER_MEAS = (1ULL << 5),
+	/** Set if the input data used FP16 format */
+	RTE_BBDEV_FFT_FP16_INPUT = (1ULL << 6),
+	/**  Set if the output data uses FP16 format  */
+	RTE_BBDEV_FFT_FP16_OUTPUT = (1ULL << 7)
+};
+
 /** Flags for the Code Block/Transport block mode  */
 enum rte_bbdev_op_cb_mode {
 	/** One operation is one or fraction of one transport block  */
@@ -689,6 +711,55 @@ struct rte_bbdev_op_ldpc_enc {
 	};
 };
 
+/** Operation structure for FFT processing.
+ *
+ * The operation processes the data for multiple antennas in a single call
+ * (.i.e for all the REs belonging to a given SRS sequence for instance)
+ *
+ * The output mbuf data structure is expected to be allocated by the
+ * application with enough room for the output data.
+ */
+struct rte_bbdev_op_fft {
+	/** Input data starting from first antenna */
+	struct rte_bbdev_op_data base_input;
+	/** Output data starting from first antenna and first cyclic shift */
+	struct rte_bbdev_op_data base_output;
+	/** Optional power measurement output data */
+	struct rte_bbdev_op_data power_meas_output;
+	/** Flags from rte_bbdev_op_fft_flag_bitmasks */
+	uint32_t op_flags;
+	/** Input sequence size in 32-bits points */
+	uint16_t input_sequence_size;
+	/** Padding at the start of the sequence */
+	uint16_t input_leading_padding;
+	/** Output sequence size in 32-bits points */
+	uint16_t output_sequence_size;
+	/** Depadding at the start of the DFT output */
+	uint16_t output_leading_depadding;
+	/** Window index being used for each cyclic shift output */
+	uint8_t window_index[RTE_BBDEV_MAX_CS_2];
+	/** Bitmap of the cyclic shift output requested */
+	uint16_t cs_bitmap;
+	/** Number of antennas as a log2 – 8 to 128 */
+	uint8_t num_antennas_log2;
+	/** iDFT size as a log2 - 32 to 2048 */
+	uint8_t idft_log2;
+	/** DFT size as a log2 - 8 to 2048 */
+	uint8_t dft_log2;
+	/** Adjustment of position of the cyclic shifts - -31 to 31 */
+	int8_t cs_time_adjustment;
+	/** iDFT shift down */
+	int8_t idft_shift;
+	/** DFT shift down */
+	int8_t dft_shift;
+	/** NCS reciprocal factor  */
+	uint16_t ncs_reciprocal;
+	/** power measurement out shift down */
+	uint16_t power_shift;
+	/** Adjust the FP6 exponent for INT<->FP16 conversion */
+	uint16_t fp16_exp_adjust;
+};
+
 /** List of the capabilities for the Turbo Decoder */
 struct rte_bbdev_op_cap_turbo_dec {
 	/** Flags from rte_bbdev_op_td_flag_bitmasks */
@@ -741,6 +812,16 @@ struct rte_bbdev_op_cap_ldpc_enc {
 	uint16_t num_buffers_dst;
 };
 
+/** List of the capabilities for the FFT */
+struct rte_bbdev_op_cap_fft {
+	/** Flags from rte_bbdev_op_ldpcenc_flag_bitmasks */
+	uint32_t capability_flags;
+	/** Num input code block buffers */
+	uint16_t num_buffers_src;
+	/** Num output code block buffers */
+	uint16_t num_buffers_dst;
+};
+
 /** Different operation types supported by the device */
 enum rte_bbdev_op_type {
 	RTE_BBDEV_OP_NONE,  /**< Dummy operation that does nothing */
@@ -748,6 +829,7 @@ enum rte_bbdev_op_type {
 	RTE_BBDEV_OP_TURBO_ENC,  /**< Turbo encode */
 	RTE_BBDEV_OP_LDPC_DEC,  /**< LDPC decode */
 	RTE_BBDEV_OP_LDPC_ENC,  /**< LDPC encode */
+	RTE_BBDEV_OP_FFT,  /**< FFT */
 	RTE_BBDEV_OP_TYPE_PADDED_MAX = 8,  /**< Maximum op type number including padding */
 };
 
@@ -791,6 +873,18 @@ struct rte_bbdev_dec_op {
 	};
 };
 
+/** Structure specifying a single fft operation */
+struct rte_bbdev_fft_op {
+	/** Status of operation that was performed */
+	int status;
+	/** Mempool which op instance is in */
+	struct rte_mempool *mempool;
+	/** Opaque pointer for user data */
+	void *opaque_data;
+	/** Contains turbo decoder specific parameters */
+	struct rte_bbdev_op_fft fft;
+};
+
 /** Operation capabilities supported by a device */
 struct rte_bbdev_op_cap {
 	enum rte_bbdev_op_type type;  /**< Type of operation */
@@ -799,6 +893,7 @@ struct rte_bbdev_op_cap {
 		struct rte_bbdev_op_cap_turbo_enc turbo_enc;
 		struct rte_bbdev_op_cap_ldpc_dec ldpc_dec;
 		struct rte_bbdev_op_cap_ldpc_enc ldpc_enc;
+		struct rte_bbdev_op_cap_fft fft;
 	} cap;  /**< Operation-type specific capabilities */
 };
 
@@ -918,6 +1013,42 @@ struct rte_mempool *
 }
 
 /**
+ * Bulk allocate fft operations from a mempool with parameter defaults reset.
+ *
+ * @param mempool
+ *   Operation mempool, created by rte_bbdev_op_pool_create().
+ * @param ops
+ *   Output array to place allocated operations
+ * @param num_ops
+ *   Number of operations to allocate
+ *
+ * @returns
+ *   - 0 on success
+ *   - EINVAL if invalid mempool is provided
+ */
+__rte_experimental
+static inline int
+rte_bbdev_fft_op_alloc_bulk(struct rte_mempool *mempool,
+		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
+{
+	struct rte_bbdev_op_pool_private *priv;
+	int ret;
+
+	/* Check type */
+	priv = (struct rte_bbdev_op_pool_private *)
+			rte_mempool_get_priv(mempool);
+	if (unlikely(priv->type != RTE_BBDEV_OP_FFT))
+		return -EINVAL;
+
+	/* Get elements */
+	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
+	if (unlikely(ret < 0))
+		return ret;
+
+	return 0;
+}
+
+/**
  * Free decode operation structures that were allocated by
  * rte_bbdev_dec_op_alloc_bulk().
  * All structures must belong to the same mempool.
@@ -951,6 +1082,24 @@ struct rte_mempool *
 		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops, num_ops);
 }
 
+/**
+ * Free encode operation structures that were allocated by
+ * rte_bbdev_fft_op_alloc_bulk().
+ * All structures must belong to the same mempool.
+ *
+ * @param ops
+ *   Operation structures
+ * @param num_ops
+ *   Number of structures
+ */
+__rte_experimental
+static inline void
+rte_bbdev_fft_op_free_bulk(struct rte_bbdev_fft_op **ops, unsigned int num_ops)
+{
+	if (num_ops > 0)
+		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops, num_ops);
+}
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map
index 9ac3643..efae50b 100644
--- a/lib/bbdev/version.map
+++ b/lib/bbdev/version.map
@@ -44,4 +44,8 @@ EXPERIMENTAL {
 	global:
 
 	rte_bbdev_device_status_str;
+	rte_bbdev_enqueue_fft_ops;
+	rte_bbdev_dequeue_fft_ops;
+	rte_bbdev_fft_op_alloc_bulk;
+	rte_bbdev_fft_op_free_bulk;
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v3 6/7] bbdev: add queue related warning and status information
  2022-06-28  1:35       ` [PATCH v3 0/7] bbdev changes for 22.11 Nicolas Chautru
                           ` (4 preceding siblings ...)
  2022-06-28  1:35         ` [PATCH v3 5/7] bbdev: add new operation for FFT processing Nicolas Chautru
@ 2022-06-28  1:35         ` Nicolas Chautru
  2022-06-28  1:35         ` [PATCH v3 7/7] bbdev: add a lock option for enqueue/dequeue operation Nicolas Chautru
  6 siblings, 0 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-06-28  1:35 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

This allows to expose more information with regards to any
queue related failure and warning which cannot be supported
in existing API.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 app/test-bbdev/test_bbdev_perf.c |  2 ++
 lib/bbdev/rte_bbdev.c            |  2 ++
 lib/bbdev/rte_bbdev.h            | 19 +++++++++++++++++++
 3 files changed, 23 insertions(+)

diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 1abda2d..653b21f 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -4360,6 +4360,8 @@ typedef int (test_case_function)(struct active_device *ad,
 	stats->dequeued_count = q_stats->dequeued_count;
 	stats->enqueue_err_count = q_stats->enqueue_err_count;
 	stats->dequeue_err_count = q_stats->dequeue_err_count;
+	stats->enqueue_warning_count = q_stats->enqueue_warning_count;
+	stats->dequeue_warning_count = q_stats->dequeue_warning_count;
 	stats->acc_offload_cycles = q_stats->acc_offload_cycles;
 
 	return 0;
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index 28b105d..fb59b51 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -723,6 +723,8 @@ struct rte_bbdev *
 		stats->dequeued_count += q_stats->dequeued_count;
 		stats->enqueue_err_count += q_stats->enqueue_err_count;
 		stats->dequeue_err_count += q_stats->dequeue_err_count;
+		stats->enqueue_warn_count += q_stats->enqueue_warn_count;
+		stats->dequeue_warn_count += q_stats->dequeue_warn_count;
 	}
 	rte_bbdev_log_debug("Got stats on %u", dev->data->dev_id);
 }
diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index ed528b8..c625a14 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -224,6 +224,19 @@ struct rte_bbdev_queue_conf {
 rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
 
 /**
+ * Flags indicate the reason why a previous enqueue may not have
+ * consummed all requested operations
+ * In case of multiple reasons the latter superdes a previous one
+ */
+enum rte_bbdev_enqueue_status {
+	RTE_BBDEV_ENQ_STATUS_NONE,             /**< Nothing to report */
+	RTE_BBDEV_ENQ_STATUS_QUEUE_FULL,       /**< Not enough room in queue */
+	RTE_BBDEV_ENQ_STATUS_RING_FULL,        /**< Not enough room in ring */
+	RTE_BBDEV_ENQ_STATUS_INVALID_OP,       /**< Operation was rejected as invalid */
+	RTE_BBDEV_ENQ_STATUS_PADDED_MAX = 6,   /**< Maximum enq status number including padding */
+};
+
+/**
  * Flags indicate the status of the device
  */
 enum rte_bbdev_device_status {
@@ -246,6 +259,12 @@ struct rte_bbdev_stats {
 	uint64_t enqueue_err_count;
 	/** Total error count on operations dequeued */
 	uint64_t dequeue_err_count;
+	/** Total warning count on operations enqueued */
+	uint64_t enqueue_warn_count;
+	/** Total warning count on operations dequeued */
+	uint64_t dequeue_warn_count;
+	/** Total enqueue status count based on rte_bbdev_enqueue_status enum */
+	uint64_t enqueue_status_count[RTE_BBDEV_ENQ_STATUS_PADDED_MAX];
 	/** CPU cycles consumed by the (HW/SW) accelerator device to offload
 	 *  the enqueue request to its internal queues.
 	 *  - For a HW device this is the cycles consumed in MMIO write
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v3 7/7] bbdev: add a lock option for enqueue/dequeue operation
  2022-06-28  1:35       ` [PATCH v3 0/7] bbdev changes for 22.11 Nicolas Chautru
                           ` (5 preceding siblings ...)
  2022-06-28  1:35         ` [PATCH v3 6/7] bbdev: add queue related warning and status information Nicolas Chautru
@ 2022-06-28  1:35         ` Nicolas Chautru
  6 siblings, 0 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-06-28  1:35 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Locking is not explictly required but can be valuable
in case the application cannot guarantee to be thread-safe,
or specifically is at risk of using the same queue from multiple threads.
This is an option for PMD to use this.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 lib/bbdev/rte_bbdev.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index c625a14..e0aa52e 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -458,6 +458,8 @@ struct rte_bbdev_data {
 	int socket_id;  /**< NUMA socket that device is on */
 	bool started;  /**< Device run-time state */
 	uint16_t process_cnt;  /** Counter of processes using the device */
+	rte_rwlock_t lock_enq; /**< lock protection for the Enqueue */
+	rte_rwlock_t lock_deq; /**< lock protection for the Dequeue */
 };
 
 /* Forward declarations */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v4 0/7] bbdev changes for 22.11
  2022-06-17 18:37     ` [PATCH v2 5/5] bbdev: add new operation for FFT processing Nicolas Chautru
  2022-06-28  1:35       ` [PATCH v3 0/7] bbdev changes for 22.11 Nicolas Chautru
@ 2022-07-06  0:23       ` Nicolas Chautru
  2022-07-06  0:23         ` [PATCH v4 1/7] bbdev: allow operation type enum for growth Nicolas Chautru
                           ` (6 more replies)
  2022-07-06 23:28       ` [PATCH v5 0/7] bbdev changes for 22.11 Nicolas Chautru
                         ` (7 subsequent siblings)
  9 siblings, 7 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-07-06  0:23 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

v4: update to the last 2 commits to include function to print the queue status and a fix to the rte_lock within the wrong structure
v3: update to device status info to also use padded size for the related array.
Adding also 2 additionals commits to allow the API struc to expose more information related to queues corner cases/warning as well as an optional rw lock.
Hemant, Maxime, this is planned for DPDK 21.11 but would like review/ack early is possible to get this applied earlier and due to time off this summer.
Thanks
Nic

-- 

Hi,

Agregating together in a single serie a number of bbdev api changes previously submitted over the last few months and all targeted for 22.11 (4 different series detailed below). Related deprecation notice being pushed in 22.07 in parallel. 
* bbdev: add device status info
* bbdev: add new operation for FFT processing
* bbdev: add device info on queue topology
* bbdev: allow operation type enum for growth

v2: Update to the RTE_BBDEV_COUNT removal based on feedback from Thomas/Stephen : rejecting out of range op type and adjusting the new name for the padded maximum value used for fixed size arrays. 

---

Previous cover letters agregated below:

* bbdev: add device status info
https://patches.dpdk.org/project/dpdk/list/?series=23367

The updated structure will allow PMDs to expose through info_get what be may the status of the underlying accelerator, notably in case an HW error event having happened.

* bbdev: add new operation for FFT processing
https://patches.dpdk.org/project/dpdk/list/?series=22111

This contribution adds a new operation type to the existing ones already supported by the bbdev PMDs.
This set of operation is FFT-based processing for 5GNR baseband processing acceleration. This operates in the same lookaside fashion as other existing bbdev operation with a dedicated set of capabilities and parameters (marked as experimental).

I plan to also include a new PMD supporting this operation (and most of the related capabilities) in the next couple of months (either in 22.06 or 22.09) as well as extending the related bbdev-test.

* bbdev: add device info on queue topology
https://patches.dpdk.org/project/dpdk/list/?series=22076

Addressing an historical concern that the device info struct only imperfectly captured what queues are available on the device (number of operation and priority). This ended up being an iterative process for application to find each queue could be configured.

ie. the gap was captured as technical debt previously  in comments
/* This isn't ideal because it reports the maximum number of queues but
 * does not provide info on how many can be uplink/downlink or different
 * priorities
 */

This is now being exposed explictly based on the what the device actually supports using the existing info_get api

* bbdev: allow operation type enum for growth
https://patches.dpdk.org/project/dpdk/list/?series=23509

This is related to the general intent to remove using MAX value for enums. There is consensus that we should avoid this for a while notably for future-proofed ABI concerns https://patches.dpdk.org/project/dpdk/patch/20200130142003.2645765-1-ferruh.yigit@intel.com/.
But still there is arguably not yet an explicit best recommendation to handle this especially when we actualy need to expose array whose index is such an enum.
As a specific example here I am refering to RTE_BBDEV_OP_TYPE_COUNT in enum rte_bbdev_op_type which is being extended for new operation type being support in bbdev (such as https://patches.dpdk.org/project/dpdk/patch/1646956157-245769-2-git-send-email-nicolas.chautru@intel.com/ adding new FFT operation)

There is also the intent to be able to expose information for each operation type through the bbdev api such as dynamically configured queues information per such operation type https://patches.dpdk.org/project/dpdk/patch/1646785355-168133-2-git-send-email-nicolas.chautru@intel.com/

Basically we are considering best way to accomodate for this, notably based on discussions with Ray Kinsella and Bruce Richardson, to handle such a case moving forward: specifically for the example with RTE_BBDEV_OP_TYPE_COUNT and also more generally.

One possible option is captured in that patchset and is basically based on the simple principle to allow for growth and prevent ABI breakage. Ie. the last value of the enum is set with a higher value than required so that to allow insertion of new enum outside of the major ABI versions.
In that case the RTE_BBDEV_OP_TYPE_COUNT is still present and can be exposed and used while still allowing for addition thanks to the implicit padding-like room. As an alternate variant, instead of using that last enum value, that extended size could be exposed as an #define outside of the enum but would be fundamentally the same (public).

Another option would be to avoid array alltogether and use each time this a new dedicated API function (operation type enum being an input argument instead of an index to an array in an existing structure so that to get access to structure related to a given operation type enum) but that is arguably not well scalable within DPDK to use such a scheme for each enums and keep an uncluttered and clean API. In that very example that would be very odd indeed not to get this simply from info_get().

Some pros and cons, arguably the simple option in that patchset is a valid compromise option and a step in the right direction but we would like to know your view wrt best recommendation, or any other thought. 


Nicolas Chautru (7):
  bbdev: allow operation type enum for growth
  bbdev: add device status info
  bbdev: add device info on queue topology
  drivers/baseband: update PMDs to expose queue per operation
  bbdev: add new operation for FFT processing
  bbdev: add queue related warning and status information
  bbdev: add a lock option for enqueue/dequeue operation

 app/test-bbdev/test_bbdev.c                        |   2 +-
 app/test-bbdev/test_bbdev_perf.c                   |   6 +-
 doc/guides/prog_guide/bbdev.rst                    | 130 ++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.c           |  30 ++--
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |   9 ++
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |   9 ++
 drivers/baseband/la12xx/bbdev_la12xx.c             |  10 +-
 drivers/baseband/null/bbdev_null.c                 |   1 +
 drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  12 ++
 examples/bbdev_app/main.c                          |   2 +-
 lib/bbdev/rte_bbdev.c                              |  61 ++++++++-
 lib/bbdev/rte_bbdev.h                              | 151 ++++++++++++++++++++-
 lib/bbdev/rte_bbdev_op.h                           | 151 ++++++++++++++++++++-
 lib/bbdev/version.map                              |  11 ++
 14 files changed, 562 insertions(+), 23 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v4 1/7] bbdev: allow operation type enum for growth
  2022-07-06  0:23       ` [PATCH v4 0/7] bbdev changes for 22.11 Nicolas Chautru
@ 2022-07-06  0:23         ` Nicolas Chautru
  2022-07-06 12:50           ` Tom Rix
  2022-07-06  0:23         ` [PATCH v4 2/7] bbdev: add device status info Nicolas Chautru
                           ` (5 subsequent siblings)
  6 siblings, 1 reply; 174+ messages in thread
From: Nicolas Chautru @ 2022-07-06  0:23 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Updating the enum for rte_bbdev_op_type
to allow to keep ABI compatible for enum insertion
while adding padded maximum value for array need.
Removing RTE_BBDEV_OP_TYPE_COUNT and instead exposing
RTE_BBDEV_OP_TYPE_PADDED_MAX.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 app/test-bbdev/test_bbdev.c      | 2 +-
 app/test-bbdev/test_bbdev_perf.c | 4 ++--
 examples/bbdev_app/main.c        | 2 +-
 lib/bbdev/rte_bbdev.c            | 9 +++++----
 lib/bbdev/rte_bbdev_op.h         | 2 +-
 5 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/app/test-bbdev/test_bbdev.c b/app/test-bbdev/test_bbdev.c
index ac06d73..1063f6e 100644
--- a/app/test-bbdev/test_bbdev.c
+++ b/app/test-bbdev/test_bbdev.c
@@ -521,7 +521,7 @@ struct bbdev_testsuite_params {
 	rte_mempool_free(mp);
 
 	TEST_ASSERT((mp = rte_bbdev_op_pool_create("Test_INV",
-			RTE_BBDEV_OP_TYPE_COUNT, size, cache_size, 0)) == NULL,
+			RTE_BBDEV_OP_TYPE_PADDED_MAX, size, cache_size, 0)) == NULL,
 			"Failed test for rte_bbdev_op_pool_create: "
 			"returned value is not NULL for invalid type");
 
diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index fad3b1e..1abda2d 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -2428,13 +2428,13 @@ typedef int (test_case_function)(struct active_device *ad,
 
 	/* Find capabilities */
 	const struct rte_bbdev_op_cap *cap = info.drv.capabilities;
-	for (i = 0; i < RTE_BBDEV_OP_TYPE_COUNT; i++) {
+	do {
 		if (cap->type == test_vector.op_type) {
 			capabilities = cap;
 			break;
 		}
 		cap++;
-	}
+	} while (cap->type != RTE_BBDEV_OP_NONE);
 	TEST_ASSERT_NOT_NULL(capabilities,
 			"Couldn't find capabilities");
 
diff --git a/examples/bbdev_app/main.c b/examples/bbdev_app/main.c
index fc7e8b8..ef0ba76 100644
--- a/examples/bbdev_app/main.c
+++ b/examples/bbdev_app/main.c
@@ -1041,7 +1041,7 @@ uint16_t bbdev_parse_number(const char *mask)
 	void *sigret;
 	struct app_config_params app_params = def_app_config;
 	struct rte_mempool *ethdev_mbuf_mempool, *bbdev_mbuf_mempool;
-	struct rte_mempool *bbdev_op_pools[RTE_BBDEV_OP_TYPE_COUNT];
+	struct rte_mempool *bbdev_op_pools[RTE_BBDEV_OP_TYPE_PADDED_MAX];
 	struct lcore_conf lcore_conf[RTE_MAX_LCORE] = { {0} };
 	struct lcore_statistics lcore_stats[RTE_MAX_LCORE] = { {0} };
 	struct stats_lcore_params stats_lcore;
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index aaee7b7..22bd894 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -23,6 +23,8 @@
 
 #define DEV_NAME "BBDEV"
 
+/* Number of supported operation types */
+#define BBDEV_OP_TYPE_COUNT 5
 
 /* BBDev library logging ID */
 RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE);
@@ -890,10 +892,10 @@ struct rte_mempool *
 		return NULL;
 	}
 
-	if (type >= RTE_BBDEV_OP_TYPE_COUNT) {
+	if (type >= BBDEV_OP_TYPE_COUNT) {
 		rte_bbdev_log(ERR,
 				"Invalid op type (%u), should be less than %u",
-				type, RTE_BBDEV_OP_TYPE_COUNT);
+				type, BBDEV_OP_TYPE_COUNT);
 		return NULL;
 	}
 
@@ -1122,10 +1124,9 @@ struct rte_mempool *
 		"RTE_BBDEV_OP_TURBO_DEC",
 		"RTE_BBDEV_OP_TURBO_ENC",
 		"RTE_BBDEV_OP_LDPC_DEC",
-		"RTE_BBDEV_OP_LDPC_ENC",
 	};
 
-	if (op_type < RTE_BBDEV_OP_TYPE_COUNT)
+	if (op_type < BBDEV_OP_TYPE_COUNT)
 		return op_types[op_type];
 
 	rte_bbdev_log(ERR, "Invalid operation type");
diff --git a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h
index 6d56133..cd82418 100644
--- a/lib/bbdev/rte_bbdev_op.h
+++ b/lib/bbdev/rte_bbdev_op.h
@@ -748,7 +748,7 @@ enum rte_bbdev_op_type {
 	RTE_BBDEV_OP_TURBO_ENC,  /**< Turbo encode */
 	RTE_BBDEV_OP_LDPC_DEC,  /**< LDPC decode */
 	RTE_BBDEV_OP_LDPC_ENC,  /**< LDPC encode */
-	RTE_BBDEV_OP_TYPE_COUNT,  /**< Count of different op types */
+	RTE_BBDEV_OP_TYPE_PADDED_MAX = 8,  /**< Maximum op type number including padding */
 };
 
 /** Bit indexes of possible errors reported through status field */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v4 2/7] bbdev: add device status info
  2022-07-06  0:23       ` [PATCH v4 0/7] bbdev changes for 22.11 Nicolas Chautru
  2022-07-06  0:23         ` [PATCH v4 1/7] bbdev: allow operation type enum for growth Nicolas Chautru
@ 2022-07-06  0:23         ` Nicolas Chautru
  2022-07-06 15:38           ` Tom Rix
  2022-07-06  0:23         ` [PATCH v4 3/7] bbdev: add device info on queue topology Nicolas Chautru
                           ` (4 subsequent siblings)
  6 siblings, 1 reply; 174+ messages in thread
From: Nicolas Chautru @ 2022-07-06  0:23 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Added device status information, so that the PMD can
expose information related to the underlying accelerator device status.
Minor order change in structure to fit into padding hole.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c           |  1 +
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  1 +
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  1 +
 drivers/baseband/la12xx/bbdev_la12xx.c             |  1 +
 drivers/baseband/null/bbdev_null.c                 |  1 +
 drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  1 +
 lib/bbdev/rte_bbdev.c                              | 24 +++++++++++++++
 lib/bbdev/rte_bbdev.h                              | 35 ++++++++++++++++++++--
 lib/bbdev/version.map                              |  6 ++++
 9 files changed, 69 insertions(+), 2 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index de7e4bc..17ba798 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -1060,6 +1060,7 @@
 
 	/* Read and save the populated config from ACC100 registers */
 	fetch_acc100_config(dev);
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	/* This isn't ideal because it reports the maximum number of queues but
 	 * does not provide info on how many can be uplink/downlink or different
diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
index 82ae6ba..57b12af 100644
--- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
+++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
@@ -369,6 +369,7 @@
 	dev_info->capabilities = bbdev_capabilities;
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	/* Calculates number of queues assigned to device */
 	dev_info->max_num_queues = 0;
diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
index 21d3529..2a330c4 100644
--- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
+++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
@@ -645,6 +645,7 @@ struct __rte_cache_aligned fpga_queue {
 	dev_info->capabilities = bbdev_capabilities;
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	/* Calculates number of queues assigned to device */
 	dev_info->max_num_queues = 0;
diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c b/drivers/baseband/la12xx/bbdev_la12xx.c
index 4d1bd16..c1f88c6 100644
--- a/drivers/baseband/la12xx/bbdev_la12xx.c
+++ b/drivers/baseband/la12xx/bbdev_la12xx.c
@@ -100,6 +100,7 @@ struct bbdev_la12xx_params {
 	dev_info->capabilities = bbdev_capabilities;
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->min_alignment = 64;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
diff --git a/drivers/baseband/null/bbdev_null.c b/drivers/baseband/null/bbdev_null.c
index 248e129..94a1976 100644
--- a/drivers/baseband/null/bbdev_null.c
+++ b/drivers/baseband/null/bbdev_null.c
@@ -82,6 +82,7 @@ struct bbdev_queue {
 	 * here for code completeness.
 	 */
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
index af7bc41..dbc5524 100644
--- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
+++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
@@ -254,6 +254,7 @@ struct turbo_sw_queue {
 	dev_info->min_alignment = 64;
 	dev_info->harq_buffer_size = 0;
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	rte_bbdev_log_debug("got device info from %u\n", dev->data->dev_id);
 }
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index 22bd894..555bda9 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -25,6 +25,8 @@
 
 /* Number of supported operation types */
 #define BBDEV_OP_TYPE_COUNT 5
+/* Number of supported device status */
+#define BBDEV_DEV_STATUS_COUNT 9
 
 /* BBDev library logging ID */
 RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE);
@@ -1132,3 +1134,25 @@ struct rte_mempool *
 	rte_bbdev_log(ERR, "Invalid operation type");
 	return NULL;
 }
+
+const char *
+rte_bbdev_device_status_str(enum rte_bbdev_device_status status)
+{
+	static const char * const dev_sta_string[] = {
+		"RTE_BBDEV_DEV_NOSTATUS",
+		"RTE_BBDEV_DEV_NOT_SUPPORTED",
+		"RTE_BBDEV_DEV_RESET",
+		"RTE_BBDEV_DEV_CONFIGURED",
+		"RTE_BBDEV_DEV_ACTIVE",
+		"RTE_BBDEV_DEV_FATAL_ERR",
+		"RTE_BBDEV_DEV_RESTART_REQ",
+		"RTE_BBDEV_DEV_RECONFIG_REQ",
+		"RTE_BBDEV_DEV_CORRECT_ERR",
+	};
+
+	if (status < BBDEV_DEV_STATUS_COUNT)
+		return dev_sta_string[status];
+
+	rte_bbdev_log(ERR, "Invalid device status");
+	return NULL;
+}
diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index b88c881..9b1ffa4 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -223,6 +223,21 @@ struct rte_bbdev_queue_conf {
 int
 rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
 
+/**
+ * Flags indicate the status of the device
+ */
+enum rte_bbdev_device_status {
+	RTE_BBDEV_DEV_NOSTATUS,        /**< Nothing being reported */
+	RTE_BBDEV_DEV_NOT_SUPPORTED,   /**< Device status is not supported on the PMD */
+	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-configured state */
+	RTE_BBDEV_DEV_CONFIGURED,      /**< Device is configured and ready to use */
+	RTE_BBDEV_DEV_ACTIVE,          /**< Device is configured and VF is being used */
+	RTE_BBDEV_DEV_FATAL_ERR,       /**< Device has hit a fatal uncorrectable error */
+	RTE_BBDEV_DEV_RESTART_REQ,     /**< Device requires application to restart */
+	RTE_BBDEV_DEV_RECONFIG_REQ,    /**< Device requires application to reconfigure queues */
+	RTE_BBDEV_DEV_CORRECT_ERR,     /**< Warning of a correctable error event happened */
+};
+
 /** Device statistics. */
 struct rte_bbdev_stats {
 	uint64_t enqueued_count;  /**< Count of all operations enqueued */
@@ -285,12 +300,14 @@ struct rte_bbdev_driver_info {
 	/** Set if device supports per-queue interrupts */
 	bool queue_intr_supported;
 	/** Minimum alignment of buffers, in bytes */
-	uint16_t min_alignment;
-	/** HARQ memory available in kB */
+	/** Device Status */
+	enum rte_bbdev_device_status device_status;
 	uint32_t harq_buffer_size;
 	/** Byte endianness (RTE_BIG_ENDIAN/RTE_LITTLE_ENDIAN) supported
 	 *  for input/output data
 	 */
+	uint16_t min_alignment;
+	/** HARQ memory available in kB */
 	uint8_t data_endianness;
 	/** Default queue configuration used if none is supplied  */
 	struct rte_bbdev_queue_conf default_queue_conf;
@@ -827,6 +844,20 @@ typedef void (*rte_bbdev_cb_fn)(uint16_t dev_id,
 rte_bbdev_queue_intr_ctl(uint16_t dev_id, uint16_t queue_id, int epfd, int op,
 		void *data);
 
+/**
+ * Converts device status from enum to string
+ *
+ * @param status
+ *   Device status as enum
+ *
+ * @returns
+ *   Operation type as string or NULL if op_type is invalid
+ *
+ */
+__rte_experimental
+const char*
+rte_bbdev_device_status_str(enum rte_bbdev_device_status status);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map
index cce3f3c..9ac3643 100644
--- a/lib/bbdev/version.map
+++ b/lib/bbdev/version.map
@@ -39,3 +39,9 @@ DPDK_22 {
 
 	local: *;
 };
+
+EXPERIMENTAL {
+	global:
+
+	rte_bbdev_device_status_str;
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v4 3/7] bbdev: add device info on queue topology
  2022-07-06  0:23       ` [PATCH v4 0/7] bbdev changes for 22.11 Nicolas Chautru
  2022-07-06  0:23         ` [PATCH v4 1/7] bbdev: allow operation type enum for growth Nicolas Chautru
  2022-07-06  0:23         ` [PATCH v4 2/7] bbdev: add device status info Nicolas Chautru
@ 2022-07-06  0:23         ` Nicolas Chautru
  2022-07-06 16:06           ` Tom Rix
  2022-07-06  0:23         ` [PATCH v4 4/7] drivers/baseband: update PMDs to expose queue per operation Nicolas Chautru
                           ` (3 subsequent siblings)
  6 siblings, 1 reply; 174+ messages in thread
From: Nicolas Chautru @ 2022-07-06  0:23 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Adding more options in the API to expose the number
of queues exposed and related priority.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 lib/bbdev/rte_bbdev.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index 9b1ffa4..ac941d6 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -289,6 +289,10 @@ struct rte_bbdev_driver_info {
 
 	/** Maximum number of queues supported by the device */
 	unsigned int max_num_queues;
+	/** Maximum number of queues supported per operation type */
+	unsigned int num_queues[RTE_BBDEV_OP_TYPE_PADDED_MAX];
+	/** Priority level supported per operation type */
+	unsigned int queue_priority[RTE_BBDEV_OP_TYPE_PADDED_MAX];
 	/** Queue size limit (queue size must also be power of 2) */
 	uint32_t queue_size_lim;
 	/** Set if device off-loads operation to hardware  */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v4 4/7] drivers/baseband: update PMDs to expose queue per operation
  2022-07-06  0:23       ` [PATCH v4 0/7] bbdev changes for 22.11 Nicolas Chautru
                           ` (2 preceding siblings ...)
  2022-07-06  0:23         ` [PATCH v4 3/7] bbdev: add device info on queue topology Nicolas Chautru
@ 2022-07-06  0:23         ` Nicolas Chautru
  2022-07-06 16:15           ` Tom Rix
  2022-07-06  0:23         ` [PATCH v4 5/7] bbdev: add new operation for FFT processing Nicolas Chautru
                           ` (2 subsequent siblings)
  6 siblings, 1 reply; 174+ messages in thread
From: Nicolas Chautru @ 2022-07-06  0:23 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Add support in existing bbdev PMDs for the explicit number of queue
and priority for each operation type configured on the device.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c           | 29 +++++++++++++---------
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  8 ++++++
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  8 ++++++
 drivers/baseband/la12xx/bbdev_la12xx.c             |  7 ++++++
 drivers/baseband/turbo_sw/bbdev_turbo_software.c   | 11 ++++++++
 5 files changed, 51 insertions(+), 12 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 17ba798..d568d0d 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -966,6 +966,7 @@
 		struct rte_bbdev_driver_info *dev_info)
 {
 	struct acc100_device *d = dev->data->dev_private;
+	int i;
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
 		{
@@ -1062,19 +1063,23 @@
 	fetch_acc100_config(dev);
 	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
-	/* This isn't ideal because it reports the maximum number of queues but
-	 * does not provide info on how many can be uplink/downlink or different
-	 * priorities
-	 */
-	dev_info->max_num_queues =
-			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
-			d->acc100_conf.q_dl_5g.num_qgroups +
-			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
-			d->acc100_conf.q_ul_5g.num_qgroups +
-			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
-			d->acc100_conf.q_dl_4g.num_qgroups +
-			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
+	/* Expose number of queues */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = d->acc100_conf.q_ul_4g.num_aqs_per_groups *
 			d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = d->acc100_conf.q_dl_4g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_4g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = d->acc100_conf.q_ul_5g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_5g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = d->acc100_conf.q_dl_5g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_5g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = d->acc100_conf.q_dl_4g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = d->acc100_conf.q_ul_5g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = d->acc100_conf.q_dl_5g.num_qgroups;
+	dev_info->max_num_queues = 0;
+	for (i = RTE_BBDEV_OP_TURBO_DEC; i < RTE_BBDEV_OP_LDPC_ENC; i++)
+		dev_info->max_num_queues += dev_info->num_queues[i];
 	dev_info->queue_size_lim = ACC100_MAX_QUEUE_DEPTH;
 	dev_info->hardware_accelerated = true;
 	dev_info->max_dl_queue_priority =
diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
index 57b12af..b4982af 100644
--- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
+++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
@@ -379,6 +379,14 @@
 		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
 			dev_info->max_num_queues++;
 	}
+	/* Expose number of queue per operation type */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = dev_info->max_num_queues / 2;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
 }
 
 /**
diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
index 2a330c4..dc7f479 100644
--- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
+++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
@@ -655,6 +655,14 @@ struct __rte_cache_aligned fpga_queue {
 		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
 			dev_info->max_num_queues++;
 	}
+	/* Expose number of queue per operation type */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = 0;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = 1;
 }
 
 /**
diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c b/drivers/baseband/la12xx/bbdev_la12xx.c
index c1f88c6..e99ea9a 100644
--- a/drivers/baseband/la12xx/bbdev_la12xx.c
+++ b/drivers/baseband/la12xx/bbdev_la12xx.c
@@ -102,6 +102,13 @@ struct bbdev_la12xx_params {
 	dev_info->min_alignment = 64;
 	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = LA12XX_MAX_QUEUES / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = LA12XX_MAX_QUEUES / 2;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
 	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
 
diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
index dbc5524..647e706 100644
--- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
+++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
@@ -256,6 +256,17 @@ struct turbo_sw_queue {
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
 	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
+	const struct rte_bbdev_op_cap *op_cap = bbdev_capabilities;
+	int num_op_type = 0;
+	for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
+		num_op_type++;
+	op_cap = bbdev_capabilities;
+	if (num_op_type > 0) {
+		int num_queue_per_type = dev_info->max_num_queues / num_op_type;
+		for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
+			dev_info->num_queues[op_cap->type] = num_queue_per_type;
+	}
+
 	rte_bbdev_log_debug("got device info from %u\n", dev->data->dev_id);
 }
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v4 5/7] bbdev: add new operation for FFT processing
  2022-07-06  0:23       ` [PATCH v4 0/7] bbdev changes for 22.11 Nicolas Chautru
                           ` (3 preceding siblings ...)
  2022-07-06  0:23         ` [PATCH v4 4/7] drivers/baseband: update PMDs to expose queue per operation Nicolas Chautru
@ 2022-07-06  0:23         ` Nicolas Chautru
  2022-07-06 18:47           ` Tom Rix
  2022-07-06  0:23         ` [PATCH v4 6/7] bbdev: add queue related warning and status information Nicolas Chautru
  2022-07-06  0:23         ` [PATCH v4 7/7] bbdev: add a lock option for enqueue/dequeue operation Nicolas Chautru
  6 siblings, 1 reply; 174+ messages in thread
From: Nicolas Chautru @ 2022-07-06  0:23 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=a, Size: 22013 bytes --]

Extension of bbdev operation to support FFT based operations.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
 doc/guides/prog_guide/bbdev.rst | 130 +++++++++++++++++++++++++++++++++++
 lib/bbdev/rte_bbdev.c           |  11 ++-
 lib/bbdev/rte_bbdev.h           |  76 ++++++++++++++++++++
 lib/bbdev/rte_bbdev_op.h        | 149 ++++++++++++++++++++++++++++++++++++++++
 lib/bbdev/version.map           |   4 ++
 5 files changed, 369 insertions(+), 1 deletion(-)

diff --git a/doc/guides/prog_guide/bbdev.rst b/doc/guides/prog_guide/bbdev.rst
index 70fa01a..4a055b5 100644
--- a/doc/guides/prog_guide/bbdev.rst
+++ b/doc/guides/prog_guide/bbdev.rst
@@ -1118,6 +1118,136 @@ Figure :numref:`figure_turbo_tb_decode` above
 showing the Turbo decoding of CBs using BBDEV interface in TB-mode
 is also valid for LDPC decode.
 
+BBDEV FFT Operation
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This operation allows to run a combination of DFT and/or IDFT and/or time-domain windowing.
+These can be used in a modular fashion (using bypass modes) or as a processing pipeline
+which can be used for FFT-based baseband signal processing.
+In more details it allows :
+- to process the data first through an IDFT of adjustable size and padding;
+- to perform the windowing as a programmable cyclic shift offset of the data followed by a
+pointwise multiplication by a time domain window;
+- to process the related data through a DFT of adjustable size and depadding for each such cyclic
+shift output.
+
+A flexible number of Rx antennas are being processed in parallel with the same configuration.
+The API allows more generally for flexibility in what the PMD may support (cabability flags) and
+flexibility to adjust some of the parameters of the processing.
+
+The operation/capability flags that can be set for each FFT operation are given below.
+
+  **NOTE:** The actual operation flags that may be used with a specific
+  BBDEV PMD are dependent on the driver capabilities as reported via
+  ``rte_bbdev_info_get()``, and may be a subset of those below.
+
++--------------------------------------------------------------------+
+|Description of FFT capability flags                                 |
++====================================================================+
+|RTE_BBDEV_FFT_WINDOWING                                             |
+| Set to enable/support windowing in time domain                     |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_CS_ADJUSTMENT                                         |
+| Set to enable/support  the cyclic shift time offset adjustment     |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_DFT_BYPASS                                            |
+| Set to bypass the DFT and use directly the IDFT as an option       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_IDFT_BYPASS                                           |
+| Set to bypass the IDFT and use directly the DFT as an option       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_WINDOWING_BYPASS                                      |
+| Set to bypass the time domain windowing  as an option              |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_POWER_MEAS                                            |
+| Set to provide an optional power measument of the DFT output       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_FP16_INPUT                                            |
+| Set if the input data shall use FP16 format instead of INT16       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_FP16_OUTPUT                                           |
+| Set if the output data shall use FP16 format instead of INT16      |
++--------------------------------------------------------------------+
+
+The structure passed for each FFT operation is given below,
+with the operation flags forming a bitmask in the ``op_flags`` field.
+
+.. code-block:: c
+
+    struct rte_bbdev_op_fft {
+        struct rte_bbdev_op_data base_input;
+        struct rte_bbdev_op_data base_output;
+        struct rte_bbdev_op_data power_meas_output;
+        uint32_t op_flags;
+        uint16_t input_sequence_size;
+        uint16_t input_leading_padding;
+        uint16_t output_sequence_size;
+        uint16_t output_leading_depadding;
+        uint8_t window_index[RTE_BBDEV_MAX_CS_2];
+        uint16_t cs_bitmap;
+        uint8_t num_antennas_log2;
+        uint8_t idft_log2;
+        uint8_t dft_log2;
+        int8_t cs_time_adjustment;
+        int8_t idft_shift;
+        int8_t dft_shift;
+        uint16_t ncs_reciprocal;
+        uint16_t power_shift;
+        uint16_t fp16_exp_adjust;
+    };
+
+The FFT parameters are set out in the table below.
+
++----------------------+--------------------------------------------------------------+
+|Parameter             |Description                                                   |
++======================+==============================================================+
+|base_input            |input data                                                    |
++----------------------+--------------------------------------------------------------+
+|base_output           |output data                                                   |
++----------------------+--------------------------------------------------------------+
+|power_meas_output     |optional output data with power measurement on DFT output     |
++----------------------+--------------------------------------------------------------+
+|op_flags              |bitmask of all active operation capabilities                  |
++----------------------+--------------------------------------------------------------+
+|input_sequence_size   |size of the input sequence in 32-bits points per antenna      |
++----------------------+--------------------------------------------------------------+
+|input_leading_padding |number of points padded at the start of input data            |
++----------------------+--------------------------------------------------------------+
+|output_sequence_size  |size of the output sequence per antenna and cyclic shift      |
++----------------------+--------------------------------------------------------------+
+|output_depadding      |number of points depadded at the start of output data         |
++----------------------+--------------------------------------------------------------+
+|window_index          |optional windowing profile index used for each cyclic shift   |
++----------------------+--------------------------------------------------------------+
+|cs_bitmap             |bitmap of the cyclic shift output requested (LSB for index 0) |
++----------------------+--------------------------------------------------------------+
+|num_antennas_log2     |number of antennas as a log2 (10 maps to 1024...)             |
++----------------------+--------------------------------------------------------------+
+|idft_log2             |iDFT size as a log2                                           |
++----------------------+--------------------------------------------------------------+
+|dft_log2              |DFT size as a log2                                            |
++----------------------+--------------------------------------------------------------+
+|cs_time_adjustment    |adjustment of time position of all the cyclic shift output    |
++----------------------+--------------------------------------------------------------+
+|idft_shift            |shift down of signal level post iDFT                          |
++----------------------+--------------------------------------------------------------+
+|dft_shift             |shift down of signal level post DFT                           |
++----------------------+--------------------------------------------------------------+
+|ncs_reciprocal        |inverse of max number of CS normalized to 15b (ie. 231 for 12)|
++----------------------+--------------------------------------------------------------+
+|power_shift           |shift down of level of power measurement when enabled         |
++----------------------+--------------------------------------------------------------+
+|fp16_exp_adjust       |value added to FP16 exponent at conversion from INT16         |
++----------------------+--------------------------------------------------------------+
+
+The mbuf input ``base_input`` is mandatory for all BBDEV PMDs and is the
+incoming data for the processing. Its size may not fit into an actual mbuf, but the
+structure is used to pass iova address.
+The mbuf output ``output`` is mandatory and is output of the FFT processing chain.
+Each point is a complex number of 32bits : either as 2 INT16 or as 2 FP16 based when the option
+supported.
+The data layout is based on contiguous concatenation of output data first by cyclic shift then
+by antenna.
 
 Sample code
 -----------
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index 555bda9..28b105d 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -24,7 +24,7 @@
 #define DEV_NAME "BBDEV"
 
 /* Number of supported operation types */
-#define BBDEV_OP_TYPE_COUNT 5
+#define BBDEV_OP_TYPE_COUNT 6
 /* Number of supported device status */
 #define BBDEV_DEV_STATUS_COUNT 9
 
@@ -854,6 +854,9 @@ struct rte_bbdev *
 	case RTE_BBDEV_OP_LDPC_ENC:
 		result = sizeof(struct rte_bbdev_enc_op);
 		break;
+	case RTE_BBDEV_OP_FFT:
+		result = sizeof(struct rte_bbdev_fft_op);
+		break;
 	default:
 		break;
 	}
@@ -877,6 +880,10 @@ struct rte_bbdev *
 		struct rte_bbdev_enc_op *op = element;
 		memset(op, 0, mempool->elt_size);
 		op->mempool = mempool;
+	} else if (type == RTE_BBDEV_OP_FFT) {
+		struct rte_bbdev_fft_op *op = element;
+		memset(op, 0, mempool->elt_size);
+		op->mempool = mempool;
 	}
 }
 
@@ -1126,6 +1133,8 @@ struct rte_mempool *
 		"RTE_BBDEV_OP_TURBO_DEC",
 		"RTE_BBDEV_OP_TURBO_ENC",
 		"RTE_BBDEV_OP_LDPC_DEC",
+		"RTE_BBDEV_OP_LDPC_ENC",
+		"RTE_BBDEV_OP_FFT",
 	};
 
 	if (op_type < BBDEV_OP_TYPE_COUNT)
diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index ac941d6..ed528b8 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -401,6 +401,12 @@ typedef uint16_t (*rte_bbdev_enqueue_dec_ops_t)(
 		struct rte_bbdev_dec_op **ops,
 		uint16_t num);
 
+/** @internal Enqueue fft operations for processing on queue of a device. */
+typedef uint16_t (*rte_bbdev_enqueue_fft_ops_t)(
+		struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_fft_op **ops,
+		uint16_t num);
+
 /** @internal Dequeue encode operations from a queue of a device. */
 typedef uint16_t (*rte_bbdev_dequeue_enc_ops_t)(
 		struct rte_bbdev_queue_data *q_data,
@@ -411,6 +417,11 @@ typedef uint16_t (*rte_bbdev_dequeue_dec_ops_t)(
 		struct rte_bbdev_queue_data *q_data,
 		struct rte_bbdev_dec_op **ops, uint16_t num);
 
+/** @internal Dequeue fft operations from a queue of a device. */
+typedef uint16_t (*rte_bbdev_dequeue_fft_ops_t)(
+		struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_fft_op **ops, uint16_t num);
+
 #define RTE_BBDEV_NAME_MAX_LEN  64  /**< Max length of device name */
 
 /**
@@ -459,6 +470,10 @@ struct __rte_cache_aligned rte_bbdev {
 	rte_bbdev_dequeue_enc_ops_t dequeue_ldpc_enc_ops;
 	/** Dequeue decode function */
 	rte_bbdev_dequeue_dec_ops_t dequeue_ldpc_dec_ops;
+	/** Enqueue FFT function */
+	rte_bbdev_enqueue_fft_ops_t enqueue_fft_ops;
+	/** Dequeue FFT function */
+	rte_bbdev_dequeue_fft_ops_t dequeue_fft_ops;
 	const struct rte_bbdev_ops *dev_ops;  /**< Functions exported by PMD */
 	struct rte_bbdev_data *data;  /**< Pointer to device data */
 	enum rte_bbdev_state state;  /**< If device is currently used or not */
@@ -591,6 +606,36 @@ struct __rte_cache_aligned rte_bbdev {
 	return dev->enqueue_ldpc_dec_ops(q_data, ops, num_ops);
 }
 
+/**
+ * Enqueue a burst of fft operations to a queue of the device.
+ * This functions only enqueues as many operations as currently possible and
+ * does not block until @p num_ops entries in the queue are available.
+ * This function does not provide any error notification to avoid the
+ * corresponding overhead.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_id
+ *   The index of the queue.
+ * @param ops
+ *   Pointer array containing operations to be enqueued Must have at least
+ *   @p num_ops entries
+ * @param num_ops
+ *   The maximum number of operations to enqueue.
+ *
+ * @return
+ *   The number of operations actually enqueued (this is the number of processed
+ *   entries in the @p ops array).
+ */
+__rte_experimental
+static inline uint16_t
+rte_bbdev_enqueue_fft_ops(uint16_t dev_id, uint16_t queue_id,
+		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
+{
+	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
+	struct rte_bbdev_queue_data *q_data = &dev->data->queues[queue_id];
+	return dev->enqueue_fft_ops(q_data, ops, num_ops);
+}
 
 /**
  * Dequeue a burst of processed encode operations from a queue of the device.
@@ -716,6 +761,37 @@ struct __rte_cache_aligned rte_bbdev {
 	return dev->dequeue_ldpc_dec_ops(q_data, ops, num_ops);
 }
 
+/**
+ * Dequeue a burst of fft operations from a queue of the device.
+ * This functions returns only the current contents of the queue, and does not
+ * block until @ num_ops is available.
+ * This function does not provide any error notification to avoid the
+ * corresponding overhead.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_id
+ *   The index of the queue.
+ * @param ops
+ *   Pointer array where operations will be dequeued to. Must have at least
+ *   @p num_ops entries
+ * @param num_ops
+ *   The maximum number of operations to dequeue.
+ *
+ * @return
+ *   The number of operations actually dequeued (this is the number of entries
+ *   copied into the @p ops array).
+ */
+__rte_experimental
+static inline uint16_t
+rte_bbdev_dequeue_fft_ops(uint16_t dev_id, uint16_t queue_id,
+		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
+{
+	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
+	struct rte_bbdev_queue_data *q_data = &dev->data->queues[queue_id];
+	return dev->dequeue_fft_ops(q_data, ops, num_ops);
+}
+
 /** Definitions of device event types */
 enum rte_bbdev_event_type {
 	RTE_BBDEV_EVENT_UNKNOWN,  /**< unknown event type */
diff --git a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h
index cd82418..3e46f1d 100644
--- a/lib/bbdev/rte_bbdev_op.h
+++ b/lib/bbdev/rte_bbdev_op.h
@@ -47,6 +47,8 @@
 #define RTE_BBDEV_TURBO_MAX_CODE_BLOCKS (64)
 /* LDPC:  Maximum number of Code Blocks in Transport Block.*/
 #define RTE_BBDEV_LDPC_MAX_CODE_BLOCKS (256)
+/* 12 CS maximum */
+#define RTE_BBDEV_MAX_CS_2 (6)
 
 /** Flags for turbo decoder operation and capability structure */
 enum rte_bbdev_op_td_flag_bitmasks {
@@ -211,6 +213,26 @@ enum rte_bbdev_op_ldpcenc_flag_bitmasks {
 	RTE_BBDEV_LDPC_ENC_CONCATENATION = (1ULL << 7)
 };
 
+/** Flags for DFT operation and capability structure */
+enum rte_bbdev_op_fft_flag_bitmasks {
+	/** Flexible windowing capability */
+	RTE_BBDEV_FFT_WINDOWING = (1ULL << 0),
+	/** Flexible adjustment of Cyclic Shift time offset */
+	RTE_BBDEV_FFT_CS_ADJUSTMENT = (1ULL << 1),
+	/** Set for bypass the DFT and get directly into iDFT input */
+	RTE_BBDEV_FFT_DFT_BYPASS = (1ULL << 2),
+	/** Set for bypass the IDFT and get directly the DFT output */
+	RTE_BBDEV_FFT_IDFT_BYPASS = (1ULL << 3),
+	/** Set for bypass time domain windowing */
+	RTE_BBDEV_FFT_WINDOWING_BYPASS = (1ULL << 4),
+	/** Set for optional power measurement on DFT output */
+	RTE_BBDEV_FFT_POWER_MEAS = (1ULL << 5),
+	/** Set if the input data used FP16 format */
+	RTE_BBDEV_FFT_FP16_INPUT = (1ULL << 6),
+	/**  Set if the output data uses FP16 format  */
+	RTE_BBDEV_FFT_FP16_OUTPUT = (1ULL << 7)
+};
+
 /** Flags for the Code Block/Transport block mode  */
 enum rte_bbdev_op_cb_mode {
 	/** One operation is one or fraction of one transport block  */
@@ -689,6 +711,55 @@ struct rte_bbdev_op_ldpc_enc {
 	};
 };
 
+/** Operation structure for FFT processing.
+ *
+ * The operation processes the data for multiple antennas in a single call
+ * (.i.e for all the REs belonging to a given SRS sequence for instance)
+ *
+ * The output mbuf data structure is expected to be allocated by the
+ * application with enough room for the output data.
+ */
+struct rte_bbdev_op_fft {
+	/** Input data starting from first antenna */
+	struct rte_bbdev_op_data base_input;
+	/** Output data starting from first antenna and first cyclic shift */
+	struct rte_bbdev_op_data base_output;
+	/** Optional power measurement output data */
+	struct rte_bbdev_op_data power_meas_output;
+	/** Flags from rte_bbdev_op_fft_flag_bitmasks */
+	uint32_t op_flags;
+	/** Input sequence size in 32-bits points */
+	uint16_t input_sequence_size;
+	/** Padding at the start of the sequence */
+	uint16_t input_leading_padding;
+	/** Output sequence size in 32-bits points */
+	uint16_t output_sequence_size;
+	/** Depadding at the start of the DFT output */
+	uint16_t output_leading_depadding;
+	/** Window index being used for each cyclic shift output */
+	uint8_t window_index[RTE_BBDEV_MAX_CS_2];
+	/** Bitmap of the cyclic shift output requested */
+	uint16_t cs_bitmap;
+	/** Number of antennas as a log2 – 8 to 128 */
+	uint8_t num_antennas_log2;
+	/** iDFT size as a log2 - 32 to 2048 */
+	uint8_t idft_log2;
+	/** DFT size as a log2 - 8 to 2048 */
+	uint8_t dft_log2;
+	/** Adjustment of position of the cyclic shifts - -31 to 31 */
+	int8_t cs_time_adjustment;
+	/** iDFT shift down */
+	int8_t idft_shift;
+	/** DFT shift down */
+	int8_t dft_shift;
+	/** NCS reciprocal factor  */
+	uint16_t ncs_reciprocal;
+	/** power measurement out shift down */
+	uint16_t power_shift;
+	/** Adjust the FP6 exponent for INT<->FP16 conversion */
+	uint16_t fp16_exp_adjust;
+};
+
 /** List of the capabilities for the Turbo Decoder */
 struct rte_bbdev_op_cap_turbo_dec {
 	/** Flags from rte_bbdev_op_td_flag_bitmasks */
@@ -741,6 +812,16 @@ struct rte_bbdev_op_cap_ldpc_enc {
 	uint16_t num_buffers_dst;
 };
 
+/** List of the capabilities for the FFT */
+struct rte_bbdev_op_cap_fft {
+	/** Flags from rte_bbdev_op_ldpcenc_flag_bitmasks */
+	uint32_t capability_flags;
+	/** Num input code block buffers */
+	uint16_t num_buffers_src;
+	/** Num output code block buffers */
+	uint16_t num_buffers_dst;
+};
+
 /** Different operation types supported by the device */
 enum rte_bbdev_op_type {
 	RTE_BBDEV_OP_NONE,  /**< Dummy operation that does nothing */
@@ -748,6 +829,7 @@ enum rte_bbdev_op_type {
 	RTE_BBDEV_OP_TURBO_ENC,  /**< Turbo encode */
 	RTE_BBDEV_OP_LDPC_DEC,  /**< LDPC decode */
 	RTE_BBDEV_OP_LDPC_ENC,  /**< LDPC encode */
+	RTE_BBDEV_OP_FFT,  /**< FFT */
 	RTE_BBDEV_OP_TYPE_PADDED_MAX = 8,  /**< Maximum op type number including padding */
 };
 
@@ -791,6 +873,18 @@ struct rte_bbdev_dec_op {
 	};
 };
 
+/** Structure specifying a single fft operation */
+struct rte_bbdev_fft_op {
+	/** Status of operation that was performed */
+	int status;
+	/** Mempool which op instance is in */
+	struct rte_mempool *mempool;
+	/** Opaque pointer for user data */
+	void *opaque_data;
+	/** Contains turbo decoder specific parameters */
+	struct rte_bbdev_op_fft fft;
+};
+
 /** Operation capabilities supported by a device */
 struct rte_bbdev_op_cap {
 	enum rte_bbdev_op_type type;  /**< Type of operation */
@@ -799,6 +893,7 @@ struct rte_bbdev_op_cap {
 		struct rte_bbdev_op_cap_turbo_enc turbo_enc;
 		struct rte_bbdev_op_cap_ldpc_dec ldpc_dec;
 		struct rte_bbdev_op_cap_ldpc_enc ldpc_enc;
+		struct rte_bbdev_op_cap_fft fft;
 	} cap;  /**< Operation-type specific capabilities */
 };
 
@@ -918,6 +1013,42 @@ struct rte_mempool *
 }
 
 /**
+ * Bulk allocate fft operations from a mempool with parameter defaults reset.
+ *
+ * @param mempool
+ *   Operation mempool, created by rte_bbdev_op_pool_create().
+ * @param ops
+ *   Output array to place allocated operations
+ * @param num_ops
+ *   Number of operations to allocate
+ *
+ * @returns
+ *   - 0 on success
+ *   - EINVAL if invalid mempool is provided
+ */
+__rte_experimental
+static inline int
+rte_bbdev_fft_op_alloc_bulk(struct rte_mempool *mempool,
+		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
+{
+	struct rte_bbdev_op_pool_private *priv;
+	int ret;
+
+	/* Check type */
+	priv = (struct rte_bbdev_op_pool_private *)
+			rte_mempool_get_priv(mempool);
+	if (unlikely(priv->type != RTE_BBDEV_OP_FFT))
+		return -EINVAL;
+
+	/* Get elements */
+	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
+	if (unlikely(ret < 0))
+		return ret;
+
+	return 0;
+}
+
+/**
  * Free decode operation structures that were allocated by
  * rte_bbdev_dec_op_alloc_bulk().
  * All structures must belong to the same mempool.
@@ -951,6 +1082,24 @@ struct rte_mempool *
 		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops, num_ops);
 }
 
+/**
+ * Free encode operation structures that were allocated by
+ * rte_bbdev_fft_op_alloc_bulk().
+ * All structures must belong to the same mempool.
+ *
+ * @param ops
+ *   Operation structures
+ * @param num_ops
+ *   Number of structures
+ */
+__rte_experimental
+static inline void
+rte_bbdev_fft_op_free_bulk(struct rte_bbdev_fft_op **ops, unsigned int num_ops)
+{
+	if (num_ops > 0)
+		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops, num_ops);
+}
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map
index 9ac3643..efae50b 100644
--- a/lib/bbdev/version.map
+++ b/lib/bbdev/version.map
@@ -44,4 +44,8 @@ EXPERIMENTAL {
 	global:
 
 	rte_bbdev_device_status_str;
+	rte_bbdev_enqueue_fft_ops;
+	rte_bbdev_dequeue_fft_ops;
+	rte_bbdev_fft_op_alloc_bulk;
+	rte_bbdev_fft_op_free_bulk;
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v4 6/7] bbdev: add queue related warning and status information
  2022-07-06  0:23       ` [PATCH v4 0/7] bbdev changes for 22.11 Nicolas Chautru
                           ` (4 preceding siblings ...)
  2022-07-06  0:23         ` [PATCH v4 5/7] bbdev: add new operation for FFT processing Nicolas Chautru
@ 2022-07-06  0:23         ` Nicolas Chautru
  2022-07-06 18:57           ` Tom Rix
  2022-07-06  0:23         ` [PATCH v4 7/7] bbdev: add a lock option for enqueue/dequeue operation Nicolas Chautru
  6 siblings, 1 reply; 174+ messages in thread
From: Nicolas Chautru @ 2022-07-06  0:23 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

This allows to expose more information with regards to any
queue related failure and warning which cannot be supported
in existing API.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 app/test-bbdev/test_bbdev_perf.c |  2 ++
 lib/bbdev/rte_bbdev.c            | 21 +++++++++++++++++++++
 lib/bbdev/rte_bbdev.h            | 34 ++++++++++++++++++++++++++++++++++
 lib/bbdev/version.map            |  1 +
 4 files changed, 58 insertions(+)

diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 1abda2d..653b21f 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -4360,6 +4360,8 @@ typedef int (test_case_function)(struct active_device *ad,
 	stats->dequeued_count = q_stats->dequeued_count;
 	stats->enqueue_err_count = q_stats->enqueue_err_count;
 	stats->dequeue_err_count = q_stats->dequeue_err_count;
+	stats->enqueue_warning_count = q_stats->enqueue_warning_count;
+	stats->dequeue_warning_count = q_stats->dequeue_warning_count;
 	stats->acc_offload_cycles = q_stats->acc_offload_cycles;
 
 	return 0;
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index 28b105d..ddad464 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -27,6 +27,8 @@
 #define BBDEV_OP_TYPE_COUNT 6
 /* Number of supported device status */
 #define BBDEV_DEV_STATUS_COUNT 9
+/* Number of supported enqueue status */
+#define BBDEV_ENQ_STATUS_COUNT 4
 
 /* BBDev library logging ID */
 RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE);
@@ -723,6 +725,8 @@ struct rte_bbdev *
 		stats->dequeued_count += q_stats->dequeued_count;
 		stats->enqueue_err_count += q_stats->enqueue_err_count;
 		stats->dequeue_err_count += q_stats->dequeue_err_count;
+		stats->enqueue_warn_count += q_stats->enqueue_warn_count;
+		stats->dequeue_warn_count += q_stats->dequeue_warn_count;
 	}
 	rte_bbdev_log_debug("Got stats on %u", dev->data->dev_id);
 }
@@ -1165,3 +1169,20 @@ struct rte_mempool *
 	rte_bbdev_log(ERR, "Invalid device status");
 	return NULL;
 }
+
+const char *
+rte_bbdev_enqueue_status_str(enum rte_bbdev_enqueue_status status)
+{
+	static const char * const enq_sta_string[] = {
+		"RTE_BBDEV_ENQ_STATUS_NONE",
+		"RTE_BBDEV_ENQ_STATUS_QUEUE_FULL",
+		"RTE_BBDEV_ENQ_STATUS_RING_FULL",
+		"RTE_BBDEV_ENQ_STATUS_INVALID_OP",
+	};
+
+	if (status < BBDEV_ENQ_STATUS_COUNT)
+		return enq_sta_string[status];
+
+	rte_bbdev_log(ERR, "Invalid enqueue status");
+	return NULL;
+}
diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index ed528b8..b7ecf94 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -224,6 +224,19 @@ struct rte_bbdev_queue_conf {
 rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
 
 /**
+ * Flags indicate the reason why a previous enqueue may not have
+ * consumed all requested operations
+ * In case of multiple reasons the latter superdes a previous one
+ */
+enum rte_bbdev_enqueue_status {
+	RTE_BBDEV_ENQ_STATUS_NONE,             /**< Nothing to report */
+	RTE_BBDEV_ENQ_STATUS_QUEUE_FULL,       /**< Not enough room in queue */
+	RTE_BBDEV_ENQ_STATUS_RING_FULL,        /**< Not enough room in ring */
+	RTE_BBDEV_ENQ_STATUS_INVALID_OP,       /**< Operation was rejected as invalid */
+	RTE_BBDEV_ENQ_STATUS_PADDED_MAX = 6,   /**< Maximum enq status number including padding */
+};
+
+/**
  * Flags indicate the status of the device
  */
 enum rte_bbdev_device_status {
@@ -246,6 +259,12 @@ struct rte_bbdev_stats {
 	uint64_t enqueue_err_count;
 	/** Total error count on operations dequeued */
 	uint64_t dequeue_err_count;
+	/** Total warning count on operations enqueued */
+	uint64_t enqueue_warn_count;
+	/** Total warning count on operations dequeued */
+	uint64_t dequeue_warn_count;
+	/** Total enqueue status count based on rte_bbdev_enqueue_status enum */
+	uint64_t enqueue_status_count[RTE_BBDEV_ENQ_STATUS_PADDED_MAX];
 	/** CPU cycles consumed by the (HW/SW) accelerator device to offload
 	 *  the enqueue request to its internal queues.
 	 *  - For a HW device this is the cycles consumed in MMIO write
@@ -386,6 +405,7 @@ struct rte_bbdev_queue_data {
 	void *queue_private;  /**< Driver-specific per-queue data */
 	struct rte_bbdev_queue_conf conf;  /**< Current configuration */
 	struct rte_bbdev_stats queue_stats;  /**< Queue statistics */
+	enum rte_bbdev_enqueue_status enqueue_status; /**< Enqueue status when op is rejected */
 	bool started;  /**< Queue state */
 };
 
@@ -938,6 +958,20 @@ typedef void (*rte_bbdev_cb_fn)(uint16_t dev_id,
 const char*
 rte_bbdev_device_status_str(enum rte_bbdev_device_status status);
 
+/**
+ * Converts queue status from enum to string
+ *
+ * @param status
+ *   Queue status as enum
+ *
+ * @returns
+ *  Queue status as string or NULL if op_type is invalid
+ *
+ */
+__rte_experimental
+const char*
+rte_bbdev_enqueue_status_str(enum rte_bbdev_enqueue_status status);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map
index efae50b..1c06738 100644
--- a/lib/bbdev/version.map
+++ b/lib/bbdev/version.map
@@ -44,6 +44,7 @@ EXPERIMENTAL {
 	global:
 
 	rte_bbdev_device_status_str;
+	rte_bbdev_enqueue_status_str;
 	rte_bbdev_enqueue_fft_ops;
 	rte_bbdev_dequeue_fft_ops;
 	rte_bbdev_fft_op_alloc_bulk;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v4 7/7] bbdev: add a lock option for enqueue/dequeue operation
  2022-07-06  0:23       ` [PATCH v4 0/7] bbdev changes for 22.11 Nicolas Chautru
                           ` (5 preceding siblings ...)
  2022-07-06  0:23         ` [PATCH v4 6/7] bbdev: add queue related warning and status information Nicolas Chautru
@ 2022-07-06  0:23         ` Nicolas Chautru
  2022-07-06 19:01           ` Tom Rix
  6 siblings, 1 reply; 174+ messages in thread
From: Nicolas Chautru @ 2022-07-06  0:23 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Locking is not explicitly required but can be valuable
in case the application cannot guarantee to be thread-safe,
or specifically is at risk of using the same queue from multiple threads.
This is an option for PMD to use this.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 lib/bbdev/rte_bbdev.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index b7ecf94..8e7ca86 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -407,6 +407,8 @@ struct rte_bbdev_queue_data {
 	struct rte_bbdev_stats queue_stats;  /**< Queue statistics */
 	enum rte_bbdev_enqueue_status enqueue_status; /**< Enqueue status when op is rejected */
 	bool started;  /**< Queue state */
+	rte_rwlock_t lock_enq; /**< lock protection for the Enqueue */
+	rte_rwlock_t lock_deq; /**< lock protection for the Dequeue */
 };
 
 /** @internal Enqueue encode operations for processing on queue of a device. */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v4 1/7] bbdev: allow operation type enum for growth
  2022-07-06  0:23         ` [PATCH v4 1/7] bbdev: allow operation type enum for growth Nicolas Chautru
@ 2022-07-06 12:50           ` Tom Rix
  2022-07-06 21:20             ` Chautru, Nicolas
  0 siblings, 1 reply; 174+ messages in thread
From: Tom Rix @ 2022-07-06 12:50 UTC (permalink / raw)
  To: Nicolas Chautru, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, bruce.richardson, david.marchand, stephen


On 7/5/22 5:23 PM, Nicolas Chautru wrote:
> Updating the enum for rte_bbdev_op_type
> to allow to keep ABI compatible for enum insertion
> while adding padded maximum value for array need.
> Removing RTE_BBDEV_OP_TYPE_COUNT and instead exposing
> RTE_BBDEV_OP_TYPE_PADDED_MAX.
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>   app/test-bbdev/test_bbdev.c      | 2 +-
>   app/test-bbdev/test_bbdev_perf.c | 4 ++--
>   examples/bbdev_app/main.c        | 2 +-
>   lib/bbdev/rte_bbdev.c            | 9 +++++----
>   lib/bbdev/rte_bbdev_op.h         | 2 +-
>   5 files changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/app/test-bbdev/test_bbdev.c b/app/test-bbdev/test_bbdev.c
> index ac06d73..1063f6e 100644
> --- a/app/test-bbdev/test_bbdev.c
> +++ b/app/test-bbdev/test_bbdev.c
> @@ -521,7 +521,7 @@ struct bbdev_testsuite_params {
>   	rte_mempool_free(mp);
>   
>   	TEST_ASSERT((mp = rte_bbdev_op_pool_create("Test_INV",
> -			RTE_BBDEV_OP_TYPE_COUNT, size, cache_size, 0)) == NULL,
> +			RTE_BBDEV_OP_TYPE_PADDED_MAX, size, cache_size, 0)) == NULL,
>   			"Failed test for rte_bbdev_op_pool_create: "
>   			"returned value is not NULL for invalid type");
>   
> diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
> index fad3b1e..1abda2d 100644
> --- a/app/test-bbdev/test_bbdev_perf.c
> +++ b/app/test-bbdev/test_bbdev_perf.c
> @@ -2428,13 +2428,13 @@ typedef int (test_case_function)(struct active_device *ad,
>   
>   	/* Find capabilities */
>   	const struct rte_bbdev_op_cap *cap = info.drv.capabilities;
> -	for (i = 0; i < RTE_BBDEV_OP_TYPE_COUNT; i++) {
> +	do {
>   		if (cap->type == test_vector.op_type) {
>   			capabilities = cap;
>   			break;
>   		}
>   		cap++;
> -	}
> +	} while (cap->type != RTE_BBDEV_OP_NONE);
>   	TEST_ASSERT_NOT_NULL(capabilities,
>   			"Couldn't find capabilities");
>   
> diff --git a/examples/bbdev_app/main.c b/examples/bbdev_app/main.c
> index fc7e8b8..ef0ba76 100644
> --- a/examples/bbdev_app/main.c
> +++ b/examples/bbdev_app/main.c
> @@ -1041,7 +1041,7 @@ uint16_t bbdev_parse_number(const char *mask)
>   	void *sigret;
>   	struct app_config_params app_params = def_app_config;
>   	struct rte_mempool *ethdev_mbuf_mempool, *bbdev_mbuf_mempool;
> -	struct rte_mempool *bbdev_op_pools[RTE_BBDEV_OP_TYPE_COUNT];
> +	struct rte_mempool *bbdev_op_pools[RTE_BBDEV_OP_TYPE_PADDED_MAX];
>   	struct lcore_conf lcore_conf[RTE_MAX_LCORE] = { {0} };
>   	struct lcore_statistics lcore_stats[RTE_MAX_LCORE] = { {0} };
>   	struct stats_lcore_params stats_lcore;
> diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
> index aaee7b7..22bd894 100644
> --- a/lib/bbdev/rte_bbdev.c
> +++ b/lib/bbdev/rte_bbdev.c
> @@ -23,6 +23,8 @@
>   
>   #define DEV_NAME "BBDEV"
>   
> +/* Number of supported operation types */
> +#define BBDEV_OP_TYPE_COUNT 5
>   
>   /* BBDev library logging ID */
>   RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE);
> @@ -890,10 +892,10 @@ struct rte_mempool *
>   		return NULL;
>   	}
>   
> -	if (type >= RTE_BBDEV_OP_TYPE_COUNT) {
> +	if (type >= BBDEV_OP_TYPE_COUNT) {
>   		rte_bbdev_log(ERR,
>   				"Invalid op type (%u), should be less than %u",
> -				type, RTE_BBDEV_OP_TYPE_COUNT);
> +				type, BBDEV_OP_TYPE_COUNT);
>   		return NULL;
>   	}
>   
> @@ -1122,10 +1124,9 @@ struct rte_mempool *
>   		"RTE_BBDEV_OP_TURBO_DEC",
>   		"RTE_BBDEV_OP_TURBO_ENC",
>   		"RTE_BBDEV_OP_LDPC_DEC",
> -		"RTE_BBDEV_OP_LDPC_ENC",
>   	};
>   
> -	if (op_type < RTE_BBDEV_OP_TYPE_COUNT)
> +	if (op_type < BBDEV_OP_TYPE_COUNT)
>   		return op_types[op_type];
>   
>   	rte_bbdev_log(ERR, "Invalid operation type");
> diff --git a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h
> index 6d56133..cd82418 100644
> --- a/lib/bbdev/rte_bbdev_op.h
> +++ b/lib/bbdev/rte_bbdev_op.h
> @@ -748,7 +748,7 @@ enum rte_bbdev_op_type {
>   	RTE_BBDEV_OP_TURBO_ENC,  /**< Turbo encode */
>   	RTE_BBDEV_OP_LDPC_DEC,  /**< LDPC decode */
>   	RTE_BBDEV_OP_LDPC_ENC,  /**< LDPC encode */
> -	RTE_BBDEV_OP_TYPE_COUNT,  /**< Count of different op types */

Why not keep this enum so you don't have to make the BBDEV_OP_TYPE_COUNT 
#define ?

Tom

> +	RTE_BBDEV_OP_TYPE_PADDED_MAX = 8,  /**< Maximum op type number including padding */
>   };
>   
>   /** Bit indexes of possible errors reported through status field */


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v4 2/7] bbdev: add device status info
  2022-07-06  0:23         ` [PATCH v4 2/7] bbdev: add device status info Nicolas Chautru
@ 2022-07-06 15:38           ` Tom Rix
  2022-07-06 21:16             ` Chautru, Nicolas
  0 siblings, 1 reply; 174+ messages in thread
From: Tom Rix @ 2022-07-06 15:38 UTC (permalink / raw)
  To: Nicolas Chautru, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, bruce.richardson, david.marchand, stephen


On 7/5/22 5:23 PM, Nicolas Chautru wrote:
> Added device status information, so that the PMD can
> expose information related to the underlying accelerator device status.
> Minor order change in structure to fit into padding hole.
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>   drivers/baseband/acc100/rte_acc100_pmd.c           |  1 +
>   drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  1 +
>   drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  1 +
>   drivers/baseband/la12xx/bbdev_la12xx.c             |  1 +
>   drivers/baseband/null/bbdev_null.c                 |  1 +
>   drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  1 +
>   lib/bbdev/rte_bbdev.c                              | 24 +++++++++++++++
>   lib/bbdev/rte_bbdev.h                              | 35 ++++++++++++++++++++--
>   lib/bbdev/version.map                              |  6 ++++
>   9 files changed, 69 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
> index de7e4bc..17ba798 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -1060,6 +1060,7 @@
>   
>   	/* Read and save the populated config from ACC100 registers */
>   	fetch_acc100_config(dev);
> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>   
>   	/* This isn't ideal because it reports the maximum number of queues but
>   	 * does not provide info on how many can be uplink/downlink or different
> diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> index 82ae6ba..57b12af 100644
> --- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> +++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> @@ -369,6 +369,7 @@
>   	dev_info->capabilities = bbdev_capabilities;
>   	dev_info->cpu_flag_reqs = NULL;
>   	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>   
>   	/* Calculates number of queues assigned to device */
>   	dev_info->max_num_queues = 0;
> diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> index 21d3529..2a330c4 100644
> --- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> +++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> @@ -645,6 +645,7 @@ struct __rte_cache_aligned fpga_queue {
>   	dev_info->capabilities = bbdev_capabilities;
>   	dev_info->cpu_flag_reqs = NULL;
>   	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>   
>   	/* Calculates number of queues assigned to device */
>   	dev_info->max_num_queues = 0;
> diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c b/drivers/baseband/la12xx/bbdev_la12xx.c
> index 4d1bd16..c1f88c6 100644
> --- a/drivers/baseband/la12xx/bbdev_la12xx.c
> +++ b/drivers/baseband/la12xx/bbdev_la12xx.c
> @@ -100,6 +100,7 @@ struct bbdev_la12xx_params {
>   	dev_info->capabilities = bbdev_capabilities;
>   	dev_info->cpu_flag_reqs = NULL;
>   	dev_info->min_alignment = 64;
> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>   
>   	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
>   }
> diff --git a/drivers/baseband/null/bbdev_null.c b/drivers/baseband/null/bbdev_null.c
> index 248e129..94a1976 100644
> --- a/drivers/baseband/null/bbdev_null.c
> +++ b/drivers/baseband/null/bbdev_null.c
> @@ -82,6 +82,7 @@ struct bbdev_queue {
>   	 * here for code completeness.
>   	 */
>   	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>   
>   	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
>   }
> diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> index af7bc41..dbc5524 100644
> --- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> +++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> @@ -254,6 +254,7 @@ struct turbo_sw_queue {
>   	dev_info->min_alignment = 64;
>   	dev_info->harq_buffer_size = 0;
>   	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>   
>   	rte_bbdev_log_debug("got device info from %u\n", dev->data->dev_id);
>   }
> diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
> index 22bd894..555bda9 100644
> --- a/lib/bbdev/rte_bbdev.c
> +++ b/lib/bbdev/rte_bbdev.c
> @@ -25,6 +25,8 @@
>   
>   /* Number of supported operation types */
>   #define BBDEV_OP_TYPE_COUNT 5
> +/* Number of supported device status */
> +#define BBDEV_DEV_STATUS_COUNT 9
>   
>   /* BBDev library logging ID */
>   RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE);
> @@ -1132,3 +1134,25 @@ struct rte_mempool *
>   	rte_bbdev_log(ERR, "Invalid operation type");
>   	return NULL;
>   }
> +
> +const char *
> +rte_bbdev_device_status_str(enum rte_bbdev_device_status status)
> +{
> +	static const char * const dev_sta_string[] = {
> +		"RTE_BBDEV_DEV_NOSTATUS",
> +		"RTE_BBDEV_DEV_NOT_SUPPORTED",
> +		"RTE_BBDEV_DEV_RESET",
> +		"RTE_BBDEV_DEV_CONFIGURED",
> +		"RTE_BBDEV_DEV_ACTIVE",
> +		"RTE_BBDEV_DEV_FATAL_ERR",
> +		"RTE_BBDEV_DEV_RESTART_REQ",
> +		"RTE_BBDEV_DEV_RECONFIG_REQ",
> +		"RTE_BBDEV_DEV_CORRECT_ERR",
> +	};
> +
> +	if (status < BBDEV_DEV_STATUS_COUNT)
> +		return dev_sta_string[status];
> +
> +	rte_bbdev_log(ERR, "Invalid device status");
> +	return NULL;
> +}
> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
> index b88c881..9b1ffa4 100644
> --- a/lib/bbdev/rte_bbdev.h
> +++ b/lib/bbdev/rte_bbdev.h
> @@ -223,6 +223,21 @@ struct rte_bbdev_queue_conf {
>   int
>   rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
>   
> +/**
> + * Flags indicate the status of the device
> + */
> +enum rte_bbdev_device_status {
> +	RTE_BBDEV_DEV_NOSTATUS,        /**< Nothing being reported */
> +	RTE_BBDEV_DEV_NOT_SUPPORTED,   /**< Device status is not supported on the PMD */
If this was 0, you may not need to explicitly set.
> +	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-configured state */
> +	RTE_BBDEV_DEV_CONFIGURED,      /**< Device is configured and ready to use */
> +	RTE_BBDEV_DEV_ACTIVE,          /**< Device is configured and VF is being used */
> +	RTE_BBDEV_DEV_FATAL_ERR,       /**< Device has hit a fatal uncorrectable error */
> +	RTE_BBDEV_DEV_RESTART_REQ,     /**< Device requires application to restart */
> +	RTE_BBDEV_DEV_RECONFIG_REQ,    /**< Device requires application to reconfigure queues */
> +	RTE_BBDEV_DEV_CORRECT_ERR,     /**< Warning of a correctable error event happened */
Last patch was padded, do something consistent here.
> +};
> +
>   /** Device statistics. */
>   struct rte_bbdev_stats {
>   	uint64_t enqueued_count;  /**< Count of all operations enqueued */
> @@ -285,12 +300,14 @@ struct rte_bbdev_driver_info {
>   	/** Set if device supports per-queue interrupts */
>   	bool queue_intr_supported;
>   	/** Minimum alignment of buffers, in bytes */
> -	uint16_t min_alignment;
> -	/** HARQ memory available in kB */
> +	/** Device Status */
> +	enum rte_bbdev_device_status device_status;

New elements should be added to the end to improve backward compatibility.

Tom

>   	uint32_t harq_buffer_size;
>   	/** Byte endianness (RTE_BIG_ENDIAN/RTE_LITTLE_ENDIAN) supported
>   	 *  for input/output data
>   	 */
> +	uint16_t min_alignment;
> +	/** HARQ memory available in kB */
>   	uint8_t data_endianness;
>   	/** Default queue configuration used if none is supplied  */
>   	struct rte_bbdev_queue_conf default_queue_conf;
> @@ -827,6 +844,20 @@ typedef void (*rte_bbdev_cb_fn)(uint16_t dev_id,
>   rte_bbdev_queue_intr_ctl(uint16_t dev_id, uint16_t queue_id, int epfd, int op,
>   		void *data);
>   
> +/**
> + * Converts device status from enum to string
> + *
> + * @param status
> + *   Device status as enum
> + *
> + * @returns
> + *   Operation type as string or NULL if op_type is invalid
> + *
> + */
> +__rte_experimental
> +const char*
> +rte_bbdev_device_status_str(enum rte_bbdev_device_status status);
> +
>   #ifdef __cplusplus
>   }
>   #endif
> diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map
> index cce3f3c..9ac3643 100644
> --- a/lib/bbdev/version.map
> +++ b/lib/bbdev/version.map
> @@ -39,3 +39,9 @@ DPDK_22 {
>   
>   	local: *;
>   };
> +
> +EXPERIMENTAL {
> +	global:
> +
> +	rte_bbdev_device_status_str;
> +};


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v4 3/7] bbdev: add device info on queue topology
  2022-07-06  0:23         ` [PATCH v4 3/7] bbdev: add device info on queue topology Nicolas Chautru
@ 2022-07-06 16:06           ` Tom Rix
  2022-07-06 21:12             ` Chautru, Nicolas
  0 siblings, 1 reply; 174+ messages in thread
From: Tom Rix @ 2022-07-06 16:06 UTC (permalink / raw)
  To: Nicolas Chautru, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, bruce.richardson, david.marchand, stephen


On 7/5/22 5:23 PM, Nicolas Chautru wrote:
> Adding more options in the API to expose the number
> of queues exposed and related priority.
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>   lib/bbdev/rte_bbdev.h | 4 ++++
>   1 file changed, 4 insertions(+)
>
> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
> index 9b1ffa4..ac941d6 100644
> --- a/lib/bbdev/rte_bbdev.h
> +++ b/lib/bbdev/rte_bbdev.h
> @@ -289,6 +289,10 @@ struct rte_bbdev_driver_info {
>   
>   	/** Maximum number of queues supported by the device */
>   	unsigned int max_num_queues;
> +	/** Maximum number of queues supported per operation type */
> +	unsigned int num_queues[RTE_BBDEV_OP_TYPE_PADDED_MAX];
> +	/** Priority level supported per operation type */
> +	unsigned int queue_priority[RTE_BBDEV_OP_TYPE_PADDED_MAX];

It is better to add new elements to the end of a structure for better 
backward compatiblity

Tom

>   	/** Queue size limit (queue size must also be power of 2) */
>   	uint32_t queue_size_lim;
>   	/** Set if device off-loads operation to hardware  */


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v4 4/7] drivers/baseband: update PMDs to expose queue per operation
  2022-07-06  0:23         ` [PATCH v4 4/7] drivers/baseband: update PMDs to expose queue per operation Nicolas Chautru
@ 2022-07-06 16:15           ` Tom Rix
  2022-07-06 21:10             ` Chautru, Nicolas
  0 siblings, 1 reply; 174+ messages in thread
From: Tom Rix @ 2022-07-06 16:15 UTC (permalink / raw)
  To: Nicolas Chautru, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, bruce.richardson, david.marchand, stephen


On 7/5/22 5:23 PM, Nicolas Chautru wrote:
> Add support in existing bbdev PMDs for the explicit number of queue
> and priority for each operation type configured on the device.
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>   drivers/baseband/acc100/rte_acc100_pmd.c           | 29 +++++++++++++---------
>   drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  8 ++++++
>   drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  8 ++++++
>   drivers/baseband/la12xx/bbdev_la12xx.c             |  7 ++++++
>   drivers/baseband/turbo_sw/bbdev_turbo_software.c   | 11 ++++++++
>   5 files changed, 51 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
> index 17ba798..d568d0d 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -966,6 +966,7 @@
>   		struct rte_bbdev_driver_info *dev_info)
>   {
>   	struct acc100_device *d = dev->data->dev_private;
> +	int i;
>   
>   	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
>   		{
> @@ -1062,19 +1063,23 @@
>   	fetch_acc100_config(dev);
>   	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>   
> -	/* This isn't ideal because it reports the maximum number of queues but
> -	 * does not provide info on how many can be uplink/downlink or different
> -	 * priorities
> -	 */
> -	dev_info->max_num_queues =
> -			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
> -			d->acc100_conf.q_dl_5g.num_qgroups +
> -			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
> -			d->acc100_conf.q_ul_5g.num_qgroups +
> -			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
> -			d->acc100_conf.q_dl_4g.num_qgroups +
> -			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
> +	/* Expose number of queues */
> +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = d->acc100_conf.q_ul_4g.num_aqs_per_groups *
>   			d->acc100_conf.q_ul_4g.num_qgroups;
> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = d->acc100_conf.q_dl_4g.num_aqs_per_groups *
> +			d->acc100_conf.q_dl_4g.num_qgroups;
> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = d->acc100_conf.q_ul_5g.num_aqs_per_groups *
> +			d->acc100_conf.q_ul_5g.num_qgroups;
> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = d->acc100_conf.q_dl_5g.num_aqs_per_groups *
> +			d->acc100_conf.q_dl_5g.num_qgroups;
> +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = d->acc100_conf.q_ul_4g.num_qgroups;
> +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = d->acc100_conf.q_dl_4g.num_qgroups;
> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = d->acc100_conf.q_ul_5g.num_qgroups;
> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = d->acc100_conf.q_dl_5g.num_qgroups;
> +	dev_info->max_num_queues = 0;
> +	for (i = RTE_BBDEV_OP_TURBO_DEC; i < RTE_BBDEV_OP_LDPC_ENC; i++)

should this be i <=  ?


> +		dev_info->max_num_queues += dev_info->num_queues[i];
>   	dev_info->queue_size_lim = ACC100_MAX_QUEUE_DEPTH;
>   	dev_info->hardware_accelerated = true;
>   	dev_info->max_dl_queue_priority =
> diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> index 57b12af..b4982af 100644
> --- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> +++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> @@ -379,6 +379,14 @@
>   		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
>   			dev_info->max_num_queues++;
>   	}
> +	/* Expose number of queue per operation type */
> +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = dev_info->max_num_queues / 2;
> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = dev_info->max_num_queues / 2;
> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
>   }
>   
>   /**
> diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> index 2a330c4..dc7f479 100644
> --- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> +++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> @@ -655,6 +655,14 @@ struct __rte_cache_aligned fpga_queue {
>   		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
>   			dev_info->max_num_queues++;
>   	}
> +	/* Expose number of queue per operation type */
> +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = dev_info->max_num_queues / 2;
> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = dev_info->max_num_queues / 2;
> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = 0;
> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = 0;
> +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = 1;
> +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = 1;
>   }
>   
>   /**
> diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c b/drivers/baseband/la12xx/bbdev_la12xx.c
> index c1f88c6..e99ea9a 100644
> --- a/drivers/baseband/la12xx/bbdev_la12xx.c
> +++ b/drivers/baseband/la12xx/bbdev_la12xx.c
> @@ -102,6 +102,13 @@ struct bbdev_la12xx_params {
>   	dev_info->min_alignment = 64;
>   	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>   
> +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = LA12XX_MAX_QUEUES / 2;
> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = LA12XX_MAX_QUEUES / 2;
> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
>   	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
>   }
>   
> diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> index dbc5524..647e706 100644
> --- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> +++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> @@ -256,6 +256,17 @@ struct turbo_sw_queue {
>   	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
>   	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>   
> +	const struct rte_bbdev_op_cap *op_cap = bbdev_capabilities;

Should this be done through dev instead of assigning directly ?

Tom

> +	int num_op_type = 0;
> +	for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
> +		num_op_type++;
> +	op_cap = bbdev_capabilities;
> +	if (num_op_type > 0) {
> +		int num_queue_per_type = dev_info->max_num_queues / num_op_type;
> +		for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
> +			dev_info->num_queues[op_cap->type] = num_queue_per_type;
> +	}
> +
>   	rte_bbdev_log_debug("got device info from %u\n", dev->data->dev_id);
>   }
>   


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v4 5/7] bbdev: add new operation for FFT processing
  2022-07-06  0:23         ` [PATCH v4 5/7] bbdev: add new operation for FFT processing Nicolas Chautru
@ 2022-07-06 18:47           ` Tom Rix
  2022-07-06 21:04             ` Chautru, Nicolas
  0 siblings, 1 reply; 174+ messages in thread
From: Tom Rix @ 2022-07-06 18:47 UTC (permalink / raw)
  To: Nicolas Chautru, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, bruce.richardson, david.marchand, stephen


On 7/5/22 5:23 PM, Nicolas Chautru wrote:
> Extension of bbdev operation to support FFT based operations.
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
> ---
>   doc/guides/prog_guide/bbdev.rst | 130 +++++++++++++++++++++++++++++++++++
>   lib/bbdev/rte_bbdev.c           |  11 ++-
>   lib/bbdev/rte_bbdev.h           |  76 ++++++++++++++++++++
>   lib/bbdev/rte_bbdev_op.h        | 149 ++++++++++++++++++++++++++++++++++++++++
>   lib/bbdev/version.map           |   4 ++
>   5 files changed, 369 insertions(+), 1 deletion(-)
>
> diff --git a/doc/guides/prog_guide/bbdev.rst b/doc/guides/prog_guide/bbdev.rst
> index 70fa01a..4a055b5 100644
> --- a/doc/guides/prog_guide/bbdev.rst
> +++ b/doc/guides/prog_guide/bbdev.rst
> @@ -1118,6 +1118,136 @@ Figure :numref:`figure_turbo_tb_decode` above
>   showing the Turbo decoding of CBs using BBDEV interface in TB-mode
>   is also valid for LDPC decode.
>   
> +BBDEV FFT Operation
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +This operation allows to run a combination of DFT and/or IDFT and/or time-domain windowing.
> +These can be used in a modular fashion (using bypass modes) or as a processing pipeline
> +which can be used for FFT-based baseband signal processing.
> +In more details it allows :
> +- to process the data first through an IDFT of adjustable size and padding;
> +- to perform the windowing as a programmable cyclic shift offset of the data followed by a
> +pointwise multiplication by a time domain window;
> +- to process the related data through a DFT of adjustable size and depadding for each such cyclic
> +shift output.
> +
> +A flexible number of Rx antennas are being processed in parallel with the same configuration.
> +The API allows more generally for flexibility in what the PMD may support (cabability flags) and
> +flexibility to adjust some of the parameters of the processing.
> +
> +The operation/capability flags that can be set for each FFT operation are given below.
> +
> +  **NOTE:** The actual operation flags that may be used with a specific
> +  BBDEV PMD are dependent on the driver capabilities as reported via
> +  ``rte_bbdev_info_get()``, and may be a subset of those below.
> +
> ++--------------------------------------------------------------------+
> +|Description of FFT capability flags                                 |
> ++====================================================================+
> +|RTE_BBDEV_FFT_WINDOWING                                             |
> +| Set to enable/support windowing in time domain                     |
> ++--------------------------------------------------------------------+
> +|RTE_BBDEV_FFT_CS_ADJUSTMENT                                         |
> +| Set to enable/support  the cyclic shift time offset adjustment     |
> ++--------------------------------------------------------------------+
> +|RTE_BBDEV_FFT_DFT_BYPASS                                            |
> +| Set to bypass the DFT and use directly the IDFT as an option       |
> ++--------------------------------------------------------------------+
> +|RTE_BBDEV_FFT_IDFT_BYPASS                                           |
> +| Set to bypass the IDFT and use directly the DFT as an option       |
> ++--------------------------------------------------------------------+
> +|RTE_BBDEV_FFT_WINDOWING_BYPASS                                      |
> +| Set to bypass the time domain windowing  as an option              |
> ++--------------------------------------------------------------------+
> +|RTE_BBDEV_FFT_POWER_MEAS

Other flags are not truncated, should be

RTE_BBDEV_FFT_POWER_MEASUREMENT

>                                              |
> +| Set to provide an optional power measument of the DFT output       |
> ++--------------------------------------------------------------------+
measurement
> +|RTE_BBDEV_FFT_FP16_INPUT                                            |
> +| Set if the input data shall use FP16 format instead of INT16       |
> ++--------------------------------------------------------------------+
> +|RTE_BBDEV_FFT_FP16_OUTPUT                                           |
> +| Set if the output data shall use FP16 format instead of INT16      |
> ++--------------------------------------------------------------------+
> +
> +The structure passed for each FFT operation is given below,
> +with the operation flags forming a bitmask in the ``op_flags`` field.
> +
> +.. code-block:: c
> +
> +    struct rte_bbdev_op_fft {
> +        struct rte_bbdev_op_data base_input;
> +        struct rte_bbdev_op_data base_output;
> +        struct rte_bbdev_op_data power_meas_output;
similar to above, meas -> measurement
> +        uint32_t op_flags;
> +        uint16_t input_sequence_size;
Could these be future proofed by increasing small int size's to uint32_t ?
> +        uint16_t input_leading_padding;
> +        uint16_t output_sequence_size;
> +        uint16_t output_leading_depadding;
> +        uint8_t window_index[RTE_BBDEV_MAX_CS_2];
> +        uint16_t cs_bitmap;
> +        uint8_t num_antennas_log2;
> +        uint8_t idft_log2;
> +        uint8_t dft_log2;
is _log2 needed in variable name if it is documenation ?
> +        int8_t cs_time_adjustment;
> +        int8_t idft_shift;
> +        int8_t dft_shift;
> +        uint16_t ncs_reciprocal;
> +        uint16_t power_shift;
> +        uint16_t fp16_exp_adjust;
> +    };
> +
> +The FFT parameters are set out in the table below.
> +
> ++----------------------+--------------------------------------------------------------+
> +|Parameter             |Description                                                   |
> ++======================+==============================================================+
> +|base_input            |input data                                                    |
> ++----------------------+--------------------------------------------------------------+
> +|base_output           |output data                                                   |
> ++----------------------+--------------------------------------------------------------+
> +|power_meas_output     |optional output data with power measurement on DFT output     |
> ++----------------------+--------------------------------------------------------------+
> +|op_flags              |bitmask of all active operation capabilities                  |
> ++----------------------+--------------------------------------------------------------+
> +|input_sequence_size   |size of the input sequence in 32-bits points per antenna      |
> ++----------------------+--------------------------------------------------------------+
> +|input_leading_padding |number of points padded at the start of input data            |
> ++----------------------+--------------------------------------------------------------+
> +|output_sequence_size  |size of the output sequence per antenna and cyclic shift      |
> ++----------------------+--------------------------------------------------------------+
> +|output_depadding      |number of points depadded at the start of output data         |
> ++----------------------+--------------------------------------------------------------+
output_leading_depadding
> +|window_index          |optional windowing profile index used for each cyclic shift   |
> ++----------------------+--------------------------------------------------------------+
> +|cs_bitmap             |bitmap of the cyclic shift output requested (LSB for index 0) |
> ++----------------------+--------------------------------------------------------------+
> +|num_antennas_log2     |number of antennas as a log2 (10 maps to 1024...)             |
> ++----------------------+--------------------------------------------------------------+
> +|idft_log2             |iDFT size as a log2                                           |
> ++----------------------+--------------------------------------------------------------+
> +|dft_log2              |DFT size as a log2                                            |
> ++----------------------+--------------------------------------------------------------+
> +|cs_time_adjustment    |adjustment of time position of all the cyclic shift output    |
> ++----------------------+--------------------------------------------------------------+
> +|idft_shift            |shift down of signal level post iDFT                          |
> ++----------------------+--------------------------------------------------------------+
> +|dft_shift             |shift down of signal level post DFT                           |
> ++----------------------+--------------------------------------------------------------+
> +|ncs_reciprocal        |inverse of max number of CS normalized to 15b (ie. 231 for 12)|
> ++----------------------+--------------------------------------------------------------+
> +|power_shift           |shift down of level of power measurement when enabled         |
> ++----------------------+--------------------------------------------------------------+
> +|fp16_exp_adjust       |value added to FP16 exponent at conversion from INT16         |
> ++----------------------+--------------------------------------------------------------+
> +
> +The mbuf input ``base_input`` is mandatory for all BBDEV PMDs and is the
> +incoming data for the processing. Its size may not fit into an actual mbuf, but the
> +structure is used to pass iova address.
> +The mbuf output ``output`` is mandatory and is output of the FFT processing chain.
> +Each point is a complex number of 32bits : either as 2 INT16 or as 2 FP16 based when the option
> +supported.
> +The data layout is based on contiguous concatenation of output data first by cyclic shift then
> +by antenna.
>   
>   Sample code
>   -----------
> diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
> index 555bda9..28b105d 100644
> --- a/lib/bbdev/rte_bbdev.c
> +++ b/lib/bbdev/rte_bbdev.c
> @@ -24,7 +24,7 @@
>   #define DEV_NAME "BBDEV"
>   
>   /* Number of supported operation types */
> -#define BBDEV_OP_TYPE_COUNT 5
> +#define BBDEV_OP_TYPE_COUNT 6
>   /* Number of supported device status */
>   #define BBDEV_DEV_STATUS_COUNT 9
>   
> @@ -854,6 +854,9 @@ struct rte_bbdev *
>   	case RTE_BBDEV_OP_LDPC_ENC:
>   		result = sizeof(struct rte_bbdev_enc_op);
>   		break;
> +	case RTE_BBDEV_OP_FFT:
> +		result = sizeof(struct rte_bbdev_fft_op);
> +		break;
>   	default:
>   		break;
>   	}
> @@ -877,6 +880,10 @@ struct rte_bbdev *
>   		struct rte_bbdev_enc_op *op = element;
>   		memset(op, 0, mempool->elt_size);
>   		op->mempool = mempool;
> +	} else if (type == RTE_BBDEV_OP_FFT) {
> +		struct rte_bbdev_fft_op *op = element;
> +		memset(op, 0, mempool->elt_size);
> +		op->mempool = mempool;
>   	}
>   }
>   
> @@ -1126,6 +1133,8 @@ struct rte_mempool *
>   		"RTE_BBDEV_OP_TURBO_DEC",
>   		"RTE_BBDEV_OP_TURBO_ENC",
>   		"RTE_BBDEV_OP_LDPC_DEC",
> +		"RTE_BBDEV_OP_LDPC_ENC",
Why ldpc_enc line, this is already in codebase ?
> +		"RTE_BBDEV_OP_FFT",
>   	};
>   
>   	if (op_type < BBDEV_OP_TYPE_COUNT)
> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
> index ac941d6..ed528b8 100644
> --- a/lib/bbdev/rte_bbdev.h
> +++ b/lib/bbdev/rte_bbdev.h
> @@ -401,6 +401,12 @@ typedef uint16_t (*rte_bbdev_enqueue_dec_ops_t)(
>   		struct rte_bbdev_dec_op **ops,
>   		uint16_t num);
>   
> +/** @internal Enqueue fft operations for processing on queue of a device. */
> +typedef uint16_t (*rte_bbdev_enqueue_fft_ops_t)(
> +		struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_fft_op **ops,
> +		uint16_t num);
> +
>   /** @internal Dequeue encode operations from a queue of a device. */
>   typedef uint16_t (*rte_bbdev_dequeue_enc_ops_t)(
>   		struct rte_bbdev_queue_data *q_data,
> @@ -411,6 +417,11 @@ typedef uint16_t (*rte_bbdev_dequeue_dec_ops_t)(
>   		struct rte_bbdev_queue_data *q_data,
>   		struct rte_bbdev_dec_op **ops, uint16_t num);
>   
> +/** @internal Dequeue fft operations from a queue of a device. */
> +typedef uint16_t (*rte_bbdev_dequeue_fft_ops_t)(
> +		struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_fft_op **ops, uint16_t num);
> +
>   #define RTE_BBDEV_NAME_MAX_LEN  64  /**< Max length of device name */
>   
>   /**
> @@ -459,6 +470,10 @@ struct __rte_cache_aligned rte_bbdev {
>   	rte_bbdev_dequeue_enc_ops_t dequeue_ldpc_enc_ops;
>   	/** Dequeue decode function */
>   	rte_bbdev_dequeue_dec_ops_t dequeue_ldpc_dec_ops;
> +	/** Enqueue FFT function */
> +	rte_bbdev_enqueue_fft_ops_t enqueue_fft_ops;
> +	/** Dequeue FFT function */
> +	rte_bbdev_dequeue_fft_ops_t dequeue_fft_ops;
>   	const struct rte_bbdev_ops *dev_ops;  /**< Functions exported by PMD */
>   	struct rte_bbdev_data *data;  /**< Pointer to device data */
>   	enum rte_bbdev_state state;  /**< If device is currently used or not */
> @@ -591,6 +606,36 @@ struct __rte_cache_aligned rte_bbdev {
>   	return dev->enqueue_ldpc_dec_ops(q_data, ops, num_ops);
>   }
>   
> +/**
> + * Enqueue a burst of fft operations to a queue of the device.
> + * This functions only enqueues as many operations as currently possible and
> + * does not block until @p num_ops entries in the queue are available.
> + * This function does not provide any error notification to avoid the
> + * corresponding overhead.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param queue_id
> + *   The index of the queue.
> + * @param ops
> + *   Pointer array containing operations to be enqueued Must have at least
> + *   @p num_ops entries
> + * @param num_ops
> + *   The maximum number of operations to enqueue.
> + *
> + * @return
> + *   The number of operations actually enqueued (this is the number of processed
> + *   entries in the @p ops array).
> + */
> +__rte_experimental
> +static inline uint16_t
> +rte_bbdev_enqueue_fft_ops(uint16_t dev_id, uint16_t queue_id,
> +		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
> +{
> +	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
Who checks the input is valid ?
> +	struct rte_bbdev_queue_data *q_data = &dev->data->queues[queue_id];
> +	return dev->enqueue_fft_ops(q_data, ops, num_ops);
> +}
>   
>   /**
>    * Dequeue a burst of processed encode operations from a queue of the device.
> @@ -716,6 +761,37 @@ struct __rte_cache_aligned rte_bbdev {
>   	return dev->dequeue_ldpc_dec_ops(q_data, ops, num_ops);
>   }
>   
> +/**
> + * Dequeue a burst of fft operations from a queue of the device.
> + * This functions returns only the current contents of the queue, and does not
> + * block until @ num_ops is available.
> + * This function does not provide any error notification to avoid the
> + * corresponding overhead.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param queue_id
> + *   The index of the queue.
> + * @param ops
> + *   Pointer array where operations will be dequeued to. Must have at least
> + *   @p num_ops entries
> + * @param num_ops
> + *   The maximum number of operations to dequeue.
> + *
> + * @return
> + *   The number of operations actually dequeued (this is the number of entries
> + *   copied into the @p ops array).
> + */
> +__rte_experimental
> +static inline uint16_t
> +rte_bbdev_dequeue_fft_ops(uint16_t dev_id, uint16_t queue_id,
> +		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
> +{
> +	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
> +	struct rte_bbdev_queue_data *q_data = &dev->data->queues[queue_id];
> +	return dev->dequeue_fft_ops(q_data, ops, num_ops);
> +}
> +
>   /** Definitions of device event types */
>   enum rte_bbdev_event_type {
>   	RTE_BBDEV_EVENT_UNKNOWN,  /**< unknown event type */
> diff --git a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h
> index cd82418..3e46f1d 100644
> --- a/lib/bbdev/rte_bbdev_op.h
> +++ b/lib/bbdev/rte_bbdev_op.h
> @@ -47,6 +47,8 @@
>   #define RTE_BBDEV_TURBO_MAX_CODE_BLOCKS (64)
>   /* LDPC:  Maximum number of Code Blocks in Transport Block.*/
>   #define RTE_BBDEV_LDPC_MAX_CODE_BLOCKS (256)
> +/* 12 CS maximum */
> +#define RTE_BBDEV_MAX_CS_2 (6)
>   
>   /** Flags for turbo decoder operation and capability structure */
>   enum rte_bbdev_op_td_flag_bitmasks {
> @@ -211,6 +213,26 @@ enum rte_bbdev_op_ldpcenc_flag_bitmasks {
>   	RTE_BBDEV_LDPC_ENC_CONCATENATION = (1ULL << 7)
>   };
>   
> +/** Flags for DFT operation and capability structure */
> +enum rte_bbdev_op_fft_flag_bitmasks {
> +	/** Flexible windowing capability */
> +	RTE_BBDEV_FFT_WINDOWING = (1ULL << 0),
> +	/** Flexible adjustment of Cyclic Shift time offset */
> +	RTE_BBDEV_FFT_CS_ADJUSTMENT = (1ULL << 1),
> +	/** Set for bypass the DFT and get directly into iDFT input */
> +	RTE_BBDEV_FFT_DFT_BYPASS = (1ULL << 2),
> +	/** Set for bypass the IDFT and get directly the DFT output */
> +	RTE_BBDEV_FFT_IDFT_BYPASS = (1ULL << 3),
> +	/** Set for bypass time domain windowing */
> +	RTE_BBDEV_FFT_WINDOWING_BYPASS = (1ULL << 4),
> +	/** Set for optional power measurement on DFT output */
> +	RTE_BBDEV_FFT_POWER_MEAS = (1ULL << 5),
Meas here too, change generally
> +	/** Set if the input data used FP16 format */
> +	RTE_BBDEV_FFT_FP16_INPUT = (1ULL << 6),

What are the other data type(s) ?

The default is not mentioned, or i missed it.

> +	/**  Set if the output data uses FP16 format  */
> +	RTE_BBDEV_FFT_FP16_OUTPUT = (1ULL << 7)
> +};
> +
>   /** Flags for the Code Block/Transport block mode  */
>   enum rte_bbdev_op_cb_mode {
>   	/** One operation is one or fraction of one transport block  */
> @@ -689,6 +711,55 @@ struct rte_bbdev_op_ldpc_enc {
>   	};
>   };
>   
> +/** Operation structure for FFT processing.
> + *
> + * The operation processes the data for multiple antennas in a single call
> + * (.i.e for all the REs belonging to a given SRS sequence for instance)
> + *
> + * The output mbuf data structure is expected to be allocated by the
> + * application with enough room for the output data.
> + */
> +struct rte_bbdev_op_fft {
> +	/** Input data starting from first antenna */
> +	struct rte_bbdev_op_data base_input;
> +	/** Output data starting from first antenna and first cyclic shift */
> +	struct rte_bbdev_op_data base_output;
> +	/** Optional power measurement output data */
> +	struct rte_bbdev_op_data power_meas_output;
> +	/** Flags from rte_bbdev_op_fft_flag_bitmasks */
> +	uint32_t op_flags;
> +	/** Input sequence size in 32-bits points */
> +	uint16_t input_sequence_size;
size is bytes*4 ? how does this work with fp16 ?
> +	/** Padding at the start of the sequence */
> +	uint16_t input_leading_padding;
> +	/** Output sequence size in 32-bits points */
> +	uint16_t output_sequence_size;
> +	/** Depadding at the start of the DFT output */
> +	uint16_t output_leading_depadding;
> +	/** Window index being used for each cyclic shift output */
> +	uint8_t window_index[RTE_BBDEV_MAX_CS_2];
> +	/** Bitmap of the cyclic shift output requested */
> +	uint16_t cs_bitmap;
> +	/** Number of antennas as a log2 – 8 to 128 */
> +	uint8_t num_antennas_log2;
> +	/** iDFT size as a log2 - 32 to 2048 */
> +	uint8_t idft_log2;
> +	/** DFT size as a log2 - 8 to 2048 */
> +	uint8_t dft_log2;
> +	/** Adjustment of position of the cyclic shifts - -31 to 31 */
> +	int8_t cs_time_adjustment;
> +	/** iDFT shift down */
> +	int8_t idft_shift;
> +	/** DFT shift down */
> +	int8_t dft_shift;
> +	/** NCS reciprocal factor  */
> +	uint16_t ncs_reciprocal;
> +	/** power measurement out shift down */
> +	uint16_t power_shift;
> +	/** Adjust the FP6 exponent for INT<->FP16 conversion */
> +	uint16_t fp16_exp_adjust;
> +};
> +
>   /** List of the capabilities for the Turbo Decoder */
>   struct rte_bbdev_op_cap_turbo_dec {
>   	/** Flags from rte_bbdev_op_td_flag_bitmasks */
> @@ -741,6 +812,16 @@ struct rte_bbdev_op_cap_ldpc_enc {
>   	uint16_t num_buffers_dst;
>   };
>   
> +/** List of the capabilities for the FFT */
> +struct rte_bbdev_op_cap_fft {
> +	/** Flags from rte_bbdev_op_ldpcenc_flag_bitmasks */
you mean 'from rte_bbdev_op_fft_flag_bitmasks' ?
> +	uint32_t capability_flags;
> +	/** Num input code block buffers */
> +	uint16_t num_buffers_src;
> +	/** Num output code block buffers */
> +	uint16_t num_buffers_dst;
> +};
> +
>   /** Different operation types supported by the device */
>   enum rte_bbdev_op_type {
>   	RTE_BBDEV_OP_NONE,  /**< Dummy operation that does nothing */
> @@ -748,6 +829,7 @@ enum rte_bbdev_op_type {
>   	RTE_BBDEV_OP_TURBO_ENC,  /**< Turbo encode */
>   	RTE_BBDEV_OP_LDPC_DEC,  /**< LDPC decode */
>   	RTE_BBDEV_OP_LDPC_ENC,  /**< LDPC encode */
> +	RTE_BBDEV_OP_FFT,  /**< FFT */
>   	RTE_BBDEV_OP_TYPE_PADDED_MAX = 8,  /**< Maximum op type number including padding */
>   };
>   
> @@ -791,6 +873,18 @@ struct rte_bbdev_dec_op {
>   	};
>   };
>   
> +/** Structure specifying a single fft operation */
> +struct rte_bbdev_fft_op {
> +	/** Status of operation that was performed */
> +	int status;
> +	/** Mempool which op instance is in */
> +	struct rte_mempool *mempool;
> +	/** Opaque pointer for user data */
> +	void *opaque_data;
> +	/** Contains turbo decoder specific parameters */
> +	struct rte_bbdev_op_fft fft;
> +};
> +
>   /** Operation capabilities supported by a device */
>   struct rte_bbdev_op_cap {
>   	enum rte_bbdev_op_type type;  /**< Type of operation */
> @@ -799,6 +893,7 @@ struct rte_bbdev_op_cap {
>   		struct rte_bbdev_op_cap_turbo_enc turbo_enc;
>   		struct rte_bbdev_op_cap_ldpc_dec ldpc_dec;
>   		struct rte_bbdev_op_cap_ldpc_enc ldpc_enc;
> +		struct rte_bbdev_op_cap_fft fft;
>   	} cap;  /**< Operation-type specific capabilities */
>   };
>   
> @@ -918,6 +1013,42 @@ struct rte_mempool *
>   }
>   
>   /**
> + * Bulk allocate fft operations from a mempool with parameter defaults reset.
> + *
> + * @param mempool
> + *   Operation mempool, created by rte_bbdev_op_pool_create().
> + * @param ops
> + *   Output array to place allocated operations
> + * @param num_ops
> + *   Number of operations to allocate
> + *
> + * @returns
> + *   - 0 on success
> + *   - EINVAL if invalid mempool is provided
> + */
> +__rte_experimental
> +static inline int
> +rte_bbdev_fft_op_alloc_bulk(struct rte_mempool *mempool,
> +		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
> +{
> +	struct rte_bbdev_op_pool_private *priv;
> +	int ret;
> +
> +	/* Check type */
> +	priv = (struct rte_bbdev_op_pool_private *)
> +			rte_mempool_get_priv(mempool);
> +	if (unlikely(priv->type != RTE_BBDEV_OP_FFT))
> +		return -EINVAL;
> +
> +	/* Get elements */
> +	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
> +	if (unlikely(ret < 0))
> +		return ret;

if-check is not needed, just

return ret;

and drop the next line

Tom

> +
> +	return 0;
> +}
> +
> +/**
>    * Free decode operation structures that were allocated by
>    * rte_bbdev_dec_op_alloc_bulk().
>    * All structures must belong to the same mempool.
> @@ -951,6 +1082,24 @@ struct rte_mempool *
>   		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops, num_ops);
>   }
>   
> +/**
> + * Free encode operation structures that were allocated by
> + * rte_bbdev_fft_op_alloc_bulk().
> + * All structures must belong to the same mempool.
> + *
> + * @param ops
> + *   Operation structures
> + * @param num_ops
> + *   Number of structures
> + */
> +__rte_experimental
> +static inline void
> +rte_bbdev_fft_op_free_bulk(struct rte_bbdev_fft_op **ops, unsigned int num_ops)
> +{
> +	if (num_ops > 0)
> +		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops, num_ops);
> +}
> +
>   #ifdef __cplusplus
>   }
>   #endif
> diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map
> index 9ac3643..efae50b 100644
> --- a/lib/bbdev/version.map
> +++ b/lib/bbdev/version.map
> @@ -44,4 +44,8 @@ EXPERIMENTAL {
>   	global:
>   
>   	rte_bbdev_device_status_str;
> +	rte_bbdev_enqueue_fft_ops;
> +	rte_bbdev_dequeue_fft_ops;
> +	rte_bbdev_fft_op_alloc_bulk;
> +	rte_bbdev_fft_op_free_bulk;
>   };


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v4 6/7] bbdev: add queue related warning and status information
  2022-07-06  0:23         ` [PATCH v4 6/7] bbdev: add queue related warning and status information Nicolas Chautru
@ 2022-07-06 18:57           ` Tom Rix
  2022-07-06 20:34             ` Chautru, Nicolas
  0 siblings, 1 reply; 174+ messages in thread
From: Tom Rix @ 2022-07-06 18:57 UTC (permalink / raw)
  To: Nicolas Chautru, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, bruce.richardson, david.marchand, stephen


On 7/5/22 5:23 PM, Nicolas Chautru wrote:
> This allows to expose more information with regards to any
> queue related failure and warning which cannot be supported
> in existing API.
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>   app/test-bbdev/test_bbdev_perf.c |  2 ++
>   lib/bbdev/rte_bbdev.c            | 21 +++++++++++++++++++++
>   lib/bbdev/rte_bbdev.h            | 34 ++++++++++++++++++++++++++++++++++
>   lib/bbdev/version.map            |  1 +
>   4 files changed, 58 insertions(+)
>
> diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
> index 1abda2d..653b21f 100644
> --- a/app/test-bbdev/test_bbdev_perf.c
> +++ b/app/test-bbdev/test_bbdev_perf.c
> @@ -4360,6 +4360,8 @@ typedef int (test_case_function)(struct active_device *ad,
>   	stats->dequeued_count = q_stats->dequeued_count;
>   	stats->enqueue_err_count = q_stats->enqueue_err_count;
>   	stats->dequeue_err_count = q_stats->dequeue_err_count;
> +	stats->enqueue_warning_count = q_stats->enqueue_warning_count;
> +	stats->dequeue_warning_count = q_stats->dequeue_warning_count;
>   	stats->acc_offload_cycles = q_stats->acc_offload_cycles;
>   
>   	return 0;
> diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
> index 28b105d..ddad464 100644
> --- a/lib/bbdev/rte_bbdev.c
> +++ b/lib/bbdev/rte_bbdev.c
> @@ -27,6 +27,8 @@
>   #define BBDEV_OP_TYPE_COUNT 6
>   /* Number of supported device status */
>   #define BBDEV_DEV_STATUS_COUNT 9
> +/* Number of supported enqueue status */
> +#define BBDEV_ENQ_STATUS_COUNT 4
>   
>   /* BBDev library logging ID */
>   RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE);
> @@ -723,6 +725,8 @@ struct rte_bbdev *
>   		stats->dequeued_count += q_stats->dequeued_count;
>   		stats->enqueue_err_count += q_stats->enqueue_err_count;
>   		stats->dequeue_err_count += q_stats->dequeue_err_count;
> +		stats->enqueue_warn_count += q_stats->enqueue_warn_count;
> +		stats->dequeue_warn_count += q_stats->dequeue_warn_count;
>   	}
>   	rte_bbdev_log_debug("Got stats on %u", dev->data->dev_id);
>   }
> @@ -1165,3 +1169,20 @@ struct rte_mempool *
>   	rte_bbdev_log(ERR, "Invalid device status");
>   	return NULL;
>   }
> +
> +const char *
> +rte_bbdev_enqueue_status_str(enum rte_bbdev_enqueue_status status)
> +{
> +	static const char * const enq_sta_string[] = {
> +		"RTE_BBDEV_ENQ_STATUS_NONE",
> +		"RTE_BBDEV_ENQ_STATUS_QUEUE_FULL",
> +		"RTE_BBDEV_ENQ_STATUS_RING_FULL",
> +		"RTE_BBDEV_ENQ_STATUS_INVALID_OP",
> +	};
> +
> +	if (status < BBDEV_ENQ_STATUS_COUNT)
Single use of #define, could just be an array size check and remove the 
#define
> +		return enq_sta_string[status];
> +
> +	rte_bbdev_log(ERR, "Invalid enqueue status");
> +	return NULL;
> +}
> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
> index ed528b8..b7ecf94 100644
> --- a/lib/bbdev/rte_bbdev.h
> +++ b/lib/bbdev/rte_bbdev.h
> @@ -224,6 +224,19 @@ struct rte_bbdev_queue_conf {
>   rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
>   
>   /**
> + * Flags indicate the reason why a previous enqueue may not have
> + * consumed all requested operations
> + * In case of multiple reasons the latter superdes a previous one
> + */
> +enum rte_bbdev_enqueue_status {
> +	RTE_BBDEV_ENQ_STATUS_NONE,             /**< Nothing to report */
> +	RTE_BBDEV_ENQ_STATUS_QUEUE_FULL,       /**< Not enough room in queue */
> +	RTE_BBDEV_ENQ_STATUS_RING_FULL,        /**< Not enough room in ring */
> +	RTE_BBDEV_ENQ_STATUS_INVALID_OP,       /**< Operation was rejected as invalid */
> +	RTE_BBDEV_ENQ_STATUS_PADDED_MAX = 6,   /**< Maximum enq status number including padding */
Pad to 8 like the other patch ?
> +};
> +
> +/**
>    * Flags indicate the status of the device
>    */
>   enum rte_bbdev_device_status {
> @@ -246,6 +259,12 @@ struct rte_bbdev_stats {
>   	uint64_t enqueue_err_count;
>   	/** Total error count on operations dequeued */
>   	uint64_t dequeue_err_count;
> +	/** Total warning count on operations enqueued */
> +	uint64_t enqueue_warn_count;
> +	/** Total warning count on operations dequeued */
> +	uint64_t dequeue_warn_count;
> +	/** Total enqueue status count based on rte_bbdev_enqueue_status enum */
> +	uint64_t enqueue_status_count[RTE_BBDEV_ENQ_STATUS_PADDED_MAX];
This element is not used in this patch, is it needed ?
>   	/** CPU cycles consumed by the (HW/SW) accelerator device to offload
>   	 *  the enqueue request to its internal queues.
>   	 *  - For a HW device this is the cycles consumed in MMIO write
> @@ -386,6 +405,7 @@ struct rte_bbdev_queue_data {
>   	void *queue_private;  /**< Driver-specific per-queue data */
>   	struct rte_bbdev_queue_conf conf;  /**< Current configuration */
>   	struct rte_bbdev_stats queue_stats;  /**< Queue statistics */
> +	enum rte_bbdev_enqueue_status enqueue_status; /**< Enqueue status when op is rejected */

This element is not used in this patch, is it needed ?

Tom

>   	bool started;  /**< Queue state */
>   };
>   
> @@ -938,6 +958,20 @@ typedef void (*rte_bbdev_cb_fn)(uint16_t dev_id,
>   const char*
>   rte_bbdev_device_status_str(enum rte_bbdev_device_status status);
>   
> +/**
> + * Converts queue status from enum to string
> + *
> + * @param status
> + *   Queue status as enum
> + *
> + * @returns
> + *  Queue status as string or NULL if op_type is invalid
> + *
> + */
> +__rte_experimental
> +const char*
> +rte_bbdev_enqueue_status_str(enum rte_bbdev_enqueue_status status);
> +
>   #ifdef __cplusplus
>   }
>   #endif
> diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map
> index efae50b..1c06738 100644
> --- a/lib/bbdev/version.map
> +++ b/lib/bbdev/version.map
> @@ -44,6 +44,7 @@ EXPERIMENTAL {
>   	global:
>   
>   	rte_bbdev_device_status_str;
> +	rte_bbdev_enqueue_status_str;
>   	rte_bbdev_enqueue_fft_ops;
>   	rte_bbdev_dequeue_fft_ops;
>   	rte_bbdev_fft_op_alloc_bulk;


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v4 7/7] bbdev: add a lock option for enqueue/dequeue operation
  2022-07-06  0:23         ` [PATCH v4 7/7] bbdev: add a lock option for enqueue/dequeue operation Nicolas Chautru
@ 2022-07-06 19:01           ` Tom Rix
  2022-07-06 19:20             ` Stephen Hemminger
  0 siblings, 1 reply; 174+ messages in thread
From: Tom Rix @ 2022-07-06 19:01 UTC (permalink / raw)
  To: Nicolas Chautru, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, bruce.richardson, david.marchand, stephen


On 7/5/22 5:23 PM, Nicolas Chautru wrote:
> Locking is not explicitly required but can be valuable
> in case the application cannot guarantee to be thread-safe,
> or specifically is at risk of using the same queue from multiple threads.
> This is an option for PMD to use this.
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>   lib/bbdev/rte_bbdev.h | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
> index b7ecf94..8e7ca86 100644
> --- a/lib/bbdev/rte_bbdev.h
> +++ b/lib/bbdev/rte_bbdev.h
> @@ -407,6 +407,8 @@ struct rte_bbdev_queue_data {
>   	struct rte_bbdev_stats queue_stats;  /**< Queue statistics */
>   	enum rte_bbdev_enqueue_status enqueue_status; /**< Enqueue status when op is rejected */
>   	bool started;  /**< Queue state */
> +	rte_rwlock_t lock_enq; /**< lock protection for the Enqueue */
> +	rte_rwlock_t lock_deq; /**< lock protection for the Dequeue */

No.

This is a good idea but needs a use before introducing another element, 
particularly a complicated one like locking

Tom

>   };
>   
>   /** @internal Enqueue encode operations for processing on queue of a device. */


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v4 7/7] bbdev: add a lock option for enqueue/dequeue operation
  2022-07-06 19:01           ` Tom Rix
@ 2022-07-06 19:20             ` Stephen Hemminger
  2022-07-06 20:21               ` Chautru, Nicolas
  0 siblings, 1 reply; 174+ messages in thread
From: Stephen Hemminger @ 2022-07-06 19:20 UTC (permalink / raw)
  To: Tom Rix
  Cc: Nicolas Chautru, dev, thomas, gakhil, hemant.agrawal,
	maxime.coquelin, mdr, bruce.richardson, david.marchand

On Wed, 6 Jul 2022 12:01:19 -0700
Tom Rix <trix@redhat.com> wrote:

> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
> > Locking is not explicitly required but can be valuable
> > in case the application cannot guarantee to be thread-safe,
> > or specifically is at risk of using the same queue from multiple threads.
> > This is an option for PMD to use this.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > ---
> >   lib/bbdev/rte_bbdev.h | 2 ++
> >   1 file changed, 2 insertions(+)
> >
> > diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
> > index b7ecf94..8e7ca86 100644
> > --- a/lib/bbdev/rte_bbdev.h
> > +++ b/lib/bbdev/rte_bbdev.h
> > @@ -407,6 +407,8 @@ struct rte_bbdev_queue_data {
> >   	struct rte_bbdev_stats queue_stats;  /**< Queue statistics */
> >   	enum rte_bbdev_enqueue_status enqueue_status; /**< Enqueue status when op is rejected */
> >   	bool started;  /**< Queue state */
> > +	rte_rwlock_t lock_enq; /**< lock protection for the Enqueue */
> > +	rte_rwlock_t lock_deq; /**< lock protection for the Dequeue */  
> 
> No.
> 
> This is a good idea but needs a use before introducing another element, 
> particularly a complicated one like locking
> 
> Tom

Having two locks on same cacheline will create lots of ping/pong false sharing.

Also, unless the reader is holding the lock for a significant fraction of the time
a regular spin lock will be faster.

^ permalink raw reply	[flat|nested] 174+ messages in thread

* RE: [PATCH v4 7/7] bbdev: add a lock option for enqueue/dequeue operation
  2022-07-06 19:20             ` Stephen Hemminger
@ 2022-07-06 20:21               ` Chautru, Nicolas
  2022-07-07 12:47                 ` Tom Rix
  0 siblings, 1 reply; 174+ messages in thread
From: Chautru, Nicolas @ 2022-07-06 20:21 UTC (permalink / raw)
  To: Stephen Hemminger, Tom Rix
  Cc: dev, thomas, gakhil, hemant.agrawal, maxime.coquelin, mdr,
	Richardson, Bruce, david.marchand

Hi Stephen, Tom., 

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> 
> On Wed, 6 Jul 2022 12:01:19 -0700
> Tom Rix <trix@redhat.com> wrote:
> 
> > On 7/5/22 5:23 PM, Nicolas Chautru wrote:
> > > Locking is not explicitly required but can be valuable in case the
> > > application cannot guarantee to be thread-safe, or specifically is
> > > at risk of using the same queue from multiple threads.
> > > This is an option for PMD to use this.
> > >
> > > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > > ---
> > >   lib/bbdev/rte_bbdev.h | 2 ++
> > >   1 file changed, 2 insertions(+)
> > >
> > > diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
> > > b7ecf94..8e7ca86 100644
> > > --- a/lib/bbdev/rte_bbdev.h
> > > +++ b/lib/bbdev/rte_bbdev.h
> > > @@ -407,6 +407,8 @@ struct rte_bbdev_queue_data {
> > >   	struct rte_bbdev_stats queue_stats;  /**< Queue statistics */
> > >   	enum rte_bbdev_enqueue_status enqueue_status; /**< Enqueue
> status when op is rejected */
> > >   	bool started;  /**< Queue state */
> > > +	rte_rwlock_t lock_enq; /**< lock protection for the Enqueue */
> > > +	rte_rwlock_t lock_deq; /**< lock protection for the Dequeue */
> >
> > No.
> >
> > This is a good idea but needs a use before introducing another
> > element, particularly a complicated one like locking
> >
> > Tom

The actual usage would be implemented within the PMD. Basically this to prevent the corner case when a queue is being accessed from multiple thread for which there is no protection in DPDK (but application does not necessarily behaves well). 
In normal operation there would never be a case when there is a conflict on the lock.
This is not something which was considered for any other PMD?
From DPDK doc : "If multiple threads are to use the same hardware queue on the same NIC port, then locking, or some other form of mutual exclusion, is necessary."
Basically for AC100 we would purely enforce the lock for any enqueue/dequeue operation for a given queue (distinct lock for enqueue and dequeue, since these would run on different threads).  

> Having two locks on same cacheline will create lots of ping/pong false sharing.

You would recommend to purely spread them within the structure? Or something else? 
 
> Also, unless the reader is holding the lock for a significant fraction of the time a
> regular spin lock will be faster.

OK Thanks. It should in principle never have to wait for the lock for the usage above, only to catch misbehaving application risk. 

Nic



^ permalink raw reply	[flat|nested] 174+ messages in thread

* RE: [PATCH v4 6/7] bbdev: add queue related warning and status information
  2022-07-06 18:57           ` Tom Rix
@ 2022-07-06 20:34             ` Chautru, Nicolas
  0 siblings, 0 replies; 174+ messages in thread
From: Chautru, Nicolas @ 2022-07-06 20:34 UTC (permalink / raw)
  To: Tom Rix, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, Richardson, Bruce, david.marchand, stephen

Hi Tom, 

> -----Original Message-----
> From: Tom Rix <trix@redhat.com>
> 
> 
> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
> > This allows to expose more information with regards to any queue
> > related failure and warning which cannot be supported in existing API.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > ---
> >   app/test-bbdev/test_bbdev_perf.c |  2 ++
> >   lib/bbdev/rte_bbdev.c            | 21 +++++++++++++++++++++
> >   lib/bbdev/rte_bbdev.h            | 34 ++++++++++++++++++++++++++++++++++
> >   lib/bbdev/version.map            |  1 +
> >   4 files changed, 58 insertions(+)
> >
> > diff --git a/app/test-bbdev/test_bbdev_perf.c
> > b/app/test-bbdev/test_bbdev_perf.c
> > index 1abda2d..653b21f 100644
> > --- a/app/test-bbdev/test_bbdev_perf.c
> > +++ b/app/test-bbdev/test_bbdev_perf.c
> > @@ -4360,6 +4360,8 @@ typedef int (test_case_function)(struct
> active_device *ad,
> >   	stats->dequeued_count = q_stats->dequeued_count;
> >   	stats->enqueue_err_count = q_stats->enqueue_err_count;
> >   	stats->dequeue_err_count = q_stats->dequeue_err_count;
> > +	stats->enqueue_warning_count = q_stats->enqueue_warning_count;
> > +	stats->dequeue_warning_count = q_stats->dequeue_warning_count;
> >   	stats->acc_offload_cycles = q_stats->acc_offload_cycles;
> >
> >   	return 0;
> > diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c index
> > 28b105d..ddad464 100644
> > --- a/lib/bbdev/rte_bbdev.c
> > +++ b/lib/bbdev/rte_bbdev.c
> > @@ -27,6 +27,8 @@
> >   #define BBDEV_OP_TYPE_COUNT 6
> >   /* Number of supported device status */
> >   #define BBDEV_DEV_STATUS_COUNT 9
> > +/* Number of supported enqueue status */ #define
> > +BBDEV_ENQ_STATUS_COUNT 4
> >
> >   /* BBDev library logging ID */
> >   RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE); @@ -723,6 +725,8
> @@
> > struct rte_bbdev *
> >   		stats->dequeued_count += q_stats->dequeued_count;
> >   		stats->enqueue_err_count += q_stats->enqueue_err_count;
> >   		stats->dequeue_err_count += q_stats->dequeue_err_count;
> > +		stats->enqueue_warn_count += q_stats-
> >enqueue_warn_count;
> > +		stats->dequeue_warn_count += q_stats-
> >dequeue_warn_count;
> >   	}
> >   	rte_bbdev_log_debug("Got stats on %u", dev->data->dev_id);
> >   }
> > @@ -1165,3 +1169,20 @@ struct rte_mempool *
> >   	rte_bbdev_log(ERR, "Invalid device status");
> >   	return NULL;
> >   }
> > +
> > +const char *
> > +rte_bbdev_enqueue_status_str(enum rte_bbdev_enqueue_status status) {
> > +	static const char * const enq_sta_string[] = {
> > +		"RTE_BBDEV_ENQ_STATUS_NONE",
> > +		"RTE_BBDEV_ENQ_STATUS_QUEUE_FULL",
> > +		"RTE_BBDEV_ENQ_STATUS_RING_FULL",
> > +		"RTE_BBDEV_ENQ_STATUS_INVALID_OP",
> > +	};
> > +
> > +	if (status < BBDEV_ENQ_STATUS_COUNT)
> Single use of #define, could just be an array size check and remove the #define

Thanks, good point. 

> > +		return enq_sta_string[status];
> > +
> > +	rte_bbdev_log(ERR, "Invalid enqueue status");
> > +	return NULL;
> > +}
> > diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
> > ed528b8..b7ecf94 100644
> > --- a/lib/bbdev/rte_bbdev.h
> > +++ b/lib/bbdev/rte_bbdev.h
> > @@ -224,6 +224,19 @@ struct rte_bbdev_queue_conf {
> >   rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
> >
> >   /**
> > + * Flags indicate the reason why a previous enqueue may not have
> > + * consumed all requested operations
> > + * In case of multiple reasons the latter superdes a previous one  */
> > +enum rte_bbdev_enqueue_status {
> > +	RTE_BBDEV_ENQ_STATUS_NONE,             /**< Nothing to report */
> > +	RTE_BBDEV_ENQ_STATUS_QUEUE_FULL,       /**< Not enough room in
> queue */
> > +	RTE_BBDEV_ENQ_STATUS_RING_FULL,        /**< Not enough room in
> ring */
> > +	RTE_BBDEV_ENQ_STATUS_INVALID_OP,       /**< Operation was
> rejected as invalid */
> > +	RTE_BBDEV_ENQ_STATUS_PADDED_MAX = 6,   /**< Maximum enq
> status number including padding */
> Pad to 8 like the other patch ?

It doesn't have to be the same number, just a bit of room for growth. 

> > +};
> > +
> > +/**
> >    * Flags indicate the status of the device
> >    */
> >   enum rte_bbdev_device_status {
> > @@ -246,6 +259,12 @@ struct rte_bbdev_stats {
> >   	uint64_t enqueue_err_count;
> >   	/** Total error count on operations dequeued */
> >   	uint64_t dequeue_err_count;
> > +	/** Total warning count on operations enqueued */
> > +	uint64_t enqueue_warn_count;
> > +	/** Total warning count on operations dequeued */
> > +	uint64_t dequeue_warn_count;
> > +	/** Total enqueue status count based on rte_bbdev_enqueue_status
> enum */
> > +	uint64_t
> enqueue_status_count[RTE_BBDEV_ENQ_STATUS_PADDED_MAX];
> This element is not used in this patch, is it needed ?
> >   	/** CPU cycles consumed by the (HW/SW) accelerator device to
> offload
> >   	 *  the enqueue request to its internal queues.
> >   	 *  - For a HW device this is the cycles consumed in MMIO write @@
> > -386,6 +405,7 @@ struct rte_bbdev_queue_data {
> >   	void *queue_private;  /**< Driver-specific per-queue data */
> >   	struct rte_bbdev_queue_conf conf;  /**< Current configuration */
> >   	struct rte_bbdev_stats queue_stats;  /**< Queue statistics */
> > +	enum rte_bbdev_enqueue_status enqueue_status; /**< Enqueue
> status
> > +when op is rejected */
> 
> This element is not used in this patch, is it needed ?


This is exposed to both PMD and application through this commit at queue granularity. Said otherwise the PMD would then set this based on events detected in the PMD implementation. 


> 
> Tom
> 
> >   	bool started;  /**< Queue state */
> >   };
> >
> > @@ -938,6 +958,20 @@ typedef void (*rte_bbdev_cb_fn)(uint16_t dev_id,
> >   const char*
> >   rte_bbdev_device_status_str(enum rte_bbdev_device_status status);
> >
> > +/**
> > + * Converts queue status from enum to string
> > + *
> > + * @param status
> > + *   Queue status as enum
> > + *
> > + * @returns
> > + *  Queue status as string or NULL if op_type is invalid
> > + *
> > + */
> > +__rte_experimental
> > +const char*
> > +rte_bbdev_enqueue_status_str(enum rte_bbdev_enqueue_status status);
> > +
> >   #ifdef __cplusplus
> >   }
> >   #endif
> > diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map index
> > efae50b..1c06738 100644
> > --- a/lib/bbdev/version.map
> > +++ b/lib/bbdev/version.map
> > @@ -44,6 +44,7 @@ EXPERIMENTAL {
> >   	global:
> >
> >   	rte_bbdev_device_status_str;
> > +	rte_bbdev_enqueue_status_str;
> >   	rte_bbdev_enqueue_fft_ops;
> >   	rte_bbdev_dequeue_fft_ops;
> >   	rte_bbdev_fft_op_alloc_bulk;


^ permalink raw reply	[flat|nested] 174+ messages in thread

* RE: [PATCH v4 5/7] bbdev: add new operation for FFT processing
  2022-07-06 18:47           ` Tom Rix
@ 2022-07-06 21:04             ` Chautru, Nicolas
  2022-07-07 13:09               ` Tom Rix
  0 siblings, 1 reply; 174+ messages in thread
From: Chautru, Nicolas @ 2022-07-06 21:04 UTC (permalink / raw)
  To: Tom Rix, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, Richardson, Bruce, david.marchand, stephen

Hi Tom, 

> -----Original Message-----
> From: Tom Rix <trix@redhat.com>> 
> 
> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
> > Extension of bbdev operation to support FFT based operations.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
> > ---
> >   doc/guides/prog_guide/bbdev.rst | 130
> +++++++++++++++++++++++++++++++++++
> >   lib/bbdev/rte_bbdev.c           |  11 ++-
> >   lib/bbdev/rte_bbdev.h           |  76 ++++++++++++++++++++
> >   lib/bbdev/rte_bbdev_op.h        | 149
> ++++++++++++++++++++++++++++++++++++++++
> >   lib/bbdev/version.map           |   4 ++
> >   5 files changed, 369 insertions(+), 1 deletion(-)
> >
> > diff --git a/doc/guides/prog_guide/bbdev.rst
> > b/doc/guides/prog_guide/bbdev.rst index 70fa01a..4a055b5 100644
> > --- a/doc/guides/prog_guide/bbdev.rst
> > +++ b/doc/guides/prog_guide/bbdev.rst
> > @@ -1118,6 +1118,136 @@ Figure :numref:`figure_turbo_tb_decode` above
> >   showing the Turbo decoding of CBs using BBDEV interface in TB-mode
> >   is also valid for LDPC decode.
> >
> > +BBDEV FFT Operation
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +This operation allows to run a combination of DFT and/or IDFT and/or time-
> domain windowing.
> > +These can be used in a modular fashion (using bypass modes) or as a
> > +processing pipeline which can be used for FFT-based baseband signal
> processing.
> > +In more details it allows :
> > +- to process the data first through an IDFT of adjustable size and
> > +padding;
> > +- to perform the windowing as a programmable cyclic shift offset of
> > +the data followed by a pointwise multiplication by a time domain
> > +window;
> > +- to process the related data through a DFT of adjustable size and
> > +depadding for each such cyclic shift output.
> > +
> > +A flexible number of Rx antennas are being processed in parallel with the
> same configuration.
> > +The API allows more generally for flexibility in what the PMD may
> > +support (cabability flags) and flexibility to adjust some of the parameters of
> the processing.
> > +
> > +The operation/capability flags that can be set for each FFT operation are
> given below.
> > +
> > +  **NOTE:** The actual operation flags that may be used with a
> > + specific  BBDEV PMD are dependent on the driver capabilities as
> > + reported via  ``rte_bbdev_info_get()``, and may be a subset of those below.
> > +
> > ++--------------------------------------------------------------------+
> > +|Description of FFT capability flags                                 |
> >
> ++===============================================================
> =====
> > +++
> > +|RTE_BBDEV_FFT_WINDOWING                                             |
> > +| Set to enable/support windowing in time domain                     |
> > ++--------------------------------------------------------------------+
> > +|RTE_BBDEV_FFT_CS_ADJUSTMENT                                         |
> > +| Set to enable/support  the cyclic shift time offset adjustment     |
> > ++--------------------------------------------------------------------+
> > +|RTE_BBDEV_FFT_DFT_BYPASS                                            |
> > +| Set to bypass the DFT and use directly the IDFT as an option       |
> > ++--------------------------------------------------------------------+
> > +|RTE_BBDEV_FFT_IDFT_BYPASS                                           |
> > +| Set to bypass the IDFT and use directly the DFT as an option       |
> > ++--------------------------------------------------------------------+
> > +|RTE_BBDEV_FFT_WINDOWING_BYPASS                                      |
> > +| Set to bypass the time domain windowing  as an option              |
> > ++--------------------------------------------------------------------+
> > +|RTE_BBDEV_FFT_POWER_MEAS
> 
> Other flags are not truncated, should be
> 
> RTE_BBDEV_FFT_POWER_MEASUREMENT
> 

The intention from DPDK recommendation is for these to be kept shortnames, isn't it?
Above we use many acronyms to keep it short (CS, etc...)
Even in current BBDEV API we use many truncation to keep names short: OUT, ENC/DEC, HQ, RM on top of acronyms. 
I believe this is still super explicit with that name?

> >                                              |
> > +| Set to provide an optional power measument of the DFT output       |
> > ++--------------------------------------------------------------------+
> measurement

OK Thanks

> > +|RTE_BBDEV_FFT_FP16_INPUT                                            |
> > +| Set if the input data shall use FP16 format instead of INT16       |
> > ++--------------------------------------------------------------------+
> > +|RTE_BBDEV_FFT_FP16_OUTPUT                                           |
> > +| Set if the output data shall use FP16 format instead of INT16      |
> > ++--------------------------------------------------------------------+
> > +
> > +The structure passed for each FFT operation is given below, with the
> > +operation flags forming a bitmask in the ``op_flags`` field.
> > +
> > +.. code-block:: c
> > +
> > +    struct rte_bbdev_op_fft {
> > +        struct rte_bbdev_op_data base_input;
> > +        struct rte_bbdev_op_data base_output;
> > +        struct rte_bbdev_op_data power_meas_output;
> similar to above, meas -> measurement

See above. Would that really help? I don’t believe there can be any confusion. 

> > +        uint32_t op_flags;
> > +        uint16_t input_sequence_size;
> Could these be future proofed by increasing small int size's to uint32_t ?

It is not possible to be that big for any signal processing relevant to that operation. 

> > +        uint16_t input_leading_padding;
> > +        uint16_t output_sequence_size;
> > +        uint16_t output_leading_depadding;
> > +        uint8_t window_index[RTE_BBDEV_MAX_CS_2];
> > +        uint16_t cs_bitmap;
> > +        uint8_t num_antennas_log2;
> > +        uint8_t idft_log2;
> > +        uint8_t dft_log2;
> is _log2 needed in variable name if it is documenation ?

I believe it is a best practice when the variable name may be misleading, ie. this is not the actual dft size as a natural number (2048 for instance) but there is an implied mapping. 

> > +        int8_t cs_time_adjustment;
> > +        int8_t idft_shift;
> > +        int8_t dft_shift;
> > +        uint16_t ncs_reciprocal;
> > +        uint16_t power_shift;
> > +        uint16_t fp16_exp_adjust;
> > +    };
> > +
> > +The FFT parameters are set out in the table below.
> > +
> > ++----------------------+--------------------------------------------------------------+
> > +|Parameter             |Description                                                   |
> >
> ++======================+========================================
> =====
> > ++=================+
> > +|base_input            |input data                                                    |
> > ++----------------------+--------------------------------------------------------------+
> > +|base_output           |output data                                                   |
> > ++----------------------+--------------------------------------------------------------+
> > +|power_meas_output     |optional output data with power measurement
> on DFT output     |
> > ++----------------------+--------------------------------------------------------------+
> > +|op_flags              |bitmask of all active operation capabilities                  |
> > ++----------------------+--------------------------------------------------------------+
> > +|input_sequence_size   |size of the input sequence in 32-bits points per
> antenna      |
> > ++----------------------+--------------------------------------------------------------+
> > +|input_leading_padding |number of points padded at the start of input
> data            |
> > ++----------------------+--------------------------------------------------------------+
> > +|output_sequence_size  |size of the output sequence per antenna and
> cyclic shift      |
> > ++----------------------+--------------------------------------------------------------+
> > +|output_depadding      |number of points depadded at the start of output
> data         |
> > ++----------------------+--------------------------------------------------------------+
> output_leading_depadding

OK Thanks

> > +|window_index          |optional windowing profile index used for each cyclic
> shift   |
> > ++----------------------+--------------------------------------------------------------+
> > +|cs_bitmap             |bitmap of the cyclic shift output requested (LSB for
> index 0) |
> > ++----------------------+--------------------------------------------------------------+
> > +|num_antennas_log2     |number of antennas as a log2 (10 maps to 1024...)
> |
> > ++----------------------+--------------------------------------------------------------+
> > +|idft_log2             |iDFT size as a log2                                           |
> > ++----------------------+--------------------------------------------------------------+
> > +|dft_log2              |DFT size as a log2                                            |
> > ++----------------------+--------------------------------------------------------------+
> > +|cs_time_adjustment    |adjustment of time position of all the cyclic shift
> output    |
> > ++----------------------+--------------------------------------------------------------+
> > +|idft_shift            |shift down of signal level post iDFT                          |
> > ++----------------------+--------------------------------------------------------------+
> > +|dft_shift             |shift down of signal level post DFT                           |
> > ++----------------------+--------------------------------------------------------------+
> > +|ncs_reciprocal        |inverse of max number of CS normalized to 15b (ie.
> 231 for 12)|
> > ++----------------------+--------------------------------------------------------------+
> > +|power_shift           |shift down of level of power measurement when
> enabled         |
> > ++----------------------+--------------------------------------------------------------+
> > +|fp16_exp_adjust       |value added to FP16 exponent at conversion from
> INT16         |
> > ++----------------------+--------------------------------------------------------------+
> > +
> > +The mbuf input ``base_input`` is mandatory for all BBDEV PMDs and is
> > +the incoming data for the processing. Its size may not fit into an
> > +actual mbuf, but the structure is used to pass iova address.
> > +The mbuf output ``output`` is mandatory and is output of the FFT
> processing chain.
> > +Each point is a complex number of 32bits : either as 2 INT16 or as 2
> > +FP16 based when the option supported.
> > +The data layout is based on contiguous concatenation of output data
> > +first by cyclic shift then by antenna.
> >
> >   Sample code
> >   -----------
> > diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c index
> > 555bda9..28b105d 100644
> > --- a/lib/bbdev/rte_bbdev.c
> > +++ b/lib/bbdev/rte_bbdev.c
> > @@ -24,7 +24,7 @@
> >   #define DEV_NAME "BBDEV"
> >
> >   /* Number of supported operation types */ -#define
> > BBDEV_OP_TYPE_COUNT 5
> > +#define BBDEV_OP_TYPE_COUNT 6
> >   /* Number of supported device status */
> >   #define BBDEV_DEV_STATUS_COUNT 9
> >
> > @@ -854,6 +854,9 @@ struct rte_bbdev *
> >   	case RTE_BBDEV_OP_LDPC_ENC:
> >   		result = sizeof(struct rte_bbdev_enc_op);
> >   		break;
> > +	case RTE_BBDEV_OP_FFT:
> > +		result = sizeof(struct rte_bbdev_fft_op);
> > +		break;
> >   	default:
> >   		break;
> >   	}
> > @@ -877,6 +880,10 @@ struct rte_bbdev *
> >   		struct rte_bbdev_enc_op *op = element;
> >   		memset(op, 0, mempool->elt_size);
> >   		op->mempool = mempool;
> > +	} else if (type == RTE_BBDEV_OP_FFT) {
> > +		struct rte_bbdev_fft_op *op = element;
> > +		memset(op, 0, mempool->elt_size);
> > +		op->mempool = mempool;
> >   	}
> >   }
> >
> > @@ -1126,6 +1133,8 @@ struct rte_mempool *
> >   		"RTE_BBDEV_OP_TURBO_DEC",
> >   		"RTE_BBDEV_OP_TURBO_ENC",
> >   		"RTE_BBDEV_OP_LDPC_DEC",
> > +		"RTE_BBDEV_OP_LDPC_ENC",
> Why ldpc_enc line, this is already in codebase ?
> > +		"RTE_BBDEV_OP_FFT",

Thanks, there this is a rebase issue in previous commit


> >   	};
> >
> >   	if (op_type < BBDEV_OP_TYPE_COUNT)
> > diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
> > ac941d6..ed528b8 100644
> > --- a/lib/bbdev/rte_bbdev.h
> > +++ b/lib/bbdev/rte_bbdev.h
> > @@ -401,6 +401,12 @@ typedef uint16_t
> (*rte_bbdev_enqueue_dec_ops_t)(
> >   		struct rte_bbdev_dec_op **ops,
> >   		uint16_t num);
> >
> > +/** @internal Enqueue fft operations for processing on queue of a
> > +device. */ typedef uint16_t (*rte_bbdev_enqueue_fft_ops_t)(
> > +		struct rte_bbdev_queue_data *q_data,
> > +		struct rte_bbdev_fft_op **ops,
> > +		uint16_t num);
> > +
> >   /** @internal Dequeue encode operations from a queue of a device. */
> >   typedef uint16_t (*rte_bbdev_dequeue_enc_ops_t)(
> >   		struct rte_bbdev_queue_data *q_data, @@ -411,6 +417,11
> @@ typedef
> > uint16_t (*rte_bbdev_dequeue_dec_ops_t)(
> >   		struct rte_bbdev_queue_data *q_data,
> >   		struct rte_bbdev_dec_op **ops, uint16_t num);
> >
> > +/** @internal Dequeue fft operations from a queue of a device. */
> > +typedef uint16_t (*rte_bbdev_dequeue_fft_ops_t)(
> > +		struct rte_bbdev_queue_data *q_data,
> > +		struct rte_bbdev_fft_op **ops, uint16_t num);
> > +
> >   #define RTE_BBDEV_NAME_MAX_LEN  64  /**< Max length of device name
> > */
> >
> >   /**
> > @@ -459,6 +470,10 @@ struct __rte_cache_aligned rte_bbdev {
> >   	rte_bbdev_dequeue_enc_ops_t dequeue_ldpc_enc_ops;
> >   	/** Dequeue decode function */
> >   	rte_bbdev_dequeue_dec_ops_t dequeue_ldpc_dec_ops;
> > +	/** Enqueue FFT function */
> > +	rte_bbdev_enqueue_fft_ops_t enqueue_fft_ops;
> > +	/** Dequeue FFT function */
> > +	rte_bbdev_dequeue_fft_ops_t dequeue_fft_ops;
> >   	const struct rte_bbdev_ops *dev_ops;  /**< Functions exported by
> PMD */
> >   	struct rte_bbdev_data *data;  /**< Pointer to device data */
> >   	enum rte_bbdev_state state;  /**< If device is currently used or
> > not */ @@ -591,6 +606,36 @@ struct __rte_cache_aligned rte_bbdev {
> >   	return dev->enqueue_ldpc_dec_ops(q_data, ops, num_ops);
> >   }
> >
> > +/**
> > + * Enqueue a burst of fft operations to a queue of the device.
> > + * This functions only enqueues as many operations as currently
> > +possible and
> > + * does not block until @p num_ops entries in the queue are available.
> > + * This function does not provide any error notification to avoid the
> > + * corresponding overhead.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param queue_id
> > + *   The index of the queue.
> > + * @param ops
> > + *   Pointer array containing operations to be enqueued Must have at least
> > + *   @p num_ops entries
> > + * @param num_ops
> > + *   The maximum number of operations to enqueue.
> > + *
> > + * @return
> > + *   The number of operations actually enqueued (this is the number of
> processed
> > + *   entries in the @p ops array).
> > + */
> > +__rte_experimental
> > +static inline uint16_t
> > +rte_bbdev_enqueue_fft_ops(uint16_t dev_id, uint16_t queue_id,
> > +		struct rte_bbdev_fft_op **ops, uint16_t num_ops) {
> > +	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
> Who checks the input is valid ?
> > +	struct rte_bbdev_queue_data *q_data = &dev->data-
> >queues[queue_id];
> > +	return dev->enqueue_fft_ops(q_data, ops, num_ops); }
> >
> >   /**
> >    * Dequeue a burst of processed encode operations from a queue of the
> device.
> > @@ -716,6 +761,37 @@ struct __rte_cache_aligned rte_bbdev {
> >   	return dev->dequeue_ldpc_dec_ops(q_data, ops, num_ops);
> >   }
> >
> > +/**
> > + * Dequeue a burst of fft operations from a queue of the device.
> > + * This functions returns only the current contents of the queue, and
> > +does not
> > + * block until @ num_ops is available.
> > + * This function does not provide any error notification to avoid the
> > + * corresponding overhead.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param queue_id
> > + *   The index of the queue.
> > + * @param ops
> > + *   Pointer array where operations will be dequeued to. Must have at least
> > + *   @p num_ops entries
> > + * @param num_ops
> > + *   The maximum number of operations to dequeue.
> > + *
> > + * @return
> > + *   The number of operations actually dequeued (this is the number of
> entries
> > + *   copied into the @p ops array).
> > + */
> > +__rte_experimental
> > +static inline uint16_t
> > +rte_bbdev_dequeue_fft_ops(uint16_t dev_id, uint16_t queue_id,
> > +		struct rte_bbdev_fft_op **ops, uint16_t num_ops) {
> > +	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
> > +	struct rte_bbdev_queue_data *q_data = &dev->data-
> >queues[queue_id];
> > +	return dev->dequeue_fft_ops(q_data, ops, num_ops); }
> > +
> >   /** Definitions of device event types */
> >   enum rte_bbdev_event_type {
> >   	RTE_BBDEV_EVENT_UNKNOWN,  /**< unknown event type */ diff --git
> > a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h index
> > cd82418..3e46f1d 100644
> > --- a/lib/bbdev/rte_bbdev_op.h
> > +++ b/lib/bbdev/rte_bbdev_op.h
> > @@ -47,6 +47,8 @@
> >   #define RTE_BBDEV_TURBO_MAX_CODE_BLOCKS (64)
> >   /* LDPC:  Maximum number of Code Blocks in Transport Block.*/
> >   #define RTE_BBDEV_LDPC_MAX_CODE_BLOCKS (256)
> > +/* 12 CS maximum */
> > +#define RTE_BBDEV_MAX_CS_2 (6)
> >
> >   /** Flags for turbo decoder operation and capability structure */
> >   enum rte_bbdev_op_td_flag_bitmasks { @@ -211,6 +213,26 @@ enum
> > rte_bbdev_op_ldpcenc_flag_bitmasks {
> >   	RTE_BBDEV_LDPC_ENC_CONCATENATION = (1ULL << 7)
> >   };
> >
> > +/** Flags for DFT operation and capability structure */ enum
> > +rte_bbdev_op_fft_flag_bitmasks {
> > +	/** Flexible windowing capability */
> > +	RTE_BBDEV_FFT_WINDOWING = (1ULL << 0),
> > +	/** Flexible adjustment of Cyclic Shift time offset */
> > +	RTE_BBDEV_FFT_CS_ADJUSTMENT = (1ULL << 1),
> > +	/** Set for bypass the DFT and get directly into iDFT input */
> > +	RTE_BBDEV_FFT_DFT_BYPASS = (1ULL << 2),
> > +	/** Set for bypass the IDFT and get directly the DFT output */
> > +	RTE_BBDEV_FFT_IDFT_BYPASS = (1ULL << 3),
> > +	/** Set for bypass time domain windowing */
> > +	RTE_BBDEV_FFT_WINDOWING_BYPASS = (1ULL << 4),
> > +	/** Set for optional power measurement on DFT output */
> > +	RTE_BBDEV_FFT_POWER_MEAS = (1ULL << 5),
> Meas here too, change generally
> > +	/** Set if the input data used FP16 format */
> > +	RTE_BBDEV_FFT_FP16_INPUT = (1ULL << 6),
> 
> What are the other data type(s) ?
> 
> The default is not mentioned, or i missed it.
> 
> > +	/**  Set if the output data uses FP16 format  */
> > +	RTE_BBDEV_FFT_FP16_OUTPUT = (1ULL << 7) };
> > +
> >   /** Flags for the Code Block/Transport block mode  */
> >   enum rte_bbdev_op_cb_mode {
> >   	/** One operation is one or fraction of one transport block  */ @@
> > -689,6 +711,55 @@ struct rte_bbdev_op_ldpc_enc {
> >   	};
> >   };
> >
> > +/** Operation structure for FFT processing.
> > + *
> > + * The operation processes the data for multiple antennas in a single
> > +call
> > + * (.i.e for all the REs belonging to a given SRS sequence for
> > +instance)
> > + *
> > + * The output mbuf data structure is expected to be allocated by the
> > + * application with enough room for the output data.
> > + */
> > +struct rte_bbdev_op_fft {
> > +	/** Input data starting from first antenna */
> > +	struct rte_bbdev_op_data base_input;
> > +	/** Output data starting from first antenna and first cyclic shift */
> > +	struct rte_bbdev_op_data base_output;
> > +	/** Optional power measurement output data */
> > +	struct rte_bbdev_op_data power_meas_output;
> > +	/** Flags from rte_bbdev_op_fft_flag_bitmasks */
> > +	uint32_t op_flags;
> > +	/** Input sequence size in 32-bits points */
> > +	uint16_t input_sequence_size;
> size is bytes*4 ? how does this work with fp16 ?
> > +	/** Padding at the start of the sequence */
> > +	uint16_t input_leading_padding;
> > +	/** Output sequence size in 32-bits points */
> > +	uint16_t output_sequence_size;
> > +	/** Depadding at the start of the DFT output */
> > +	uint16_t output_leading_depadding;
> > +	/** Window index being used for each cyclic shift output */
> > +	uint8_t window_index[RTE_BBDEV_MAX_CS_2];
> > +	/** Bitmap of the cyclic shift output requested */
> > +	uint16_t cs_bitmap;
> > +	/** Number of antennas as a log2 – 8 to 128 */
> > +	uint8_t num_antennas_log2;
> > +	/** iDFT size as a log2 - 32 to 2048 */
> > +	uint8_t idft_log2;
> > +	/** DFT size as a log2 - 8 to 2048 */
> > +	uint8_t dft_log2;
> > +	/** Adjustment of position of the cyclic shifts - -31 to 31 */
> > +	int8_t cs_time_adjustment;
> > +	/** iDFT shift down */
> > +	int8_t idft_shift;
> > +	/** DFT shift down */
> > +	int8_t dft_shift;
> > +	/** NCS reciprocal factor  */
> > +	uint16_t ncs_reciprocal;
> > +	/** power measurement out shift down */
> > +	uint16_t power_shift;
> > +	/** Adjust the FP6 exponent for INT<->FP16 conversion */
> > +	uint16_t fp16_exp_adjust;
> > +};
> > +
> >   /** List of the capabilities for the Turbo Decoder */
> >   struct rte_bbdev_op_cap_turbo_dec {
> >   	/** Flags from rte_bbdev_op_td_flag_bitmasks */ @@ -741,6 +812,16
> > @@ struct rte_bbdev_op_cap_ldpc_enc {
> >   	uint16_t num_buffers_dst;
> >   };
> >
> > +/** List of the capabilities for the FFT */ struct
> > +rte_bbdev_op_cap_fft {
> > +	/** Flags from rte_bbdev_op_ldpcenc_flag_bitmasks */
> you mean 'from rte_bbdev_op_fft_flag_bitmasks' ?
> > +	uint32_t capability_flags;
> > +	/** Num input code block buffers */
> > +	uint16_t num_buffers_src;
> > +	/** Num output code block buffers */
> > +	uint16_t num_buffers_dst;
> > +};
> > +
> >   /** Different operation types supported by the device */
> >   enum rte_bbdev_op_type {
> >   	RTE_BBDEV_OP_NONE,  /**< Dummy operation that does nothing */
> @@
> > -748,6 +829,7 @@ enum rte_bbdev_op_type {
> >   	RTE_BBDEV_OP_TURBO_ENC,  /**< Turbo encode */
> >   	RTE_BBDEV_OP_LDPC_DEC,  /**< LDPC decode */
> >   	RTE_BBDEV_OP_LDPC_ENC,  /**< LDPC encode */
> > +	RTE_BBDEV_OP_FFT,  /**< FFT */
> >   	RTE_BBDEV_OP_TYPE_PADDED_MAX = 8,  /**< Maximum op type
> number including padding */
> >   };
> >
> > @@ -791,6 +873,18 @@ struct rte_bbdev_dec_op {
> >   	};
> >   };
> >
> > +/** Structure specifying a single fft operation */ struct
> > +rte_bbdev_fft_op {
> > +	/** Status of operation that was performed */
> > +	int status;
> > +	/** Mempool which op instance is in */
> > +	struct rte_mempool *mempool;
> > +	/** Opaque pointer for user data */
> > +	void *opaque_data;
> > +	/** Contains turbo decoder specific parameters */
> > +	struct rte_bbdev_op_fft fft;
> > +};
> > +
> >   /** Operation capabilities supported by a device */
> >   struct rte_bbdev_op_cap {
> >   	enum rte_bbdev_op_type type;  /**< Type of operation */ @@ -799,6
> > +893,7 @@ struct rte_bbdev_op_cap {
> >   		struct rte_bbdev_op_cap_turbo_enc turbo_enc;
> >   		struct rte_bbdev_op_cap_ldpc_dec ldpc_dec;
> >   		struct rte_bbdev_op_cap_ldpc_enc ldpc_enc;
> > +		struct rte_bbdev_op_cap_fft fft;
> >   	} cap;  /**< Operation-type specific capabilities */
> >   };
> >
> > @@ -918,6 +1013,42 @@ struct rte_mempool *
> >   }
> >
> >   /**
> > + * Bulk allocate fft operations from a mempool with parameter defaults
> reset.
> > + *
> > + * @param mempool
> > + *   Operation mempool, created by rte_bbdev_op_pool_create().
> > + * @param ops
> > + *   Output array to place allocated operations
> > + * @param num_ops
> > + *   Number of operations to allocate
> > + *
> > + * @returns
> > + *   - 0 on success
> > + *   - EINVAL if invalid mempool is provided
> > + */
> > +__rte_experimental
> > +static inline int
> > +rte_bbdev_fft_op_alloc_bulk(struct rte_mempool *mempool,
> > +		struct rte_bbdev_fft_op **ops, uint16_t num_ops) {
> > +	struct rte_bbdev_op_pool_private *priv;
> > +	int ret;
> > +
> > +	/* Check type */
> > +	priv = (struct rte_bbdev_op_pool_private *)
> > +			rte_mempool_get_priv(mempool);
> > +	if (unlikely(priv->type != RTE_BBDEV_OP_FFT))
> > +		return -EINVAL;
> > +
> > +	/* Get elements */
> > +	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
> > +	if (unlikely(ret < 0))
> > +		return ret;
> 
> if-check is not needed, just
> 
> return ret;
> 
> and drop the next line
> 
> Tom
> 
> > +
> > +	return 0;
> > +}
> > +
> > +/**
> >    * Free decode operation structures that were allocated by
> >    * rte_bbdev_dec_op_alloc_bulk().
> >    * All structures must belong to the same mempool.
> > @@ -951,6 +1082,24 @@ struct rte_mempool *
> >   		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops,
> num_ops);
> >   }
> >
> > +/**
> > + * Free encode operation structures that were allocated by
> > + * rte_bbdev_fft_op_alloc_bulk().
> > + * All structures must belong to the same mempool.
> > + *
> > + * @param ops
> > + *   Operation structures
> > + * @param num_ops
> > + *   Number of structures
> > + */
> > +__rte_experimental
> > +static inline void
> > +rte_bbdev_fft_op_free_bulk(struct rte_bbdev_fft_op **ops, unsigned
> > +int num_ops) {
> > +	if (num_ops > 0)
> > +		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops,
> num_ops); }
> > +
> >   #ifdef __cplusplus
> >   }
> >   #endif
> > diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map index
> > 9ac3643..efae50b 100644
> > --- a/lib/bbdev/version.map
> > +++ b/lib/bbdev/version.map
> > @@ -44,4 +44,8 @@ EXPERIMENTAL {
> >   	global:
> >
> >   	rte_bbdev_device_status_str;
> > +	rte_bbdev_enqueue_fft_ops;
> > +	rte_bbdev_dequeue_fft_ops;
> > +	rte_bbdev_fft_op_alloc_bulk;
> > +	rte_bbdev_fft_op_free_bulk;
> >   };


^ permalink raw reply	[flat|nested] 174+ messages in thread

* RE: [PATCH v4 4/7] drivers/baseband: update PMDs to expose queue per operation
  2022-07-06 16:15           ` Tom Rix
@ 2022-07-06 21:10             ` Chautru, Nicolas
  2022-07-07 13:20               ` Tom Rix
  0 siblings, 1 reply; 174+ messages in thread
From: Chautru, Nicolas @ 2022-07-06 21:10 UTC (permalink / raw)
  To: Tom Rix, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, Richardson, Bruce, david.marchand, stephen

Hi Tom, 

> -----Original Message-----
> From: Tom Rix <trix@redhat.com>
> Sent: Wednesday, July 6, 2022 9:15 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
> Cc: maxime.coquelin@redhat.com; mdr@ashroe.eu; Richardson, Bruce
> <bruce.richardson@intel.com>; david.marchand@redhat.com;
> stephen@networkplumber.org
> Subject: Re: [PATCH v4 4/7] drivers/baseband: update PMDs to expose queue
> per operation
> 
> 
> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
> > Add support in existing bbdev PMDs for the explicit number of queue
> > and priority for each operation type configured on the device.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > ---
> >   drivers/baseband/acc100/rte_acc100_pmd.c           | 29 +++++++++++++------
> ---
> >   drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  8 ++++++
> >   drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  8 ++++++
> >   drivers/baseband/la12xx/bbdev_la12xx.c             |  7 ++++++
> >   drivers/baseband/turbo_sw/bbdev_turbo_software.c   | 11 ++++++++
> >   5 files changed, 51 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index 17ba798..d568d0d 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -966,6 +966,7 @@
> >   		struct rte_bbdev_driver_info *dev_info)
> >   {
> >   	struct acc100_device *d = dev->data->dev_private;
> > +	int i;
> >
> >   	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> >   		{
> > @@ -1062,19 +1063,23 @@
> >   	fetch_acc100_config(dev);
> >   	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >
> > -	/* This isn't ideal because it reports the maximum number of queues
> but
> > -	 * does not provide info on how many can be uplink/downlink or
> different
> > -	 * priorities
> > -	 */
> > -	dev_info->max_num_queues =
> > -			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
> > -			d->acc100_conf.q_dl_5g.num_qgroups +
> > -			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
> > -			d->acc100_conf.q_ul_5g.num_qgroups +
> > -			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
> > -			d->acc100_conf.q_dl_4g.num_qgroups +
> > -			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
> > +	/* Expose number of queues */
> > +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
> > +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] =
> > +d->acc100_conf.q_ul_4g.num_aqs_per_groups *
> >   			d->acc100_conf.q_ul_4g.num_qgroups;
> > +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = d-
> >acc100_conf.q_dl_4g.num_aqs_per_groups *
> > +			d->acc100_conf.q_dl_4g.num_qgroups;
> > +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = d-
> >acc100_conf.q_ul_5g.num_aqs_per_groups *
> > +			d->acc100_conf.q_ul_5g.num_qgroups;
> > +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = d-
> >acc100_conf.q_dl_5g.num_aqs_per_groups *
> > +			d->acc100_conf.q_dl_5g.num_qgroups;
> > +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = d-
> >acc100_conf.q_ul_4g.num_qgroups;
> > +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = d-
> >acc100_conf.q_dl_4g.num_qgroups;
> > +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = d-
> >acc100_conf.q_ul_5g.num_qgroups;
> > +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = d-
> >acc100_conf.q_dl_5g.num_qgroups;
> > +	dev_info->max_num_queues = 0;
> > +	for (i = RTE_BBDEV_OP_TURBO_DEC; i < RTE_BBDEV_OP_LDPC_ENC;
> i++)
> 
> should this be i <=  ?
>

Thanks

> 
> > +		dev_info->max_num_queues += dev_info->num_queues[i];
> >   	dev_info->queue_size_lim = ACC100_MAX_QUEUE_DEPTH;
> >   	dev_info->hardware_accelerated = true;
> >   	dev_info->max_dl_queue_priority =
> > diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> > b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> > index 57b12af..b4982af 100644
> > --- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> > +++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> > @@ -379,6 +379,14 @@
> >   		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
> >   			dev_info->max_num_queues++;
> >   	}
> > +	/* Expose number of queue per operation type */
> > +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
> > +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
> > +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
> > +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = dev_info-
> >max_num_queues / 2;
> > +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = dev_info-
> >max_num_queues / 2;
> > +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
> > +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
> >   }
> >
> >   /**
> > diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> > b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> > index 2a330c4..dc7f479 100644
> > --- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> > +++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> > @@ -655,6 +655,14 @@ struct __rte_cache_aligned fpga_queue {
> >   		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
> >   			dev_info->max_num_queues++;
> >   	}
> > +	/* Expose number of queue per operation type */
> > +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
> > +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = dev_info-
> >max_num_queues / 2;
> > +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = dev_info-
> >max_num_queues / 2;
> > +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = 0;
> > +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = 0;
> > +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = 1;
> > +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = 1;
> >   }
> >
> >   /**
> > diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c
> > b/drivers/baseband/la12xx/bbdev_la12xx.c
> > index c1f88c6..e99ea9a 100644
> > --- a/drivers/baseband/la12xx/bbdev_la12xx.c
> > +++ b/drivers/baseband/la12xx/bbdev_la12xx.c
> > @@ -102,6 +102,13 @@ struct bbdev_la12xx_params {
> >   	dev_info->min_alignment = 64;
> >   	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >
> > +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
> > +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
> > +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
> > +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] =
> LA12XX_MAX_QUEUES / 2;
> > +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] =
> LA12XX_MAX_QUEUES / 2;
> > +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
> > +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
> >   	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
> >   }
> >
> > diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> > b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> > index dbc5524..647e706 100644
> > --- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> > +++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> > @@ -256,6 +256,17 @@ struct turbo_sw_queue {
> >   	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> >   	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >
> > +	const struct rte_bbdev_op_cap *op_cap = bbdev_capabilities;
> 
> Should this be done through dev instead of assigning directly ?

I am not sure I follow your suggestion. Do you mind clarifying?

> 
> Tom
> 
> > +	int num_op_type = 0;
> > +	for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
> > +		num_op_type++;
> > +	op_cap = bbdev_capabilities;
> > +	if (num_op_type > 0) {
> > +		int num_queue_per_type = dev_info->max_num_queues /
> num_op_type;
> > +		for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
> > +			dev_info->num_queues[op_cap->type] =
> num_queue_per_type;
> > +	}
> > +
> >   	rte_bbdev_log_debug("got device info from %u\n", dev->data-
> >dev_id);
> >   }
> >


^ permalink raw reply	[flat|nested] 174+ messages in thread

* RE: [PATCH v4 3/7] bbdev: add device info on queue topology
  2022-07-06 16:06           ` Tom Rix
@ 2022-07-06 21:12             ` Chautru, Nicolas
  2022-07-07 13:34               ` Tom Rix
  0 siblings, 1 reply; 174+ messages in thread
From: Chautru, Nicolas @ 2022-07-06 21:12 UTC (permalink / raw)
  To: Tom Rix, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, Richardson, Bruce, david.marchand, stephen

Hi Tom, 

> -----Original Message-----
> From: Tom Rix <trix@redhat.com>
> Subject: Re: [PATCH v4 3/7] bbdev: add device info on queue topology
> 
> 
> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
> > Adding more options in the API to expose the number of queues exposed
> > and related priority.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > ---
> >   lib/bbdev/rte_bbdev.h | 4 ++++
> >   1 file changed, 4 insertions(+)
> >
> > diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
> > 9b1ffa4..ac941d6 100644
> > --- a/lib/bbdev/rte_bbdev.h
> > +++ b/lib/bbdev/rte_bbdev.h
> > @@ -289,6 +289,10 @@ struct rte_bbdev_driver_info {
> >
> >   	/** Maximum number of queues supported by the device */
> >   	unsigned int max_num_queues;
> > +	/** Maximum number of queues supported per operation type */
> > +	unsigned int num_queues[RTE_BBDEV_OP_TYPE_PADDED_MAX];
> > +	/** Priority level supported per operation type */
> > +	unsigned int queue_priority[RTE_BBDEV_OP_TYPE_PADDED_MAX];
> 
> It is better to add new elements to the end of a structure for better backward
> compatibility

All that serie is not ABI compatible (sizes change etc...). I don’t believe there is such a recommendation, is there?

> 
> Tom
> 
> >   	/** Queue size limit (queue size must also be power of 2) */
> >   	uint32_t queue_size_lim;
> >   	/** Set if device off-loads operation to hardware  */


^ permalink raw reply	[flat|nested] 174+ messages in thread

* RE: [PATCH v4 2/7] bbdev: add device status info
  2022-07-06 15:38           ` Tom Rix
@ 2022-07-06 21:16             ` Chautru, Nicolas
  2022-07-07 13:37               ` Tom Rix
  2022-08-25 14:08               ` Maxime Coquelin
  0 siblings, 2 replies; 174+ messages in thread
From: Chautru, Nicolas @ 2022-07-06 21:16 UTC (permalink / raw)
  To: Tom Rix, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, Richardson, Bruce, david.marchand, stephen


> -----Original Message-----
> From: Tom Rix <trix@redhat.com>
> Subject: Re: [PATCH v4 2/7] bbdev: add device status info
> 
> 
> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
> > Added device status information, so that the PMD can expose
> > information related to the underlying accelerator device status.
> > Minor order change in structure to fit into padding hole.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > ---
> >   drivers/baseband/acc100/rte_acc100_pmd.c           |  1 +
> >   drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  1 +
> >   drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  1 +
> >   drivers/baseband/la12xx/bbdev_la12xx.c             |  1 +
> >   drivers/baseband/null/bbdev_null.c                 |  1 +
> >   drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  1 +
> >   lib/bbdev/rte_bbdev.c                              | 24 +++++++++++++++
> >   lib/bbdev/rte_bbdev.h                              | 35 ++++++++++++++++++++--
> >   lib/bbdev/version.map                              |  6 ++++
> >   9 files changed, 69 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index de7e4bc..17ba798 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -1060,6 +1060,7 @@
> >
> >   	/* Read and save the populated config from ACC100 registers */
> >   	fetch_acc100_config(dev);
> > +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >
> >   	/* This isn't ideal because it reports the maximum number of queues
> but
> >   	 * does not provide info on how many can be uplink/downlink or
> > different diff --git
> > a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> > b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> > index 82ae6ba..57b12af 100644
> > --- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> > +++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> > @@ -369,6 +369,7 @@
> >   	dev_info->capabilities = bbdev_capabilities;
> >   	dev_info->cpu_flag_reqs = NULL;
> >   	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> > +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >
> >   	/* Calculates number of queues assigned to device */
> >   	dev_info->max_num_queues = 0;
> > diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> > b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> > index 21d3529..2a330c4 100644
> > --- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> > +++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> > @@ -645,6 +645,7 @@ struct __rte_cache_aligned fpga_queue {
> >   	dev_info->capabilities = bbdev_capabilities;
> >   	dev_info->cpu_flag_reqs = NULL;
> >   	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> > +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >
> >   	/* Calculates number of queues assigned to device */
> >   	dev_info->max_num_queues = 0;
> > diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c
> > b/drivers/baseband/la12xx/bbdev_la12xx.c
> > index 4d1bd16..c1f88c6 100644
> > --- a/drivers/baseband/la12xx/bbdev_la12xx.c
> > +++ b/drivers/baseband/la12xx/bbdev_la12xx.c
> > @@ -100,6 +100,7 @@ struct bbdev_la12xx_params {
> >   	dev_info->capabilities = bbdev_capabilities;
> >   	dev_info->cpu_flag_reqs = NULL;
> >   	dev_info->min_alignment = 64;
> > +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >
> >   	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
> >   }
> > diff --git a/drivers/baseband/null/bbdev_null.c
> > b/drivers/baseband/null/bbdev_null.c
> > index 248e129..94a1976 100644
> > --- a/drivers/baseband/null/bbdev_null.c
> > +++ b/drivers/baseband/null/bbdev_null.c
> > @@ -82,6 +82,7 @@ struct bbdev_queue {
> >   	 * here for code completeness.
> >   	 */
> >   	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> > +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >
> >   	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
> >   }
> > diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> > b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> > index af7bc41..dbc5524 100644
> > --- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> > +++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> > @@ -254,6 +254,7 @@ struct turbo_sw_queue {
> >   	dev_info->min_alignment = 64;
> >   	dev_info->harq_buffer_size = 0;
> >   	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> > +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >
> >   	rte_bbdev_log_debug("got device info from %u\n", dev->data-
> >dev_id);
> >   }
> > diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c index
> > 22bd894..555bda9 100644
> > --- a/lib/bbdev/rte_bbdev.c
> > +++ b/lib/bbdev/rte_bbdev.c
> > @@ -25,6 +25,8 @@
> >
> >   /* Number of supported operation types */
> >   #define BBDEV_OP_TYPE_COUNT 5
> > +/* Number of supported device status */ #define
> > +BBDEV_DEV_STATUS_COUNT 9
> >
> >   /* BBDev library logging ID */
> >   RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE); @@ -1132,3
> +1134,25
> > @@ struct rte_mempool *
> >   	rte_bbdev_log(ERR, "Invalid operation type");
> >   	return NULL;
> >   }
> > +
> > +const char *
> > +rte_bbdev_device_status_str(enum rte_bbdev_device_status status) {
> > +	static const char * const dev_sta_string[] = {
> > +		"RTE_BBDEV_DEV_NOSTATUS",
> > +		"RTE_BBDEV_DEV_NOT_SUPPORTED",
> > +		"RTE_BBDEV_DEV_RESET",
> > +		"RTE_BBDEV_DEV_CONFIGURED",
> > +		"RTE_BBDEV_DEV_ACTIVE",
> > +		"RTE_BBDEV_DEV_FATAL_ERR",
> > +		"RTE_BBDEV_DEV_RESTART_REQ",
> > +		"RTE_BBDEV_DEV_RECONFIG_REQ",
> > +		"RTE_BBDEV_DEV_CORRECT_ERR",
> > +	};
> > +
> > +	if (status < BBDEV_DEV_STATUS_COUNT)
> > +		return dev_sta_string[status];
> > +
> > +	rte_bbdev_log(ERR, "Invalid device status");
> > +	return NULL;
> > +}
> > diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
> > b88c881..9b1ffa4 100644
> > --- a/lib/bbdev/rte_bbdev.h
> > +++ b/lib/bbdev/rte_bbdev.h
> > @@ -223,6 +223,21 @@ struct rte_bbdev_queue_conf {
> >   int
> >   rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
> >
> > +/**
> > + * Flags indicate the status of the device  */ enum
> > +rte_bbdev_device_status {
> > +	RTE_BBDEV_DEV_NOSTATUS,        /**< Nothing being reported */
> > +	RTE_BBDEV_DEV_NOT_SUPPORTED,   /**< Device status is not
> supported on the PMD */
> If this was 0, you may not need to explicitly set.

This helps to have the lack of status being equivalent to a cleared register.

> > +	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-configured
> state */
> > +	RTE_BBDEV_DEV_CONFIGURED,      /**< Device is configured and
> ready to use */
> > +	RTE_BBDEV_DEV_ACTIVE,          /**< Device is configured and VF is
> being used */
> > +	RTE_BBDEV_DEV_FATAL_ERR,       /**< Device has hit a fatal
> uncorrectable error */
> > +	RTE_BBDEV_DEV_RESTART_REQ,     /**< Device requires application to
> restart */
> > +	RTE_BBDEV_DEV_RECONFIG_REQ,    /**< Device requires application
> to reconfigure queues */
> > +	RTE_BBDEV_DEV_CORRECT_ERR,     /**< Warning of a correctable
> error event happened */
> Last patch was padded, do something consistent here.

We only pad if we have to. Here there is no array whose size would be dimensioned by the size of that enum.

> > +};
> > +
> >   /** Device statistics. */
> >   struct rte_bbdev_stats {
> >   	uint64_t enqueued_count;  /**< Count of all operations enqueued */
> > @@ -285,12 +300,14 @@ struct rte_bbdev_driver_info {
> >   	/** Set if device supports per-queue interrupts */
> >   	bool queue_intr_supported;
> >   	/** Minimum alignment of buffers, in bytes */
> > -	uint16_t min_alignment;
> > -	/** HARQ memory available in kB */
> > +	/** Device Status */
> > +	enum rte_bbdev_device_status device_status;
> 
> New elements should be added to the end to improve backward compatibility.

Same comment in different patch. I would like to know if there is a real recommendation from DPDK on this. I have heard opposite view as well.
In that very case we are breaking the ABI in that new serie for 22.11 (sizes and offsets are changing). 

> 
> Tom
> 
> >   	uint32_t harq_buffer_size;
> >   	/** Byte endianness (RTE_BIG_ENDIAN/RTE_LITTLE_ENDIAN)
> supported
> >   	 *  for input/output data
> >   	 */
> > +	uint16_t min_alignment;
> > +	/** HARQ memory available in kB */
> >   	uint8_t data_endianness;
> >   	/** Default queue configuration used if none is supplied  */
> >   	struct rte_bbdev_queue_conf default_queue_conf; @@ -827,6
> +844,20
> > @@ typedef void (*rte_bbdev_cb_fn)(uint16_t dev_id,
> >   rte_bbdev_queue_intr_ctl(uint16_t dev_id, uint16_t queue_id, int epfd, int
> op,
> >   		void *data);
> >
> > +/**
> > + * Converts device status from enum to string
> > + *
> > + * @param status
> > + *   Device status as enum
> > + *
> > + * @returns
> > + *   Operation type as string or NULL if op_type is invalid
> > + *
> > + */
> > +__rte_experimental
> > +const char*
> > +rte_bbdev_device_status_str(enum rte_bbdev_device_status status);
> > +
> >   #ifdef __cplusplus
> >   }
> >   #endif
> > diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map index
> > cce3f3c..9ac3643 100644
> > --- a/lib/bbdev/version.map
> > +++ b/lib/bbdev/version.map
> > @@ -39,3 +39,9 @@ DPDK_22 {
> >
> >   	local: *;
> >   };
> > +
> > +EXPERIMENTAL {
> > +	global:
> > +
> > +	rte_bbdev_device_status_str;
> > +};


^ permalink raw reply	[flat|nested] 174+ messages in thread

* RE: [PATCH v4 1/7] bbdev: allow operation type enum for growth
  2022-07-06 12:50           ` Tom Rix
@ 2022-07-06 21:20             ` Chautru, Nicolas
  0 siblings, 0 replies; 174+ messages in thread
From: Chautru, Nicolas @ 2022-07-06 21:20 UTC (permalink / raw)
  To: Tom Rix, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, Richardson, Bruce, david.marchand, stephen

Hi Tom, 

> -----Original Message-----
> From: Tom Rix <trix@redhat.com>
> Subject: Re: [PATCH v4 1/7] bbdev: allow operation type enum for growth
> 
> 
> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
> > Updating the enum for rte_bbdev_op_type to allow to keep ABI
> > compatible for enum insertion while adding padded maximum value for
> > array need.
> > Removing RTE_BBDEV_OP_TYPE_COUNT and instead exposing
> > RTE_BBDEV_OP_TYPE_PADDED_MAX.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > ---
> >   app/test-bbdev/test_bbdev.c      | 2 +-
> >   app/test-bbdev/test_bbdev_perf.c | 4 ++--
> >   examples/bbdev_app/main.c        | 2 +-
> >   lib/bbdev/rte_bbdev.c            | 9 +++++----
> >   lib/bbdev/rte_bbdev_op.h         | 2 +-
> >   5 files changed, 10 insertions(+), 9 deletions(-)
> >
> > diff --git a/app/test-bbdev/test_bbdev.c b/app/test-bbdev/test_bbdev.c
> > index ac06d73..1063f6e 100644
> > --- a/app/test-bbdev/test_bbdev.c
> > +++ b/app/test-bbdev/test_bbdev.c
> > @@ -521,7 +521,7 @@ struct bbdev_testsuite_params {
> >   	rte_mempool_free(mp);
> >
> >   	TEST_ASSERT((mp = rte_bbdev_op_pool_create("Test_INV",
> > -			RTE_BBDEV_OP_TYPE_COUNT, size, cache_size, 0)) ==
> NULL,
> > +			RTE_BBDEV_OP_TYPE_PADDED_MAX, size, cache_size,
> 0)) == NULL,
> >   			"Failed test for rte_bbdev_op_pool_create: "
> >   			"returned value is not NULL for invalid type");
> >
> > diff --git a/app/test-bbdev/test_bbdev_perf.c
> > b/app/test-bbdev/test_bbdev_perf.c
> > index fad3b1e..1abda2d 100644
> > --- a/app/test-bbdev/test_bbdev_perf.c
> > +++ b/app/test-bbdev/test_bbdev_perf.c
> > @@ -2428,13 +2428,13 @@ typedef int (test_case_function)(struct
> > active_device *ad,
> >
> >   	/* Find capabilities */
> >   	const struct rte_bbdev_op_cap *cap = info.drv.capabilities;
> > -	for (i = 0; i < RTE_BBDEV_OP_TYPE_COUNT; i++) {
> > +	do {
> >   		if (cap->type == test_vector.op_type) {
> >   			capabilities = cap;
> >   			break;
> >   		}
> >   		cap++;
> > -	}
> > +	} while (cap->type != RTE_BBDEV_OP_NONE);
> >   	TEST_ASSERT_NOT_NULL(capabilities,
> >   			"Couldn't find capabilities");
> >
> > diff --git a/examples/bbdev_app/main.c b/examples/bbdev_app/main.c
> > index fc7e8b8..ef0ba76 100644
> > --- a/examples/bbdev_app/main.c
> > +++ b/examples/bbdev_app/main.c
> > @@ -1041,7 +1041,7 @@ uint16_t bbdev_parse_number(const char *mask)
> >   	void *sigret;
> >   	struct app_config_params app_params = def_app_config;
> >   	struct rte_mempool *ethdev_mbuf_mempool,
> *bbdev_mbuf_mempool;
> > -	struct rte_mempool *bbdev_op_pools[RTE_BBDEV_OP_TYPE_COUNT];
> > +	struct rte_mempool
> *bbdev_op_pools[RTE_BBDEV_OP_TYPE_PADDED_MAX];
> >   	struct lcore_conf lcore_conf[RTE_MAX_LCORE] = { {0} };
> >   	struct lcore_statistics lcore_stats[RTE_MAX_LCORE] = { {0} };
> >   	struct stats_lcore_params stats_lcore; diff --git
> > a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c index aaee7b7..22bd894
> > 100644
> > --- a/lib/bbdev/rte_bbdev.c
> > +++ b/lib/bbdev/rte_bbdev.c
> > @@ -23,6 +23,8 @@
> >
> >   #define DEV_NAME "BBDEV"
> >
> > +/* Number of supported operation types */ #define
> BBDEV_OP_TYPE_COUNT
> > +5
> >
> >   /* BBDev library logging ID */
> >   RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE); @@ -890,10
> +892,10
> > @@ struct rte_mempool *
> >   		return NULL;
> >   	}
> >
> > -	if (type >= RTE_BBDEV_OP_TYPE_COUNT) {
> > +	if (type >= BBDEV_OP_TYPE_COUNT) {
> >   		rte_bbdev_log(ERR,
> >   				"Invalid op type (%u), should be less than %u",
> > -				type, RTE_BBDEV_OP_TYPE_COUNT);
> > +				type, BBDEV_OP_TYPE_COUNT);
> >   		return NULL;
> >   	}
> >
> > @@ -1122,10 +1124,9 @@ struct rte_mempool *
> >   		"RTE_BBDEV_OP_TURBO_DEC",
> >   		"RTE_BBDEV_OP_TURBO_ENC",
> >   		"RTE_BBDEV_OP_LDPC_DEC",
> > -		"RTE_BBDEV_OP_LDPC_ENC",
> >   	};
> >
> > -	if (op_type < RTE_BBDEV_OP_TYPE_COUNT)
> > +	if (op_type < BBDEV_OP_TYPE_COUNT)
> >   		return op_types[op_type];
> >
> >   	rte_bbdev_log(ERR, "Invalid operation type"); diff --git
> > a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h index
> > 6d56133..cd82418 100644
> > --- a/lib/bbdev/rte_bbdev_op.h
> > +++ b/lib/bbdev/rte_bbdev_op.h
> > @@ -748,7 +748,7 @@ enum rte_bbdev_op_type {
> >   	RTE_BBDEV_OP_TURBO_ENC,  /**< Turbo encode */
> >   	RTE_BBDEV_OP_LDPC_DEC,  /**< LDPC decode */
> >   	RTE_BBDEV_OP_LDPC_ENC,  /**< LDPC encode */
> > -	RTE_BBDEV_OP_TYPE_COUNT,  /**< Count of different op types */
> 
> Why not keep this enum so you don't have to make the
> BBDEV_OP_TYPE_COUNT #define ?

We are announcing that we are deprecating that enum. We want to make sure this is not being used by application, only the PADDED one should be used moving forward
But I think I will use your suggestion in other commit not to use #define for this and instead just check for array size.

Thanks
Nic

> 
> Tom
> 
> > +	RTE_BBDEV_OP_TYPE_PADDED_MAX = 8,  /**< Maximum op type
> number
> > +including padding */
> >   };
> >
> >   /** Bit indexes of possible errors reported through status field */


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v5 0/7]  bbdev changes for 22.11
  2022-06-17 18:37     ` [PATCH v2 5/5] bbdev: add new operation for FFT processing Nicolas Chautru
  2022-06-28  1:35       ` [PATCH v3 0/7] bbdev changes for 22.11 Nicolas Chautru
  2022-07-06  0:23       ` [PATCH v4 0/7] bbdev changes for 22.11 Nicolas Chautru
@ 2022-07-06 23:28       ` Nicolas Chautru
  2022-07-06 23:28         ` [PATCH v5 1/7] bbdev: allow operation type enum for growth Nicolas Chautru
                           ` (7 more replies)
  2022-08-25 18:24       ` [PATCH v6 " Nicolas Chautru
                         ` (6 subsequent siblings)
  9 siblings, 8 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-07-06 23:28 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

v5: update base on review from Tom Rix. Number of typos reported and resolved,
removed the commit related to rw_lock for now, added a commit for
code clean up from review, resolved one rebase issue between 2 commits, used size of array for some bound check implementation. Thanks. 
v4: update to the last 2 commits to include function to print the queue status and a fix to the rte_lock within the wrong structure
v3: update to device status info to also use padded size for the related array.
Adding also 2 additionals commits to allow the API struc to expose more information related to queues corner cases/warning as well as an optional rw lock.
Hemant, Maxime, this is planned for DPDK 21.11 but would like review/ack early is possible to get this applied earlier and due to time off this summer.
Thanks
Nic

-- 

Hi,

Agregating together in a single serie a number of bbdev api changes previously submitted over the last few months and all targeted for 22.11 (4 different series detailed below). Related deprecation notice being pushed in 22.07 in parallel. 
* bbdev: add device status info
* bbdev: add new operation for FFT processing
* bbdev: add device info on queue topology
* bbdev: allow operation type enum for growth

v2: Update to the RTE_BBDEV_COUNT removal based on feedback from Thomas/Stephen : rejecting out of range op type and adjusting the new name for the padded maximum value used for fixed size arrays. 

---

Previous cover letters agregated below:

* bbdev: add device status info
https://patches.dpdk.org/project/dpdk/list/?series=23367

The updated structure will allow PMDs to expose through info_get what be may the status of the underlying accelerator, notably in case an HW error event having happened.

* bbdev: add new operation for FFT processing
https://patches.dpdk.org/project/dpdk/list/?series=22111

This contribution adds a new operation type to the existing ones already supported by the bbdev PMDs.
This set of operation is FFT-based processing for 5GNR baseband processing acceleration. This operates in the same lookaside fashion as other existing bbdev operation with a dedicated set of capabilities and parameters (marked as experimental).

I plan to also include a new PMD supporting this operation (and most of the related capabilities) in the next couple of months (either in 22.06 or 22.09) as well as extending the related bbdev-test.

* bbdev: add device info on queue topology
https://patches.dpdk.org/project/dpdk/list/?series=22076

Addressing an historical concern that the device info struct only imperfectly captured what queues are available on the device (number of operation and priority). This ended up being an iterative process for application to find each queue could be configured.

ie. the gap was captured as technical debt previously  in comments
/* This isn't ideal because it reports the maximum number of queues but
 * does not provide info on how many can be uplink/downlink or different
 * priorities
 */

This is now being exposed explictly based on the what the device actually supports using the existing info_get api

* bbdev: allow operation type enum for growth
https://patches.dpdk.org/project/dpdk/list/?series=23509

This is related to the general intent to remove using MAX value for enums. There is consensus that we should avoid this for a while notably for future-proofed ABI concerns https://patches.dpdk.org/project/dpdk/patch/20200130142003.2645765-1-ferruh.yigit@intel.com/.
But still there is arguably not yet an explicit best recommendation to handle this especially when we actualy need to expose array whose index is such an enum.
As a specific example here I am refering to RTE_BBDEV_OP_TYPE_COUNT in enum rte_bbdev_op_type which is being extended for new operation type being support in bbdev (such as https://patches.dpdk.org/project/dpdk/patch/1646956157-245769-2-git-send-email-nicolas.chautru@intel.com/ adding new FFT operation)

There is also the intent to be able to expose information for each operation type through the bbdev api such as dynamically configured queues information per such operation type https://patches.dpdk.org/project/dpdk/patch/1646785355-168133-2-git-send-email-nicolas.chautru@intel.com/

Basically we are considering best way to accomodate for this, notably based on discussions with Ray Kinsella and Bruce Richardson, to handle such a case moving forward: specifically for the example with RTE_BBDEV_OP_TYPE_COUNT and also more generally.

One possible option is captured in that patchset and is basically based on the simple principle to allow for growth and prevent ABI breakage. Ie. the last value of the enum is set with a higher value than required so that to allow insertion of new enum outside of the major ABI versions.
In that case the RTE_BBDEV_OP_TYPE_COUNT is still present and can be exposed and used while still allowing for addition thanks to the implicit padding-like room. As an alternate variant, instead of using that last enum value, that extended size could be exposed as an #define outside of the enum but would be fundamentally the same (public).

Another option would be to avoid array alltogether and use each time this a new dedicated API function (operation type enum being an input argument instead of an index to an array in an existing structure so that to get access to structure related to a given operation type enum) but that is arguably not well scalable within DPDK to use such a scheme for each enums and keep an uncluttered and clean API. In that very example that would be very odd indeed not to get this simply from info_get().

Some pros and cons, arguably the simple option in that patchset is a valid compromise option and a step in the right direction but we would like to know your view wrt best recommendation, or any other thought. 



Nicolas Chautru (7):
  bbdev: allow operation type enum for growth
  bbdev: add device status info
  bbdev: add device info on queue topology
  drivers/baseband: update PMDs to expose queue per operation
  bbdev: add new operation for FFT processing
  bbdev: add queue related warning and status information
  bbdev: remove unnecessary if-check

 app/test-bbdev/test_bbdev.c                        |   2 +-
 app/test-bbdev/test_bbdev_perf.c                   |   6 +-
 doc/guides/prog_guide/bbdev.rst                    | 130 +++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.c           |  30 ++--
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |   9 ++
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |   9 ++
 drivers/baseband/la12xx/bbdev_la12xx.c             |  10 +-
 drivers/baseband/null/bbdev_null.c                 |   1 +
 drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  12 ++
 examples/bbdev_app/main.c                          |   2 +-
 lib/bbdev/rte_bbdev.c                              |  57 +++++++-
 lib/bbdev/rte_bbdev.h                              | 149 +++++++++++++++++++-
 lib/bbdev/rte_bbdev_op.h                           | 156 ++++++++++++++++++++-
 lib/bbdev/version.map                              |  11 ++
 14 files changed, 555 insertions(+), 29 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v5 1/7] bbdev: allow operation type enum for growth
  2022-07-06 23:28       ` [PATCH v5 0/7] bbdev changes for 22.11 Nicolas Chautru
@ 2022-07-06 23:28         ` Nicolas Chautru
  2022-08-25 13:54           ` Maxime Coquelin
  2022-07-06 23:28         ` [PATCH v5 2/7] bbdev: add device status info Nicolas Chautru
                           ` (6 subsequent siblings)
  7 siblings, 1 reply; 174+ messages in thread
From: Nicolas Chautru @ 2022-07-06 23:28 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Updating the enum for rte_bbdev_op_type
to allow to keep ABI compatible for enum insertion
while adding padded maximum value for array need.
Removing RTE_BBDEV_OP_TYPE_COUNT and instead exposing
RTE_BBDEV_OP_TYPE_PADDED_MAX.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 app/test-bbdev/test_bbdev.c      | 2 +-
 app/test-bbdev/test_bbdev_perf.c | 4 ++--
 examples/bbdev_app/main.c        | 2 +-
 lib/bbdev/rte_bbdev.c            | 8 +++++---
 lib/bbdev/rte_bbdev_op.h         | 2 +-
 5 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/app/test-bbdev/test_bbdev.c b/app/test-bbdev/test_bbdev.c
index ac06d73..1063f6e 100644
--- a/app/test-bbdev/test_bbdev.c
+++ b/app/test-bbdev/test_bbdev.c
@@ -521,7 +521,7 @@ struct bbdev_testsuite_params {
 	rte_mempool_free(mp);
 
 	TEST_ASSERT((mp = rte_bbdev_op_pool_create("Test_INV",
-			RTE_BBDEV_OP_TYPE_COUNT, size, cache_size, 0)) == NULL,
+			RTE_BBDEV_OP_TYPE_PADDED_MAX, size, cache_size, 0)) == NULL,
 			"Failed test for rte_bbdev_op_pool_create: "
 			"returned value is not NULL for invalid type");
 
diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index fad3b1e..1abda2d 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -2428,13 +2428,13 @@ typedef int (test_case_function)(struct active_device *ad,
 
 	/* Find capabilities */
 	const struct rte_bbdev_op_cap *cap = info.drv.capabilities;
-	for (i = 0; i < RTE_BBDEV_OP_TYPE_COUNT; i++) {
+	do {
 		if (cap->type == test_vector.op_type) {
 			capabilities = cap;
 			break;
 		}
 		cap++;
-	}
+	} while (cap->type != RTE_BBDEV_OP_NONE);
 	TEST_ASSERT_NOT_NULL(capabilities,
 			"Couldn't find capabilities");
 
diff --git a/examples/bbdev_app/main.c b/examples/bbdev_app/main.c
index fc7e8b8..ef0ba76 100644
--- a/examples/bbdev_app/main.c
+++ b/examples/bbdev_app/main.c
@@ -1041,7 +1041,7 @@ uint16_t bbdev_parse_number(const char *mask)
 	void *sigret;
 	struct app_config_params app_params = def_app_config;
 	struct rte_mempool *ethdev_mbuf_mempool, *bbdev_mbuf_mempool;
-	struct rte_mempool *bbdev_op_pools[RTE_BBDEV_OP_TYPE_COUNT];
+	struct rte_mempool *bbdev_op_pools[RTE_BBDEV_OP_TYPE_PADDED_MAX];
 	struct lcore_conf lcore_conf[RTE_MAX_LCORE] = { {0} };
 	struct lcore_statistics lcore_stats[RTE_MAX_LCORE] = { {0} };
 	struct stats_lcore_params stats_lcore;
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index aaee7b7..4da8047 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -23,6 +23,8 @@
 
 #define DEV_NAME "BBDEV"
 
+/* Number of supported operation types */
+#define BBDEV_OP_TYPE_COUNT 5
 
 /* BBDev library logging ID */
 RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE);
@@ -890,10 +892,10 @@ struct rte_mempool *
 		return NULL;
 	}
 
-	if (type >= RTE_BBDEV_OP_TYPE_COUNT) {
+	if (type >= BBDEV_OP_TYPE_COUNT) {
 		rte_bbdev_log(ERR,
 				"Invalid op type (%u), should be less than %u",
-				type, RTE_BBDEV_OP_TYPE_COUNT);
+				type, BBDEV_OP_TYPE_COUNT);
 		return NULL;
 	}
 
@@ -1125,7 +1127,7 @@ struct rte_mempool *
 		"RTE_BBDEV_OP_LDPC_ENC",
 	};
 
-	if (op_type < RTE_BBDEV_OP_TYPE_COUNT)
+	if (op_type < BBDEV_OP_TYPE_COUNT)
 		return op_types[op_type];
 
 	rte_bbdev_log(ERR, "Invalid operation type");
diff --git a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h
index 6d56133..cd82418 100644
--- a/lib/bbdev/rte_bbdev_op.h
+++ b/lib/bbdev/rte_bbdev_op.h
@@ -748,7 +748,7 @@ enum rte_bbdev_op_type {
 	RTE_BBDEV_OP_TURBO_ENC,  /**< Turbo encode */
 	RTE_BBDEV_OP_LDPC_DEC,  /**< LDPC decode */
 	RTE_BBDEV_OP_LDPC_ENC,  /**< LDPC encode */
-	RTE_BBDEV_OP_TYPE_COUNT,  /**< Count of different op types */
+	RTE_BBDEV_OP_TYPE_PADDED_MAX = 8,  /**< Maximum op type number including padding */
 };
 
 /** Bit indexes of possible errors reported through status field */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v5 2/7] bbdev: add device status info
  2022-07-06 23:28       ` [PATCH v5 0/7] bbdev changes for 22.11 Nicolas Chautru
  2022-07-06 23:28         ` [PATCH v5 1/7] bbdev: allow operation type enum for growth Nicolas Chautru
@ 2022-07-06 23:28         ` Nicolas Chautru
  2022-08-25 14:18           ` Maxime Coquelin
  2022-07-06 23:28         ` [PATCH v5 3/7] bbdev: add device info on queue topology Nicolas Chautru
                           ` (5 subsequent siblings)
  7 siblings, 1 reply; 174+ messages in thread
From: Nicolas Chautru @ 2022-07-06 23:28 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Added device status information, so that the PMD can
expose information related to the underlying accelerator device status.
Minor order change in structure to fit into padding hole.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c           |  1 +
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  1 +
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  1 +
 drivers/baseband/la12xx/bbdev_la12xx.c             |  1 +
 drivers/baseband/null/bbdev_null.c                 |  1 +
 drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  1 +
 lib/bbdev/rte_bbdev.c                              | 22 ++++++++++++++
 lib/bbdev/rte_bbdev.h                              | 35 ++++++++++++++++++++--
 lib/bbdev/version.map                              |  6 ++++
 9 files changed, 67 insertions(+), 2 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index de7e4bc..17ba798 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -1060,6 +1060,7 @@
 
 	/* Read and save the populated config from ACC100 registers */
 	fetch_acc100_config(dev);
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	/* This isn't ideal because it reports the maximum number of queues but
 	 * does not provide info on how many can be uplink/downlink or different
diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
index 82ae6ba..57b12af 100644
--- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
+++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
@@ -369,6 +369,7 @@
 	dev_info->capabilities = bbdev_capabilities;
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	/* Calculates number of queues assigned to device */
 	dev_info->max_num_queues = 0;
diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
index 21d3529..2a330c4 100644
--- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
+++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
@@ -645,6 +645,7 @@ struct __rte_cache_aligned fpga_queue {
 	dev_info->capabilities = bbdev_capabilities;
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	/* Calculates number of queues assigned to device */
 	dev_info->max_num_queues = 0;
diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c b/drivers/baseband/la12xx/bbdev_la12xx.c
index 4d1bd16..c1f88c6 100644
--- a/drivers/baseband/la12xx/bbdev_la12xx.c
+++ b/drivers/baseband/la12xx/bbdev_la12xx.c
@@ -100,6 +100,7 @@ struct bbdev_la12xx_params {
 	dev_info->capabilities = bbdev_capabilities;
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->min_alignment = 64;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
diff --git a/drivers/baseband/null/bbdev_null.c b/drivers/baseband/null/bbdev_null.c
index 248e129..94a1976 100644
--- a/drivers/baseband/null/bbdev_null.c
+++ b/drivers/baseband/null/bbdev_null.c
@@ -82,6 +82,7 @@ struct bbdev_queue {
 	 * here for code completeness.
 	 */
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
index af7bc41..dbc5524 100644
--- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
+++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
@@ -254,6 +254,7 @@ struct turbo_sw_queue {
 	dev_info->min_alignment = 64;
 	dev_info->harq_buffer_size = 0;
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	rte_bbdev_log_debug("got device info from %u\n", dev->data->dev_id);
 }
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index 4da8047..38630a2 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -1133,3 +1133,25 @@ struct rte_mempool *
 	rte_bbdev_log(ERR, "Invalid operation type");
 	return NULL;
 }
+
+const char *
+rte_bbdev_device_status_str(enum rte_bbdev_device_status status)
+{
+	static const char * const dev_sta_string[] = {
+		"RTE_BBDEV_DEV_NOSTATUS",
+		"RTE_BBDEV_DEV_NOT_SUPPORTED",
+		"RTE_BBDEV_DEV_RESET",
+		"RTE_BBDEV_DEV_CONFIGURED",
+		"RTE_BBDEV_DEV_ACTIVE",
+		"RTE_BBDEV_DEV_FATAL_ERR",
+		"RTE_BBDEV_DEV_RESTART_REQ",
+		"RTE_BBDEV_DEV_RECONFIG_REQ",
+		"RTE_BBDEV_DEV_CORRECT_ERR",
+	};
+
+	if (status < sizeof(dev_sta_string) / sizeof(char *))
+		return dev_sta_string[status];
+
+	rte_bbdev_log(ERR, "Invalid device status");
+	return NULL;
+}
diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index b88c881..9b1ffa4 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -223,6 +223,21 @@ struct rte_bbdev_queue_conf {
 int
 rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
 
+/**
+ * Flags indicate the status of the device
+ */
+enum rte_bbdev_device_status {
+	RTE_BBDEV_DEV_NOSTATUS,        /**< Nothing being reported */
+	RTE_BBDEV_DEV_NOT_SUPPORTED,   /**< Device status is not supported on the PMD */
+	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-configured state */
+	RTE_BBDEV_DEV_CONFIGURED,      /**< Device is configured and ready to use */
+	RTE_BBDEV_DEV_ACTIVE,          /**< Device is configured and VF is being used */
+	RTE_BBDEV_DEV_FATAL_ERR,       /**< Device has hit a fatal uncorrectable error */
+	RTE_BBDEV_DEV_RESTART_REQ,     /**< Device requires application to restart */
+	RTE_BBDEV_DEV_RECONFIG_REQ,    /**< Device requires application to reconfigure queues */
+	RTE_BBDEV_DEV_CORRECT_ERR,     /**< Warning of a correctable error event happened */
+};
+
 /** Device statistics. */
 struct rte_bbdev_stats {
 	uint64_t enqueued_count;  /**< Count of all operations enqueued */
@@ -285,12 +300,14 @@ struct rte_bbdev_driver_info {
 	/** Set if device supports per-queue interrupts */
 	bool queue_intr_supported;
 	/** Minimum alignment of buffers, in bytes */
-	uint16_t min_alignment;
-	/** HARQ memory available in kB */
+	/** Device Status */
+	enum rte_bbdev_device_status device_status;
 	uint32_t harq_buffer_size;
 	/** Byte endianness (RTE_BIG_ENDIAN/RTE_LITTLE_ENDIAN) supported
 	 *  for input/output data
 	 */
+	uint16_t min_alignment;
+	/** HARQ memory available in kB */
 	uint8_t data_endianness;
 	/** Default queue configuration used if none is supplied  */
 	struct rte_bbdev_queue_conf default_queue_conf;
@@ -827,6 +844,20 @@ typedef void (*rte_bbdev_cb_fn)(uint16_t dev_id,
 rte_bbdev_queue_intr_ctl(uint16_t dev_id, uint16_t queue_id, int epfd, int op,
 		void *data);
 
+/**
+ * Converts device status from enum to string
+ *
+ * @param status
+ *   Device status as enum
+ *
+ * @returns
+ *   Operation type as string or NULL if op_type is invalid
+ *
+ */
+__rte_experimental
+const char*
+rte_bbdev_device_status_str(enum rte_bbdev_device_status status);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map
index cce3f3c..9ac3643 100644
--- a/lib/bbdev/version.map
+++ b/lib/bbdev/version.map
@@ -39,3 +39,9 @@ DPDK_22 {
 
 	local: *;
 };
+
+EXPERIMENTAL {
+	global:
+
+	rte_bbdev_device_status_str;
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v5 3/7] bbdev: add device info on queue topology
  2022-07-06 23:28       ` [PATCH v5 0/7] bbdev changes for 22.11 Nicolas Chautru
  2022-07-06 23:28         ` [PATCH v5 1/7] bbdev: allow operation type enum for growth Nicolas Chautru
  2022-07-06 23:28         ` [PATCH v5 2/7] bbdev: add device status info Nicolas Chautru
@ 2022-07-06 23:28         ` Nicolas Chautru
  2022-08-25 15:23           ` Maxime Coquelin
  2022-07-06 23:28         ` [PATCH v5 4/7] drivers/baseband: update PMDs to expose queue per operation Nicolas Chautru
                           ` (4 subsequent siblings)
  7 siblings, 1 reply; 174+ messages in thread
From: Nicolas Chautru @ 2022-07-06 23:28 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Adding more options in the API to expose the number
of queues exposed and related priority.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 lib/bbdev/rte_bbdev.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index 9b1ffa4..ac941d6 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -289,6 +289,10 @@ struct rte_bbdev_driver_info {
 
 	/** Maximum number of queues supported by the device */
 	unsigned int max_num_queues;
+	/** Maximum number of queues supported per operation type */
+	unsigned int num_queues[RTE_BBDEV_OP_TYPE_PADDED_MAX];
+	/** Priority level supported per operation type */
+	unsigned int queue_priority[RTE_BBDEV_OP_TYPE_PADDED_MAX];
 	/** Queue size limit (queue size must also be power of 2) */
 	uint32_t queue_size_lim;
 	/** Set if device off-loads operation to hardware  */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v5 4/7] drivers/baseband: update PMDs to expose queue per operation
  2022-07-06 23:28       ` [PATCH v5 0/7] bbdev changes for 22.11 Nicolas Chautru
                           ` (2 preceding siblings ...)
  2022-07-06 23:28         ` [PATCH v5 3/7] bbdev: add device info on queue topology Nicolas Chautru
@ 2022-07-06 23:28         ` Nicolas Chautru
  2022-07-06 23:28         ` [PATCH v5 5/7] bbdev: add new operation for FFT processing Nicolas Chautru
                           ` (3 subsequent siblings)
  7 siblings, 0 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-07-06 23:28 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Add support in existing bbdev PMDs for the explicit number of queue
and priority for each operation type configured on the device.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c           | 29 +++++++++++++---------
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  8 ++++++
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  8 ++++++
 drivers/baseband/la12xx/bbdev_la12xx.c             |  7 ++++++
 drivers/baseband/turbo_sw/bbdev_turbo_software.c   | 11 ++++++++
 5 files changed, 51 insertions(+), 12 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 17ba798..f967e3f 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -966,6 +966,7 @@
 		struct rte_bbdev_driver_info *dev_info)
 {
 	struct acc100_device *d = dev->data->dev_private;
+	int i;
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
 		{
@@ -1062,19 +1063,23 @@
 	fetch_acc100_config(dev);
 	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
-	/* This isn't ideal because it reports the maximum number of queues but
-	 * does not provide info on how many can be uplink/downlink or different
-	 * priorities
-	 */
-	dev_info->max_num_queues =
-			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
-			d->acc100_conf.q_dl_5g.num_qgroups +
-			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
-			d->acc100_conf.q_ul_5g.num_qgroups +
-			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
-			d->acc100_conf.q_dl_4g.num_qgroups +
-			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
+	/* Expose number of queues */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = d->acc100_conf.q_ul_4g.num_aqs_per_groups *
 			d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = d->acc100_conf.q_dl_4g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_4g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = d->acc100_conf.q_ul_5g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_5g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = d->acc100_conf.q_dl_5g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_5g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = d->acc100_conf.q_dl_4g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = d->acc100_conf.q_ul_5g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = d->acc100_conf.q_dl_5g.num_qgroups;
+	dev_info->max_num_queues = 0;
+	for (i = RTE_BBDEV_OP_TURBO_DEC; i <= RTE_BBDEV_OP_LDPC_ENC; i++)
+		dev_info->max_num_queues += dev_info->num_queues[i];
 	dev_info->queue_size_lim = ACC100_MAX_QUEUE_DEPTH;
 	dev_info->hardware_accelerated = true;
 	dev_info->max_dl_queue_priority =
diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
index 57b12af..b4982af 100644
--- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
+++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
@@ -379,6 +379,14 @@
 		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
 			dev_info->max_num_queues++;
 	}
+	/* Expose number of queue per operation type */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = dev_info->max_num_queues / 2;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
 }
 
 /**
diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
index 2a330c4..dc7f479 100644
--- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
+++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
@@ -655,6 +655,14 @@ struct __rte_cache_aligned fpga_queue {
 		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
 			dev_info->max_num_queues++;
 	}
+	/* Expose number of queue per operation type */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = 0;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = 1;
 }
 
 /**
diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c b/drivers/baseband/la12xx/bbdev_la12xx.c
index c1f88c6..e99ea9a 100644
--- a/drivers/baseband/la12xx/bbdev_la12xx.c
+++ b/drivers/baseband/la12xx/bbdev_la12xx.c
@@ -102,6 +102,13 @@ struct bbdev_la12xx_params {
 	dev_info->min_alignment = 64;
 	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = LA12XX_MAX_QUEUES / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = LA12XX_MAX_QUEUES / 2;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
 	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
 
diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
index dbc5524..647e706 100644
--- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
+++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
@@ -256,6 +256,17 @@ struct turbo_sw_queue {
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
 	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
+	const struct rte_bbdev_op_cap *op_cap = bbdev_capabilities;
+	int num_op_type = 0;
+	for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
+		num_op_type++;
+	op_cap = bbdev_capabilities;
+	if (num_op_type > 0) {
+		int num_queue_per_type = dev_info->max_num_queues / num_op_type;
+		for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
+			dev_info->num_queues[op_cap->type] = num_queue_per_type;
+	}
+
 	rte_bbdev_log_debug("got device info from %u\n", dev->data->dev_id);
 }
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v5 5/7] bbdev: add new operation for FFT processing
  2022-07-06 23:28       ` [PATCH v5 0/7] bbdev changes for 22.11 Nicolas Chautru
                           ` (3 preceding siblings ...)
  2022-07-06 23:28         ` [PATCH v5 4/7] drivers/baseband: update PMDs to expose queue per operation Nicolas Chautru
@ 2022-07-06 23:28         ` Nicolas Chautru
  2022-07-06 23:28         ` [PATCH v5 6/7] bbdev: add queue related warning and status information Nicolas Chautru
                           ` (2 subsequent siblings)
  7 siblings, 0 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-07-06 23:28 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=a, Size: 22110 bytes --]

Extension of bbdev operation to support FFT based operations.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
 doc/guides/prog_guide/bbdev.rst | 130 +++++++++++++++++++++++++++++++++++
 lib/bbdev/rte_bbdev.c           |  10 ++-
 lib/bbdev/rte_bbdev.h           |  76 ++++++++++++++++++++
 lib/bbdev/rte_bbdev_op.h        | 149 ++++++++++++++++++++++++++++++++++++++++
 lib/bbdev/version.map           |   4 ++
 5 files changed, 368 insertions(+), 1 deletion(-)

diff --git a/doc/guides/prog_guide/bbdev.rst b/doc/guides/prog_guide/bbdev.rst
index 70fa01a..150161b 100644
--- a/doc/guides/prog_guide/bbdev.rst
+++ b/doc/guides/prog_guide/bbdev.rst
@@ -1118,6 +1118,136 @@ Figure :numref:`figure_turbo_tb_decode` above
 showing the Turbo decoding of CBs using BBDEV interface in TB-mode
 is also valid for LDPC decode.
 
+BBDEV FFT Operation
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This operation allows to run a combination of DFT and/or IDFT and/or time-domain windowing.
+These can be used in a modular fashion (using bypass modes) or as a processing pipeline
+which can be used for FFT-based baseband signal processing.
+In more details it allows :
+- to process the data first through an IDFT of adjustable size and padding;
+- to perform the windowing as a programmable cyclic shift offset of the data followed by a
+pointwise multiplication by a time domain window;
+- to process the related data through a DFT of adjustable size and depadding for each such cyclic
+shift output.
+
+A flexible number of Rx antennas are being processed in parallel with the same configuration.
+The API allows more generally for flexibility in what the PMD may support (cabability flags) and
+flexibility to adjust some of the parameters of the processing.
+
+The operation/capability flags that can be set for each FFT operation are given below.
+
+  **NOTE:** The actual operation flags that may be used with a specific
+  BBDEV PMD are dependent on the driver capabilities as reported via
+  ``rte_bbdev_info_get()``, and may be a subset of those below.
+
++--------------------------------------------------------------------+
+|Description of FFT capability flags                                 |
++====================================================================+
+|RTE_BBDEV_FFT_WINDOWING                                             |
+| Set to enable/support windowing in time domain                     |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_CS_ADJUSTMENT                                         |
+| Set to enable/support  the cyclic shift time offset adjustment     |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_DFT_BYPASS                                            |
+| Set to bypass the DFT and use directly the IDFT as an option       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_IDFT_BYPASS                                           |
+| Set to bypass the IDFT and use directly the DFT as an option       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_WINDOWING_BYPASS                                      |
+| Set to bypass the time domain windowing  as an option              |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_POWER_MEAS                                            |
+| Set to provide an optional power measurement of the DFT output     |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_FP16_INPUT                                            |
+| Set if the input data shall use FP16 format instead of INT16       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_FP16_OUTPUT                                           |
+| Set if the output data shall use FP16 format instead of INT16      |
++--------------------------------------------------------------------+
+
+The structure passed for each FFT operation is given below,
+with the operation flags forming a bitmask in the ``op_flags`` field.
+
+.. code-block:: c
+
+    struct rte_bbdev_op_fft {
+        struct rte_bbdev_op_data base_input;
+        struct rte_bbdev_op_data base_output;
+        struct rte_bbdev_op_data power_meas_output;
+        uint32_t op_flags;
+        uint16_t input_sequence_size;
+        uint16_t input_leading_padding;
+        uint16_t output_sequence_size;
+        uint16_t output_leading_depadding;
+        uint8_t window_index[RTE_BBDEV_MAX_CS_2];
+        uint16_t cs_bitmap;
+        uint8_t num_antennas_log2;
+        uint8_t idft_log2;
+        uint8_t dft_log2;
+        int8_t cs_time_adjustment;
+        int8_t idft_shift;
+        int8_t dft_shift;
+        uint16_t ncs_reciprocal;
+        uint16_t power_shift;
+        uint16_t fp16_exp_adjust;
+    };
+
+The FFT parameters are set out in the table below.
+
++-------------------------+--------------------------------------------------------------+
+|Parameter                |Description                                                   |
++=========================+==============================================================+
+|base_input               |input data                                                    |
++-------------------------+--------------------------------------------------------------+
+|base_output              |output data                                                   |
++-------------------------+--------------------------------------------------------------+
+|power_meas_output        |optional output data with power measurement on DFT output     |
++-------------------------+--------------------------------------------------------------+
+|op_flags                 |bitmask of all active operation capabilities                  |
++-------------------------+--------------------------------------------------------------+
+|input_sequence_size      |size of the input sequence in 32-bits points per antenna      |
++-------------------------+--------------------------------------------------------------+
+|input_leading_padding    |number of points padded at the start of input data            |
++-------------------------+--------------------------------------------------------------+
+|output_sequence_size     |size of the output sequence per antenna and cyclic shift      |
++-------------------------+--------------------------------------------------------------+
+|output_leading_depadding |number of points depadded at the start of output data         |
++-------------------------+--------------------------------------------------------------+
+|window_index             |optional windowing profile index used for each cyclic shift   |
++-------------------------+--------------------------------------------------------------+
+|cs_bitmap                |bitmap of the cyclic shift output requested (LSB for index 0) |
++-------------------------+--------------------------------------------------------------+
+|num_antennas_log2        |number of antennas as a log2 (10 maps to 1024...)             |
++-------------------------+--------------------------------------------------------------+
+|idft_log2                |iDFT size as a log2                                           |
++-------------------------+--------------------------------------------------------------+
+|dft_log2                 |DFT size as a log2                                            |
++-------------------------+--------------------------------------------------------------+
+|cs_time_adjustment       |adjustment of time position of all the cyclic shift output    |
++-------------------------+--------------------------------------------------------------+
+|idft_shift               |shift down of signal level post iDFT                          |
++-------------------------+--------------------------------------------------------------+
+|dft_shift                |shift down of signal level post DFT                           |
++-------------------------+--------------------------------------------------------------+
+|ncs_reciprocal           |inverse of max number of CS normalized to 15b (ie. 231 for 12)|
++-------------------------+--------------------------------------------------------------+
+|power_shift              |shift down of level of power measurement when enabled         |
++-------------------------+--------------------------------------------------------------+
+|fp16_exp_adjust          |value added to FP16 exponent at conversion from INT16         |
++-------------------------+--------------------------------------------------------------+
+
+The mbuf input ``base_input`` is mandatory for all BBDEV PMDs and is the
+incoming data for the processing. Its size may not fit into an actual mbuf, but the
+structure is used to pass iova address.
+The mbuf output ``output`` is mandatory and is output of the FFT processing chain.
+Each point is a complex number of 32bits : either as 2 INT16 or as 2 FP16 based when the option
+supported.
+The data layout is based on contiguous concatenation of output data first by cyclic shift then
+by antenna.
 
 Sample code
 -----------
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index 38630a2..9d65ba8 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -24,7 +24,7 @@
 #define DEV_NAME "BBDEV"
 
 /* Number of supported operation types */
-#define BBDEV_OP_TYPE_COUNT 5
+#define BBDEV_OP_TYPE_COUNT 6
 
 /* BBDev library logging ID */
 RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE);
@@ -852,6 +852,9 @@ struct rte_bbdev *
 	case RTE_BBDEV_OP_LDPC_ENC:
 		result = sizeof(struct rte_bbdev_enc_op);
 		break;
+	case RTE_BBDEV_OP_FFT:
+		result = sizeof(struct rte_bbdev_fft_op);
+		break;
 	default:
 		break;
 	}
@@ -875,6 +878,10 @@ struct rte_bbdev *
 		struct rte_bbdev_enc_op *op = element;
 		memset(op, 0, mempool->elt_size);
 		op->mempool = mempool;
+	} else if (type == RTE_BBDEV_OP_FFT) {
+		struct rte_bbdev_fft_op *op = element;
+		memset(op, 0, mempool->elt_size);
+		op->mempool = mempool;
 	}
 }
 
@@ -1125,6 +1132,7 @@ struct rte_mempool *
 		"RTE_BBDEV_OP_TURBO_ENC",
 		"RTE_BBDEV_OP_LDPC_DEC",
 		"RTE_BBDEV_OP_LDPC_ENC",
+		"RTE_BBDEV_OP_FFT",
 	};
 
 	if (op_type < BBDEV_OP_TYPE_COUNT)
diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index ac941d6..ed528b8 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -401,6 +401,12 @@ typedef uint16_t (*rte_bbdev_enqueue_dec_ops_t)(
 		struct rte_bbdev_dec_op **ops,
 		uint16_t num);
 
+/** @internal Enqueue fft operations for processing on queue of a device. */
+typedef uint16_t (*rte_bbdev_enqueue_fft_ops_t)(
+		struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_fft_op **ops,
+		uint16_t num);
+
 /** @internal Dequeue encode operations from a queue of a device. */
 typedef uint16_t (*rte_bbdev_dequeue_enc_ops_t)(
 		struct rte_bbdev_queue_data *q_data,
@@ -411,6 +417,11 @@ typedef uint16_t (*rte_bbdev_dequeue_dec_ops_t)(
 		struct rte_bbdev_queue_data *q_data,
 		struct rte_bbdev_dec_op **ops, uint16_t num);
 
+/** @internal Dequeue fft operations from a queue of a device. */
+typedef uint16_t (*rte_bbdev_dequeue_fft_ops_t)(
+		struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_fft_op **ops, uint16_t num);
+
 #define RTE_BBDEV_NAME_MAX_LEN  64  /**< Max length of device name */
 
 /**
@@ -459,6 +470,10 @@ struct __rte_cache_aligned rte_bbdev {
 	rte_bbdev_dequeue_enc_ops_t dequeue_ldpc_enc_ops;
 	/** Dequeue decode function */
 	rte_bbdev_dequeue_dec_ops_t dequeue_ldpc_dec_ops;
+	/** Enqueue FFT function */
+	rte_bbdev_enqueue_fft_ops_t enqueue_fft_ops;
+	/** Dequeue FFT function */
+	rte_bbdev_dequeue_fft_ops_t dequeue_fft_ops;
 	const struct rte_bbdev_ops *dev_ops;  /**< Functions exported by PMD */
 	struct rte_bbdev_data *data;  /**< Pointer to device data */
 	enum rte_bbdev_state state;  /**< If device is currently used or not */
@@ -591,6 +606,36 @@ struct __rte_cache_aligned rte_bbdev {
 	return dev->enqueue_ldpc_dec_ops(q_data, ops, num_ops);
 }
 
+/**
+ * Enqueue a burst of fft operations to a queue of the device.
+ * This functions only enqueues as many operations as currently possible and
+ * does not block until @p num_ops entries in the queue are available.
+ * This function does not provide any error notification to avoid the
+ * corresponding overhead.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_id
+ *   The index of the queue.
+ * @param ops
+ *   Pointer array containing operations to be enqueued Must have at least
+ *   @p num_ops entries
+ * @param num_ops
+ *   The maximum number of operations to enqueue.
+ *
+ * @return
+ *   The number of operations actually enqueued (this is the number of processed
+ *   entries in the @p ops array).
+ */
+__rte_experimental
+static inline uint16_t
+rte_bbdev_enqueue_fft_ops(uint16_t dev_id, uint16_t queue_id,
+		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
+{
+	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
+	struct rte_bbdev_queue_data *q_data = &dev->data->queues[queue_id];
+	return dev->enqueue_fft_ops(q_data, ops, num_ops);
+}
 
 /**
  * Dequeue a burst of processed encode operations from a queue of the device.
@@ -716,6 +761,37 @@ struct __rte_cache_aligned rte_bbdev {
 	return dev->dequeue_ldpc_dec_ops(q_data, ops, num_ops);
 }
 
+/**
+ * Dequeue a burst of fft operations from a queue of the device.
+ * This functions returns only the current contents of the queue, and does not
+ * block until @ num_ops is available.
+ * This function does not provide any error notification to avoid the
+ * corresponding overhead.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_id
+ *   The index of the queue.
+ * @param ops
+ *   Pointer array where operations will be dequeued to. Must have at least
+ *   @p num_ops entries
+ * @param num_ops
+ *   The maximum number of operations to dequeue.
+ *
+ * @return
+ *   The number of operations actually dequeued (this is the number of entries
+ *   copied into the @p ops array).
+ */
+__rte_experimental
+static inline uint16_t
+rte_bbdev_dequeue_fft_ops(uint16_t dev_id, uint16_t queue_id,
+		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
+{
+	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
+	struct rte_bbdev_queue_data *q_data = &dev->data->queues[queue_id];
+	return dev->dequeue_fft_ops(q_data, ops, num_ops);
+}
+
 /** Definitions of device event types */
 enum rte_bbdev_event_type {
 	RTE_BBDEV_EVENT_UNKNOWN,  /**< unknown event type */
diff --git a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h
index cd82418..afa1a71 100644
--- a/lib/bbdev/rte_bbdev_op.h
+++ b/lib/bbdev/rte_bbdev_op.h
@@ -47,6 +47,8 @@
 #define RTE_BBDEV_TURBO_MAX_CODE_BLOCKS (64)
 /* LDPC:  Maximum number of Code Blocks in Transport Block.*/
 #define RTE_BBDEV_LDPC_MAX_CODE_BLOCKS (256)
+/* 12 CS maximum */
+#define RTE_BBDEV_MAX_CS_2 (6)
 
 /** Flags for turbo decoder operation and capability structure */
 enum rte_bbdev_op_td_flag_bitmasks {
@@ -211,6 +213,26 @@ enum rte_bbdev_op_ldpcenc_flag_bitmasks {
 	RTE_BBDEV_LDPC_ENC_CONCATENATION = (1ULL << 7)
 };
 
+/** Flags for DFT operation and capability structure */
+enum rte_bbdev_op_fft_flag_bitmasks {
+	/** Flexible windowing capability */
+	RTE_BBDEV_FFT_WINDOWING = (1ULL << 0),
+	/** Flexible adjustment of Cyclic Shift time offset */
+	RTE_BBDEV_FFT_CS_ADJUSTMENT = (1ULL << 1),
+	/** Set for bypass the DFT and get directly into iDFT input */
+	RTE_BBDEV_FFT_DFT_BYPASS = (1ULL << 2),
+	/** Set for bypass the IDFT and get directly the DFT output */
+	RTE_BBDEV_FFT_IDFT_BYPASS = (1ULL << 3),
+	/** Set for bypass time domain windowing */
+	RTE_BBDEV_FFT_WINDOWING_BYPASS = (1ULL << 4),
+	/** Set for optional power measurement on DFT output */
+	RTE_BBDEV_FFT_POWER_MEAS = (1ULL << 5),
+	/** Set if the input data used FP16 format */
+	RTE_BBDEV_FFT_FP16_INPUT = (1ULL << 6),
+	/**  Set if the output data uses FP16 format  */
+	RTE_BBDEV_FFT_FP16_OUTPUT = (1ULL << 7)
+};
+
 /** Flags for the Code Block/Transport block mode  */
 enum rte_bbdev_op_cb_mode {
 	/** One operation is one or fraction of one transport block  */
@@ -689,6 +711,55 @@ struct rte_bbdev_op_ldpc_enc {
 	};
 };
 
+/** Operation structure for FFT processing.
+ *
+ * The operation processes the data for multiple antennas in a single call
+ * (.i.e for all the REs belonging to a given SRS sequence for instance)
+ *
+ * The output mbuf data structure is expected to be allocated by the
+ * application with enough room for the output data.
+ */
+struct rte_bbdev_op_fft {
+	/** Input data starting from first antenna */
+	struct rte_bbdev_op_data base_input;
+	/** Output data starting from first antenna and first cyclic shift */
+	struct rte_bbdev_op_data base_output;
+	/** Optional power measurement output data */
+	struct rte_bbdev_op_data power_meas_output;
+	/** Flags from rte_bbdev_op_fft_flag_bitmasks */
+	uint32_t op_flags;
+	/** Input sequence size in 32-bits points */
+	uint16_t input_sequence_size;
+	/** Padding at the start of the sequence */
+	uint16_t input_leading_padding;
+	/** Output sequence size in 32-bits points */
+	uint16_t output_sequence_size;
+	/** Depadding at the start of the DFT output */
+	uint16_t output_leading_depadding;
+	/** Window index being used for each cyclic shift output */
+	uint8_t window_index[RTE_BBDEV_MAX_CS_2];
+	/** Bitmap of the cyclic shift output requested */
+	uint16_t cs_bitmap;
+	/** Number of antennas as a log2 – 8 to 128 */
+	uint8_t num_antennas_log2;
+	/** iDFT size as a log2 - 32 to 2048 */
+	uint8_t idft_log2;
+	/** DFT size as a log2 - 8 to 2048 */
+	uint8_t dft_log2;
+	/** Adjustment of position of the cyclic shifts - -31 to 31 */
+	int8_t cs_time_adjustment;
+	/** iDFT shift down */
+	int8_t idft_shift;
+	/** DFT shift down */
+	int8_t dft_shift;
+	/** NCS reciprocal factor  */
+	uint16_t ncs_reciprocal;
+	/** power measurement out shift down */
+	uint16_t power_shift;
+	/** Adjust the FP6 exponent for INT<->FP16 conversion */
+	uint16_t fp16_exp_adjust;
+};
+
 /** List of the capabilities for the Turbo Decoder */
 struct rte_bbdev_op_cap_turbo_dec {
 	/** Flags from rte_bbdev_op_td_flag_bitmasks */
@@ -741,6 +812,16 @@ struct rte_bbdev_op_cap_ldpc_enc {
 	uint16_t num_buffers_dst;
 };
 
+/** List of the capabilities for the FFT */
+struct rte_bbdev_op_cap_fft {
+	/** Flags from rte_bbdev_op_fft_flag_bitmasks */
+	uint32_t capability_flags;
+	/** Num input code block buffers */
+	uint16_t num_buffers_src;
+	/** Num output code block buffers */
+	uint16_t num_buffers_dst;
+};
+
 /** Different operation types supported by the device */
 enum rte_bbdev_op_type {
 	RTE_BBDEV_OP_NONE,  /**< Dummy operation that does nothing */
@@ -748,6 +829,7 @@ enum rte_bbdev_op_type {
 	RTE_BBDEV_OP_TURBO_ENC,  /**< Turbo encode */
 	RTE_BBDEV_OP_LDPC_DEC,  /**< LDPC decode */
 	RTE_BBDEV_OP_LDPC_ENC,  /**< LDPC encode */
+	RTE_BBDEV_OP_FFT,  /**< FFT */
 	RTE_BBDEV_OP_TYPE_PADDED_MAX = 8,  /**< Maximum op type number including padding */
 };
 
@@ -791,6 +873,18 @@ struct rte_bbdev_dec_op {
 	};
 };
 
+/** Structure specifying a single fft operation */
+struct rte_bbdev_fft_op {
+	/** Status of operation that was performed */
+	int status;
+	/** Mempool which op instance is in */
+	struct rte_mempool *mempool;
+	/** Opaque pointer for user data */
+	void *opaque_data;
+	/** Contains turbo decoder specific parameters */
+	struct rte_bbdev_op_fft fft;
+};
+
 /** Operation capabilities supported by a device */
 struct rte_bbdev_op_cap {
 	enum rte_bbdev_op_type type;  /**< Type of operation */
@@ -799,6 +893,7 @@ struct rte_bbdev_op_cap {
 		struct rte_bbdev_op_cap_turbo_enc turbo_enc;
 		struct rte_bbdev_op_cap_ldpc_dec ldpc_dec;
 		struct rte_bbdev_op_cap_ldpc_enc ldpc_enc;
+		struct rte_bbdev_op_cap_fft fft;
 	} cap;  /**< Operation-type specific capabilities */
 };
 
@@ -918,6 +1013,42 @@ struct rte_mempool *
 }
 
 /**
+ * Bulk allocate fft operations from a mempool with parameter defaults reset.
+ *
+ * @param mempool
+ *   Operation mempool, created by rte_bbdev_op_pool_create().
+ * @param ops
+ *   Output array to place allocated operations
+ * @param num_ops
+ *   Number of operations to allocate
+ *
+ * @returns
+ *   - 0 on success
+ *   - EINVAL if invalid mempool is provided
+ */
+__rte_experimental
+static inline int
+rte_bbdev_fft_op_alloc_bulk(struct rte_mempool *mempool,
+		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
+{
+	struct rte_bbdev_op_pool_private *priv;
+	int ret;
+
+	/* Check type */
+	priv = (struct rte_bbdev_op_pool_private *)
+			rte_mempool_get_priv(mempool);
+	if (unlikely(priv->type != RTE_BBDEV_OP_FFT))
+		return -EINVAL;
+
+	/* Get elements */
+	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
+	if (unlikely(ret < 0))
+		return ret;
+
+	return 0;
+}
+
+/**
  * Free decode operation structures that were allocated by
  * rte_bbdev_dec_op_alloc_bulk().
  * All structures must belong to the same mempool.
@@ -951,6 +1082,24 @@ struct rte_mempool *
 		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops, num_ops);
 }
 
+/**
+ * Free encode operation structures that were allocated by
+ * rte_bbdev_fft_op_alloc_bulk().
+ * All structures must belong to the same mempool.
+ *
+ * @param ops
+ *   Operation structures
+ * @param num_ops
+ *   Number of structures
+ */
+__rte_experimental
+static inline void
+rte_bbdev_fft_op_free_bulk(struct rte_bbdev_fft_op **ops, unsigned int num_ops)
+{
+	if (num_ops > 0)
+		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops, num_ops);
+}
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map
index 9ac3643..efae50b 100644
--- a/lib/bbdev/version.map
+++ b/lib/bbdev/version.map
@@ -44,4 +44,8 @@ EXPERIMENTAL {
 	global:
 
 	rte_bbdev_device_status_str;
+	rte_bbdev_enqueue_fft_ops;
+	rte_bbdev_dequeue_fft_ops;
+	rte_bbdev_fft_op_alloc_bulk;
+	rte_bbdev_fft_op_free_bulk;
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v5 6/7] bbdev: add queue related warning and status information
  2022-07-06 23:28       ` [PATCH v5 0/7] bbdev changes for 22.11 Nicolas Chautru
                           ` (4 preceding siblings ...)
  2022-07-06 23:28         ` [PATCH v5 5/7] bbdev: add new operation for FFT processing Nicolas Chautru
@ 2022-07-06 23:28         ` Nicolas Chautru
  2022-07-06 23:28         ` [PATCH v5 7/7] bbdev: remove unnecessary if-check Nicolas Chautru
  2022-08-15 17:54         ` [PATCH v5 0/7] bbdev changes for 22.11 Chautru, Nicolas
  7 siblings, 0 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-07-06 23:28 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

This allows to expose more information with regards to any
queue related failure and warning which cannot be supported
in existing API.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 app/test-bbdev/test_bbdev_perf.c |  2 ++
 lib/bbdev/rte_bbdev.c            | 19 +++++++++++++++++++
 lib/bbdev/rte_bbdev.h            | 34 ++++++++++++++++++++++++++++++++++
 lib/bbdev/version.map            |  1 +
 4 files changed, 56 insertions(+)

diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 1abda2d..653b21f 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -4360,6 +4360,8 @@ typedef int (test_case_function)(struct active_device *ad,
 	stats->dequeued_count = q_stats->dequeued_count;
 	stats->enqueue_err_count = q_stats->enqueue_err_count;
 	stats->dequeue_err_count = q_stats->dequeue_err_count;
+	stats->enqueue_warning_count = q_stats->enqueue_warning_count;
+	stats->dequeue_warning_count = q_stats->dequeue_warning_count;
 	stats->acc_offload_cycles = q_stats->acc_offload_cycles;
 
 	return 0;
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index 9d65ba8..bdd7c2f 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -721,6 +721,8 @@ struct rte_bbdev *
 		stats->dequeued_count += q_stats->dequeued_count;
 		stats->enqueue_err_count += q_stats->enqueue_err_count;
 		stats->dequeue_err_count += q_stats->dequeue_err_count;
+		stats->enqueue_warn_count += q_stats->enqueue_warn_count;
+		stats->dequeue_warn_count += q_stats->dequeue_warn_count;
 	}
 	rte_bbdev_log_debug("Got stats on %u", dev->data->dev_id);
 }
@@ -1163,3 +1165,20 @@ struct rte_mempool *
 	rte_bbdev_log(ERR, "Invalid device status");
 	return NULL;
 }
+
+const char *
+rte_bbdev_enqueue_status_str(enum rte_bbdev_enqueue_status status)
+{
+	static const char * const enq_sta_string[] = {
+		"RTE_BBDEV_ENQ_STATUS_NONE",
+		"RTE_BBDEV_ENQ_STATUS_QUEUE_FULL",
+		"RTE_BBDEV_ENQ_STATUS_RING_FULL",
+		"RTE_BBDEV_ENQ_STATUS_INVALID_OP",
+	};
+
+	if (status < sizeof(enq_sta_string) / sizeof(char *))
+		return enq_sta_string[status];
+
+	rte_bbdev_log(ERR, "Invalid enqueue status");
+	return NULL;
+}
diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index ed528b8..b7ecf94 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -224,6 +224,19 @@ struct rte_bbdev_queue_conf {
 rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
 
 /**
+ * Flags indicate the reason why a previous enqueue may not have
+ * consumed all requested operations
+ * In case of multiple reasons the latter superdes a previous one
+ */
+enum rte_bbdev_enqueue_status {
+	RTE_BBDEV_ENQ_STATUS_NONE,             /**< Nothing to report */
+	RTE_BBDEV_ENQ_STATUS_QUEUE_FULL,       /**< Not enough room in queue */
+	RTE_BBDEV_ENQ_STATUS_RING_FULL,        /**< Not enough room in ring */
+	RTE_BBDEV_ENQ_STATUS_INVALID_OP,       /**< Operation was rejected as invalid */
+	RTE_BBDEV_ENQ_STATUS_PADDED_MAX = 6,   /**< Maximum enq status number including padding */
+};
+
+/**
  * Flags indicate the status of the device
  */
 enum rte_bbdev_device_status {
@@ -246,6 +259,12 @@ struct rte_bbdev_stats {
 	uint64_t enqueue_err_count;
 	/** Total error count on operations dequeued */
 	uint64_t dequeue_err_count;
+	/** Total warning count on operations enqueued */
+	uint64_t enqueue_warn_count;
+	/** Total warning count on operations dequeued */
+	uint64_t dequeue_warn_count;
+	/** Total enqueue status count based on rte_bbdev_enqueue_status enum */
+	uint64_t enqueue_status_count[RTE_BBDEV_ENQ_STATUS_PADDED_MAX];
 	/** CPU cycles consumed by the (HW/SW) accelerator device to offload
 	 *  the enqueue request to its internal queues.
 	 *  - For a HW device this is the cycles consumed in MMIO write
@@ -386,6 +405,7 @@ struct rte_bbdev_queue_data {
 	void *queue_private;  /**< Driver-specific per-queue data */
 	struct rte_bbdev_queue_conf conf;  /**< Current configuration */
 	struct rte_bbdev_stats queue_stats;  /**< Queue statistics */
+	enum rte_bbdev_enqueue_status enqueue_status; /**< Enqueue status when op is rejected */
 	bool started;  /**< Queue state */
 };
 
@@ -938,6 +958,20 @@ typedef void (*rte_bbdev_cb_fn)(uint16_t dev_id,
 const char*
 rte_bbdev_device_status_str(enum rte_bbdev_device_status status);
 
+/**
+ * Converts queue status from enum to string
+ *
+ * @param status
+ *   Queue status as enum
+ *
+ * @returns
+ *  Queue status as string or NULL if op_type is invalid
+ *
+ */
+__rte_experimental
+const char*
+rte_bbdev_enqueue_status_str(enum rte_bbdev_enqueue_status status);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map
index efae50b..1c06738 100644
--- a/lib/bbdev/version.map
+++ b/lib/bbdev/version.map
@@ -44,6 +44,7 @@ EXPERIMENTAL {
 	global:
 
 	rte_bbdev_device_status_str;
+	rte_bbdev_enqueue_status_str;
 	rte_bbdev_enqueue_fft_ops;
 	rte_bbdev_dequeue_fft_ops;
 	rte_bbdev_fft_op_alloc_bulk;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v5 7/7] bbdev: remove unnecessary if-check
  2022-07-06 23:28       ` [PATCH v5 0/7] bbdev changes for 22.11 Nicolas Chautru
                           ` (5 preceding siblings ...)
  2022-07-06 23:28         ` [PATCH v5 6/7] bbdev: add queue related warning and status information Nicolas Chautru
@ 2022-07-06 23:28         ` Nicolas Chautru
  2022-08-15 17:54         ` [PATCH v5 0/7] bbdev changes for 22.11 Chautru, Nicolas
  7 siblings, 0 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-07-06 23:28 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Code clean up due to if-check not required

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 lib/bbdev/rte_bbdev_op.h | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h
index afa1a71..386eed8 100644
--- a/lib/bbdev/rte_bbdev_op.h
+++ b/lib/bbdev/rte_bbdev_op.h
@@ -970,10 +970,8 @@ struct rte_mempool *
 
 	/* Get elements */
 	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
-	if (unlikely(ret < 0))
-		return ret;
 
-	return 0;
+	return ret;
 }
 
 /**
@@ -1006,10 +1004,8 @@ struct rte_mempool *
 
 	/* Get elements */
 	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
-	if (unlikely(ret < 0))
-		return ret;
 
-	return 0;
+	return ret;
 }
 
 /**
@@ -1035,17 +1031,14 @@ struct rte_mempool *
 	int ret;
 
 	/* Check type */
-	priv = (struct rte_bbdev_op_pool_private *)
-			rte_mempool_get_priv(mempool);
+	priv = (struct rte_bbdev_op_pool_private *) rte_mempool_get_priv(mempool);
 	if (unlikely(priv->type != RTE_BBDEV_OP_FFT))
 		return -EINVAL;
 
 	/* Get elements */
 	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
-	if (unlikely(ret < 0))
-		return ret;
 
-	return 0;
+	return ret;
 }
 
 /**
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v4 7/7] bbdev: add a lock option for enqueue/dequeue operation
  2022-07-06 20:21               ` Chautru, Nicolas
@ 2022-07-07 12:47                 ` Tom Rix
  0 siblings, 0 replies; 174+ messages in thread
From: Tom Rix @ 2022-07-07 12:47 UTC (permalink / raw)
  To: Chautru, Nicolas, Stephen Hemminger
  Cc: dev, thomas, gakhil, hemant.agrawal, maxime.coquelin, mdr,
	Richardson, Bruce, david.marchand


On 7/6/22 1:21 PM, Chautru, Nicolas wrote:
> Hi Stephen, Tom.,
>
>> -----Original Message-----
>> From: Stephen Hemminger <stephen@networkplumber.org>
>>
>> On Wed, 6 Jul 2022 12:01:19 -0700
>> Tom Rix <trix@redhat.com> wrote:
>>
>>> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
>>>> Locking is not explicitly required but can be valuable in case the
>>>> application cannot guarantee to be thread-safe, or specifically is
>>>> at risk of using the same queue from multiple threads.
>>>> This is an option for PMD to use this.
>>>>
>>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>>> ---
>>>>    lib/bbdev/rte_bbdev.h | 2 ++
>>>>    1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
>>>> b7ecf94..8e7ca86 100644
>>>> --- a/lib/bbdev/rte_bbdev.h
>>>> +++ b/lib/bbdev/rte_bbdev.h
>>>> @@ -407,6 +407,8 @@ struct rte_bbdev_queue_data {
>>>>    	struct rte_bbdev_stats queue_stats;  /**< Queue statistics */
>>>>    	enum rte_bbdev_enqueue_status enqueue_status; /**< Enqueue
>> status when op is rejected */
>>>>    	bool started;  /**< Queue state */
>>>> +	rte_rwlock_t lock_enq; /**< lock protection for the Enqueue */
>>>> +	rte_rwlock_t lock_deq; /**< lock protection for the Dequeue */
>>> No.
>>>
>>> This is a good idea but needs a use before introducing another
>>> element, particularly a complicated one like locking
>>>
>>> Tom
> The actual usage would be implemented within the PMD. Basically this to prevent the corner case when a queue is being accessed from multiple thread for which there is no protection in DPDK (but application does not necessarily behaves well).
> In normal operation there would never be a case when there is a conflict on the lock.
> This is not something which was considered for any other PMD?
>  From DPDK doc : "If multiple threads are to use the same hardware queue on the same NIC port, then locking, or some other form of mutual exclusion, is necessary."
> Basically for AC100 we would purely enforce the lock for any enqueue/dequeue operation for a given queue (distinct lock for enqueue and dequeue, since these would run on different threads).

I am fine with locking, just have to use them.

For me, this would mean adding them to every public interface so the 
changes would be involved.

This is a big change and if pressed to get this patchset into 22.11, 
then defer this patch to later.

Tom

>
>> Having two locks on same cacheline will create lots of ping/pong false sharing.
> You would recommend to purely spread them within the structure? Or something else?
>   
>> Also, unless the reader is holding the lock for a significant fraction of the time a
>> regular spin lock will be faster.
> OK Thanks. It should in principle never have to wait for the lock for the usage above, only to catch misbehaving application risk.
>
> Nic
>
>


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v4 5/7] bbdev: add new operation for FFT processing
  2022-07-06 21:04             ` Chautru, Nicolas
@ 2022-07-07 13:09               ` Tom Rix
  2022-07-07 16:57                 ` Chautru, Nicolas
  0 siblings, 1 reply; 174+ messages in thread
From: Tom Rix @ 2022-07-07 13:09 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, Richardson, Bruce, david.marchand, stephen

Nic,

Not all my comments were addressed.

The one I am most interested in is the default type / size and how it 
interacts with fp16.

Please see the others below

On 7/6/22 2:04 PM, Chautru, Nicolas wrote:
> Hi Tom,
>
>> -----Original Message-----
>> From: Tom Rix <trix@redhat.com>>
>>
>> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
>>> Extension of bbdev operation to support FFT based operations.
>>>
>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
>>> ---
>>>    doc/guides/prog_guide/bbdev.rst | 130
>> +++++++++++++++++++++++++++++++++++
>>>    lib/bbdev/rte_bbdev.c           |  11 ++-
>>>    lib/bbdev/rte_bbdev.h           |  76 ++++++++++++++++++++
>>>    lib/bbdev/rte_bbdev_op.h        | 149
>> ++++++++++++++++++++++++++++++++++++++++
>>>    lib/bbdev/version.map           |   4 ++
>>>    5 files changed, 369 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/doc/guides/prog_guide/bbdev.rst
>>> b/doc/guides/prog_guide/bbdev.rst index 70fa01a..4a055b5 100644
>>> --- a/doc/guides/prog_guide/bbdev.rst
>>> +++ b/doc/guides/prog_guide/bbdev.rst
>>> @@ -1118,6 +1118,136 @@ Figure :numref:`figure_turbo_tb_decode` above
>>>    showing the Turbo decoding of CBs using BBDEV interface in TB-mode
>>>    is also valid for LDPC decode.
>>>
>>> +BBDEV FFT Operation
>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> +
>>> +This operation allows to run a combination of DFT and/or IDFT and/or time-
>> domain windowing.
>>> +These can be used in a modular fashion (using bypass modes) or as a
>>> +processing pipeline which can be used for FFT-based baseband signal
>> processing.
>>> +In more details it allows :
>>> +- to process the data first through an IDFT of adjustable size and
>>> +padding;
>>> +- to perform the windowing as a programmable cyclic shift offset of
>>> +the data followed by a pointwise multiplication by a time domain
>>> +window;
>>> +- to process the related data through a DFT of adjustable size and
>>> +depadding for each such cyclic shift output.
>>> +
>>> +A flexible number of Rx antennas are being processed in parallel with the
>> same configuration.
>>> +The API allows more generally for flexibility in what the PMD may
>>> +support (cabability flags) and flexibility to adjust some of the parameters of
>> the processing.
>>> +
>>> +The operation/capability flags that can be set for each FFT operation are
>> given below.
>>> +
>>> +  **NOTE:** The actual operation flags that may be used with a
>>> + specific  BBDEV PMD are dependent on the driver capabilities as
>>> + reported via  ``rte_bbdev_info_get()``, and may be a subset of those below.
>>> +
>>> ++--------------------------------------------------------------------+
>>> +|Description of FFT capability flags                                 |
>>>
>> ++===============================================================
>> =====
>>> +++
>>> +|RTE_BBDEV_FFT_WINDOWING                                             |
>>> +| Set to enable/support windowing in time domain                     |
>>> ++--------------------------------------------------------------------+
>>> +|RTE_BBDEV_FFT_CS_ADJUSTMENT                                         |
>>> +| Set to enable/support  the cyclic shift time offset adjustment     |
>>> ++--------------------------------------------------------------------+
>>> +|RTE_BBDEV_FFT_DFT_BYPASS                                            |
>>> +| Set to bypass the DFT and use directly the IDFT as an option       |
>>> ++--------------------------------------------------------------------+
>>> +|RTE_BBDEV_FFT_IDFT_BYPASS                                           |
>>> +| Set to bypass the IDFT and use directly the DFT as an option       |
>>> ++--------------------------------------------------------------------+
>>> +|RTE_BBDEV_FFT_WINDOWING_BYPASS                                      |
>>> +| Set to bypass the time domain windowing  as an option              |
>>> ++--------------------------------------------------------------------+
>>> +|RTE_BBDEV_FFT_POWER_MEAS
>> Other flags are not truncated, should be
>>
>> RTE_BBDEV_FFT_POWER_MEASUREMENT
>>
> The intention from DPDK recommendation is for these to be kept shortnames, isn't it?
> Above we use many acronyms to keep it short (CS, etc...)
> Even in current BBDEV API we use many truncation to keep names short: OUT, ENC/DEC, HQ, RM on top of acronyms.
> I believe this is still super explicit with that name?

Some of other identifier have longer names than this.

If you wanted to keep things short, drop the last _<word>

Generally the use of acronyms should be avoided because they add a layer 
of jargon that makes the code less readable to all but writer.


>
>>>                                               |
>>> +| Set to provide an optional power measument of the DFT output       |
>>> ++--------------------------------------------------------------------+
>> measurement
> OK Thanks
>
>>> +|RTE_BBDEV_FFT_FP16_INPUT                                            |
>>> +| Set if the input data shall use FP16 format instead of INT16       |
>>> ++--------------------------------------------------------------------+
>>> +|RTE_BBDEV_FFT_FP16_OUTPUT                                           |
>>> +| Set if the output data shall use FP16 format instead of INT16      |
>>> ++--------------------------------------------------------------------+
>>> +
>>> +The structure passed for each FFT operation is given below, with the
>>> +operation flags forming a bitmask in the ``op_flags`` field.
>>> +
>>> +.. code-block:: c
>>> +
>>> +    struct rte_bbdev_op_fft {
>>> +        struct rte_bbdev_op_data base_input;
>>> +        struct rte_bbdev_op_data base_output;
>>> +        struct rte_bbdev_op_data power_meas_output;
>> similar to above, meas -> measurement
> See above. Would that really help? I don’t believe there can be any confusion.

Naming is hard.

How about dropping the _meas_ and go with power_output

>
>>> +        uint32_t op_flags;
>>> +        uint16_t input_sequence_size;
>> Could these be future proofed by increasing small int size's to uint32_t ?
> It is not possible to be that big for any signal processing relevant to that operation.
>
>>> +        uint16_t input_leading_padding;
>>> +        uint16_t output_sequence_size;
>>> +        uint16_t output_leading_depadding;
>>> +        uint8_t window_index[RTE_BBDEV_MAX_CS_2];
>>> +        uint16_t cs_bitmap;
>>> +        uint8_t num_antennas_log2;
>>> +        uint8_t idft_log2;
>>> +        uint8_t dft_log2;
>> is _log2 needed in variable name if it is documenation ?
> I believe it is a best practice when the variable name may be misleading, ie. this is not the actual dft size as a natural number (2048 for instance) but there is an implied mapping.
>
>>> +        int8_t cs_time_adjustment;
>>> +        int8_t idft_shift;
>>> +        int8_t dft_shift;
>>> +        uint16_t ncs_reciprocal;
>>> +        uint16_t power_shift;
>>> +        uint16_t fp16_exp_adjust;
>>> +    };
>>> +
>>> +The FFT parameters are set out in the table below.
>>> +
>>> ++----------------------+--------------------------------------------------------------+
>>> +|Parameter             |Description                                                   |
>>>
>> ++======================+========================================
>> =====
>>> ++=================+
>>> +|base_input            |input data                                                    |
>>> ++----------------------+--------------------------------------------------------------+
>>> +|base_output           |output data                                                   |
>>> ++----------------------+--------------------------------------------------------------+
>>> +|power_meas_output     |optional output data with power measurement
>> on DFT output     |
>>> ++----------------------+--------------------------------------------------------------+
>>> +|op_flags              |bitmask of all active operation capabilities                  |
>>> ++----------------------+--------------------------------------------------------------+
>>> +|input_sequence_size   |size of the input sequence in 32-bits points per
>> antenna      |
>>> ++----------------------+--------------------------------------------------------------+
>>> +|input_leading_padding |number of points padded at the start of input
>> data            |
>>> ++----------------------+--------------------------------------------------------------+
>>> +|output_sequence_size  |size of the output sequence per antenna and
>> cyclic shift      |
>>> ++----------------------+--------------------------------------------------------------+
>>> +|output_depadding      |number of points depadded at the start of output
>> data         |
>>> ++----------------------+--------------------------------------------------------------+
>> output_leading_depadding
> OK Thanks
>
>>> +|window_index          |optional windowing profile index used for each cyclic
>> shift   |
>>> ++----------------------+--------------------------------------------------------------+
>>> +|cs_bitmap             |bitmap of the cyclic shift output requested (LSB for
>> index 0) |
>>> ++----------------------+--------------------------------------------------------------+
>>> +|num_antennas_log2     |number of antennas as a log2 (10 maps to 1024...)
>> |
>>> ++----------------------+--------------------------------------------------------------+
>>> +|idft_log2             |iDFT size as a log2                                           |
>>> ++----------------------+--------------------------------------------------------------+
>>> +|dft_log2              |DFT size as a log2                                            |
>>> ++----------------------+--------------------------------------------------------------+
>>> +|cs_time_adjustment    |adjustment of time position of all the cyclic shift
>> output    |
>>> ++----------------------+--------------------------------------------------------------+
>>> +|idft_shift            |shift down of signal level post iDFT                          |
>>> ++----------------------+--------------------------------------------------------------+
>>> +|dft_shift             |shift down of signal level post DFT                           |
>>> ++----------------------+--------------------------------------------------------------+
>>> +|ncs_reciprocal        |inverse of max number of CS normalized to 15b (ie.
>> 231 for 12)|
>>> ++----------------------+--------------------------------------------------------------+
>>> +|power_shift           |shift down of level of power measurement when
>> enabled         |
>>> ++----------------------+--------------------------------------------------------------+
>>> +|fp16_exp_adjust       |value added to FP16 exponent at conversion from
>> INT16         |
>>> ++----------------------+--------------------------------------------------------------+
>>> +
>>> +The mbuf input ``base_input`` is mandatory for all BBDEV PMDs and is
>>> +the incoming data for the processing. Its size may not fit into an
>>> +actual mbuf, but the structure is used to pass iova address.
>>> +The mbuf output ``output`` is mandatory and is output of the FFT
>> processing chain.
>>> +Each point is a complex number of 32bits : either as 2 INT16 or as 2
>>> +FP16 based when the option supported.
>>> +The data layout is based on contiguous concatenation of output data
>>> +first by cyclic shift then by antenna.
>>>
>>>    Sample code
>>>    -----------
>>> diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c index
>>> 555bda9..28b105d 100644
>>> --- a/lib/bbdev/rte_bbdev.c
>>> +++ b/lib/bbdev/rte_bbdev.c
>>> @@ -24,7 +24,7 @@
>>>    #define DEV_NAME "BBDEV"
>>>
>>>    /* Number of supported operation types */ -#define
>>> BBDEV_OP_TYPE_COUNT 5
>>> +#define BBDEV_OP_TYPE_COUNT 6
>>>    /* Number of supported device status */
>>>    #define BBDEV_DEV_STATUS_COUNT 9
>>>
>>> @@ -854,6 +854,9 @@ struct rte_bbdev *
>>>    	case RTE_BBDEV_OP_LDPC_ENC:
>>>    		result = sizeof(struct rte_bbdev_enc_op);
>>>    		break;
>>> +	case RTE_BBDEV_OP_FFT:
>>> +		result = sizeof(struct rte_bbdev_fft_op);
>>> +		break;
>>>    	default:
>>>    		break;
>>>    	}
>>> @@ -877,6 +880,10 @@ struct rte_bbdev *
>>>    		struct rte_bbdev_enc_op *op = element;
>>>    		memset(op, 0, mempool->elt_size);
>>>    		op->mempool = mempool;
>>> +	} else if (type == RTE_BBDEV_OP_FFT) {
>>> +		struct rte_bbdev_fft_op *op = element;
>>> +		memset(op, 0, mempool->elt_size);
>>> +		op->mempool = mempool;
>>>    	}
>>>    }
>>>
>>> @@ -1126,6 +1133,8 @@ struct rte_mempool *
>>>    		"RTE_BBDEV_OP_TURBO_DEC",
>>>    		"RTE_BBDEV_OP_TURBO_ENC",
>>>    		"RTE_BBDEV_OP_LDPC_DEC",
>>> +		"RTE_BBDEV_OP_LDPC_ENC",
>> Why ldpc_enc line, this is already in codebase ?
>>> +		"RTE_BBDEV_OP_FFT",
> Thanks, there this is a rebase issue in previous commit
>
>
>>>    	};
>>>
>>>    	if (op_type < BBDEV_OP_TYPE_COUNT)
>>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
>>> ac941d6..ed528b8 100644
>>> --- a/lib/bbdev/rte_bbdev.h
>>> +++ b/lib/bbdev/rte_bbdev.h
>>> @@ -401,6 +401,12 @@ typedef uint16_t
>> (*rte_bbdev_enqueue_dec_ops_t)(
>>>    		struct rte_bbdev_dec_op **ops,
>>>    		uint16_t num);
>>>
>>> +/** @internal Enqueue fft operations for processing on queue of a
>>> +device. */ typedef uint16_t (*rte_bbdev_enqueue_fft_ops_t)(
>>> +		struct rte_bbdev_queue_data *q_data,
>>> +		struct rte_bbdev_fft_op **ops,
>>> +		uint16_t num);
>>> +
>>>    /** @internal Dequeue encode operations from a queue of a device. */
>>>    typedef uint16_t (*rte_bbdev_dequeue_enc_ops_t)(
>>>    		struct rte_bbdev_queue_data *q_data, @@ -411,6 +417,11
>> @@ typedef
>>> uint16_t (*rte_bbdev_dequeue_dec_ops_t)(
>>>    		struct rte_bbdev_queue_data *q_data,
>>>    		struct rte_bbdev_dec_op **ops, uint16_t num);
>>>
>>> +/** @internal Dequeue fft operations from a queue of a device. */
>>> +typedef uint16_t (*rte_bbdev_dequeue_fft_ops_t)(
>>> +		struct rte_bbdev_queue_data *q_data,
>>> +		struct rte_bbdev_fft_op **ops, uint16_t num);
>>> +
>>>    #define RTE_BBDEV_NAME_MAX_LEN  64  /**< Max length of device name
>>> */
>>>
>>>    /**
>>> @@ -459,6 +470,10 @@ struct __rte_cache_aligned rte_bbdev {
>>>    	rte_bbdev_dequeue_enc_ops_t dequeue_ldpc_enc_ops;
>>>    	/** Dequeue decode function */
>>>    	rte_bbdev_dequeue_dec_ops_t dequeue_ldpc_dec_ops;
>>> +	/** Enqueue FFT function */
>>> +	rte_bbdev_enqueue_fft_ops_t enqueue_fft_ops;
>>> +	/** Dequeue FFT function */
>>> +	rte_bbdev_dequeue_fft_ops_t dequeue_fft_ops;
>>>    	const struct rte_bbdev_ops *dev_ops;  /**< Functions exported by
>> PMD */
>>>    	struct rte_bbdev_data *data;  /**< Pointer to device data */
>>>    	enum rte_bbdev_state state;  /**< If device is currently used or
>>> not */ @@ -591,6 +606,36 @@ struct __rte_cache_aligned rte_bbdev {
>>>    	return dev->enqueue_ldpc_dec_ops(q_data, ops, num_ops);
>>>    }
>>>
>>> +/**
>>> + * Enqueue a burst of fft operations to a queue of the device.
>>> + * This functions only enqueues as many operations as currently
>>> +possible and
>>> + * does not block until @p num_ops entries in the queue are available.
>>> + * This function does not provide any error notification to avoid the
>>> + * corresponding overhead.
>>> + *
>>> + * @param dev_id
>>> + *   The identifier of the device.
>>> + * @param queue_id
>>> + *   The index of the queue.
>>> + * @param ops
>>> + *   Pointer array containing operations to be enqueued Must have at least
>>> + *   @p num_ops entries
>>> + * @param num_ops
>>> + *   The maximum number of operations to enqueue.
>>> + *
>>> + * @return
>>> + *   The number of operations actually enqueued (this is the number of
>> processed
>>> + *   entries in the @p ops array).
>>> + */
>>> +__rte_experimental
>>> +static inline uint16_t
>>> +rte_bbdev_enqueue_fft_ops(uint16_t dev_id, uint16_t queue_id,
>>> +		struct rte_bbdev_fft_op **ops, uint16_t num_ops) {
>>> +	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
>> Who checks the input is valid ?

Who checks the input is valid ?


>>> +	struct rte_bbdev_queue_data *q_data = &dev->data-
>>> queues[queue_id];
>>> +	return dev->enqueue_fft_ops(q_data, ops, num_ops); }
>>>
>>>    /**
>>>     * Dequeue a burst of processed encode operations from a queue of the
>> device.
>>> @@ -716,6 +761,37 @@ struct __rte_cache_aligned rte_bbdev {
>>>    	return dev->dequeue_ldpc_dec_ops(q_data, ops, num_ops);
>>>    }
>>>
>>> +/**
>>> + * Dequeue a burst of fft operations from a queue of the device.
>>> + * This functions returns only the current contents of the queue, and
>>> +does not
>>> + * block until @ num_ops is available.
>>> + * This function does not provide any error notification to avoid the
>>> + * corresponding overhead.
>>> + *
>>> + * @param dev_id
>>> + *   The identifier of the device.
>>> + * @param queue_id
>>> + *   The index of the queue.
>>> + * @param ops
>>> + *   Pointer array where operations will be dequeued to. Must have at least
>>> + *   @p num_ops entries
>>> + * @param num_ops
>>> + *   The maximum number of operations to dequeue.
>>> + *
>>> + * @return
>>> + *   The number of operations actually dequeued (this is the number of
>> entries
>>> + *   copied into the @p ops array).
>>> + */
>>> +__rte_experimental
>>> +static inline uint16_t
>>> +rte_bbdev_dequeue_fft_ops(uint16_t dev_id, uint16_t queue_id,
>>> +		struct rte_bbdev_fft_op **ops, uint16_t num_ops) {
>>> +	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
>>> +	struct rte_bbdev_queue_data *q_data = &dev->data-
>>> queues[queue_id];
>>> +	return dev->dequeue_fft_ops(q_data, ops, num_ops); }
>>> +
>>>    /** Definitions of device event types */
>>>    enum rte_bbdev_event_type {
>>>    	RTE_BBDEV_EVENT_UNKNOWN,  /**< unknown event type */ diff --git
>>> a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h index
>>> cd82418..3e46f1d 100644
>>> --- a/lib/bbdev/rte_bbdev_op.h
>>> +++ b/lib/bbdev/rte_bbdev_op.h
>>> @@ -47,6 +47,8 @@
>>>    #define RTE_BBDEV_TURBO_MAX_CODE_BLOCKS (64)
>>>    /* LDPC:  Maximum number of Code Blocks in Transport Block.*/
>>>    #define RTE_BBDEV_LDPC_MAX_CODE_BLOCKS (256)
>>> +/* 12 CS maximum */
>>> +#define RTE_BBDEV_MAX_CS_2 (6)
>>>
>>>    /** Flags for turbo decoder operation and capability structure */
>>>    enum rte_bbdev_op_td_flag_bitmasks { @@ -211,6 +213,26 @@ enum
>>> rte_bbdev_op_ldpcenc_flag_bitmasks {
>>>    	RTE_BBDEV_LDPC_ENC_CONCATENATION = (1ULL << 7)
>>>    };
>>>
>>> +/** Flags for DFT operation and capability structure */ enum
>>> +rte_bbdev_op_fft_flag_bitmasks {
>>> +	/** Flexible windowing capability */
>>> +	RTE_BBDEV_FFT_WINDOWING = (1ULL << 0),
>>> +	/** Flexible adjustment of Cyclic Shift time offset */
>>> +	RTE_BBDEV_FFT_CS_ADJUSTMENT = (1ULL << 1),
>>> +	/** Set for bypass the DFT and get directly into iDFT input */
>>> +	RTE_BBDEV_FFT_DFT_BYPASS = (1ULL << 2),
>>> +	/** Set for bypass the IDFT and get directly the DFT output */
>>> +	RTE_BBDEV_FFT_IDFT_BYPASS = (1ULL << 3),
>>> +	/** Set for bypass time domain windowing */
>>> +	RTE_BBDEV_FFT_WINDOWING_BYPASS = (1ULL << 4),
>>> +	/** Set for optional power measurement on DFT output */
>>> +	RTE_BBDEV_FFT_POWER_MEAS = (1ULL << 5),
>> Meas here too, change generally
>>> +	/** Set if the input data used FP16 format */
>>> +	RTE_BBDEV_FFT_FP16_INPUT = (1ULL << 6),
>> What are the other data type(s) ?
>>
>> The default is not mentioned, or i missed it.
?
>>
>>> +	/**  Set if the output data uses FP16 format  */
>>> +	RTE_BBDEV_FFT_FP16_OUTPUT = (1ULL << 7) };
>>> +
>>>    /** Flags for the Code Block/Transport block mode  */
>>>    enum rte_bbdev_op_cb_mode {
>>>    	/** One operation is one or fraction of one transport block  */ @@
>>> -689,6 +711,55 @@ struct rte_bbdev_op_ldpc_enc {
>>>    	};
>>>    };
>>>
>>> +/** Operation structure for FFT processing.
>>> + *
>>> + * The operation processes the data for multiple antennas in a single
>>> +call
>>> + * (.i.e for all the REs belonging to a given SRS sequence for
>>> +instance)
>>> + *
>>> + * The output mbuf data structure is expected to be allocated by the
>>> + * application with enough room for the output data.
>>> + */
>>> +struct rte_bbdev_op_fft {
>>> +	/** Input data starting from first antenna */
>>> +	struct rte_bbdev_op_data base_input;
>>> +	/** Output data starting from first antenna and first cyclic shift */
>>> +	struct rte_bbdev_op_data base_output;
>>> +	/** Optional power measurement output data */
>>> +	struct rte_bbdev_op_data power_meas_output;
>>> +	/** Flags from rte_bbdev_op_fft_flag_bitmasks */
>>> +	uint32_t op_flags;
>>> +	/** Input sequence size in 32-bits points */
>>> +	uint16_t input_sequence_size;
>> size is bytes*4 ? how does this work with fp16 ?
?
>>> +	/** Padding at the start of the sequence */
>>> +	uint16_t input_leading_padding;
>>> +	/** Output sequence size in 32-bits points */
>>> +	uint16_t output_sequence_size;
>>> +	/** Depadding at the start of the DFT output */
>>> +	uint16_t output_leading_depadding;
>>> +	/** Window index being used for each cyclic shift output */
>>> +	uint8_t window_index[RTE_BBDEV_MAX_CS_2];
>>> +	/** Bitmap of the cyclic shift output requested */
>>> +	uint16_t cs_bitmap;
>>> +	/** Number of antennas as a log2 – 8 to 128 */
>>> +	uint8_t num_antennas_log2;
>>> +	/** iDFT size as a log2 - 32 to 2048 */
>>> +	uint8_t idft_log2;
>>> +	/** DFT size as a log2 - 8 to 2048 */
>>> +	uint8_t dft_log2;
>>> +	/** Adjustment of position of the cyclic shifts - -31 to 31 */
>>> +	int8_t cs_time_adjustment;
>>> +	/** iDFT shift down */
>>> +	int8_t idft_shift;
>>> +	/** DFT shift down */
>>> +	int8_t dft_shift;
>>> +	/** NCS reciprocal factor  */
>>> +	uint16_t ncs_reciprocal;
>>> +	/** power measurement out shift down */
>>> +	uint16_t power_shift;
>>> +	/** Adjust the FP6 exponent for INT<->FP16 conversion */
>>> +	uint16_t fp16_exp_adjust;
>>> +};
>>> +
>>>    /** List of the capabilities for the Turbo Decoder */
>>>    struct rte_bbdev_op_cap_turbo_dec {
>>>    	/** Flags from rte_bbdev_op_td_flag_bitmasks */ @@ -741,6 +812,16
>>> @@ struct rte_bbdev_op_cap_ldpc_enc {
>>>    	uint16_t num_buffers_dst;
>>>    };
>>>
>>> +/** List of the capabilities for the FFT */ struct
>>> +rte_bbdev_op_cap_fft {
>>> +	/** Flags from rte_bbdev_op_ldpcenc_flag_bitmasks */
>> you mean 'from rte_bbdev_op_fft_flag_bitmasks' ?
?
>>> +	uint32_t capability_flags;
>>> +	/** Num input code block buffers */
>>> +	uint16_t num_buffers_src;
>>> +	/** Num output code block buffers */
>>> +	uint16_t num_buffers_dst;
>>> +};
>>> +
>>>    /** Different operation types supported by the device */
>>>    enum rte_bbdev_op_type {
>>>    	RTE_BBDEV_OP_NONE,  /**< Dummy operation that does nothing */
>> @@
>>> -748,6 +829,7 @@ enum rte_bbdev_op_type {
>>>    	RTE_BBDEV_OP_TURBO_ENC,  /**< Turbo encode */
>>>    	RTE_BBDEV_OP_LDPC_DEC,  /**< LDPC decode */
>>>    	RTE_BBDEV_OP_LDPC_ENC,  /**< LDPC encode */
>>> +	RTE_BBDEV_OP_FFT,  /**< FFT */
>>>    	RTE_BBDEV_OP_TYPE_PADDED_MAX = 8,  /**< Maximum op type
>> number including padding */
>>>    };
>>>
>>> @@ -791,6 +873,18 @@ struct rte_bbdev_dec_op {
>>>    	};
>>>    };
>>>
>>> +/** Structure specifying a single fft operation */ struct
>>> +rte_bbdev_fft_op {
>>> +	/** Status of operation that was performed */
>>> +	int status;
>>> +	/** Mempool which op instance is in */
>>> +	struct rte_mempool *mempool;
>>> +	/** Opaque pointer for user data */
>>> +	void *opaque_data;
>>> +	/** Contains turbo decoder specific parameters */
>>> +	struct rte_bbdev_op_fft fft;
>>> +};
>>> +
>>>    /** Operation capabilities supported by a device */
>>>    struct rte_bbdev_op_cap {
>>>    	enum rte_bbdev_op_type type;  /**< Type of operation */ @@ -799,6
>>> +893,7 @@ struct rte_bbdev_op_cap {
>>>    		struct rte_bbdev_op_cap_turbo_enc turbo_enc;
>>>    		struct rte_bbdev_op_cap_ldpc_dec ldpc_dec;
>>>    		struct rte_bbdev_op_cap_ldpc_enc ldpc_enc;
>>> +		struct rte_bbdev_op_cap_fft fft;
>>>    	} cap;  /**< Operation-type specific capabilities */
>>>    };
>>>
>>> @@ -918,6 +1013,42 @@ struct rte_mempool *
>>>    }
>>>
>>>    /**
>>> + * Bulk allocate fft operations from a mempool with parameter defaults
>> reset.
>>> + *
>>> + * @param mempool
>>> + *   Operation mempool, created by rte_bbdev_op_pool_create().
>>> + * @param ops
>>> + *   Output array to place allocated operations
>>> + * @param num_ops
>>> + *   Number of operations to allocate
>>> + *
>>> + * @returns
>>> + *   - 0 on success
>>> + *   - EINVAL if invalid mempool is provided
>>> + */
>>> +__rte_experimental
>>> +static inline int
>>> +rte_bbdev_fft_op_alloc_bulk(struct rte_mempool *mempool,
>>> +		struct rte_bbdev_fft_op **ops, uint16_t num_ops) {
>>> +	struct rte_bbdev_op_pool_private *priv;
>>> +	int ret;
>>> +
>>> +	/* Check type */
>>> +	priv = (struct rte_bbdev_op_pool_private *)
>>> +			rte_mempool_get_priv(mempool);
>>> +	if (unlikely(priv->type != RTE_BBDEV_OP_FFT))
>>> +		return -EINVAL;
>>> +
>>> +	/* Get elements */
>>> +	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
>>> +	if (unlikely(ret < 0))
>>> +		return ret;
>> if-check is not needed, just
>>
>> return ret;
>>
>> and drop the next line
?
>>
>> Tom
>>
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +/**
>>>     * Free decode operation structures that were allocated by
>>>     * rte_bbdev_dec_op_alloc_bulk().
>>>     * All structures must belong to the same mempool.
>>> @@ -951,6 +1082,24 @@ struct rte_mempool *
>>>    		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops,
>> num_ops);
>>>    }
>>>
>>> +/**
>>> + * Free encode operation structures that were allocated by
>>> + * rte_bbdev_fft_op_alloc_bulk().
>>> + * All structures must belong to the same mempool.
>>> + *
>>> + * @param ops
>>> + *   Operation structures
>>> + * @param num_ops
>>> + *   Number of structures
>>> + */
>>> +__rte_experimental
>>> +static inline void
>>> +rte_bbdev_fft_op_free_bulk(struct rte_bbdev_fft_op **ops, unsigned
>>> +int num_ops) {
>>> +	if (num_ops > 0)
>>> +		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops,
>> num_ops); }
>>> +
>>>    #ifdef __cplusplus
>>>    }
>>>    #endif
>>> diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map index
>>> 9ac3643..efae50b 100644
>>> --- a/lib/bbdev/version.map
>>> +++ b/lib/bbdev/version.map
>>> @@ -44,4 +44,8 @@ EXPERIMENTAL {
>>>    	global:
>>>
>>>    	rte_bbdev_device_status_str;
>>> +	rte_bbdev_enqueue_fft_ops;
>>> +	rte_bbdev_dequeue_fft_ops;
>>> +	rte_bbdev_fft_op_alloc_bulk;
>>> +	rte_bbdev_fft_op_free_bulk;
>>>    };


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v4 4/7] drivers/baseband: update PMDs to expose queue per operation
  2022-07-06 21:10             ` Chautru, Nicolas
@ 2022-07-07 13:20               ` Tom Rix
  2022-07-07 17:19                 ` Chautru, Nicolas
  0 siblings, 1 reply; 174+ messages in thread
From: Tom Rix @ 2022-07-07 13:20 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, Richardson, Bruce, david.marchand, stephen


On 7/6/22 2:10 PM, Chautru, Nicolas wrote:
> Hi Tom,
>
>> -----Original Message-----
>> From: Tom Rix <trix@redhat.com>
>> Sent: Wednesday, July 6, 2022 9:15 AM
>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
>> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
>> Cc: maxime.coquelin@redhat.com; mdr@ashroe.eu; Richardson, Bruce
>> <bruce.richardson@intel.com>; david.marchand@redhat.com;
>> stephen@networkplumber.org
>> Subject: Re: [PATCH v4 4/7] drivers/baseband: update PMDs to expose queue
>> per operation
>>
>>
>> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
>>> Add support in existing bbdev PMDs for the explicit number of queue
>>> and priority for each operation type configured on the device.
>>>
>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>> ---
>>>    drivers/baseband/acc100/rte_acc100_pmd.c           | 29 +++++++++++++------
>> ---
>>>    drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  8 ++++++
>>>    drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  8 ++++++
>>>    drivers/baseband/la12xx/bbdev_la12xx.c             |  7 ++++++
>>>    drivers/baseband/turbo_sw/bbdev_turbo_software.c   | 11 ++++++++
>>>    5 files changed, 51 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> index 17ba798..d568d0d 100644
>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> @@ -966,6 +966,7 @@
>>>    		struct rte_bbdev_driver_info *dev_info)
>>>    {
>>>    	struct acc100_device *d = dev->data->dev_private;
>>> +	int i;
>>>
>>>    	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
>>>    		{
>>> @@ -1062,19 +1063,23 @@
>>>    	fetch_acc100_config(dev);
>>>    	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>
>>> -	/* This isn't ideal because it reports the maximum number of queues
>> but
>>> -	 * does not provide info on how many can be uplink/downlink or
>> different
>>> -	 * priorities
>>> -	 */
>>> -	dev_info->max_num_queues =
>>> -			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
>>> -			d->acc100_conf.q_dl_5g.num_qgroups +
>>> -			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
>>> -			d->acc100_conf.q_ul_5g.num_qgroups +
>>> -			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
>>> -			d->acc100_conf.q_dl_4g.num_qgroups +
>>> -			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
>>> +	/* Expose number of queues */
>>> +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] =
>>> +d->acc100_conf.q_ul_4g.num_aqs_per_groups *
>>>    			d->acc100_conf.q_ul_4g.num_qgroups;
>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = d-
>>> acc100_conf.q_dl_4g.num_aqs_per_groups *
>>> +			d->acc100_conf.q_dl_4g.num_qgroups;
>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = d-
>>> acc100_conf.q_ul_5g.num_aqs_per_groups *
>>> +			d->acc100_conf.q_ul_5g.num_qgroups;
>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = d-
>>> acc100_conf.q_dl_5g.num_aqs_per_groups *
>>> +			d->acc100_conf.q_dl_5g.num_qgroups;
>>> +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = d-
>>> acc100_conf.q_ul_4g.num_qgroups;
>>> +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = d-
>>> acc100_conf.q_dl_4g.num_qgroups;
>>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = d-
>>> acc100_conf.q_ul_5g.num_qgroups;
>>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = d-
>>> acc100_conf.q_dl_5g.num_qgroups;
>>> +	dev_info->max_num_queues = 0;
>>> +	for (i = RTE_BBDEV_OP_TURBO_DEC; i < RTE_BBDEV_OP_LDPC_ENC;
>> i++)
>>
>> should this be i <=  ?
>>
> Thanks
>
>>> +		dev_info->max_num_queues += dev_info->num_queues[i];
>>>    	dev_info->queue_size_lim = ACC100_MAX_QUEUE_DEPTH;
>>>    	dev_info->hardware_accelerated = true;
>>>    	dev_info->max_dl_queue_priority =
>>> diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>> b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>> index 57b12af..b4982af 100644
>>> --- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>> +++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>> @@ -379,6 +379,14 @@
>>>    		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
>>>    			dev_info->max_num_queues++;
>>>    	}
>>> +	/* Expose number of queue per operation type */
>>> +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = dev_info-
>>> max_num_queues / 2;
>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = dev_info-
>>> max_num_queues / 2;
>>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
>>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
>>>    }
>>>
>>>    /**
>>> diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>> b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>> index 2a330c4..dc7f479 100644
>>> --- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>> +++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>> @@ -655,6 +655,14 @@ struct __rte_cache_aligned fpga_queue {
>>>    		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
>>>    			dev_info->max_num_queues++;
>>>    	}
>>> +	/* Expose number of queue per operation type */
>>> +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = dev_info-
>>> max_num_queues / 2;
>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = dev_info-
>>> max_num_queues / 2;
>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = 0;
>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = 0;
>>> +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = 1;
>>> +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = 1;
>>>    }
>>>
>>>    /**
>>> diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c
>>> b/drivers/baseband/la12xx/bbdev_la12xx.c
>>> index c1f88c6..e99ea9a 100644
>>> --- a/drivers/baseband/la12xx/bbdev_la12xx.c
>>> +++ b/drivers/baseband/la12xx/bbdev_la12xx.c
>>> @@ -102,6 +102,13 @@ struct bbdev_la12xx_params {
>>>    	dev_info->min_alignment = 64;
>>>    	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>
>>> +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] =
>> LA12XX_MAX_QUEUES / 2;
>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] =
>> LA12XX_MAX_QUEUES / 2;
>>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
>>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
>>>    	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
>>>    }
>>>
>>> diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>> b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>> index dbc5524..647e706 100644
>>> --- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>> +++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>> @@ -256,6 +256,17 @@ struct turbo_sw_queue {
>>>    	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
>>>    	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>
>>> +	const struct rte_bbdev_op_cap *op_cap = bbdev_capabilities;
>> Should this be done through dev instead of assigning directly ?
> I am not sure I follow your suggestion. Do you mind clarifying?

bbdev_capabilites is a const defined in this function, do you really 
need to loop over it to find information that is constant ?

Tom

>
>> Tom
>>
>>> +	int num_op_type = 0;
>>> +	for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
>>> +		num_op_type++;
>>> +	op_cap = bbdev_capabilities;
>>> +	if (num_op_type > 0) {
>>> +		int num_queue_per_type = dev_info->max_num_queues /
>> num_op_type;
>>> +		for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
>>> +			dev_info->num_queues[op_cap->type] =
>> num_queue_per_type;
>>> +	}
>>> +
>>>    	rte_bbdev_log_debug("got device info from %u\n", dev->data-
>>> dev_id);
>>>    }
>>>


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v4 3/7] bbdev: add device info on queue topology
  2022-07-06 21:12             ` Chautru, Nicolas
@ 2022-07-07 13:34               ` Tom Rix
  2022-07-07 17:13                 ` Chautru, Nicolas
  0 siblings, 1 reply; 174+ messages in thread
From: Tom Rix @ 2022-07-07 13:34 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, Richardson, Bruce, david.marchand, stephen


On 7/6/22 2:12 PM, Chautru, Nicolas wrote:
> Hi Tom,
>
>> -----Original Message-----
>> From: Tom Rix <trix@redhat.com>
>> Subject: Re: [PATCH v4 3/7] bbdev: add device info on queue topology
>>
>>
>> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
>>> Adding more options in the API to expose the number of queues exposed
>>> and related priority.
>>>
>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>> ---
>>>    lib/bbdev/rte_bbdev.h | 4 ++++
>>>    1 file changed, 4 insertions(+)
>>>
>>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
>>> 9b1ffa4..ac941d6 100644
>>> --- a/lib/bbdev/rte_bbdev.h
>>> +++ b/lib/bbdev/rte_bbdev.h
>>> @@ -289,6 +289,10 @@ struct rte_bbdev_driver_info {
>>>
>>>    	/** Maximum number of queues supported by the device */
>>>    	unsigned int max_num_queues;
>>> +	/** Maximum number of queues supported per operation type */
>>> +	unsigned int num_queues[RTE_BBDEV_OP_TYPE_PADDED_MAX];
>>> +	/** Priority level supported per operation type */
>>> +	unsigned int queue_priority[RTE_BBDEV_OP_TYPE_PADDED_MAX];
>> It is better to add new elements to the end of a structure for better backward
>> compatibility
> All that serie is not ABI compatible (sizes change etc...). I don’t believe there is such a recommendation, is there?

Depends on what users expect, a dynamically linked old application would 
at best core here.  If the elements were added to the end, yes the size 
would change but the old dynamically linked application would not use 
them.  Dynamically linking is nice because problems in the library can 
be fixed and shipped without forcing the user recompile.  Though the 
user may not realize  it, this change forces them to recompile.

Tom

>
>> Tom
>>
>>>    	/** Queue size limit (queue size must also be power of 2) */
>>>    	uint32_t queue_size_lim;
>>>    	/** Set if device off-loads operation to hardware  */


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v4 2/7] bbdev: add device status info
  2022-07-06 21:16             ` Chautru, Nicolas
@ 2022-07-07 13:37               ` Tom Rix
  2022-07-07 17:15                 ` Chautru, Nicolas
  2022-08-25 14:08               ` Maxime Coquelin
  1 sibling, 1 reply; 174+ messages in thread
From: Tom Rix @ 2022-07-07 13:37 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, Richardson, Bruce, david.marchand, stephen


On 7/6/22 2:16 PM, Chautru, Nicolas wrote:
>> -----Original Message-----
>> From: Tom Rix <trix@redhat.com>
>> Subject: Re: [PATCH v4 2/7] bbdev: add device status info
>>
>>
>> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
>>> Added device status information, so that the PMD can expose
>>> information related to the underlying accelerator device status.
>>> Minor order change in structure to fit into padding hole.
>>>
>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>> ---
>>>    drivers/baseband/acc100/rte_acc100_pmd.c           |  1 +
>>>    drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  1 +
>>>    drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  1 +
>>>    drivers/baseband/la12xx/bbdev_la12xx.c             |  1 +
>>>    drivers/baseband/null/bbdev_null.c                 |  1 +
>>>    drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  1 +
>>>    lib/bbdev/rte_bbdev.c                              | 24 +++++++++++++++
>>>    lib/bbdev/rte_bbdev.h                              | 35 ++++++++++++++++++++--
>>>    lib/bbdev/version.map                              |  6 ++++
>>>    9 files changed, 69 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> index de7e4bc..17ba798 100644
>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> @@ -1060,6 +1060,7 @@
>>>
>>>    	/* Read and save the populated config from ACC100 registers */
>>>    	fetch_acc100_config(dev);
>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>
>>>    	/* This isn't ideal because it reports the maximum number of queues
>> but
>>>    	 * does not provide info on how many can be uplink/downlink or
>>> different diff --git
>>> a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>> b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>> index 82ae6ba..57b12af 100644
>>> --- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>> +++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>> @@ -369,6 +369,7 @@
>>>    	dev_info->capabilities = bbdev_capabilities;
>>>    	dev_info->cpu_flag_reqs = NULL;
>>>    	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>
>>>    	/* Calculates number of queues assigned to device */
>>>    	dev_info->max_num_queues = 0;
>>> diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>> b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>> index 21d3529..2a330c4 100644
>>> --- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>> +++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>> @@ -645,6 +645,7 @@ struct __rte_cache_aligned fpga_queue {
>>>    	dev_info->capabilities = bbdev_capabilities;
>>>    	dev_info->cpu_flag_reqs = NULL;
>>>    	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>
>>>    	/* Calculates number of queues assigned to device */
>>>    	dev_info->max_num_queues = 0;
>>> diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c
>>> b/drivers/baseband/la12xx/bbdev_la12xx.c
>>> index 4d1bd16..c1f88c6 100644
>>> --- a/drivers/baseband/la12xx/bbdev_la12xx.c
>>> +++ b/drivers/baseband/la12xx/bbdev_la12xx.c
>>> @@ -100,6 +100,7 @@ struct bbdev_la12xx_params {
>>>    	dev_info->capabilities = bbdev_capabilities;
>>>    	dev_info->cpu_flag_reqs = NULL;
>>>    	dev_info->min_alignment = 64;
>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>
>>>    	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
>>>    }
>>> diff --git a/drivers/baseband/null/bbdev_null.c
>>> b/drivers/baseband/null/bbdev_null.c
>>> index 248e129..94a1976 100644
>>> --- a/drivers/baseband/null/bbdev_null.c
>>> +++ b/drivers/baseband/null/bbdev_null.c
>>> @@ -82,6 +82,7 @@ struct bbdev_queue {
>>>    	 * here for code completeness.
>>>    	 */
>>>    	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>
>>>    	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
>>>    }
>>> diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>> b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>> index af7bc41..dbc5524 100644
>>> --- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>> +++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>> @@ -254,6 +254,7 @@ struct turbo_sw_queue {
>>>    	dev_info->min_alignment = 64;
>>>    	dev_info->harq_buffer_size = 0;
>>>    	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>
>>>    	rte_bbdev_log_debug("got device info from %u\n", dev->data-
>>> dev_id);
>>>    }
>>> diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c index
>>> 22bd894..555bda9 100644
>>> --- a/lib/bbdev/rte_bbdev.c
>>> +++ b/lib/bbdev/rte_bbdev.c
>>> @@ -25,6 +25,8 @@
>>>
>>>    /* Number of supported operation types */
>>>    #define BBDEV_OP_TYPE_COUNT 5
>>> +/* Number of supported device status */ #define
>>> +BBDEV_DEV_STATUS_COUNT 9
>>>
>>>    /* BBDev library logging ID */
>>>    RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE); @@ -1132,3
>> +1134,25
>>> @@ struct rte_mempool *
>>>    	rte_bbdev_log(ERR, "Invalid operation type");
>>>    	return NULL;
>>>    }
>>> +
>>> +const char *
>>> +rte_bbdev_device_status_str(enum rte_bbdev_device_status status) {
>>> +	static const char * const dev_sta_string[] = {
>>> +		"RTE_BBDEV_DEV_NOSTATUS",
>>> +		"RTE_BBDEV_DEV_NOT_SUPPORTED",
>>> +		"RTE_BBDEV_DEV_RESET",
>>> +		"RTE_BBDEV_DEV_CONFIGURED",
>>> +		"RTE_BBDEV_DEV_ACTIVE",
>>> +		"RTE_BBDEV_DEV_FATAL_ERR",
>>> +		"RTE_BBDEV_DEV_RESTART_REQ",
>>> +		"RTE_BBDEV_DEV_RECONFIG_REQ",
>>> +		"RTE_BBDEV_DEV_CORRECT_ERR",
>>> +	};
>>> +
>>> +	if (status < BBDEV_DEV_STATUS_COUNT)
>>> +		return dev_sta_string[status];
>>> +
>>> +	rte_bbdev_log(ERR, "Invalid device status");
>>> +	return NULL;
>>> +}
>>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
>>> b88c881..9b1ffa4 100644
>>> --- a/lib/bbdev/rte_bbdev.h
>>> +++ b/lib/bbdev/rte_bbdev.h
>>> @@ -223,6 +223,21 @@ struct rte_bbdev_queue_conf {
>>>    int
>>>    rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
>>>
>>> +/**
>>> + * Flags indicate the status of the device  */ enum
>>> +rte_bbdev_device_status {
>>> +	RTE_BBDEV_DEV_NOSTATUS,        /**< Nothing being reported */
>>> +	RTE_BBDEV_DEV_NOT_SUPPORTED,   /**< Device status is not
>> supported on the PMD */
>> If this was 0, you may not need to explicitly set.
> This helps to have the lack of status being equivalent to a cleared register.

NOSTATUS is fine, just change

NOT_SUPPORTED = 0,

Tom

>
>>> +	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-configured
>> state */
>>> +	RTE_BBDEV_DEV_CONFIGURED,      /**< Device is configured and
>> ready to use */
>>> +	RTE_BBDEV_DEV_ACTIVE,          /**< Device is configured and VF is
>> being used */
>>> +	RTE_BBDEV_DEV_FATAL_ERR,       /**< Device has hit a fatal
>> uncorrectable error */
>>> +	RTE_BBDEV_DEV_RESTART_REQ,     /**< Device requires application to
>> restart */
>>> +	RTE_BBDEV_DEV_RECONFIG_REQ,    /**< Device requires application
>> to reconfigure queues */
>>> +	RTE_BBDEV_DEV_CORRECT_ERR,     /**< Warning of a correctable
>> error event happened */
>> Last patch was padded, do something consistent here.
> We only pad if we have to. Here there is no array whose size would be dimensioned by the size of that enum.
>
>>> +};
>>> +
>>>    /** Device statistics. */
>>>    struct rte_bbdev_stats {
>>>    	uint64_t enqueued_count;  /**< Count of all operations enqueued */
>>> @@ -285,12 +300,14 @@ struct rte_bbdev_driver_info {
>>>    	/** Set if device supports per-queue interrupts */
>>>    	bool queue_intr_supported;
>>>    	/** Minimum alignment of buffers, in bytes */
>>> -	uint16_t min_alignment;
>>> -	/** HARQ memory available in kB */
>>> +	/** Device Status */
>>> +	enum rte_bbdev_device_status device_status;
>> New elements should be added to the end to improve backward compatibility.
> Same comment in different patch. I would like to know if there is a real recommendation from DPDK on this. I have heard opposite view as well.
> In that very case we are breaking the ABI in that new serie for 22.11 (sizes and offsets are changing).
>
>> Tom
>>
>>>    	uint32_t harq_buffer_size;
>>>    	/** Byte endianness (RTE_BIG_ENDIAN/RTE_LITTLE_ENDIAN)
>> supported
>>>    	 *  for input/output data
>>>    	 */
>>> +	uint16_t min_alignment;
>>> +	/** HARQ memory available in kB */
>>>    	uint8_t data_endianness;
>>>    	/** Default queue configuration used if none is supplied  */
>>>    	struct rte_bbdev_queue_conf default_queue_conf; @@ -827,6
>> +844,20
>>> @@ typedef void (*rte_bbdev_cb_fn)(uint16_t dev_id,
>>>    rte_bbdev_queue_intr_ctl(uint16_t dev_id, uint16_t queue_id, int epfd, int
>> op,
>>>    		void *data);
>>>
>>> +/**
>>> + * Converts device status from enum to string
>>> + *
>>> + * @param status
>>> + *   Device status as enum
>>> + *
>>> + * @returns
>>> + *   Operation type as string or NULL if op_type is invalid
>>> + *
>>> + */
>>> +__rte_experimental
>>> +const char*
>>> +rte_bbdev_device_status_str(enum rte_bbdev_device_status status);
>>> +
>>>    #ifdef __cplusplus
>>>    }
>>>    #endif
>>> diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map index
>>> cce3f3c..9ac3643 100644
>>> --- a/lib/bbdev/version.map
>>> +++ b/lib/bbdev/version.map
>>> @@ -39,3 +39,9 @@ DPDK_22 {
>>>
>>>    	local: *;
>>>    };
>>> +
>>> +EXPERIMENTAL {
>>> +	global:
>>> +
>>> +	rte_bbdev_device_status_str;
>>> +};


^ permalink raw reply	[flat|nested] 174+ messages in thread

* RE: [PATCH v4 5/7] bbdev: add new operation for FFT processing
  2022-07-07 13:09               ` Tom Rix
@ 2022-07-07 16:57                 ` Chautru, Nicolas
  2022-07-18 22:38                   ` Tom Rix
  0 siblings, 1 reply; 174+ messages in thread
From: Chautru, Nicolas @ 2022-07-07 16:57 UTC (permalink / raw)
  To: Tom Rix, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, Richardson, Bruce, david.marchand, stephen

Hi Tom, 

> -----Original Message-----
> From: Tom Rix <trix@redhat.com>
> 
> Nic,
> 
> Not all my comments were addressed.
> 
> The one I am most interested in is the default type / size and how it interacts
> with fp16.

My bad, I had replied to all that (and fixed some of them in the new version) but I must have NOT sent the latest draft of that email by mistake. Let me go through it again below and let me know if unclear. 

> 
> Please see the others below
> 
> On 7/6/22 2:04 PM, Chautru, Nicolas wrote:
> > Hi Tom,
> >
> >> -----Original Message-----
> >> From: Tom Rix <trix@redhat.com>>
> >>
> >> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
> >>> Extension of bbdev operation to support FFT based operations.
> >>>
> >>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> >>> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
> >>> ---
> >>>    doc/guides/prog_guide/bbdev.rst | 130
> >> +++++++++++++++++++++++++++++++++++
> >>>    lib/bbdev/rte_bbdev.c           |  11 ++-
> >>>    lib/bbdev/rte_bbdev.h           |  76 ++++++++++++++++++++
> >>>    lib/bbdev/rte_bbdev_op.h        | 149
> >> ++++++++++++++++++++++++++++++++++++++++
> >>>    lib/bbdev/version.map           |   4 ++
> >>>    5 files changed, 369 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/doc/guides/prog_guide/bbdev.rst
> >>> b/doc/guides/prog_guide/bbdev.rst index 70fa01a..4a055b5 100644
> >>> --- a/doc/guides/prog_guide/bbdev.rst
> >>> +++ b/doc/guides/prog_guide/bbdev.rst
> >>> @@ -1118,6 +1118,136 @@ Figure :numref:`figure_turbo_tb_decode`
> above
> >>>    showing the Turbo decoding of CBs using BBDEV interface in TB-mode
> >>>    is also valid for LDPC decode.
> >>>
> >>> +BBDEV FFT Operation
> >>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>> +
> >>> +This operation allows to run a combination of DFT and/or IDFT
> >>> +and/or time-
> >> domain windowing.
> >>> +These can be used in a modular fashion (using bypass modes) or as a
> >>> +processing pipeline which can be used for FFT-based baseband signal
> >> processing.
> >>> +In more details it allows :
> >>> +- to process the data first through an IDFT of adjustable size and
> >>> +padding;
> >>> +- to perform the windowing as a programmable cyclic shift offset of
> >>> +the data followed by a pointwise multiplication by a time domain
> >>> +window;
> >>> +- to process the related data through a DFT of adjustable size and
> >>> +depadding for each such cyclic shift output.
> >>> +
> >>> +A flexible number of Rx antennas are being processed in parallel
> >>> +with the
> >> same configuration.
> >>> +The API allows more generally for flexibility in what the PMD may
> >>> +support (cabability flags) and flexibility to adjust some of the
> >>> +parameters of
> >> the processing.
> >>> +
> >>> +The operation/capability flags that can be set for each FFT
> >>> +operation are
> >> given below.
> >>> +
> >>> +  **NOTE:** The actual operation flags that may be used with a
> >>> + specific  BBDEV PMD are dependent on the driver capabilities as
> >>> + reported via  ``rte_bbdev_info_get()``, and may be a subset of those
> below.
> >>> +
> >>> ++--------------------------------------------------------------------+
> >>> +|Description of FFT capability flags                                 |
> >>>
> >>
> ++===============================================================
> >> =====
> >>> +++
> >>> +|RTE_BBDEV_FFT_WINDOWING                                             |
> >>> +| Set to enable/support windowing in time domain                     |
> >>> ++--------------------------------------------------------------------+
> >>> +|RTE_BBDEV_FFT_CS_ADJUSTMENT                                         |
> >>> +| Set to enable/support  the cyclic shift time offset adjustment     |
> >>> ++--------------------------------------------------------------------+
> >>> +|RTE_BBDEV_FFT_DFT_BYPASS                                            |
> >>> +| Set to bypass the DFT and use directly the IDFT as an option       |
> >>> ++--------------------------------------------------------------------+
> >>> +|RTE_BBDEV_FFT_IDFT_BYPASS                                           |
> >>> +| Set to bypass the IDFT and use directly the DFT as an option       |
> >>> ++--------------------------------------------------------------------+
> >>> +|RTE_BBDEV_FFT_WINDOWING_BYPASS                                      |
> >>> +| Set to bypass the time domain windowing  as an option              |
> >>> ++--------------------------------------------------------------------+
> >>> +|RTE_BBDEV_FFT_POWER_MEAS
> >> Other flags are not truncated, should be
> >>
> >> RTE_BBDEV_FFT_POWER_MEASUREMENT
> >>
> > The intention from DPDK recommendation is for these to be kept
> shortnames, isn't it?
> > Above we use many acronyms to keep it short (CS, etc...) Even in
> > current BBDEV API we use many truncation to keep names short: OUT,
> ENC/DEC, HQ, RM on top of acronyms.
> > I believe this is still super explicit with that name?
> 
> Some of other identifier have longer names than this.
> 
> If you wanted to keep things short, drop the last _<word>
> 
> Generally the use of acronyms should be avoided because they add a layer of
> jargon that makes the code less readable to all but writer.

To be totally honest usage for acronym is ubiquitous in such L1 signal
processing (and captured in 3GPP specs explicitly, everyone knows what HARQ or LDPC or FFT stands for, etc...).
I believe this is currently striking the right balance in
being explicit to developers familiar with related processing while not being unduly long names which create mess when trying to fit to 100 cols. 

> 
> 
> >
> >>>                                               |
> >>> +| Set to provide an optional power measument of the DFT output       |
> >>> ++--------------------------------------------------------------------+
> >> measurement
> > OK Thanks
> >
> >>> +|RTE_BBDEV_FFT_FP16_INPUT                                            |
> >>> +| Set if the input data shall use FP16 format instead of INT16       |
> >>> ++--------------------------------------------------------------------+
> >>> +|RTE_BBDEV_FFT_FP16_OUTPUT                                           |
> >>> +| Set if the output data shall use FP16 format instead of INT16      |
> >>> ++--------------------------------------------------------------------+
> >>> +
> >>> +The structure passed for each FFT operation is given below, with the
> >>> +operation flags forming a bitmask in the ``op_flags`` field.
> >>> +
> >>> +.. code-block:: c
> >>> +
> >>> +    struct rte_bbdev_op_fft {
> >>> +        struct rte_bbdev_op_data base_input;
> >>> +        struct rte_bbdev_op_data base_output;
> >>> +        struct rte_bbdev_op_data power_meas_output;
> >> similar to above, meas -> measurement
> > See above. Would that really help? I don’t believe there can be any
> confusion.
> 
> Naming is hard.
> 
> How about dropping the _meas_ and go with power_output

I agree that naming can be tricky. But in that case I believe this is the right balance as mentioned above.

> 
> >
> >>> +        uint32_t op_flags;
> >>> +        uint16_t input_sequence_size;
> >> Could these be future proofed by increasing small int size's to uint32_t ?
> > It is not possible to be that big for any signal processing relevant to that
> operation.
> >
> >>> +        uint16_t input_leading_padding;
> >>> +        uint16_t output_sequence_size;
> >>> +        uint16_t output_leading_depadding;
> >>> +        uint8_t window_index[RTE_BBDEV_MAX_CS_2];
> >>> +        uint16_t cs_bitmap;
> >>> +        uint8_t num_antennas_log2;
> >>> +        uint8_t idft_log2;
> >>> +        uint8_t dft_log2;
> >> is _log2 needed in variable name if it is documenation ?
> > I believe it is a best practice when the variable name may be misleading, ie.
> this is not the actual dft size as a natural number (2048 for instance) but there
> is an implied mapping.
> >
> >>> +        int8_t cs_time_adjustment;
> >>> +        int8_t idft_shift;
> >>> +        int8_t dft_shift;
> >>> +        uint16_t ncs_reciprocal;
> >>> +        uint16_t power_shift;
> >>> +        uint16_t fp16_exp_adjust;
> >>> +    };
> >>> +
> >>> +The FFT parameters are set out in the table below.
> >>> +
> >>> ++----------------------+--------------------------------------------------------------+
> >>> +|Parameter             |Description                                                   |
> >>>
> >>
> ++======================+========================================
> >> =====
> >>> ++=================+
> >>> +|base_input            |input data                                                    |
> >>> ++----------------------+--------------------------------------------------------------+
> >>> +|base_output           |output data                                                   |
> >>> ++----------------------+--------------------------------------------------------------+
> >>> +|power_meas_output     |optional output data with power measurement
> >> on DFT output     |
> >>> ++----------------------+--------------------------------------------------------------+
> >>> +|op_flags              |bitmask of all active operation capabilities                  |
> >>> ++----------------------+--------------------------------------------------------------+
> >>> +|input_sequence_size   |size of the input sequence in 32-bits points per
> >> antenna      |
> >>> ++----------------------+--------------------------------------------------------------+
> >>> +|input_leading_padding |number of points padded at the start of input
> >> data            |
> >>> ++----------------------+--------------------------------------------------------------+
> >>> +|output_sequence_size  |size of the output sequence per antenna and
> >> cyclic shift      |
> >>> ++----------------------+--------------------------------------------------------------+
> >>> +|output_depadding      |number of points depadded at the start of
> output
> >> data         |
> >>> ++----------------------+--------------------------------------------------------------+
> >> output_leading_depadding
> > OK Thanks
> >
> >>> +|window_index          |optional windowing profile index used for each
> cyclic
> >> shift   |
> >>> ++----------------------+--------------------------------------------------------------+
> >>> +|cs_bitmap             |bitmap of the cyclic shift output requested (LSB for
> >> index 0) |
> >>> ++----------------------+--------------------------------------------------------------+
> >>> +|num_antennas_log2     |number of antennas as a log2 (10 maps to
> 1024...)
> >> |
> >>> ++----------------------+--------------------------------------------------------------+
> >>> +|idft_log2             |iDFT size as a log2                                           |
> >>> ++----------------------+--------------------------------------------------------------+
> >>> +|dft_log2              |DFT size as a log2                                            |
> >>> ++----------------------+--------------------------------------------------------------+
> >>> +|cs_time_adjustment    |adjustment of time position of all the cyclic shift
> >> output    |
> >>> ++----------------------+--------------------------------------------------------------+
> >>> +|idft_shift            |shift down of signal level post iDFT                          |
> >>> ++----------------------+--------------------------------------------------------------+
> >>> +|dft_shift             |shift down of signal level post DFT                           |
> >>> ++----------------------+--------------------------------------------------------------+
> >>> +|ncs_reciprocal        |inverse of max number of CS normalized to 15b (ie.
> >> 231 for 12)|
> >>> ++----------------------+--------------------------------------------------------------+
> >>> +|power_shift           |shift down of level of power measurement when
> >> enabled         |
> >>> ++----------------------+--------------------------------------------------------------+
> >>> +|fp16_exp_adjust       |value added to FP16 exponent at conversion from
> >> INT16         |
> >>> ++----------------------+--------------------------------------------------------------+
> >>> +
> >>> +The mbuf input ``base_input`` is mandatory for all BBDEV PMDs and is
> >>> +the incoming data for the processing. Its size may not fit into an
> >>> +actual mbuf, but the structure is used to pass iova address.
> >>> +The mbuf output ``output`` is mandatory and is output of the FFT
> >> processing chain.
> >>> +Each point is a complex number of 32bits : either as 2 INT16 or as 2
> >>> +FP16 based when the option supported.
> >>> +The data layout is based on contiguous concatenation of output data
> >>> +first by cyclic shift then by antenna.
> >>>
> >>>    Sample code
> >>>    -----------
> >>> diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c index
> >>> 555bda9..28b105d 100644
> >>> --- a/lib/bbdev/rte_bbdev.c
> >>> +++ b/lib/bbdev/rte_bbdev.c
> >>> @@ -24,7 +24,7 @@
> >>>    #define DEV_NAME "BBDEV"
> >>>
> >>>    /* Number of supported operation types */ -#define
> >>> BBDEV_OP_TYPE_COUNT 5
> >>> +#define BBDEV_OP_TYPE_COUNT 6
> >>>    /* Number of supported device status */
> >>>    #define BBDEV_DEV_STATUS_COUNT 9
> >>>
> >>> @@ -854,6 +854,9 @@ struct rte_bbdev *
> >>>    	case RTE_BBDEV_OP_LDPC_ENC:
> >>>    		result = sizeof(struct rte_bbdev_enc_op);
> >>>    		break;
> >>> +	case RTE_BBDEV_OP_FFT:
> >>> +		result = sizeof(struct rte_bbdev_fft_op);
> >>> +		break;
> >>>    	default:
> >>>    		break;
> >>>    	}
> >>> @@ -877,6 +880,10 @@ struct rte_bbdev *
> >>>    		struct rte_bbdev_enc_op *op = element;
> >>>    		memset(op, 0, mempool->elt_size);
> >>>    		op->mempool = mempool;
> >>> +	} else if (type == RTE_BBDEV_OP_FFT) {
> >>> +		struct rte_bbdev_fft_op *op = element;
> >>> +		memset(op, 0, mempool->elt_size);
> >>> +		op->mempool = mempool;
> >>>    	}
> >>>    }
> >>>
> >>> @@ -1126,6 +1133,8 @@ struct rte_mempool *
> >>>    		"RTE_BBDEV_OP_TURBO_DEC",
> >>>    		"RTE_BBDEV_OP_TURBO_ENC",
> >>>    		"RTE_BBDEV_OP_LDPC_DEC",
> >>> +		"RTE_BBDEV_OP_LDPC_ENC",
> >> Why ldpc_enc line, this is already in codebase ?
> >>> +		"RTE_BBDEV_OP_FFT",
> > Thanks, there this is a rebase issue in previous commit
> >
> >
> >>>    	};
> >>>
> >>>    	if (op_type < BBDEV_OP_TYPE_COUNT)
> >>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
> >>> ac941d6..ed528b8 100644
> >>> --- a/lib/bbdev/rte_bbdev.h
> >>> +++ b/lib/bbdev/rte_bbdev.h
> >>> @@ -401,6 +401,12 @@ typedef uint16_t
> >> (*rte_bbdev_enqueue_dec_ops_t)(
> >>>    		struct rte_bbdev_dec_op **ops,
> >>>    		uint16_t num);
> >>>
> >>> +/** @internal Enqueue fft operations for processing on queue of a
> >>> +device. */ typedef uint16_t (*rte_bbdev_enqueue_fft_ops_t)(
> >>> +		struct rte_bbdev_queue_data *q_data,
> >>> +		struct rte_bbdev_fft_op **ops,
> >>> +		uint16_t num);
> >>> +
> >>>    /** @internal Dequeue encode operations from a queue of a device. */
> >>>    typedef uint16_t (*rte_bbdev_dequeue_enc_ops_t)(
> >>>    		struct rte_bbdev_queue_data *q_data, @@ -411,6 +417,11
> >> @@ typedef
> >>> uint16_t (*rte_bbdev_dequeue_dec_ops_t)(
> >>>    		struct rte_bbdev_queue_data *q_data,
> >>>    		struct rte_bbdev_dec_op **ops, uint16_t num);
> >>>
> >>> +/** @internal Dequeue fft operations from a queue of a device. */
> >>> +typedef uint16_t (*rte_bbdev_dequeue_fft_ops_t)(
> >>> +		struct rte_bbdev_queue_data *q_data,
> >>> +		struct rte_bbdev_fft_op **ops, uint16_t num);
> >>> +
> >>>    #define RTE_BBDEV_NAME_MAX_LEN  64  /**< Max length of device
> name
> >>> */
> >>>
> >>>    /**
> >>> @@ -459,6 +470,10 @@ struct __rte_cache_aligned rte_bbdev {
> >>>    	rte_bbdev_dequeue_enc_ops_t dequeue_ldpc_enc_ops;
> >>>    	/** Dequeue decode function */
> >>>    	rte_bbdev_dequeue_dec_ops_t dequeue_ldpc_dec_ops;
> >>> +	/** Enqueue FFT function */
> >>> +	rte_bbdev_enqueue_fft_ops_t enqueue_fft_ops;
> >>> +	/** Dequeue FFT function */
> >>> +	rte_bbdev_dequeue_fft_ops_t dequeue_fft_ops;
> >>>    	const struct rte_bbdev_ops *dev_ops;  /**< Functions exported by
> >> PMD */
> >>>    	struct rte_bbdev_data *data;  /**< Pointer to device data */
> >>>    	enum rte_bbdev_state state;  /**< If device is currently used or
> >>> not */ @@ -591,6 +606,36 @@ struct __rte_cache_aligned rte_bbdev {
> >>>    	return dev->enqueue_ldpc_dec_ops(q_data, ops, num_ops);
> >>>    }
> >>>
> >>> +/**
> >>> + * Enqueue a burst of fft operations to a queue of the device.
> >>> + * This functions only enqueues as many operations as currently
> >>> +possible and
> >>> + * does not block until @p num_ops entries in the queue are available.
> >>> + * This function does not provide any error notification to avoid the
> >>> + * corresponding overhead.
> >>> + *
> >>> + * @param dev_id
> >>> + *   The identifier of the device.
> >>> + * @param queue_id
> >>> + *   The index of the queue.
> >>> + * @param ops
> >>> + *   Pointer array containing operations to be enqueued Must have at
> least
> >>> + *   @p num_ops entries
> >>> + * @param num_ops
> >>> + *   The maximum number of operations to enqueue.
> >>> + *
> >>> + * @return
> >>> + *   The number of operations actually enqueued (this is the number of
> >> processed
> >>> + *   entries in the @p ops array).
> >>> + */
> >>> +__rte_experimental
> >>> +static inline uint16_t
> >>> +rte_bbdev_enqueue_fft_ops(uint16_t dev_id, uint16_t queue_id,
> >>> +		struct rte_bbdev_fft_op **ops, uint16_t num_ops) {
> >>> +	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
> >> Who checks the input is valid ?
> 
> Who checks the input is valid ?
> 

This is not specific to that commit but to any operation. This is there for years and see the comment above
 * This function does not provide any error notification to avoid the
 * corresponding overhead.

> 
> >>> +	struct rte_bbdev_queue_data *q_data = &dev->data-
> >>> queues[queue_id];
> >>> +	return dev->enqueue_fft_ops(q_data, ops, num_ops); }
> >>>
> >>>    /**
> >>>     * Dequeue a burst of processed encode operations from a queue of the
> >> device.
> >>> @@ -716,6 +761,37 @@ struct __rte_cache_aligned rte_bbdev {
> >>>    	return dev->dequeue_ldpc_dec_ops(q_data, ops, num_ops);
> >>>    }
> >>>
> >>> +/**
> >>> + * Dequeue a burst of fft operations from a queue of the device.
> >>> + * This functions returns only the current contents of the queue, and
> >>> +does not
> >>> + * block until @ num_ops is available.
> >>> + * This function does not provide any error notification to avoid the
> >>> + * corresponding overhead.
> >>> + *
> >>> + * @param dev_id
> >>> + *   The identifier of the device.
> >>> + * @param queue_id
> >>> + *   The index of the queue.
> >>> + * @param ops
> >>> + *   Pointer array where operations will be dequeued to. Must have at
> least
> >>> + *   @p num_ops entries
> >>> + * @param num_ops
> >>> + *   The maximum number of operations to dequeue.
> >>> + *
> >>> + * @return
> >>> + *   The number of operations actually dequeued (this is the number of
> >> entries
> >>> + *   copied into the @p ops array).
> >>> + */
> >>> +__rte_experimental
> >>> +static inline uint16_t
> >>> +rte_bbdev_dequeue_fft_ops(uint16_t dev_id, uint16_t queue_id,
> >>> +		struct rte_bbdev_fft_op **ops, uint16_t num_ops) {
> >>> +	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
> >>> +	struct rte_bbdev_queue_data *q_data = &dev->data-
> >>> queues[queue_id];
> >>> +	return dev->dequeue_fft_ops(q_data, ops, num_ops); }
> >>> +
> >>>    /** Definitions of device event types */
> >>>    enum rte_bbdev_event_type {
> >>>    	RTE_BBDEV_EVENT_UNKNOWN,  /**< unknown event type */ diff --git
> >>> a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h index
> >>> cd82418..3e46f1d 100644
> >>> --- a/lib/bbdev/rte_bbdev_op.h
> >>> +++ b/lib/bbdev/rte_bbdev_op.h
> >>> @@ -47,6 +47,8 @@
> >>>    #define RTE_BBDEV_TURBO_MAX_CODE_BLOCKS (64)
> >>>    /* LDPC:  Maximum number of Code Blocks in Transport Block.*/
> >>>    #define RTE_BBDEV_LDPC_MAX_CODE_BLOCKS (256)
> >>> +/* 12 CS maximum */
> >>> +#define RTE_BBDEV_MAX_CS_2 (6)
> >>>
> >>>    /** Flags for turbo decoder operation and capability structure */
> >>>    enum rte_bbdev_op_td_flag_bitmasks { @@ -211,6 +213,26 @@ enum
> >>> rte_bbdev_op_ldpcenc_flag_bitmasks {
> >>>    	RTE_BBDEV_LDPC_ENC_CONCATENATION = (1ULL << 7)
> >>>    };
> >>>
> >>> +/** Flags for DFT operation and capability structure */ enum
> >>> +rte_bbdev_op_fft_flag_bitmasks {
> >>> +	/** Flexible windowing capability */
> >>> +	RTE_BBDEV_FFT_WINDOWING = (1ULL << 0),
> >>> +	/** Flexible adjustment of Cyclic Shift time offset */
> >>> +	RTE_BBDEV_FFT_CS_ADJUSTMENT = (1ULL << 1),
> >>> +	/** Set for bypass the DFT and get directly into iDFT input */
> >>> +	RTE_BBDEV_FFT_DFT_BYPASS = (1ULL << 2),
> >>> +	/** Set for bypass the IDFT and get directly the DFT output */
> >>> +	RTE_BBDEV_FFT_IDFT_BYPASS = (1ULL << 3),
> >>> +	/** Set for bypass time domain windowing */
> >>> +	RTE_BBDEV_FFT_WINDOWING_BYPASS = (1ULL << 4),
> >>> +	/** Set for optional power measurement on DFT output */
> >>> +	RTE_BBDEV_FFT_POWER_MEAS = (1ULL << 5),
> >> Meas here too, change generally
> >>> +	/** Set if the input data used FP16 format */
> >>> +	RTE_BBDEV_FFT_FP16_INPUT = (1ULL << 6),
> >> What are the other data type(s) ?
> >>
> >> The default is not mentioned, or i missed it.
> ?

Default type is INT16 as captured in doc above

+|RTE_BBDEV_FFT_FP16_INPUT                                            |
+| Set if the input data shall use FP16 format instead of INT16       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_FP16_OUTPUT                                           |
+| Set if the output data shall use FP16 format instead of INT16      |
++--------------------------------------------------------------------+



> >>
> >>> +	/**  Set if the output data uses FP16 format  */
> >>> +	RTE_BBDEV_FFT_FP16_OUTPUT = (1ULL << 7) };
> >>> +
> >>>    /** Flags for the Code Block/Transport block mode  */
> >>>    enum rte_bbdev_op_cb_mode {
> >>>    	/** One operation is one or fraction of one transport block  */ @@
> >>> -689,6 +711,55 @@ struct rte_bbdev_op_ldpc_enc {
> >>>    	};
> >>>    };
> >>>
> >>> +/** Operation structure for FFT processing.
> >>> + *
> >>> + * The operation processes the data for multiple antennas in a single
> >>> +call
> >>> + * (.i.e for all the REs belonging to a given SRS sequence for
> >>> +instance)
> >>> + *
> >>> + * The output mbuf data structure is expected to be allocated by the
> >>> + * application with enough room for the output data.
> >>> + */
> >>> +struct rte_bbdev_op_fft {
> >>> +	/** Input data starting from first antenna */
> >>> +	struct rte_bbdev_op_data base_input;
> >>> +	/** Output data starting from first antenna and first cyclic shift */
> >>> +	struct rte_bbdev_op_data base_output;
> >>> +	/** Optional power measurement output data */
> >>> +	struct rte_bbdev_op_data power_meas_output;
> >>> +	/** Flags from rte_bbdev_op_fft_flag_bitmasks */
> >>> +	uint32_t op_flags;
> >>> +	/** Input sequence size in 32-bits points */
> >>> +	uint16_t input_sequence_size;
> >> size is bytes*4 ? how does this work with fp16 ?
> ?

This is IQ data, hence a complex number using either int16 or fp6 would always be 32 bits.


> >>> +	/** Padding at the start of the sequence */
> >>> +	uint16_t input_leading_padding;
> >>> +	/** Output sequence size in 32-bits points */
> >>> +	uint16_t output_sequence_size;
> >>> +	/** Depadding at the start of the DFT output */
> >>> +	uint16_t output_leading_depadding;
> >>> +	/** Window index being used for each cyclic shift output */
> >>> +	uint8_t window_index[RTE_BBDEV_MAX_CS_2];
> >>> +	/** Bitmap of the cyclic shift output requested */
> >>> +	uint16_t cs_bitmap;
> >>> +	/** Number of antennas as a log2 – 8 to 128 */
> >>> +	uint8_t num_antennas_log2;
> >>> +	/** iDFT size as a log2 - 32 to 2048 */
> >>> +	uint8_t idft_log2;
> >>> +	/** DFT size as a log2 - 8 to 2048 */
> >>> +	uint8_t dft_log2;
> >>> +	/** Adjustment of position of the cyclic shifts - -31 to 31 */
> >>> +	int8_t cs_time_adjustment;
> >>> +	/** iDFT shift down */
> >>> +	int8_t idft_shift;
> >>> +	/** DFT shift down */
> >>> +	int8_t dft_shift;
> >>> +	/** NCS reciprocal factor  */
> >>> +	uint16_t ncs_reciprocal;
> >>> +	/** power measurement out shift down */
> >>> +	uint16_t power_shift;
> >>> +	/** Adjust the FP6 exponent for INT<->FP16 conversion */
> >>> +	uint16_t fp16_exp_adjust;
> >>> +};
> >>> +
> >>>    /** List of the capabilities for the Turbo Decoder */
> >>>    struct rte_bbdev_op_cap_turbo_dec {
> >>>    	/** Flags from rte_bbdev_op_td_flag_bitmasks */ @@ -741,6 +812,16
> >>> @@ struct rte_bbdev_op_cap_ldpc_enc {
> >>>    	uint16_t num_buffers_dst;
> >>>    };
> >>>
> >>> +/** List of the capabilities for the FFT */ struct
> >>> +rte_bbdev_op_cap_fft {
> >>> +	/** Flags from rte_bbdev_op_ldpcenc_flag_bitmasks */
> >> you mean 'from rte_bbdev_op_fft_flag_bitmasks' ?
> ?

Thanks, fixed in new commit

> >>> +	uint32_t capability_flags;
> >>> +	/** Num input code block buffers */
> >>> +	uint16_t num_buffers_src;
> >>> +	/** Num output code block buffers */
> >>> +	uint16_t num_buffers_dst;
> >>> +};
> >>> +
> >>>    /** Different operation types supported by the device */
> >>>    enum rte_bbdev_op_type {
> >>>    	RTE_BBDEV_OP_NONE,  /**< Dummy operation that does nothing */
> >> @@
> >>> -748,6 +829,7 @@ enum rte_bbdev_op_type {
> >>>    	RTE_BBDEV_OP_TURBO_ENC,  /**< Turbo encode */
> >>>    	RTE_BBDEV_OP_LDPC_DEC,  /**< LDPC decode */
> >>>    	RTE_BBDEV_OP_LDPC_ENC,  /**< LDPC encode */
> >>> +	RTE_BBDEV_OP_FFT,  /**< FFT */
> >>>    	RTE_BBDEV_OP_TYPE_PADDED_MAX = 8,  /**< Maximum op type
> >> number including padding */
> >>>    };
> >>>
> >>> @@ -791,6 +873,18 @@ struct rte_bbdev_dec_op {
> >>>    	};
> >>>    };
> >>>
> >>> +/** Structure specifying a single fft operation */ struct
> >>> +rte_bbdev_fft_op {
> >>> +	/** Status of operation that was performed */
> >>> +	int status;
> >>> +	/** Mempool which op instance is in */
> >>> +	struct rte_mempool *mempool;
> >>> +	/** Opaque pointer for user data */
> >>> +	void *opaque_data;
> >>> +	/** Contains turbo decoder specific parameters */
> >>> +	struct rte_bbdev_op_fft fft;
> >>> +};
> >>> +
> >>>    /** Operation capabilities supported by a device */
> >>>    struct rte_bbdev_op_cap {
> >>>    	enum rte_bbdev_op_type type;  /**< Type of operation */ @@ -799,6
> >>> +893,7 @@ struct rte_bbdev_op_cap {
> >>>    		struct rte_bbdev_op_cap_turbo_enc turbo_enc;
> >>>    		struct rte_bbdev_op_cap_ldpc_dec ldpc_dec;
> >>>    		struct rte_bbdev_op_cap_ldpc_enc ldpc_enc;
> >>> +		struct rte_bbdev_op_cap_fft fft;
> >>>    	} cap;  /**< Operation-type specific capabilities */
> >>>    };
> >>>
> >>> @@ -918,6 +1013,42 @@ struct rte_mempool *
> >>>    }
> >>>
> >>>    /**
> >>> + * Bulk allocate fft operations from a mempool with parameter defaults
> >> reset.
> >>> + *
> >>> + * @param mempool
> >>> + *   Operation mempool, created by rte_bbdev_op_pool_create().
> >>> + * @param ops
> >>> + *   Output array to place allocated operations
> >>> + * @param num_ops
> >>> + *   Number of operations to allocate
> >>> + *
> >>> + * @returns
> >>> + *   - 0 on success
> >>> + *   - EINVAL if invalid mempool is provided
> >>> + */
> >>> +__rte_experimental
> >>> +static inline int
> >>> +rte_bbdev_fft_op_alloc_bulk(struct rte_mempool *mempool,
> >>> +		struct rte_bbdev_fft_op **ops, uint16_t num_ops) {
> >>> +	struct rte_bbdev_op_pool_private *priv;
> >>> +	int ret;
> >>> +
> >>> +	/* Check type */
> >>> +	priv = (struct rte_bbdev_op_pool_private *)
> >>> +			rte_mempool_get_priv(mempool);
> >>> +	if (unlikely(priv->type != RTE_BBDEV_OP_FFT))
> >>> +		return -EINVAL;
> >>> +
> >>> +	/* Get elements */
> >>> +	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
> >>> +	if (unlikely(ret < 0))
> >>> +		return ret;
> >> if-check is not needed, just
> >>
> >> return ret;
> >>
> >> and drop the next line
> ?

Fixed through a new commit in new version

> >>
> >> Tom
> >>
> >>> +
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +/**
> >>>     * Free decode operation structures that were allocated by
> >>>     * rte_bbdev_dec_op_alloc_bulk().
> >>>     * All structures must belong to the same mempool.
> >>> @@ -951,6 +1082,24 @@ struct rte_mempool *
> >>>    		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops,
> >> num_ops);
> >>>    }
> >>>
> >>> +/**
> >>> + * Free encode operation structures that were allocated by
> >>> + * rte_bbdev_fft_op_alloc_bulk().
> >>> + * All structures must belong to the same mempool.
> >>> + *
> >>> + * @param ops
> >>> + *   Operation structures
> >>> + * @param num_ops
> >>> + *   Number of structures
> >>> + */
> >>> +__rte_experimental
> >>> +static inline void
> >>> +rte_bbdev_fft_op_free_bulk(struct rte_bbdev_fft_op **ops, unsigned
> >>> +int num_ops) {
> >>> +	if (num_ops > 0)
> >>> +		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops,
> >> num_ops); }
> >>> +
> >>>    #ifdef __cplusplus
> >>>    }
> >>>    #endif
> >>> diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map index
> >>> 9ac3643..efae50b 100644
> >>> --- a/lib/bbdev/version.map
> >>> +++ b/lib/bbdev/version.map
> >>> @@ -44,4 +44,8 @@ EXPERIMENTAL {
> >>>    	global:
> >>>
> >>>    	rte_bbdev_device_status_str;
> >>> +	rte_bbdev_enqueue_fft_ops;
> >>> +	rte_bbdev_dequeue_fft_ops;
> >>> +	rte_bbdev_fft_op_alloc_bulk;
> >>> +	rte_bbdev_fft_op_free_bulk;
> >>>    };


^ permalink raw reply	[flat|nested] 174+ messages in thread

* RE: [PATCH v4 3/7] bbdev: add device info on queue topology
  2022-07-07 13:34               ` Tom Rix
@ 2022-07-07 17:13                 ` Chautru, Nicolas
  2022-07-18 13:04                   ` Tom Rix
  0 siblings, 1 reply; 174+ messages in thread
From: Chautru, Nicolas @ 2022-07-07 17:13 UTC (permalink / raw)
  To: Tom Rix, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, Richardson, Bruce, david.marchand, stephen

Hi Tom, 

> -----Original Message-----
> From: Tom Rix <trix@redhat.com>
> Sent: Thursday, July 7, 2022 6:34 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
> Cc: maxime.coquelin@redhat.com; mdr@ashroe.eu; Richardson, Bruce
> <bruce.richardson@intel.com>; david.marchand@redhat.com;
> stephen@networkplumber.org
> Subject: Re: [PATCH v4 3/7] bbdev: add device info on queue topology
> 
> 
> On 7/6/22 2:12 PM, Chautru, Nicolas wrote:
> > Hi Tom,
> >
> >> -----Original Message-----
> >> From: Tom Rix <trix@redhat.com>
> >> Subject: Re: [PATCH v4 3/7] bbdev: add device info on queue topology
> >>
> >>
> >> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
> >>> Adding more options in the API to expose the number of queues
> >>> exposed and related priority.
> >>>
> >>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> >>> ---
> >>>    lib/bbdev/rte_bbdev.h | 4 ++++
> >>>    1 file changed, 4 insertions(+)
> >>>
> >>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
> >>> 9b1ffa4..ac941d6 100644
> >>> --- a/lib/bbdev/rte_bbdev.h
> >>> +++ b/lib/bbdev/rte_bbdev.h
> >>> @@ -289,6 +289,10 @@ struct rte_bbdev_driver_info {
> >>>
> >>>    	/** Maximum number of queues supported by the device */
> >>>    	unsigned int max_num_queues;
> >>> +	/** Maximum number of queues supported per operation type */
> >>> +	unsigned int num_queues[RTE_BBDEV_OP_TYPE_PADDED_MAX];
> >>> +	/** Priority level supported per operation type */
> >>> +	unsigned int queue_priority[RTE_BBDEV_OP_TYPE_PADDED_MAX];
> >> It is better to add new elements to the end of a structure for better
> >> backward compatibility
> > All that serie is not ABI compatible (sizes change etc...). I don’t believe there
> is such a recommendation, is there?
> 
> Depends on what users expect, a dynamically linked old application would at
> best core here.  If the elements were added to the end, yes the size would
> change but the old dynamically linked application would not use
> them.  Dynamically linking is nice because problems in the library can be fixed
> and shipped without forcing the user recompile.  Though the user may not
> realize  it, this change forces them to recompile.
> 
> Tom

Thanks Tom. In that very context, the change are big enough not to have any form of compatibility. This a new ABI version, and user knows they will have to recompile. 
Still it would be great to capture a recommendation in DPDK coding guideline in case there is such a BKM, I have heard multiple arguments for different preference, if we want to harmonize such things let's capture in coding guide lines, it would not hurt. Maybe one for Thomas?

> 
> >
> >> Tom
> >>
> >>>    	/** Queue size limit (queue size must also be power of 2) */
> >>>    	uint32_t queue_size_lim;
> >>>    	/** Set if device off-loads operation to hardware  */


^ permalink raw reply	[flat|nested] 174+ messages in thread

* RE: [PATCH v4 2/7] bbdev: add device status info
  2022-07-07 13:37               ` Tom Rix
@ 2022-07-07 17:15                 ` Chautru, Nicolas
  2022-07-18 13:09                   ` Tom Rix
  0 siblings, 1 reply; 174+ messages in thread
From: Chautru, Nicolas @ 2022-07-07 17:15 UTC (permalink / raw)
  To: Tom Rix, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, Richardson, Bruce, david.marchand, stephen

Hi Tom, 

> -----Original Message-----
> From: Tom Rix <trix@redhat.com>
> Sent: Thursday, July 7, 2022 6:37 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
> Cc: maxime.coquelin@redhat.com; mdr@ashroe.eu; Richardson, Bruce
> <bruce.richardson@intel.com>; david.marchand@redhat.com;
> stephen@networkplumber.org
> Subject: Re: [PATCH v4 2/7] bbdev: add device status info
> 
> 
> On 7/6/22 2:16 PM, Chautru, Nicolas wrote:
> >> -----Original Message-----
> >> From: Tom Rix <trix@redhat.com>
> >> Subject: Re: [PATCH v4 2/7] bbdev: add device status info
> >>
> >>
> >> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
> >>> Added device status information, so that the PMD can expose
> >>> information related to the underlying accelerator device status.
> >>> Minor order change in structure to fit into padding hole.
> >>>
> >>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> >>> ---
> >>>    drivers/baseband/acc100/rte_acc100_pmd.c           |  1 +
> >>>    drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  1 +
> >>>    drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  1 +
> >>>    drivers/baseband/la12xx/bbdev_la12xx.c             |  1 +
> >>>    drivers/baseband/null/bbdev_null.c                 |  1 +
> >>>    drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  1 +
> >>>    lib/bbdev/rte_bbdev.c                              | 24 +++++++++++++++
> >>>    lib/bbdev/rte_bbdev.h                              | 35 ++++++++++++++++++++--
> >>>    lib/bbdev/version.map                              |  6 ++++
> >>>    9 files changed, 69 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> index de7e4bc..17ba798 100644
> >>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> @@ -1060,6 +1060,7 @@
> >>>
> >>>    	/* Read and save the populated config from ACC100 registers */
> >>>    	fetch_acc100_config(dev);
> >>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>
> >>>    	/* This isn't ideal because it reports the maximum number of
> >>> queues
> >> but
> >>>    	 * does not provide info on how many can be uplink/downlink or
> >>> different diff --git
> >>> a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> >>> b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> >>> index 82ae6ba..57b12af 100644
> >>> --- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> >>> +++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> >>> @@ -369,6 +369,7 @@
> >>>    	dev_info->capabilities = bbdev_capabilities;
> >>>    	dev_info->cpu_flag_reqs = NULL;
> >>>    	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> >>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>
> >>>    	/* Calculates number of queues assigned to device */
> >>>    	dev_info->max_num_queues = 0;
> >>> diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> >>> b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> >>> index 21d3529..2a330c4 100644
> >>> --- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> >>> +++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> >>> @@ -645,6 +645,7 @@ struct __rte_cache_aligned fpga_queue {
> >>>    	dev_info->capabilities = bbdev_capabilities;
> >>>    	dev_info->cpu_flag_reqs = NULL;
> >>>    	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> >>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>
> >>>    	/* Calculates number of queues assigned to device */
> >>>    	dev_info->max_num_queues = 0;
> >>> diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c
> >>> b/drivers/baseband/la12xx/bbdev_la12xx.c
> >>> index 4d1bd16..c1f88c6 100644
> >>> --- a/drivers/baseband/la12xx/bbdev_la12xx.c
> >>> +++ b/drivers/baseband/la12xx/bbdev_la12xx.c
> >>> @@ -100,6 +100,7 @@ struct bbdev_la12xx_params {
> >>>    	dev_info->capabilities = bbdev_capabilities;
> >>>    	dev_info->cpu_flag_reqs = NULL;
> >>>    	dev_info->min_alignment = 64;
> >>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>
> >>>    	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
> >>>    }
> >>> diff --git a/drivers/baseband/null/bbdev_null.c
> >>> b/drivers/baseband/null/bbdev_null.c
> >>> index 248e129..94a1976 100644
> >>> --- a/drivers/baseband/null/bbdev_null.c
> >>> +++ b/drivers/baseband/null/bbdev_null.c
> >>> @@ -82,6 +82,7 @@ struct bbdev_queue {
> >>>    	 * here for code completeness.
> >>>    	 */
> >>>    	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> >>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>
> >>>    	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
> >>>    }
> >>> diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> >>> b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> >>> index af7bc41..dbc5524 100644
> >>> --- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> >>> +++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> >>> @@ -254,6 +254,7 @@ struct turbo_sw_queue {
> >>>    	dev_info->min_alignment = 64;
> >>>    	dev_info->harq_buffer_size = 0;
> >>>    	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> >>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>
> >>>    	rte_bbdev_log_debug("got device info from %u\n", dev->data-
> >>> dev_id);
> >>>    }
> >>> diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c index
> >>> 22bd894..555bda9 100644
> >>> --- a/lib/bbdev/rte_bbdev.c
> >>> +++ b/lib/bbdev/rte_bbdev.c
> >>> @@ -25,6 +25,8 @@
> >>>
> >>>    /* Number of supported operation types */
> >>>    #define BBDEV_OP_TYPE_COUNT 5
> >>> +/* Number of supported device status */ #define
> >>> +BBDEV_DEV_STATUS_COUNT 9
> >>>
> >>>    /* BBDev library logging ID */
> >>>    RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE); @@ -1132,3
> >> +1134,25
> >>> @@ struct rte_mempool *
> >>>    	rte_bbdev_log(ERR, "Invalid operation type");
> >>>    	return NULL;
> >>>    }
> >>> +
> >>> +const char *
> >>> +rte_bbdev_device_status_str(enum rte_bbdev_device_status status) {
> >>> +	static const char * const dev_sta_string[] = {
> >>> +		"RTE_BBDEV_DEV_NOSTATUS",
> >>> +		"RTE_BBDEV_DEV_NOT_SUPPORTED",
> >>> +		"RTE_BBDEV_DEV_RESET",
> >>> +		"RTE_BBDEV_DEV_CONFIGURED",
> >>> +		"RTE_BBDEV_DEV_ACTIVE",
> >>> +		"RTE_BBDEV_DEV_FATAL_ERR",
> >>> +		"RTE_BBDEV_DEV_RESTART_REQ",
> >>> +		"RTE_BBDEV_DEV_RECONFIG_REQ",
> >>> +		"RTE_BBDEV_DEV_CORRECT_ERR",
> >>> +	};
> >>> +
> >>> +	if (status < BBDEV_DEV_STATUS_COUNT)
> >>> +		return dev_sta_string[status];
> >>> +
> >>> +	rte_bbdev_log(ERR, "Invalid device status");
> >>> +	return NULL;
> >>> +}
> >>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
> >>> b88c881..9b1ffa4 100644
> >>> --- a/lib/bbdev/rte_bbdev.h
> >>> +++ b/lib/bbdev/rte_bbdev.h
> >>> @@ -223,6 +223,21 @@ struct rte_bbdev_queue_conf {
> >>>    int
> >>>    rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
> >>>
> >>> +/**
> >>> + * Flags indicate the status of the device  */ enum
> >>> +rte_bbdev_device_status {
> >>> +	RTE_BBDEV_DEV_NOSTATUS,        /**< Nothing being reported */
> >>> +	RTE_BBDEV_DEV_NOT_SUPPORTED,   /**< Device status is not
> >> supported on the PMD */
> >> If this was 0, you may not need to explicitly set.
> > This helps to have the lack of status being equivalent to a cleared register.
> 
> NOSTATUS is fine, just change
> 
> NOT_SUPPORTED = 0,

Let me rephrase. Currently RTE_BBDEV_DEV_NOSTATUS is zero explicitly which can be valuable to match a clear register.
RTE_BBDEV_DEV_NOT_SUPPORTED would not be zero.
Are you suggesting I should put explictly a RTE_BBDEV_DEV_NOSTATUS = 0? Isn't it implicit for any compiler that the first enum starts from zero?

> 
> Tom
> 
> >
> >>> +	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-configured
> >> state */
> >>> +	RTE_BBDEV_DEV_CONFIGURED,      /**< Device is configured and
> >> ready to use */
> >>> +	RTE_BBDEV_DEV_ACTIVE,          /**< Device is configured and VF is
> >> being used */
> >>> +	RTE_BBDEV_DEV_FATAL_ERR,       /**< Device has hit a fatal
> >> uncorrectable error */
> >>> +	RTE_BBDEV_DEV_RESTART_REQ,     /**< Device requires application to
> >> restart */
> >>> +	RTE_BBDEV_DEV_RECONFIG_REQ,    /**< Device requires application
> >> to reconfigure queues */
> >>> +	RTE_BBDEV_DEV_CORRECT_ERR,     /**< Warning of a correctable
> >> error event happened */
> >> Last patch was padded, do something consistent here.
> > We only pad if we have to. Here there is no array whose size would be
> dimensioned by the size of that enum.
> >
> >>> +};
> >>> +
> >>>    /** Device statistics. */
> >>>    struct rte_bbdev_stats {
> >>>    	uint64_t enqueued_count;  /**< Count of all operations enqueued
> >>> */ @@ -285,12 +300,14 @@ struct rte_bbdev_driver_info {
> >>>    	/** Set if device supports per-queue interrupts */
> >>>    	bool queue_intr_supported;
> >>>    	/** Minimum alignment of buffers, in bytes */
> >>> -	uint16_t min_alignment;
> >>> -	/** HARQ memory available in kB */
> >>> +	/** Device Status */
> >>> +	enum rte_bbdev_device_status device_status;
> >> New elements should be added to the end to improve backward
> compatibility.
> > Same comment in different patch. I would like to know if there is a real
> recommendation from DPDK on this. I have heard opposite view as well.
> > In that very case we are breaking the ABI in that new serie for 22.11 (sizes
> and offsets are changing).
> >
> >> Tom
> >>
> >>>    	uint32_t harq_buffer_size;
> >>>    	/** Byte endianness (RTE_BIG_ENDIAN/RTE_LITTLE_ENDIAN)
> >> supported
> >>>    	 *  for input/output data
> >>>    	 */
> >>> +	uint16_t min_alignment;
> >>> +	/** HARQ memory available in kB */
> >>>    	uint8_t data_endianness;
> >>>    	/** Default queue configuration used if none is supplied  */
> >>>    	struct rte_bbdev_queue_conf default_queue_conf; @@ -827,6
> >> +844,20
> >>> @@ typedef void (*rte_bbdev_cb_fn)(uint16_t dev_id,
> >>>    rte_bbdev_queue_intr_ctl(uint16_t dev_id, uint16_t queue_id, int
> >>> epfd, int
> >> op,
> >>>    		void *data);
> >>>
> >>> +/**
> >>> + * Converts device status from enum to string
> >>> + *
> >>> + * @param status
> >>> + *   Device status as enum
> >>> + *
> >>> + * @returns
> >>> + *   Operation type as string or NULL if op_type is invalid
> >>> + *
> >>> + */
> >>> +__rte_experimental
> >>> +const char*
> >>> +rte_bbdev_device_status_str(enum rte_bbdev_device_status status);
> >>> +
> >>>    #ifdef __cplusplus
> >>>    }
> >>>    #endif
> >>> diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map index
> >>> cce3f3c..9ac3643 100644
> >>> --- a/lib/bbdev/version.map
> >>> +++ b/lib/bbdev/version.map
> >>> @@ -39,3 +39,9 @@ DPDK_22 {
> >>>
> >>>    	local: *;
> >>>    };
> >>> +
> >>> +EXPERIMENTAL {
> >>> +	global:
> >>> +
> >>> +	rte_bbdev_device_status_str;
> >>> +};


^ permalink raw reply	[flat|nested] 174+ messages in thread

* RE: [PATCH v4 4/7] drivers/baseband: update PMDs to expose queue per operation
  2022-07-07 13:20               ` Tom Rix
@ 2022-07-07 17:19                 ` Chautru, Nicolas
  2022-07-18 13:21                   ` Tom Rix
  0 siblings, 1 reply; 174+ messages in thread
From: Chautru, Nicolas @ 2022-07-07 17:19 UTC (permalink / raw)
  To: Tom Rix, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, Richardson, Bruce, david.marchand, stephen

Hi Tom, 

> -----Original Message-----
> From: Tom Rix <trix@redhat.com>
> Sent: Thursday, July 7, 2022 6:21 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
> Cc: maxime.coquelin@redhat.com; mdr@ashroe.eu; Richardson, Bruce
> <bruce.richardson@intel.com>; david.marchand@redhat.com;
> stephen@networkplumber.org
> Subject: Re: [PATCH v4 4/7] drivers/baseband: update PMDs to expose queue
> per operation
> 
> 
> On 7/6/22 2:10 PM, Chautru, Nicolas wrote:
> > Hi Tom,
> >
> >> -----Original Message-----
> >> From: Tom Rix <trix@redhat.com>
> >> Sent: Wednesday, July 6, 2022 9:15 AM
> >> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> >> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
> >> Cc: maxime.coquelin@redhat.com; mdr@ashroe.eu; Richardson, Bruce
> >> <bruce.richardson@intel.com>; david.marchand@redhat.com;
> >> stephen@networkplumber.org
> >> Subject: Re: [PATCH v4 4/7] drivers/baseband: update PMDs to expose
> >> queue per operation
> >>
> >>
> >> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
> >>> Add support in existing bbdev PMDs for the explicit number of queue
> >>> and priority for each operation type configured on the device.
> >>>
> >>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> >>> ---
> >>>    drivers/baseband/acc100/rte_acc100_pmd.c           | 29 +++++++++++++--
> ----
> >> ---
> >>>    drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  8 ++++++
> >>>    drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  8 ++++++
> >>>    drivers/baseband/la12xx/bbdev_la12xx.c             |  7 ++++++
> >>>    drivers/baseband/turbo_sw/bbdev_turbo_software.c   | 11 ++++++++
> >>>    5 files changed, 51 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> index 17ba798..d568d0d 100644
> >>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> @@ -966,6 +966,7 @@
> >>>    		struct rte_bbdev_driver_info *dev_info)
> >>>    {
> >>>    	struct acc100_device *d = dev->data->dev_private;
> >>> +	int i;
> >>>
> >>>    	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> >>>    		{
> >>> @@ -1062,19 +1063,23 @@
> >>>    	fetch_acc100_config(dev);
> >>>    	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>
> >>> -	/* This isn't ideal because it reports the maximum number of queues
> >> but
> >>> -	 * does not provide info on how many can be uplink/downlink or
> >> different
> >>> -	 * priorities
> >>> -	 */
> >>> -	dev_info->max_num_queues =
> >>> -			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
> >>> -			d->acc100_conf.q_dl_5g.num_qgroups +
> >>> -			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
> >>> -			d->acc100_conf.q_ul_5g.num_qgroups +
> >>> -			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
> >>> -			d->acc100_conf.q_dl_4g.num_qgroups +
> >>> -			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
> >>> +	/* Expose number of queues */
> >>> +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
> >>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] =
> >>> +d->acc100_conf.q_ul_4g.num_aqs_per_groups *
> >>>    			d->acc100_conf.q_ul_4g.num_qgroups;
> >>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = d-
> >>> acc100_conf.q_dl_4g.num_aqs_per_groups *
> >>> +			d->acc100_conf.q_dl_4g.num_qgroups;
> >>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = d-
> >>> acc100_conf.q_ul_5g.num_aqs_per_groups *
> >>> +			d->acc100_conf.q_ul_5g.num_qgroups;
> >>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = d-
> >>> acc100_conf.q_dl_5g.num_aqs_per_groups *
> >>> +			d->acc100_conf.q_dl_5g.num_qgroups;
> >>> +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = d-
> >>> acc100_conf.q_ul_4g.num_qgroups;
> >>> +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = d-
> >>> acc100_conf.q_dl_4g.num_qgroups;
> >>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = d-
> >>> acc100_conf.q_ul_5g.num_qgroups;
> >>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = d-
> >>> acc100_conf.q_dl_5g.num_qgroups;
> >>> +	dev_info->max_num_queues = 0;
> >>> +	for (i = RTE_BBDEV_OP_TURBO_DEC; i < RTE_BBDEV_OP_LDPC_ENC;
> >> i++)
> >>
> >> should this be i <=  ?
> >>
> > Thanks
> >
> >>> +		dev_info->max_num_queues += dev_info->num_queues[i];
> >>>    	dev_info->queue_size_lim = ACC100_MAX_QUEUE_DEPTH;
> >>>    	dev_info->hardware_accelerated = true;
> >>>    	dev_info->max_dl_queue_priority = diff --git
> >>> a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> >>> b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> >>> index 57b12af..b4982af 100644
> >>> --- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> >>> +++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> >>> @@ -379,6 +379,14 @@
> >>>    		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
> >>>    			dev_info->max_num_queues++;
> >>>    	}
> >>> +	/* Expose number of queue per operation type */
> >>> +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
> >>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
> >>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
> >>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = dev_info-
> >>> max_num_queues / 2;
> >>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = dev_info-
> >>> max_num_queues / 2;
> >>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
> >>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
> >>>    }
> >>>
> >>>    /**
> >>> diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> >>> b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> >>> index 2a330c4..dc7f479 100644
> >>> --- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> >>> +++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> >>> @@ -655,6 +655,14 @@ struct __rte_cache_aligned fpga_queue {
> >>>    		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
> >>>    			dev_info->max_num_queues++;
> >>>    	}
> >>> +	/* Expose number of queue per operation type */
> >>> +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
> >>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = dev_info-
> >>> max_num_queues / 2;
> >>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = dev_info-
> >>> max_num_queues / 2;
> >>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = 0;
> >>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = 0;
> >>> +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = 1;
> >>> +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = 1;
> >>>    }
> >>>
> >>>    /**
> >>> diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c
> >>> b/drivers/baseband/la12xx/bbdev_la12xx.c
> >>> index c1f88c6..e99ea9a 100644
> >>> --- a/drivers/baseband/la12xx/bbdev_la12xx.c
> >>> +++ b/drivers/baseband/la12xx/bbdev_la12xx.c
> >>> @@ -102,6 +102,13 @@ struct bbdev_la12xx_params {
> >>>    	dev_info->min_alignment = 64;
> >>>    	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>
> >>> +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
> >>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
> >>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
> >>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] =
> >> LA12XX_MAX_QUEUES / 2;
> >>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] =
> >> LA12XX_MAX_QUEUES / 2;
> >>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
> >>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
> >>>    	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
> >>>    }
> >>>
> >>> diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> >>> b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> >>> index dbc5524..647e706 100644
> >>> --- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> >>> +++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> >>> @@ -256,6 +256,17 @@ struct turbo_sw_queue {
> >>>    	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> >>>    	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>
> >>> +	const struct rte_bbdev_op_cap *op_cap = bbdev_capabilities;
> >> Should this be done through dev instead of assigning directly ?
> > I am not sure I follow your suggestion. Do you mind clarifying?
> 
> bbdev_capabilites is a const defined in this function, do you really need to loop
> over it to find information that is constant ?

I still miss your point. Note that this constant is not always the same at build time (based on what SDK it can links to).
What would suggest?

Thanks
Nic


> 
> Tom
> 
> >
> >> Tom
> >>
> >>> +	int num_op_type = 0;
> >>> +	for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
> >>> +		num_op_type++;
> >>> +	op_cap = bbdev_capabilities;
> >>> +	if (num_op_type > 0) {
> >>> +		int num_queue_per_type = dev_info->max_num_queues /
> >> num_op_type;
> >>> +		for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
> >>> +			dev_info->num_queues[op_cap->type] =
> >> num_queue_per_type;
> >>> +	}
> >>> +
> >>>    	rte_bbdev_log_debug("got device info from %u\n", dev->data-
> >>> dev_id);
> >>>    }
> >>>


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v4 3/7] bbdev: add device info on queue topology
  2022-07-07 17:13                 ` Chautru, Nicolas
@ 2022-07-18 13:04                   ` Tom Rix
  0 siblings, 0 replies; 174+ messages in thread
From: Tom Rix @ 2022-07-18 13:04 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, Richardson, Bruce, david.marchand, stephen


On 7/7/22 10:13 AM, Chautru, Nicolas wrote:
> Hi Tom,
>
>> -----Original Message-----
>> From: Tom Rix <trix@redhat.com>
>> Sent: Thursday, July 7, 2022 6:34 AM
>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
>> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
>> Cc: maxime.coquelin@redhat.com; mdr@ashroe.eu; Richardson, Bruce
>> <bruce.richardson@intel.com>; david.marchand@redhat.com;
>> stephen@networkplumber.org
>> Subject: Re: [PATCH v4 3/7] bbdev: add device info on queue topology
>>
>>
>> On 7/6/22 2:12 PM, Chautru, Nicolas wrote:
>>> Hi Tom,
>>>
>>>> -----Original Message-----
>>>> From: Tom Rix <trix@redhat.com>
>>>> Subject: Re: [PATCH v4 3/7] bbdev: add device info on queue topology
>>>>
>>>>
>>>> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
>>>>> Adding more options in the API to expose the number of queues
>>>>> exposed and related priority.
>>>>>
>>>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>>>> ---
>>>>>     lib/bbdev/rte_bbdev.h | 4 ++++
>>>>>     1 file changed, 4 insertions(+)
>>>>>
>>>>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
>>>>> 9b1ffa4..ac941d6 100644
>>>>> --- a/lib/bbdev/rte_bbdev.h
>>>>> +++ b/lib/bbdev/rte_bbdev.h
>>>>> @@ -289,6 +289,10 @@ struct rte_bbdev_driver_info {
>>>>>
>>>>>     	/** Maximum number of queues supported by the device */
>>>>>     	unsigned int max_num_queues;
>>>>> +	/** Maximum number of queues supported per operation type */
>>>>> +	unsigned int num_queues[RTE_BBDEV_OP_TYPE_PADDED_MAX];
>>>>> +	/** Priority level supported per operation type */
>>>>> +	unsigned int queue_priority[RTE_BBDEV_OP_TYPE_PADDED_MAX];
>>>> It is better to add new elements to the end of a structure for better
>>>> backward compatibility
>>> All that serie is not ABI compatible (sizes change etc...). I don’t believe there
>> is such a recommendation, is there?
>>
>> Depends on what users expect, a dynamically linked old application would at
>> best core here.  If the elements were added to the end, yes the size would
>> change but the old dynamically linked application would not use
>> them.  Dynamically linking is nice because problems in the library can be fixed
>> and shipped without forcing the user recompile.  Though the user may not
>> realize  it, this change forces them to recompile.
>>
>> Tom
> Thanks Tom. In that very context, the change are big enough not to have any form of compatibility. This a new ABI version, and user knows they will have to recompile.
> Still it would be great to capture a recommendation in DPDK coding guideline in case there is such a BKM, I have heard multiple arguments for different preference, if we want to harmonize such things let's capture in coding guide lines, it would not hurt. Maybe one for Thomas?

When sw is deployed, how would a user know ?

For that matter, how would a developer know without a deep reading of 
header files ?

I am not asking for a compatibility testsuite here, just the placement 
of new elements (the same code) at the end of structures.  As a library 
writer, please consider the users of the library.  Your improvements are 
amplified by all of the library's users.  The user's code quality is 
based on this library's code quality.

My expectation is a new ABI introduces new functionality without 
breaking old binaries. Or if it does, it is for a good reason.

There is no good reason for putting new elements into the middle of an 
existing structure.

Tom

>
>>>> Tom
>>>>
>>>>>     	/** Queue size limit (queue size must also be power of 2) */
>>>>>     	uint32_t queue_size_lim;
>>>>>     	/** Set if device off-loads operation to hardware  */


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v4 2/7] bbdev: add device status info
  2022-07-07 17:15                 ` Chautru, Nicolas
@ 2022-07-18 13:09                   ` Tom Rix
  0 siblings, 0 replies; 174+ messages in thread
From: Tom Rix @ 2022-07-18 13:09 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, Richardson, Bruce, david.marchand, stephen


On 7/7/22 10:15 AM, Chautru, Nicolas wrote:
> Hi Tom,
>
>> -----Original Message-----
>> From: Tom Rix <trix@redhat.com>
>> Sent: Thursday, July 7, 2022 6:37 AM
>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
>> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
>> Cc: maxime.coquelin@redhat.com; mdr@ashroe.eu; Richardson, Bruce
>> <bruce.richardson@intel.com>; david.marchand@redhat.com;
>> stephen@networkplumber.org
>> Subject: Re: [PATCH v4 2/7] bbdev: add device status info
>>
>>
>> On 7/6/22 2:16 PM, Chautru, Nicolas wrote:
>>>> -----Original Message-----
>>>> From: Tom Rix <trix@redhat.com>
>>>> Subject: Re: [PATCH v4 2/7] bbdev: add device status info
>>>>
>>>>
>>>> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
>>>>> Added device status information, so that the PMD can expose
>>>>> information related to the underlying accelerator device status.
>>>>> Minor order change in structure to fit into padding hole.
>>>>>
>>>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>>>> ---
>>>>>     drivers/baseband/acc100/rte_acc100_pmd.c           |  1 +
>>>>>     drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  1 +
>>>>>     drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  1 +
>>>>>     drivers/baseband/la12xx/bbdev_la12xx.c             |  1 +
>>>>>     drivers/baseband/null/bbdev_null.c                 |  1 +
>>>>>     drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  1 +
>>>>>     lib/bbdev/rte_bbdev.c                              | 24 +++++++++++++++
>>>>>     lib/bbdev/rte_bbdev.h                              | 35 ++++++++++++++++++++--
>>>>>     lib/bbdev/version.map                              |  6 ++++
>>>>>     9 files changed, 69 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
>>>>> b/drivers/baseband/acc100/rte_acc100_pmd.c
>>>>> index de7e4bc..17ba798 100644
>>>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
>>>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
>>>>> @@ -1060,6 +1060,7 @@
>>>>>
>>>>>     	/* Read and save the populated config from ACC100 registers */
>>>>>     	fetch_acc100_config(dev);
>>>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>>>
>>>>>     	/* This isn't ideal because it reports the maximum number of
>>>>> queues
>>>> but
>>>>>     	 * does not provide info on how many can be uplink/downlink or
>>>>> different diff --git
>>>>> a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>>>> b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>>>> index 82ae6ba..57b12af 100644
>>>>> --- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>>>> +++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>>>> @@ -369,6 +369,7 @@
>>>>>     	dev_info->capabilities = bbdev_capabilities;
>>>>>     	dev_info->cpu_flag_reqs = NULL;
>>>>>     	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
>>>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>>>
>>>>>     	/* Calculates number of queues assigned to device */
>>>>>     	dev_info->max_num_queues = 0;
>>>>> diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>>>> b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>>>> index 21d3529..2a330c4 100644
>>>>> --- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>>>> +++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>>>> @@ -645,6 +645,7 @@ struct __rte_cache_aligned fpga_queue {
>>>>>     	dev_info->capabilities = bbdev_capabilities;
>>>>>     	dev_info->cpu_flag_reqs = NULL;
>>>>>     	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
>>>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>>>
>>>>>     	/* Calculates number of queues assigned to device */
>>>>>     	dev_info->max_num_queues = 0;
>>>>> diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c
>>>>> b/drivers/baseband/la12xx/bbdev_la12xx.c
>>>>> index 4d1bd16..c1f88c6 100644
>>>>> --- a/drivers/baseband/la12xx/bbdev_la12xx.c
>>>>> +++ b/drivers/baseband/la12xx/bbdev_la12xx.c
>>>>> @@ -100,6 +100,7 @@ struct bbdev_la12xx_params {
>>>>>     	dev_info->capabilities = bbdev_capabilities;
>>>>>     	dev_info->cpu_flag_reqs = NULL;
>>>>>     	dev_info->min_alignment = 64;
>>>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>>>
>>>>>     	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
>>>>>     }
>>>>> diff --git a/drivers/baseband/null/bbdev_null.c
>>>>> b/drivers/baseband/null/bbdev_null.c
>>>>> index 248e129..94a1976 100644
>>>>> --- a/drivers/baseband/null/bbdev_null.c
>>>>> +++ b/drivers/baseband/null/bbdev_null.c
>>>>> @@ -82,6 +82,7 @@ struct bbdev_queue {
>>>>>     	 * here for code completeness.
>>>>>     	 */
>>>>>     	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
>>>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>>>
>>>>>     	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
>>>>>     }
>>>>> diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>>>> b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>>>> index af7bc41..dbc5524 100644
>>>>> --- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>>>> +++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>>>> @@ -254,6 +254,7 @@ struct turbo_sw_queue {
>>>>>     	dev_info->min_alignment = 64;
>>>>>     	dev_info->harq_buffer_size = 0;
>>>>>     	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
>>>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>>>
>>>>>     	rte_bbdev_log_debug("got device info from %u\n", dev->data-
>>>>> dev_id);
>>>>>     }
>>>>> diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c index
>>>>> 22bd894..555bda9 100644
>>>>> --- a/lib/bbdev/rte_bbdev.c
>>>>> +++ b/lib/bbdev/rte_bbdev.c
>>>>> @@ -25,6 +25,8 @@
>>>>>
>>>>>     /* Number of supported operation types */
>>>>>     #define BBDEV_OP_TYPE_COUNT 5
>>>>> +/* Number of supported device status */ #define
>>>>> +BBDEV_DEV_STATUS_COUNT 9
>>>>>
>>>>>     /* BBDev library logging ID */
>>>>>     RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE); @@ -1132,3
>>>> +1134,25
>>>>> @@ struct rte_mempool *
>>>>>     	rte_bbdev_log(ERR, "Invalid operation type");
>>>>>     	return NULL;
>>>>>     }
>>>>> +
>>>>> +const char *
>>>>> +rte_bbdev_device_status_str(enum rte_bbdev_device_status status) {
>>>>> +	static const char * const dev_sta_string[] = {
>>>>> +		"RTE_BBDEV_DEV_NOSTATUS",
>>>>> +		"RTE_BBDEV_DEV_NOT_SUPPORTED",
>>>>> +		"RTE_BBDEV_DEV_RESET",
>>>>> +		"RTE_BBDEV_DEV_CONFIGURED",
>>>>> +		"RTE_BBDEV_DEV_ACTIVE",
>>>>> +		"RTE_BBDEV_DEV_FATAL_ERR",
>>>>> +		"RTE_BBDEV_DEV_RESTART_REQ",
>>>>> +		"RTE_BBDEV_DEV_RECONFIG_REQ",
>>>>> +		"RTE_BBDEV_DEV_CORRECT_ERR",
>>>>> +	};
>>>>> +
>>>>> +	if (status < BBDEV_DEV_STATUS_COUNT)
>>>>> +		return dev_sta_string[status];
>>>>> +
>>>>> +	rte_bbdev_log(ERR, "Invalid device status");
>>>>> +	return NULL;
>>>>> +}
>>>>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
>>>>> b88c881..9b1ffa4 100644
>>>>> --- a/lib/bbdev/rte_bbdev.h
>>>>> +++ b/lib/bbdev/rte_bbdev.h
>>>>> @@ -223,6 +223,21 @@ struct rte_bbdev_queue_conf {
>>>>>     int
>>>>>     rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
>>>>>
>>>>> +/**
>>>>> + * Flags indicate the status of the device  */ enum
>>>>> +rte_bbdev_device_status {
>>>>> +	RTE_BBDEV_DEV_NOSTATUS,        /**< Nothing being reported */
>>>>> +	RTE_BBDEV_DEV_NOT_SUPPORTED,   /**< Device status is not
>>>> supported on the PMD */
>>>> If this was 0, you may not need to explicitly set.
>>> This helps to have the lack of status being equivalent to a cleared register.
>> NOSTATUS is fine, just change
>>
>> NOT_SUPPORTED = 0,
> Let me rephrase. Currently RTE_BBDEV_DEV_NOSTATUS is zero explicitly which can be valuable to match a clear register.
> RTE_BBDEV_DEV_NOT_SUPPORTED would not be zero.
> Are you suggesting I should put explictly a RTE_BBDEV_DEV_NOSTATUS = 0? Isn't it implicit for any compiler that the first enum starts from zero?

However you want to do it, try taking advantage of zero-ed memory.  By 
choosing for this enum to be non-zero, it has to be explicitly set. If 
it was 0 it would be implicitly set, assuming dev is zero-ed.

Tom

>
>> Tom
>>
>>>>> +	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-configured
>>>> state */
>>>>> +	RTE_BBDEV_DEV_CONFIGURED,      /**< Device is configured and
>>>> ready to use */
>>>>> +	RTE_BBDEV_DEV_ACTIVE,          /**< Device is configured and VF is
>>>> being used */
>>>>> +	RTE_BBDEV_DEV_FATAL_ERR,       /**< Device has hit a fatal
>>>> uncorrectable error */
>>>>> +	RTE_BBDEV_DEV_RESTART_REQ,     /**< Device requires application to
>>>> restart */
>>>>> +	RTE_BBDEV_DEV_RECONFIG_REQ,    /**< Device requires application
>>>> to reconfigure queues */
>>>>> +	RTE_BBDEV_DEV_CORRECT_ERR,     /**< Warning of a correctable
>>>> error event happened */
>>>> Last patch was padded, do something consistent here.
>>> We only pad if we have to. Here there is no array whose size would be
>> dimensioned by the size of that enum.
>>>>> +};
>>>>> +
>>>>>     /** Device statistics. */
>>>>>     struct rte_bbdev_stats {
>>>>>     	uint64_t enqueued_count;  /**< Count of all operations enqueued
>>>>> */ @@ -285,12 +300,14 @@ struct rte_bbdev_driver_info {
>>>>>     	/** Set if device supports per-queue interrupts */
>>>>>     	bool queue_intr_supported;
>>>>>     	/** Minimum alignment of buffers, in bytes */
>>>>> -	uint16_t min_alignment;
>>>>> -	/** HARQ memory available in kB */
>>>>> +	/** Device Status */
>>>>> +	enum rte_bbdev_device_status device_status;
>>>> New elements should be added to the end to improve backward
>> compatibility.
>>> Same comment in different patch. I would like to know if there is a real
>> recommendation from DPDK on this. I have heard opposite view as well.
>>> In that very case we are breaking the ABI in that new serie for 22.11 (sizes
>> and offsets are changing).
>>>> Tom
>>>>
>>>>>     	uint32_t harq_buffer_size;
>>>>>     	/** Byte endianness (RTE_BIG_ENDIAN/RTE_LITTLE_ENDIAN)
>>>> supported
>>>>>     	 *  for input/output data
>>>>>     	 */
>>>>> +	uint16_t min_alignment;
>>>>> +	/** HARQ memory available in kB */
>>>>>     	uint8_t data_endianness;
>>>>>     	/** Default queue configuration used if none is supplied  */
>>>>>     	struct rte_bbdev_queue_conf default_queue_conf; @@ -827,6
>>>> +844,20
>>>>> @@ typedef void (*rte_bbdev_cb_fn)(uint16_t dev_id,
>>>>>     rte_bbdev_queue_intr_ctl(uint16_t dev_id, uint16_t queue_id, int
>>>>> epfd, int
>>>> op,
>>>>>     		void *data);
>>>>>
>>>>> +/**
>>>>> + * Converts device status from enum to string
>>>>> + *
>>>>> + * @param status
>>>>> + *   Device status as enum
>>>>> + *
>>>>> + * @returns
>>>>> + *   Operation type as string or NULL if op_type is invalid
>>>>> + *
>>>>> + */
>>>>> +__rte_experimental
>>>>> +const char*
>>>>> +rte_bbdev_device_status_str(enum rte_bbdev_device_status status);
>>>>> +
>>>>>     #ifdef __cplusplus
>>>>>     }
>>>>>     #endif
>>>>> diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map index
>>>>> cce3f3c..9ac3643 100644
>>>>> --- a/lib/bbdev/version.map
>>>>> +++ b/lib/bbdev/version.map
>>>>> @@ -39,3 +39,9 @@ DPDK_22 {
>>>>>
>>>>>     	local: *;
>>>>>     };
>>>>> +
>>>>> +EXPERIMENTAL {
>>>>> +	global:
>>>>> +
>>>>> +	rte_bbdev_device_status_str;
>>>>> +};


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v4 4/7] drivers/baseband: update PMDs to expose queue per operation
  2022-07-07 17:19                 ` Chautru, Nicolas
@ 2022-07-18 13:21                   ` Tom Rix
  2022-08-15 17:28                     ` Chautru, Nicolas
  0 siblings, 1 reply; 174+ messages in thread
From: Tom Rix @ 2022-07-18 13:21 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, Richardson, Bruce, david.marchand, stephen


On 7/7/22 10:19 AM, Chautru, Nicolas wrote:
> Hi Tom,
>
>> -----Original Message-----
>> From: Tom Rix <trix@redhat.com>
>> Sent: Thursday, July 7, 2022 6:21 AM
>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
>> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
>> Cc: maxime.coquelin@redhat.com; mdr@ashroe.eu; Richardson, Bruce
>> <bruce.richardson@intel.com>; david.marchand@redhat.com;
>> stephen@networkplumber.org
>> Subject: Re: [PATCH v4 4/7] drivers/baseband: update PMDs to expose queue
>> per operation
>>
>>
>> On 7/6/22 2:10 PM, Chautru, Nicolas wrote:
>>> Hi Tom,
>>>
>>>> -----Original Message-----
>>>> From: Tom Rix <trix@redhat.com>
>>>> Sent: Wednesday, July 6, 2022 9:15 AM
>>>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
>>>> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
>>>> Cc: maxime.coquelin@redhat.com; mdr@ashroe.eu; Richardson, Bruce
>>>> <bruce.richardson@intel.com>; david.marchand@redhat.com;
>>>> stephen@networkplumber.org
>>>> Subject: Re: [PATCH v4 4/7] drivers/baseband: update PMDs to expose
>>>> queue per operation
>>>>
>>>>
>>>> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
>>>>> Add support in existing bbdev PMDs for the explicit number of queue
>>>>> and priority for each operation type configured on the device.
>>>>>
>>>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>>>> ---
>>>>>     drivers/baseband/acc100/rte_acc100_pmd.c           | 29 +++++++++++++--
>> ----
>>>> ---
>>>>>     drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  8 ++++++
>>>>>     drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  8 ++++++
>>>>>     drivers/baseband/la12xx/bbdev_la12xx.c             |  7 ++++++
>>>>>     drivers/baseband/turbo_sw/bbdev_turbo_software.c   | 11 ++++++++
>>>>>     5 files changed, 51 insertions(+), 12 deletions(-)
>>>>>
>>>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
>>>>> b/drivers/baseband/acc100/rte_acc100_pmd.c
>>>>> index 17ba798..d568d0d 100644
>>>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
>>>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
>>>>> @@ -966,6 +966,7 @@
>>>>>     		struct rte_bbdev_driver_info *dev_info)
>>>>>     {
>>>>>     	struct acc100_device *d = dev->data->dev_private;
>>>>> +	int i;
>>>>>
>>>>>     	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
>>>>>     		{
>>>>> @@ -1062,19 +1063,23 @@
>>>>>     	fetch_acc100_config(dev);
>>>>>     	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>>>
>>>>> -	/* This isn't ideal because it reports the maximum number of queues
>>>> but
>>>>> -	 * does not provide info on how many can be uplink/downlink or
>>>> different
>>>>> -	 * priorities
>>>>> -	 */
>>>>> -	dev_info->max_num_queues =
>>>>> -			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
>>>>> -			d->acc100_conf.q_dl_5g.num_qgroups +
>>>>> -			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
>>>>> -			d->acc100_conf.q_ul_5g.num_qgroups +
>>>>> -			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
>>>>> -			d->acc100_conf.q_dl_4g.num_qgroups +
>>>>> -			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
>>>>> +	/* Expose number of queues */
>>>>> +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
>>>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] =
>>>>> +d->acc100_conf.q_ul_4g.num_aqs_per_groups *
>>>>>     			d->acc100_conf.q_ul_4g.num_qgroups;
>>>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = d-
>>>>> acc100_conf.q_dl_4g.num_aqs_per_groups *
>>>>> +			d->acc100_conf.q_dl_4g.num_qgroups;
>>>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = d-
>>>>> acc100_conf.q_ul_5g.num_aqs_per_groups *
>>>>> +			d->acc100_conf.q_ul_5g.num_qgroups;
>>>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = d-
>>>>> acc100_conf.q_dl_5g.num_aqs_per_groups *
>>>>> +			d->acc100_conf.q_dl_5g.num_qgroups;
>>>>> +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = d-
>>>>> acc100_conf.q_ul_4g.num_qgroups;
>>>>> +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = d-
>>>>> acc100_conf.q_dl_4g.num_qgroups;
>>>>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = d-
>>>>> acc100_conf.q_ul_5g.num_qgroups;
>>>>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = d-
>>>>> acc100_conf.q_dl_5g.num_qgroups;
>>>>> +	dev_info->max_num_queues = 0;
>>>>> +	for (i = RTE_BBDEV_OP_TURBO_DEC; i < RTE_BBDEV_OP_LDPC_ENC;
>>>> i++)
>>>>
>>>> should this be i <=  ?
>>>>
>>> Thanks
>>>
>>>>> +		dev_info->max_num_queues += dev_info->num_queues[i];
>>>>>     	dev_info->queue_size_lim = ACC100_MAX_QUEUE_DEPTH;
>>>>>     	dev_info->hardware_accelerated = true;
>>>>>     	dev_info->max_dl_queue_priority = diff --git
>>>>> a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>>>> b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>>>> index 57b12af..b4982af 100644
>>>>> --- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>>>> +++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>>>> @@ -379,6 +379,14 @@
>>>>>     		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
>>>>>     			dev_info->max_num_queues++;
>>>>>     	}
>>>>> +	/* Expose number of queue per operation type */
>>>>> +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
>>>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
>>>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
>>>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = dev_info-
>>>>> max_num_queues / 2;
>>>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = dev_info-
>>>>> max_num_queues / 2;
>>>>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
>>>>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
>>>>>     }
>>>>>
>>>>>     /**
>>>>> diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>>>> b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>>>> index 2a330c4..dc7f479 100644
>>>>> --- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>>>> +++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>>>> @@ -655,6 +655,14 @@ struct __rte_cache_aligned fpga_queue {
>>>>>     		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
>>>>>     			dev_info->max_num_queues++;
>>>>>     	}
>>>>> +	/* Expose number of queue per operation type */
>>>>> +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
>>>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = dev_info-
>>>>> max_num_queues / 2;
>>>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = dev_info-
>>>>> max_num_queues / 2;
>>>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = 0;
>>>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = 0;
>>>>> +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = 1;
>>>>> +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = 1;
>>>>>     }
>>>>>
>>>>>     /**
>>>>> diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c
>>>>> b/drivers/baseband/la12xx/bbdev_la12xx.c
>>>>> index c1f88c6..e99ea9a 100644
>>>>> --- a/drivers/baseband/la12xx/bbdev_la12xx.c
>>>>> +++ b/drivers/baseband/la12xx/bbdev_la12xx.c
>>>>> @@ -102,6 +102,13 @@ struct bbdev_la12xx_params {
>>>>>     	dev_info->min_alignment = 64;
>>>>>     	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>>>
>>>>> +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
>>>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
>>>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
>>>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] =
>>>> LA12XX_MAX_QUEUES / 2;
>>>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] =
>>>> LA12XX_MAX_QUEUES / 2;
>>>>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
>>>>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
>>>>>     	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
>>>>>     }
>>>>>
>>>>> diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>>>> b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>>>> index dbc5524..647e706 100644
>>>>> --- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>>>> +++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>>>> @@ -256,6 +256,17 @@ struct turbo_sw_queue {
>>>>>     	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
>>>>>     	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>>>
>>>>> +	const struct rte_bbdev_op_cap *op_cap = bbdev_capabilities;
>>>> Should this be done through dev instead of assigning directly ?
>>> I am not sure I follow your suggestion. Do you mind clarifying?
>> bbdev_capabilites is a const defined in this function, do you really need to loop
>> over it to find information that is constant ?
> I still miss your point. Note that this constant is not always the same at build time (based on what SDK it can links to).
> What would suggest?

Operations that can be done at compile time, should be.  Useless there 
is a good reason.

You need to provide a good reason or make the change.

Tom

>
> Thanks
> Nic
>
>
>> Tom
>>
>>>> Tom
>>>>
>>>>> +	int num_op_type = 0;
>>>>> +	for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
>>>>> +		num_op_type++;
>>>>> +	op_cap = bbdev_capabilities;
>>>>> +	if (num_op_type > 0) {
>>>>> +		int num_queue_per_type = dev_info->max_num_queues /
>>>> num_op_type;
>>>>> +		for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
>>>>> +			dev_info->num_queues[op_cap->type] =
>>>> num_queue_per_type;
>>>>> +	}
>>>>> +
>>>>>     	rte_bbdev_log_debug("got device info from %u\n", dev->data-
>>>>> dev_id);
>>>>>     }
>>>>>


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v4 5/7] bbdev: add new operation for FFT processing
  2022-07-07 16:57                 ` Chautru, Nicolas
@ 2022-07-18 22:38                   ` Tom Rix
  0 siblings, 0 replies; 174+ messages in thread
From: Tom Rix @ 2022-07-18 22:38 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, Richardson, Bruce, david.marchand, stephen


On 7/7/22 9:57 AM, Chautru, Nicolas wrote:
> Hi Tom,
>
>> -----Original Message-----
>> From: Tom Rix <trix@redhat.com>
>>
>> Nic,
>>
>> Not all my comments were addressed.
>>
>> The one I am most interested in is the default type / size and how it interacts
>> with fp16.
> My bad, I had replied to all that (and fixed some of them in the new version) but I must have NOT sent the latest draft of that email by mistake. Let me go through it again below and let me know if unclear.

This seems like a pattern.

In the future address all the issues raised in the review in the first 
response.

Dribbling out responses looses the cohesion of the review.  No reviewer 
has the time so chase down this-or-that point when the responses and 
changes are spread out over multiple reviews.

>
>> Please see the others below
>>
>> On 7/6/22 2:04 PM, Chautru, Nicolas wrote:
>>> Hi Tom,
>>>
>>>> -----Original Message-----
>>>> From: Tom Rix <trix@redhat.com>>
>>>>
>>>> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
>>>>> Extension of bbdev operation to support FFT based operations.
>>>>>
>>>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>>>> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
>>>>> ---
>>>>>     doc/guides/prog_guide/bbdev.rst | 130
>>>> +++++++++++++++++++++++++++++++++++
>>>>>     lib/bbdev/rte_bbdev.c           |  11 ++-
>>>>>     lib/bbdev/rte_bbdev.h           |  76 ++++++++++++++++++++
>>>>>     lib/bbdev/rte_bbdev_op.h        | 149
>>>> ++++++++++++++++++++++++++++++++++++++++
>>>>>     lib/bbdev/version.map           |   4 ++
>>>>>     5 files changed, 369 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/doc/guides/prog_guide/bbdev.rst
>>>>> b/doc/guides/prog_guide/bbdev.rst index 70fa01a..4a055b5 100644
>>>>> --- a/doc/guides/prog_guide/bbdev.rst
>>>>> +++ b/doc/guides/prog_guide/bbdev.rst
>>>>> @@ -1118,6 +1118,136 @@ Figure :numref:`figure_turbo_tb_decode`
>> above
>>>>>     showing the Turbo decoding of CBs using BBDEV interface in TB-mode
>>>>>     is also valid for LDPC decode.
>>>>>
>>>>> +BBDEV FFT Operation
>>>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>> +
>>>>> +This operation allows to run a combination of DFT and/or IDFT
>>>>> +and/or time-
>>>> domain windowing.
>>>>> +These can be used in a modular fashion (using bypass modes) or as a
>>>>> +processing pipeline which can be used for FFT-based baseband signal
>>>> processing.
>>>>> +In more details it allows :
>>>>> +- to process the data first through an IDFT of adjustable size and
>>>>> +padding;
>>>>> +- to perform the windowing as a programmable cyclic shift offset of
>>>>> +the data followed by a pointwise multiplication by a time domain
>>>>> +window;
>>>>> +- to process the related data through a DFT of adjustable size and
>>>>> +depadding for each such cyclic shift output.
>>>>> +
>>>>> +A flexible number of Rx antennas are being processed in parallel
>>>>> +with the
>>>> same configuration.
>>>>> +The API allows more generally for flexibility in what the PMD may
>>>>> +support (cabability flags) and flexibility to adjust some of the
>>>>> +parameters of
>>>> the processing.
>>>>> +
>>>>> +The operation/capability flags that can be set for each FFT
>>>>> +operation are
>>>> given below.
>>>>> +
>>>>> +  **NOTE:** The actual operation flags that may be used with a
>>>>> + specific  BBDEV PMD are dependent on the driver capabilities as
>>>>> + reported via  ``rte_bbdev_info_get()``, and may be a subset of those
>> below.
>>>>> +
>>>>> ++--------------------------------------------------------------------+
>>>>> +|Description of FFT capability flags                                 |
>>>>>
>> ++===============================================================
>>>> =====
>>>>> +++
>>>>> +|RTE_BBDEV_FFT_WINDOWING                                             |
>>>>> +| Set to enable/support windowing in time domain                     |
>>>>> ++--------------------------------------------------------------------+
>>>>> +|RTE_BBDEV_FFT_CS_ADJUSTMENT                                         |
>>>>> +| Set to enable/support  the cyclic shift time offset adjustment     |
>>>>> ++--------------------------------------------------------------------+
>>>>> +|RTE_BBDEV_FFT_DFT_BYPASS                                            |
>>>>> +| Set to bypass the DFT and use directly the IDFT as an option       |
>>>>> ++--------------------------------------------------------------------+
>>>>> +|RTE_BBDEV_FFT_IDFT_BYPASS                                           |
>>>>> +| Set to bypass the IDFT and use directly the DFT as an option       |
>>>>> ++--------------------------------------------------------------------+
>>>>> +|RTE_BBDEV_FFT_WINDOWING_BYPASS                                      |
>>>>> +| Set to bypass the time domain windowing  as an option              |
>>>>> ++--------------------------------------------------------------------+
>>>>> +|RTE_BBDEV_FFT_POWER_MEAS
>>>> Other flags are not truncated, should be
>>>>
>>>> RTE_BBDEV_FFT_POWER_MEASUREMENT
>>>>
>>> The intention from DPDK recommendation is for these to be kept
>> shortnames, isn't it?
>>> Above we use many acronyms to keep it short (CS, etc...) Even in
>>> current BBDEV API we use many truncation to keep names short: OUT,
>> ENC/DEC, HQ, RM on top of acronyms.
>>> I believe this is still super explicit with that name?
>> Some of other identifier have longer names than this.
>>
>> If you wanted to keep things short, drop the last _<word>
>>
>> Generally the use of acronyms should be avoided because they add a layer of
>> jargon that makes the code less readable to all but writer.
> To be totally honest usage for acronym is ubiquitous in such L1 signal
> processing (and captured in 3GPP specs explicitly, everyone knows what HARQ or LDPC or FFT stands for, etc...).
> I believe this is currently striking the right balance in
> being explicit to developers familiar with related processing while not being unduly long names which create mess when trying to fit to 100 cols.
This is bike shedding. so dropping
>
>>
>>>>>                                                |
>>>>> +| Set to provide an optional power measument of the DFT output       |
>>>>> ++--------------------------------------------------------------------+
>>>> measurement
>>> OK Thanks
>>>
>>>>> +|RTE_BBDEV_FFT_FP16_INPUT                                            |
>>>>> +| Set if the input data shall use FP16 format instead of INT16       |
>>>>> ++--------------------------------------------------------------------+
>>>>> +|RTE_BBDEV_FFT_FP16_OUTPUT                                           |
>>>>> +| Set if the output data shall use FP16 format instead of INT16      |
>>>>> ++--------------------------------------------------------------------+
>>>>> +
>>>>> +The structure passed for each FFT operation is given below, with the
>>>>> +operation flags forming a bitmask in the ``op_flags`` field.
>>>>> +
>>>>> +.. code-block:: c
>>>>> +
>>>>> +    struct rte_bbdev_op_fft {
>>>>> +        struct rte_bbdev_op_data base_input;
>>>>> +        struct rte_bbdev_op_data base_output;
>>>>> +        struct rte_bbdev_op_data power_meas_output;
>>>> similar to above, meas -> measurement
>>> See above. Would that really help? I don’t believe there can be any
>> confusion.
>>
>> Naming is hard.
>>
>> How about dropping the _meas_ and go with power_output
> I agree that naming can be tricky. But in that case I believe this is the right balance as mentioned above.
>
>>>>> +        uint32_t op_flags;
>>>>> +        uint16_t input_sequence_size;
>>>> Could these be future proofed by increasing small int size's to uint32_t ?
>>> It is not possible to be that big for any signal processing relevant to that
>> operation.
>>>>> +        uint16_t input_leading_padding;
>>>>> +        uint16_t output_sequence_size;
>>>>> +        uint16_t output_leading_depadding;
>>>>> +        uint8_t window_index[RTE_BBDEV_MAX_CS_2];
>>>>> +        uint16_t cs_bitmap;
>>>>> +        uint8_t num_antennas_log2;
>>>>> +        uint8_t idft_log2;
>>>>> +        uint8_t dft_log2;
>>>> is _log2 needed in variable name if it is documenation ?
>>> I believe it is a best practice when the variable name may be misleading, ie.
>> this is not the actual dft size as a natural number (2048 for instance) but there
>> is an implied mapping.
>>>>> +        int8_t cs_time_adjustment;
>>>>> +        int8_t idft_shift;
>>>>> +        int8_t dft_shift;
>>>>> +        uint16_t ncs_reciprocal;
>>>>> +        uint16_t power_shift;
>>>>> +        uint16_t fp16_exp_adjust;
>>>>> +    };
>>>>> +
>>>>> +The FFT parameters are set out in the table below.
>>>>> +
>>>>> ++----------------------+--------------------------------------------------------------+
>>>>> +|Parameter             |Description                                                   |
>>>>>
>> ++======================+========================================
>>>> =====
>>>>> ++=================+
>>>>> +|base_input            |input data                                                    |
>>>>> ++----------------------+--------------------------------------------------------------+
>>>>> +|base_output           |output data                                                   |
>>>>> ++----------------------+--------------------------------------------------------------+
>>>>> +|power_meas_output     |optional output data with power measurement
>>>> on DFT output     |
>>>>> ++----------------------+--------------------------------------------------------------+
>>>>> +|op_flags              |bitmask of all active operation capabilities                  |
>>>>> ++----------------------+--------------------------------------------------------------+
>>>>> +|input_sequence_size   |size of the input sequence in 32-bits points per
>>>> antenna      |
>>>>> ++----------------------+--------------------------------------------------------------+
>>>>> +|input_leading_padding |number of points padded at the start of input
>>>> data            |
>>>>> ++----------------------+--------------------------------------------------------------+
>>>>> +|output_sequence_size  |size of the output sequence per antenna and
>>>> cyclic shift      |
>>>>> ++----------------------+--------------------------------------------------------------+
>>>>> +|output_depadding      |number of points depadded at the start of
>> output
>>>> data         |
>>>>> ++----------------------+--------------------------------------------------------------+
>>>> output_leading_depadding
>>> OK Thanks
>>>
>>>>> +|window_index          |optional windowing profile index used for each
>> cyclic
>>>> shift   |
>>>>> ++----------------------+--------------------------------------------------------------+
>>>>> +|cs_bitmap             |bitmap of the cyclic shift output requested (LSB for
>>>> index 0) |
>>>>> ++----------------------+--------------------------------------------------------------+
>>>>> +|num_antennas_log2     |number of antennas as a log2 (10 maps to
>> 1024...)
>>>> |
>>>>> ++----------------------+--------------------------------------------------------------+
>>>>> +|idft_log2             |iDFT size as a log2                                           |
>>>>> ++----------------------+--------------------------------------------------------------+
>>>>> +|dft_log2              |DFT size as a log2                                            |
>>>>> ++----------------------+--------------------------------------------------------------+
>>>>> +|cs_time_adjustment    |adjustment of time position of all the cyclic shift
>>>> output    |
>>>>> ++----------------------+--------------------------------------------------------------+
>>>>> +|idft_shift            |shift down of signal level post iDFT                          |
>>>>> ++----------------------+--------------------------------------------------------------+
>>>>> +|dft_shift             |shift down of signal level post DFT                           |
>>>>> ++----------------------+--------------------------------------------------------------+
>>>>> +|ncs_reciprocal        |inverse of max number of CS normalized to 15b (ie.
>>>> 231 for 12)|
>>>>> ++----------------------+--------------------------------------------------------------+
>>>>> +|power_shift           |shift down of level of power measurement when
>>>> enabled         |
>>>>> ++----------------------+--------------------------------------------------------------+
>>>>> +|fp16_exp_adjust       |value added to FP16 exponent at conversion from
>>>> INT16         |
>>>>> ++----------------------+--------------------------------------------------------------+
>>>>> +
>>>>> +The mbuf input ``base_input`` is mandatory for all BBDEV PMDs and is
>>>>> +the incoming data for the processing. Its size may not fit into an
>>>>> +actual mbuf, but the structure is used to pass iova address.
>>>>> +The mbuf output ``output`` is mandatory and is output of the FFT
>>>> processing chain.
>>>>> +Each point is a complex number of 32bits : either as 2 INT16 or as 2
>>>>> +FP16 based when the option supported.
>>>>> +The data layout is based on contiguous concatenation of output data
>>>>> +first by cyclic shift then by antenna.
>>>>>
>>>>>     Sample code
>>>>>     -----------
>>>>> diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c index
>>>>> 555bda9..28b105d 100644
>>>>> --- a/lib/bbdev/rte_bbdev.c
>>>>> +++ b/lib/bbdev/rte_bbdev.c
>>>>> @@ -24,7 +24,7 @@
>>>>>     #define DEV_NAME "BBDEV"
>>>>>
>>>>>     /* Number of supported operation types */ -#define
>>>>> BBDEV_OP_TYPE_COUNT 5
>>>>> +#define BBDEV_OP_TYPE_COUNT 6
>>>>>     /* Number of supported device status */
>>>>>     #define BBDEV_DEV_STATUS_COUNT 9
>>>>>
>>>>> @@ -854,6 +854,9 @@ struct rte_bbdev *
>>>>>     	case RTE_BBDEV_OP_LDPC_ENC:
>>>>>     		result = sizeof(struct rte_bbdev_enc_op);
>>>>>     		break;
>>>>> +	case RTE_BBDEV_OP_FFT:
>>>>> +		result = sizeof(struct rte_bbdev_fft_op);
>>>>> +		break;
>>>>>     	default:
>>>>>     		break;
>>>>>     	}
>>>>> @@ -877,6 +880,10 @@ struct rte_bbdev *
>>>>>     		struct rte_bbdev_enc_op *op = element;
>>>>>     		memset(op, 0, mempool->elt_size);
>>>>>     		op->mempool = mempool;
>>>>> +	} else if (type == RTE_BBDEV_OP_FFT) {
>>>>> +		struct rte_bbdev_fft_op *op = element;
>>>>> +		memset(op, 0, mempool->elt_size);
>>>>> +		op->mempool = mempool;
>>>>>     	}
>>>>>     }
>>>>>
>>>>> @@ -1126,6 +1133,8 @@ struct rte_mempool *
>>>>>     		"RTE_BBDEV_OP_TURBO_DEC",
>>>>>     		"RTE_BBDEV_OP_TURBO_ENC",
>>>>>     		"RTE_BBDEV_OP_LDPC_DEC",
>>>>> +		"RTE_BBDEV_OP_LDPC_ENC",
>>>> Why ldpc_enc line, this is already in codebase ?
>>>>> +		"RTE_BBDEV_OP_FFT",
>>> Thanks, there this is a rebase issue in previous commit
>>>
>>>
>>>>>     	};
>>>>>
>>>>>     	if (op_type < BBDEV_OP_TYPE_COUNT)
>>>>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
>>>>> ac941d6..ed528b8 100644
>>>>> --- a/lib/bbdev/rte_bbdev.h
>>>>> +++ b/lib/bbdev/rte_bbdev.h
>>>>> @@ -401,6 +401,12 @@ typedef uint16_t
>>>> (*rte_bbdev_enqueue_dec_ops_t)(
>>>>>     		struct rte_bbdev_dec_op **ops,
>>>>>     		uint16_t num);
>>>>>
>>>>> +/** @internal Enqueue fft operations for processing on queue of a
>>>>> +device. */ typedef uint16_t (*rte_bbdev_enqueue_fft_ops_t)(
>>>>> +		struct rte_bbdev_queue_data *q_data,
>>>>> +		struct rte_bbdev_fft_op **ops,
>>>>> +		uint16_t num);
>>>>> +
>>>>>     /** @internal Dequeue encode operations from a queue of a device. */
>>>>>     typedef uint16_t (*rte_bbdev_dequeue_enc_ops_t)(
>>>>>     		struct rte_bbdev_queue_data *q_data, @@ -411,6 +417,11
>>>> @@ typedef
>>>>> uint16_t (*rte_bbdev_dequeue_dec_ops_t)(
>>>>>     		struct rte_bbdev_queue_data *q_data,
>>>>>     		struct rte_bbdev_dec_op **ops, uint16_t num);
>>>>>
>>>>> +/** @internal Dequeue fft operations from a queue of a device. */
>>>>> +typedef uint16_t (*rte_bbdev_dequeue_fft_ops_t)(
>>>>> +		struct rte_bbdev_queue_data *q_data,
>>>>> +		struct rte_bbdev_fft_op **ops, uint16_t num);
>>>>> +
>>>>>     #define RTE_BBDEV_NAME_MAX_LEN  64  /**< Max length of device
>> name
>>>>> */
>>>>>
>>>>>     /**
>>>>> @@ -459,6 +470,10 @@ struct __rte_cache_aligned rte_bbdev {
>>>>>     	rte_bbdev_dequeue_enc_ops_t dequeue_ldpc_enc_ops;
>>>>>     	/** Dequeue decode function */
>>>>>     	rte_bbdev_dequeue_dec_ops_t dequeue_ldpc_dec_ops;
>>>>> +	/** Enqueue FFT function */
>>>>> +	rte_bbdev_enqueue_fft_ops_t enqueue_fft_ops;
>>>>> +	/** Dequeue FFT function */
>>>>> +	rte_bbdev_dequeue_fft_ops_t dequeue_fft_ops;
>>>>>     	const struct rte_bbdev_ops *dev_ops;  /**< Functions exported by
>>>> PMD */
>>>>>     	struct rte_bbdev_data *data;  /**< Pointer to device data */
>>>>>     	enum rte_bbdev_state state;  /**< If device is currently used or
>>>>> not */ @@ -591,6 +606,36 @@ struct __rte_cache_aligned rte_bbdev {
>>>>>     	return dev->enqueue_ldpc_dec_ops(q_data, ops, num_ops);
>>>>>     }
>>>>>
>>>>> +/**
>>>>> + * Enqueue a burst of fft operations to a queue of the device.
>>>>> + * This functions only enqueues as many operations as currently
>>>>> +possible and
>>>>> + * does not block until @p num_ops entries in the queue are available.
>>>>> + * This function does not provide any error notification to avoid the
>>>>> + * corresponding overhead.
>>>>> + *
>>>>> + * @param dev_id
>>>>> + *   The identifier of the device.
>>>>> + * @param queue_id
>>>>> + *   The index of the queue.
>>>>> + * @param ops
>>>>> + *   Pointer array containing operations to be enqueued Must have at
>> least
>>>>> + *   @p num_ops entries
>>>>> + * @param num_ops
>>>>> + *   The maximum number of operations to enqueue.
>>>>> + *
>>>>> + * @return
>>>>> + *   The number of operations actually enqueued (this is the number of
>>>> processed
>>>>> + *   entries in the @p ops array).
>>>>> + */
>>>>> +__rte_experimental
>>>>> +static inline uint16_t
>>>>> +rte_bbdev_enqueue_fft_ops(uint16_t dev_id, uint16_t queue_id,
>>>>> +		struct rte_bbdev_fft_op **ops, uint16_t num_ops) {
>>>>> +	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
>>>> Who checks the input is valid ?
>> Who checks the input is valid ?
>>
> This is not specific to that commit but to any operation. This is there for years and see the comment above
>   * This function does not provide any error notification to avoid the
>   * corresponding overhead.

No input checking may be ok when an api is experimental, so fine for 
years gone by.

But for a production library, not checking input leads to crashes in 
production.

crashes are a security problem, so not checking inputs is a security 
problem, cwe-20

https://cwe.mitre.org/data/definitions/20.html

Tom

>
>>>>> +	struct rte_bbdev_queue_data *q_data = &dev->data-
>>>>> queues[queue_id];
>>>>> +	return dev->enqueue_fft_ops(q_data, ops, num_ops); }
>>>>>
>>>>>     /**
>>>>>      * Dequeue a burst of processed encode operations from a queue of the
>>>> device.
>>>>> @@ -716,6 +761,37 @@ struct __rte_cache_aligned rte_bbdev {
>>>>>     	return dev->dequeue_ldpc_dec_ops(q_data, ops, num_ops);
>>>>>     }
>>>>>
>>>>> +/**
>>>>> + * Dequeue a burst of fft operations from a queue of the device.
>>>>> + * This functions returns only the current contents of the queue, and
>>>>> +does not
>>>>> + * block until @ num_ops is available.
>>>>> + * This function does not provide any error notification to avoid the
>>>>> + * corresponding overhead.
>>>>> + *
>>>>> + * @param dev_id
>>>>> + *   The identifier of the device.
>>>>> + * @param queue_id
>>>>> + *   The index of the queue.
>>>>> + * @param ops
>>>>> + *   Pointer array where operations will be dequeued to. Must have at
>> least
>>>>> + *   @p num_ops entries
>>>>> + * @param num_ops
>>>>> + *   The maximum number of operations to dequeue.
>>>>> + *
>>>>> + * @return
>>>>> + *   The number of operations actually dequeued (this is the number of
>>>> entries
>>>>> + *   copied into the @p ops array).
>>>>> + */
>>>>> +__rte_experimental
>>>>> +static inline uint16_t
>>>>> +rte_bbdev_dequeue_fft_ops(uint16_t dev_id, uint16_t queue_id,
>>>>> +		struct rte_bbdev_fft_op **ops, uint16_t num_ops) {
>>>>> +	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
>>>>> +	struct rte_bbdev_queue_data *q_data = &dev->data-
>>>>> queues[queue_id];
>>>>> +	return dev->dequeue_fft_ops(q_data, ops, num_ops); }
>>>>> +
>>>>>     /** Definitions of device event types */
>>>>>     enum rte_bbdev_event_type {
>>>>>     	RTE_BBDEV_EVENT_UNKNOWN,  /**< unknown event type */ diff --git
>>>>> a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h index
>>>>> cd82418..3e46f1d 100644
>>>>> --- a/lib/bbdev/rte_bbdev_op.h
>>>>> +++ b/lib/bbdev/rte_bbdev_op.h
>>>>> @@ -47,6 +47,8 @@
>>>>>     #define RTE_BBDEV_TURBO_MAX_CODE_BLOCKS (64)
>>>>>     /* LDPC:  Maximum number of Code Blocks in Transport Block.*/
>>>>>     #define RTE_BBDEV_LDPC_MAX_CODE_BLOCKS (256)
>>>>> +/* 12 CS maximum */
>>>>> +#define RTE_BBDEV_MAX_CS_2 (6)
>>>>>
>>>>>     /** Flags for turbo decoder operation and capability structure */
>>>>>     enum rte_bbdev_op_td_flag_bitmasks { @@ -211,6 +213,26 @@ enum
>>>>> rte_bbdev_op_ldpcenc_flag_bitmasks {
>>>>>     	RTE_BBDEV_LDPC_ENC_CONCATENATION = (1ULL << 7)
>>>>>     };
>>>>>
>>>>> +/** Flags for DFT operation and capability structure */ enum
>>>>> +rte_bbdev_op_fft_flag_bitmasks {
>>>>> +	/** Flexible windowing capability */
>>>>> +	RTE_BBDEV_FFT_WINDOWING = (1ULL << 0),
>>>>> +	/** Flexible adjustment of Cyclic Shift time offset */
>>>>> +	RTE_BBDEV_FFT_CS_ADJUSTMENT = (1ULL << 1),
>>>>> +	/** Set for bypass the DFT and get directly into iDFT input */
>>>>> +	RTE_BBDEV_FFT_DFT_BYPASS = (1ULL << 2),
>>>>> +	/** Set for bypass the IDFT and get directly the DFT output */
>>>>> +	RTE_BBDEV_FFT_IDFT_BYPASS = (1ULL << 3),
>>>>> +	/** Set for bypass time domain windowing */
>>>>> +	RTE_BBDEV_FFT_WINDOWING_BYPASS = (1ULL << 4),
>>>>> +	/** Set for optional power measurement on DFT output */
>>>>> +	RTE_BBDEV_FFT_POWER_MEAS = (1ULL << 5),
>>>> Meas here too, change generally
>>>>> +	/** Set if the input data used FP16 format */
>>>>> +	RTE_BBDEV_FFT_FP16_INPUT = (1ULL << 6),
>>>> What are the other data type(s) ?
>>>>
>>>> The default is not mentioned, or i missed it.
>> ?
> Default type is INT16 as captured in doc above
>
> +|RTE_BBDEV_FFT_FP16_INPUT                                            |
> +| Set if the input data shall use FP16 format instead of INT16       |
> ++--------------------------------------------------------------------+
> +|RTE_BBDEV_FFT_FP16_OUTPUT                                           |
> +| Set if the output data shall use FP16 format instead of INT16      |
> ++--------------------------------------------------------------------+
>
>
>
>>>>> +	/**  Set if the output data uses FP16 format  */
>>>>> +	RTE_BBDEV_FFT_FP16_OUTPUT = (1ULL << 7) };
>>>>> +
>>>>>     /** Flags for the Code Block/Transport block mode  */
>>>>>     enum rte_bbdev_op_cb_mode {
>>>>>     	/** One operation is one or fraction of one transport block  */ @@
>>>>> -689,6 +711,55 @@ struct rte_bbdev_op_ldpc_enc {
>>>>>     	};
>>>>>     };
>>>>>
>>>>> +/** Operation structure for FFT processing.
>>>>> + *
>>>>> + * The operation processes the data for multiple antennas in a single
>>>>> +call
>>>>> + * (.i.e for all the REs belonging to a given SRS sequence for
>>>>> +instance)
>>>>> + *
>>>>> + * The output mbuf data structure is expected to be allocated by the
>>>>> + * application with enough room for the output data.
>>>>> + */
>>>>> +struct rte_bbdev_op_fft {
>>>>> +	/** Input data starting from first antenna */
>>>>> +	struct rte_bbdev_op_data base_input;
>>>>> +	/** Output data starting from first antenna and first cyclic shift */
>>>>> +	struct rte_bbdev_op_data base_output;
>>>>> +	/** Optional power measurement output data */
>>>>> +	struct rte_bbdev_op_data power_meas_output;
>>>>> +	/** Flags from rte_bbdev_op_fft_flag_bitmasks */
>>>>> +	uint32_t op_flags;
>>>>> +	/** Input sequence size in 32-bits points */
>>>>> +	uint16_t input_sequence_size;
>>>> size is bytes*4 ? how does this work with fp16 ?
>> ?
> This is IQ data, hence a complex number using either int16 or fp6 would always be 32 bits.
>
>
>>>>> +	/** Padding at the start of the sequence */
>>>>> +	uint16_t input_leading_padding;
>>>>> +	/** Output sequence size in 32-bits points */
>>>>> +	uint16_t output_sequence_size;
>>>>> +	/** Depadding at the start of the DFT output */
>>>>> +	uint16_t output_leading_depadding;
>>>>> +	/** Window index being used for each cyclic shift output */
>>>>> +	uint8_t window_index[RTE_BBDEV_MAX_CS_2];
>>>>> +	/** Bitmap of the cyclic shift output requested */
>>>>> +	uint16_t cs_bitmap;
>>>>> +	/** Number of antennas as a log2 – 8 to 128 */
>>>>> +	uint8_t num_antennas_log2;
>>>>> +	/** iDFT size as a log2 - 32 to 2048 */
>>>>> +	uint8_t idft_log2;
>>>>> +	/** DFT size as a log2 - 8 to 2048 */
>>>>> +	uint8_t dft_log2;
>>>>> +	/** Adjustment of position of the cyclic shifts - -31 to 31 */
>>>>> +	int8_t cs_time_adjustment;
>>>>> +	/** iDFT shift down */
>>>>> +	int8_t idft_shift;
>>>>> +	/** DFT shift down */
>>>>> +	int8_t dft_shift;
>>>>> +	/** NCS reciprocal factor  */
>>>>> +	uint16_t ncs_reciprocal;
>>>>> +	/** power measurement out shift down */
>>>>> +	uint16_t power_shift;
>>>>> +	/** Adjust the FP6 exponent for INT<->FP16 conversion */
>>>>> +	uint16_t fp16_exp_adjust;
>>>>> +};
>>>>> +
>>>>>     /** List of the capabilities for the Turbo Decoder */
>>>>>     struct rte_bbdev_op_cap_turbo_dec {
>>>>>     	/** Flags from rte_bbdev_op_td_flag_bitmasks */ @@ -741,6 +812,16
>>>>> @@ struct rte_bbdev_op_cap_ldpc_enc {
>>>>>     	uint16_t num_buffers_dst;
>>>>>     };
>>>>>
>>>>> +/** List of the capabilities for the FFT */ struct
>>>>> +rte_bbdev_op_cap_fft {
>>>>> +	/** Flags from rte_bbdev_op_ldpcenc_flag_bitmasks */
>>>> you mean 'from rte_bbdev_op_fft_flag_bitmasks' ?
>> ?
> Thanks, fixed in new commit
>
>>>>> +	uint32_t capability_flags;
>>>>> +	/** Num input code block buffers */
>>>>> +	uint16_t num_buffers_src;
>>>>> +	/** Num output code block buffers */
>>>>> +	uint16_t num_buffers_dst;
>>>>> +};
>>>>> +
>>>>>     /** Different operation types supported by the device */
>>>>>     enum rte_bbdev_op_type {
>>>>>     	RTE_BBDEV_OP_NONE,  /**< Dummy operation that does nothing */
>>>> @@
>>>>> -748,6 +829,7 @@ enum rte_bbdev_op_type {
>>>>>     	RTE_BBDEV_OP_TURBO_ENC,  /**< Turbo encode */
>>>>>     	RTE_BBDEV_OP_LDPC_DEC,  /**< LDPC decode */
>>>>>     	RTE_BBDEV_OP_LDPC_ENC,  /**< LDPC encode */
>>>>> +	RTE_BBDEV_OP_FFT,  /**< FFT */
>>>>>     	RTE_BBDEV_OP_TYPE_PADDED_MAX = 8,  /**< Maximum op type
>>>> number including padding */
>>>>>     };
>>>>>
>>>>> @@ -791,6 +873,18 @@ struct rte_bbdev_dec_op {
>>>>>     	};
>>>>>     };
>>>>>
>>>>> +/** Structure specifying a single fft operation */ struct
>>>>> +rte_bbdev_fft_op {
>>>>> +	/** Status of operation that was performed */
>>>>> +	int status;
>>>>> +	/** Mempool which op instance is in */
>>>>> +	struct rte_mempool *mempool;
>>>>> +	/** Opaque pointer for user data */
>>>>> +	void *opaque_data;
>>>>> +	/** Contains turbo decoder specific parameters */
>>>>> +	struct rte_bbdev_op_fft fft;
>>>>> +};
>>>>> +
>>>>>     /** Operation capabilities supported by a device */
>>>>>     struct rte_bbdev_op_cap {
>>>>>     	enum rte_bbdev_op_type type;  /**< Type of operation */ @@ -799,6
>>>>> +893,7 @@ struct rte_bbdev_op_cap {
>>>>>     		struct rte_bbdev_op_cap_turbo_enc turbo_enc;
>>>>>     		struct rte_bbdev_op_cap_ldpc_dec ldpc_dec;
>>>>>     		struct rte_bbdev_op_cap_ldpc_enc ldpc_enc;
>>>>> +		struct rte_bbdev_op_cap_fft fft;
>>>>>     	} cap;  /**< Operation-type specific capabilities */
>>>>>     };
>>>>>
>>>>> @@ -918,6 +1013,42 @@ struct rte_mempool *
>>>>>     }
>>>>>
>>>>>     /**
>>>>> + * Bulk allocate fft operations from a mempool with parameter defaults
>>>> reset.
>>>>> + *
>>>>> + * @param mempool
>>>>> + *   Operation mempool, created by rte_bbdev_op_pool_create().
>>>>> + * @param ops
>>>>> + *   Output array to place allocated operations
>>>>> + * @param num_ops
>>>>> + *   Number of operations to allocate
>>>>> + *
>>>>> + * @returns
>>>>> + *   - 0 on success
>>>>> + *   - EINVAL if invalid mempool is provided
>>>>> + */
>>>>> +__rte_experimental
>>>>> +static inline int
>>>>> +rte_bbdev_fft_op_alloc_bulk(struct rte_mempool *mempool,
>>>>> +		struct rte_bbdev_fft_op **ops, uint16_t num_ops) {
>>>>> +	struct rte_bbdev_op_pool_private *priv;
>>>>> +	int ret;
>>>>> +
>>>>> +	/* Check type */
>>>>> +	priv = (struct rte_bbdev_op_pool_private *)
>>>>> +			rte_mempool_get_priv(mempool);
>>>>> +	if (unlikely(priv->type != RTE_BBDEV_OP_FFT))
>>>>> +		return -EINVAL;
>>>>> +
>>>>> +	/* Get elements */
>>>>> +	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
>>>>> +	if (unlikely(ret < 0))
>>>>> +		return ret;
>>>> if-check is not needed, just
>>>>
>>>> return ret;
>>>>
>>>> and drop the next line
>> ?
> Fixed through a new commit in new version
>
>>>> Tom
>>>>
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>> +/**
>>>>>      * Free decode operation structures that were allocated by
>>>>>      * rte_bbdev_dec_op_alloc_bulk().
>>>>>      * All structures must belong to the same mempool.
>>>>> @@ -951,6 +1082,24 @@ struct rte_mempool *
>>>>>     		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops,
>>>> num_ops);
>>>>>     }
>>>>>
>>>>> +/**
>>>>> + * Free encode operation structures that were allocated by
>>>>> + * rte_bbdev_fft_op_alloc_bulk().
>>>>> + * All structures must belong to the same mempool.
>>>>> + *
>>>>> + * @param ops
>>>>> + *   Operation structures
>>>>> + * @param num_ops
>>>>> + *   Number of structures
>>>>> + */
>>>>> +__rte_experimental
>>>>> +static inline void
>>>>> +rte_bbdev_fft_op_free_bulk(struct rte_bbdev_fft_op **ops, unsigned
>>>>> +int num_ops) {
>>>>> +	if (num_ops > 0)
>>>>> +		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops,
>>>> num_ops); }
>>>>> +
>>>>>     #ifdef __cplusplus
>>>>>     }
>>>>>     #endif
>>>>> diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map index
>>>>> 9ac3643..efae50b 100644
>>>>> --- a/lib/bbdev/version.map
>>>>> +++ b/lib/bbdev/version.map
>>>>> @@ -44,4 +44,8 @@ EXPERIMENTAL {
>>>>>     	global:
>>>>>
>>>>>     	rte_bbdev_device_status_str;
>>>>> +	rte_bbdev_enqueue_fft_ops;
>>>>> +	rte_bbdev_dequeue_fft_ops;
>>>>> +	rte_bbdev_fft_op_alloc_bulk;
>>>>> +	rte_bbdev_fft_op_free_bulk;
>>>>>     };


^ permalink raw reply	[flat|nested] 174+ messages in thread

* RE: [PATCH v4 4/7] drivers/baseband: update PMDs to expose queue per operation
  2022-07-18 13:21                   ` Tom Rix
@ 2022-08-15 17:28                     ` Chautru, Nicolas
  0 siblings, 0 replies; 174+ messages in thread
From: Chautru, Nicolas @ 2022-08-15 17:28 UTC (permalink / raw)
  To: Tom Rix, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, mdr, Richardson, Bruce, david.marchand, stephen

Hi Tom, 

Back from time off, replying to that previous email.

> -----Original Message-----
> From: Tom Rix <trix@redhat.com>
> Sent: Monday, July 18, 2022 6:21 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
> Cc: maxime.coquelin@redhat.com; mdr@ashroe.eu; Richardson, Bruce
> <bruce.richardson@intel.com>; david.marchand@redhat.com;
> stephen@networkplumber.org
> Subject: Re: [PATCH v4 4/7] drivers/baseband: update PMDs to expose queue
> per operation
> 
> 
> On 7/7/22 10:19 AM, Chautru, Nicolas wrote:
> > Hi Tom,
> >
> >> -----Original Message-----
> >> From: Tom Rix <trix@redhat.com>
> >> Sent: Thursday, July 7, 2022 6:21 AM
> >> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> >> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
> >> Cc: maxime.coquelin@redhat.com; mdr@ashroe.eu; Richardson, Bruce
> >> <bruce.richardson@intel.com>; david.marchand@redhat.com;
> >> stephen@networkplumber.org
> >> Subject: Re: [PATCH v4 4/7] drivers/baseband: update PMDs to expose
> >> queue per operation
> >>
> >>
> >> On 7/6/22 2:10 PM, Chautru, Nicolas wrote:
> >>> Hi Tom,
> >>>
> >>>> -----Original Message-----
> >>>> From: Tom Rix <trix@redhat.com>
> >>>> Sent: Wednesday, July 6, 2022 9:15 AM
> >>>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> >>>> thomas@monjalon.net; gakhil@marvell.com;
> hemant.agrawal@nxp.com
> >>>> Cc: maxime.coquelin@redhat.com; mdr@ashroe.eu; Richardson, Bruce
> >>>> <bruce.richardson@intel.com>; david.marchand@redhat.com;
> >>>> stephen@networkplumber.org
> >>>> Subject: Re: [PATCH v4 4/7] drivers/baseband: update PMDs to expose
> >>>> queue per operation
> >>>>
> >>>>
> >>>> On 7/5/22 5:23 PM, Nicolas Chautru wrote:
> >>>>> Add support in existing bbdev PMDs for the explicit number of
> >>>>> queue and priority for each operation type configured on the device.
> >>>>>
> >>>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> >>>>> ---
> >>>>>     drivers/baseband/acc100/rte_acc100_pmd.c           | 29
> +++++++++++++--
> >> ----
> >>>> ---
> >>>>>     drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  8 ++++++
> >>>>>     drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  8 ++++++
> >>>>>     drivers/baseband/la12xx/bbdev_la12xx.c             |  7 ++++++
> >>>>>     drivers/baseband/turbo_sw/bbdev_turbo_software.c   | 11
> ++++++++
> >>>>>     5 files changed, 51 insertions(+), 12 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> >>>>> b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>>>> index 17ba798..d568d0d 100644
> >>>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> >>>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>>>> @@ -966,6 +966,7 @@
> >>>>>     		struct rte_bbdev_driver_info *dev_info)
> >>>>>     {
> >>>>>     	struct acc100_device *d = dev->data->dev_private;
> >>>>> +	int i;
> >>>>>
> >>>>>     	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> >>>>>     		{
> >>>>> @@ -1062,19 +1063,23 @@
> >>>>>     	fetch_acc100_config(dev);
> >>>>>     	dev_info->device_status =
> RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>>>
> >>>>> -	/* This isn't ideal because it reports the maximum number of queues
> >>>> but
> >>>>> -	 * does not provide info on how many can be uplink/downlink or
> >>>> different
> >>>>> -	 * priorities
> >>>>> -	 */
> >>>>> -	dev_info->max_num_queues =
> >>>>> -			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
> >>>>> -			d->acc100_conf.q_dl_5g.num_qgroups +
> >>>>> -			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
> >>>>> -			d->acc100_conf.q_ul_5g.num_qgroups +
> >>>>> -			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
> >>>>> -			d->acc100_conf.q_dl_4g.num_qgroups +
> >>>>> -			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
> >>>>> +	/* Expose number of queues */
> >>>>> +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
> >>>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] =
> >>>>> +d->acc100_conf.q_ul_4g.num_aqs_per_groups *
> >>>>>     			d->acc100_conf.q_ul_4g.num_qgroups;
> >>>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = d-
> >>>>> acc100_conf.q_dl_4g.num_aqs_per_groups *
> >>>>> +			d->acc100_conf.q_dl_4g.num_qgroups;
> >>>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = d-
> >>>>> acc100_conf.q_ul_5g.num_aqs_per_groups *
> >>>>> +			d->acc100_conf.q_ul_5g.num_qgroups;
> >>>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = d-
> >>>>> acc100_conf.q_dl_5g.num_aqs_per_groups *
> >>>>> +			d->acc100_conf.q_dl_5g.num_qgroups;
> >>>>> +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] =
> d-
> >>>>> acc100_conf.q_ul_4g.num_qgroups;
> >>>>> +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] =
> d-
> >>>>> acc100_conf.q_dl_4g.num_qgroups;
> >>>>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = d-
> >>>>> acc100_conf.q_ul_5g.num_qgroups;
> >>>>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = d-
> >>>>> acc100_conf.q_dl_5g.num_qgroups;
> >>>>> +	dev_info->max_num_queues = 0;
> >>>>> +	for (i = RTE_BBDEV_OP_TURBO_DEC; i <
> RTE_BBDEV_OP_LDPC_ENC;
> >>>> i++)
> >>>>
> >>>> should this be i <=  ?
> >>>>
> >>> Thanks
> >>>
> >>>>> +		dev_info->max_num_queues += dev_info-
> >num_queues[i];
> >>>>>     	dev_info->queue_size_lim = ACC100_MAX_QUEUE_DEPTH;
> >>>>>     	dev_info->hardware_accelerated = true;
> >>>>>     	dev_info->max_dl_queue_priority = diff --git
> >>>>> a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> >>>>> b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> >>>>> index 57b12af..b4982af 100644
> >>>>> --- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> >>>>> +++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> >>>>> @@ -379,6 +379,14 @@
> >>>>>     		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
> >>>>>     			dev_info->max_num_queues++;
> >>>>>     	}
> >>>>> +	/* Expose number of queue per operation type */
> >>>>> +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
> >>>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
> >>>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
> >>>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] =
> dev_info-
> >>>>> max_num_queues / 2;
> >>>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] =
> dev_info-
> >>>>> max_num_queues / 2;
> >>>>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
> >>>>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
> >>>>>     }
> >>>>>
> >>>>>     /**
> >>>>> diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> >>>>> b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> >>>>> index 2a330c4..dc7f479 100644
> >>>>> --- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> >>>>> +++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> >>>>> @@ -655,6 +655,14 @@ struct __rte_cache_aligned fpga_queue {
> >>>>>     		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
> >>>>>     			dev_info->max_num_queues++;
> >>>>>     	}
> >>>>> +	/* Expose number of queue per operation type */
> >>>>> +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
> >>>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] =
> dev_info-
> >>>>> max_num_queues / 2;
> >>>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] =
> dev_info-
> >>>>> max_num_queues / 2;
> >>>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = 0;
> >>>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = 0;
> >>>>> +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = 1;
> >>>>> +	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] =
> 1;
> >>>>>     }
> >>>>>
> >>>>>     /**
> >>>>> diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c
> >>>>> b/drivers/baseband/la12xx/bbdev_la12xx.c
> >>>>> index c1f88c6..e99ea9a 100644
> >>>>> --- a/drivers/baseband/la12xx/bbdev_la12xx.c
> >>>>> +++ b/drivers/baseband/la12xx/bbdev_la12xx.c
> >>>>> @@ -102,6 +102,13 @@ struct bbdev_la12xx_params {
> >>>>>     	dev_info->min_alignment = 64;
> >>>>>     	dev_info->device_status =
> RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>>>
> >>>>> +	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
> >>>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
> >>>>> +	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
> >>>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] =
> >>>> LA12XX_MAX_QUEUES / 2;
> >>>>> +	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] =
> >>>> LA12XX_MAX_QUEUES / 2;
> >>>>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
> >>>>> +	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
> >>>>>     	rte_bbdev_log_debug("got device info from %u", dev->data-
> >dev_id);
> >>>>>     }
> >>>>>
> >>>>> diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> >>>>> b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> >>>>> index dbc5524..647e706 100644
> >>>>> --- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> >>>>> +++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> >>>>> @@ -256,6 +256,17 @@ struct turbo_sw_queue {
> >>>>>     	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> >>>>>     	dev_info->device_status =
> RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>>>
> >>>>> +	const struct rte_bbdev_op_cap *op_cap =
> bbdev_capabilities;
> >>>> Should this be done through dev instead of assigning directly ?
> >>> I am not sure I follow your suggestion. Do you mind clarifying?
> >> bbdev_capabilites is a const defined in this function, do you really
> >> need to loop over it to find information that is constant ?
> > I still miss your point. Note that this constant is not always the same at
> build time (based on what SDK it can links to).
> > What would suggest?
> 
> Operations that can be done at compile time, should be.  Useless there is a
> good reason.
> 
> You need to provide a good reason or make the change.

I believe this is more graceful, scalable and maintainable this way. 
At build time we already define the list of capability then we just process that information without risk of disconnect between two pieces of code. 
The drawback is execution time but this function is not time sensititive (enumeration).
I could change it but the code would be poorer with risk of breaking stuff in the future (redundant information in the code hence bug prone).
Ie. defining the number of operations and queues using another serie of  #define. 
From my point of view that would be something that we would not do internally for the reasons above, but if you insist I will just change accordingly so that to move on. 
Not a big deal, let us know. 

Thanks
Nic

> 
> Tom
> 
> >
> > Thanks
> > Nic
> >
> >
> >> Tom
> >>
> >>>> Tom
> >>>>
> >>>>> +	int num_op_type = 0;
> >>>>> +	for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
> >>>>> +		num_op_type++;
> >>>>> +	op_cap = bbdev_capabilities;
> >>>>> +	if (num_op_type > 0) {
> >>>>> +		int num_queue_per_type = dev_info-
> >max_num_queues /
> >>>> num_op_type;
> >>>>> +		for (; op_cap->type != RTE_BBDEV_OP_NONE;
> ++op_cap)
> >>>>> +			dev_info->num_queues[op_cap->type] =
> >>>> num_queue_per_type;
> >>>>> +	}
> >>>>> +
> >>>>>     	rte_bbdev_log_debug("got device info from %u\n", dev-
> >data-
> >>>>> dev_id);
> >>>>>     }
> >>>>>


^ permalink raw reply	[flat|nested] 174+ messages in thread

* RE: [PATCH v5 0/7]  bbdev changes for 22.11
  2022-07-06 23:28       ` [PATCH v5 0/7] bbdev changes for 22.11 Nicolas Chautru
                           ` (6 preceding siblings ...)
  2022-07-06 23:28         ` [PATCH v5 7/7] bbdev: remove unnecessary if-check Nicolas Chautru
@ 2022-08-15 17:54         ` Chautru, Nicolas
  7 siblings, 0 replies; 174+ messages in thread
From: Chautru, Nicolas @ 2022-08-15 17:54 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, Richardson, Bruce, david.marchand, stephen

Hi Hemant, 

Could you please provide a +1 for that serie please? This has been under review for a while but would like to get it merged soon if possible. I believe you had already reviewed and acked a previous version. 
Much appreciated, thanks, 

Nic

> -----Original Message-----
> From: Chautru, Nicolas <nicolas.chautru@intel.com>
> Sent: Wednesday, July 6, 2022 4:28 PM
> To: dev@dpdk.org; thomas@monjalon.net; gakhil@marvell.com;
> hemant.agrawal@nxp.com
> Cc: maxime.coquelin@redhat.com; trix@redhat.com; mdr@ashroe.eu;
> Richardson, Bruce <bruce.richardson@intel.com>;
> david.marchand@redhat.com; stephen@networkplumber.org; Chautru,
> Nicolas <nicolas.chautru@intel.com>
> Subject: [PATCH v5 0/7] bbdev changes for 22.11
> 
> v5: update base on review from Tom Rix. Number of typos reported and
> resolved, removed the commit related to rw_lock for now, added a commit
> for code clean up from review, resolved one rebase issue between 2
> commits, used size of array for some bound check implementation. Thanks.
> v4: update to the last 2 commits to include function to print the queue status
> and a fix to the rte_lock within the wrong structure
> v3: update to device status info to also use padded size for the related array.
> Adding also 2 additionals commits to allow the API struc to expose more
> information related to queues corner cases/warning as well as an optional
> rw lock.
> Hemant, Maxime, this is planned for DPDK 21.11 but would like review/ack
> early is possible to get this applied earlier and due to time off this summer.
> Thanks
> Nic
> 
> --
> 
> Hi,
> 
> Agregating together in a single serie a number of bbdev api changes
> previously submitted over the last few months and all targeted for 22.11 (4
> different series detailed below). Related deprecation notice being pushed in
> 22.07 in parallel.
> * bbdev: add device status info
> * bbdev: add new operation for FFT processing
> * bbdev: add device info on queue topology
> * bbdev: allow operation type enum for growth
> 
> v2: Update to the RTE_BBDEV_COUNT removal based on feedback from
> Thomas/Stephen : rejecting out of range op type and adjusting the new
> name for the padded maximum value used for fixed size arrays.
> 
> ---
> 
> Previous cover letters agregated below:
> 
> * bbdev: add device status info
> https://patches.dpdk.org/project/dpdk/list/?series=23367
> 
> The updated structure will allow PMDs to expose through info_get what be
> may the status of the underlying accelerator, notably in case an HW error
> event having happened.
> 
> * bbdev: add new operation for FFT processing
> https://patches.dpdk.org/project/dpdk/list/?series=22111
> 
> This contribution adds a new operation type to the existing ones already
> supported by the bbdev PMDs.
> This set of operation is FFT-based processing for 5GNR baseband processing
> acceleration. This operates in the same lookaside fashion as other existing
> bbdev operation with a dedicated set of capabilities and parameters (marked
> as experimental).
> 
> I plan to also include a new PMD supporting this operation (and most of the
> related capabilities) in the next couple of months (either in 22.06 or 22.09) as
> well as extending the related bbdev-test.
> 
> * bbdev: add device info on queue topology
> https://patches.dpdk.org/project/dpdk/list/?series=22076
> 
> Addressing an historical concern that the device info struct only imperfectly
> captured what queues are available on the device (number of operation and
> priority). This ended up being an iterative process for application to find each
> queue could be configured.
> 
> ie. the gap was captured as technical debt previously  in comments
> /* This isn't ideal because it reports the maximum number of queues but
>  * does not provide info on how many can be uplink/downlink or different
>  * priorities
>  */
> 
> This is now being exposed explictly based on the what the device actually
> supports using the existing info_get api
> 
> * bbdev: allow operation type enum for growth
> https://patches.dpdk.org/project/dpdk/list/?series=23509
> 
> This is related to the general intent to remove using MAX value for enums.
> There is consensus that we should avoid this for a while notably for future-
> proofed ABI concerns
> https://patches.dpdk.org/project/dpdk/patch/20200130142003.2645765-1-
> ferruh.yigit@intel.com/.
> But still there is arguably not yet an explicit best recommendation to handle
> this especially when we actualy need to expose array whose index is such an
> enum.
> As a specific example here I am refering to RTE_BBDEV_OP_TYPE_COUNT in
> enum rte_bbdev_op_type which is being extended for new operation type
> being support in bbdev (such as
> https://patches.dpdk.org/project/dpdk/patch/1646956157-245769-2-git-
> send-email-nicolas.chautru@intel.com/ adding new FFT operation)
> 
> There is also the intent to be able to expose information for each operation
> type through the bbdev api such as dynamically configured queues
> information per such operation type
> https://patches.dpdk.org/project/dpdk/patch/1646785355-168133-2-git-
> send-email-nicolas.chautru@intel.com/
> 
> Basically we are considering best way to accomodate for this, notably based
> on discussions with Ray Kinsella and Bruce Richardson, to handle such a case
> moving forward: specifically for the example with
> RTE_BBDEV_OP_TYPE_COUNT and also more generally.
> 
> One possible option is captured in that patchset and is basically based on the
> simple principle to allow for growth and prevent ABI breakage. Ie. the last
> value of the enum is set with a higher value than required so that to allow
> insertion of new enum outside of the major ABI versions.
> In that case the RTE_BBDEV_OP_TYPE_COUNT is still present and can be
> exposed and used while still allowing for addition thanks to the implicit
> padding-like room. As an alternate variant, instead of using that last enum
> value, that extended size could be exposed as an #define outside of the
> enum but would be fundamentally the same (public).
> 
> Another option would be to avoid array alltogether and use each time this a
> new dedicated API function (operation type enum being an input argument
> instead of an index to an array in an existing structure so that to get access
> to structure related to a given operation type enum) but that is arguably not
> well scalable within DPDK to use such a scheme for each enums and keep an
> uncluttered and clean API. In that very example that would be very odd
> indeed not to get this simply from info_get().
> 
> Some pros and cons, arguably the simple option in that patchset is a valid
> compromise option and a step in the right direction but we would like to
> know your view wrt best recommendation, or any other thought.
> 
> 
> 
> Nicolas Chautru (7):
>   bbdev: allow operation type enum for growth
>   bbdev: add device status info
>   bbdev: add device info on queue topology
>   drivers/baseband: update PMDs to expose queue per operation
>   bbdev: add new operation for FFT processing
>   bbdev: add queue related warning and status information
>   bbdev: remove unnecessary if-check
> 
>  app/test-bbdev/test_bbdev.c                        |   2 +-
>  app/test-bbdev/test_bbdev_perf.c                   |   6 +-
>  doc/guides/prog_guide/bbdev.rst                    | 130 +++++++++++++++++
>  drivers/baseband/acc100/rte_acc100_pmd.c           |  30 ++--
>  drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |   9 ++
>  drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |   9 ++
>  drivers/baseband/la12xx/bbdev_la12xx.c             |  10 +-
>  drivers/baseband/null/bbdev_null.c                 |   1 +
>  drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  12 ++
>  examples/bbdev_app/main.c                          |   2 +-
>  lib/bbdev/rte_bbdev.c                              |  57 +++++++-
>  lib/bbdev/rte_bbdev.h                              | 149 +++++++++++++++++++-
>  lib/bbdev/rte_bbdev_op.h                           | 156 ++++++++++++++++++++-
>  lib/bbdev/version.map                              |  11 ++
>  14 files changed, 555 insertions(+), 29 deletions(-)
> 
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v5 1/7] bbdev: allow operation type enum for growth
  2022-07-06 23:28         ` [PATCH v5 1/7] bbdev: allow operation type enum for growth Nicolas Chautru
@ 2022-08-25 13:54           ` Maxime Coquelin
  0 siblings, 0 replies; 174+ messages in thread
From: Maxime Coquelin @ 2022-08-25 13:54 UTC (permalink / raw)
  To: Nicolas Chautru, dev, thomas, gakhil, hemant.agrawal
  Cc: trix, mdr, bruce.richardson, david.marchand, stephen

Hi Nicolas,

On 7/7/22 01:28, Nicolas Chautru wrote:
> Updating the enum for rte_bbdev_op_type
> to allow to keep ABI compatible for enum insertion
> while adding padded maximum value for array need.
> Removing RTE_BBDEV_OP_TYPE_COUNT and instead exposing
> RTE_BBDEV_OP_TYPE_PADDED_MAX.
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>   app/test-bbdev/test_bbdev.c      | 2 +-
>   app/test-bbdev/test_bbdev_perf.c | 4 ++--
>   examples/bbdev_app/main.c        | 2 +-
>   lib/bbdev/rte_bbdev.c            | 8 +++++---
>   lib/bbdev/rte_bbdev_op.h         | 2 +-
>   5 files changed, 10 insertions(+), 8 deletions(-)
> 

Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Thanks,
Maxime


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v4 2/7] bbdev: add device status info
  2022-07-06 21:16             ` Chautru, Nicolas
  2022-07-07 13:37               ` Tom Rix
@ 2022-08-25 14:08               ` Maxime Coquelin
  1 sibling, 0 replies; 174+ messages in thread
From: Maxime Coquelin @ 2022-08-25 14:08 UTC (permalink / raw)
  To: Chautru, Nicolas, Tom Rix, dev, thomas, gakhil, hemant.agrawal
  Cc: mdr, Richardson, Bruce, david.marchand, stephen



On 7/6/22 23:16, Chautru, Nicolas wrote:
>>> +};
>>> +
>>>    /** Device statistics. */
>>>    struct rte_bbdev_stats {
>>>    	uint64_t enqueued_count;  /**< Count of all operations enqueued */
>>> @@ -285,12 +300,14 @@ struct rte_bbdev_driver_info {
>>>    	/** Set if device supports per-queue interrupts */
>>>    	bool queue_intr_supported;
>>>    	/** Minimum alignment of buffers, in bytes */
>>> -	uint16_t min_alignment;
>>> -	/** HARQ memory available in kB */
>>> +	/** Device Status */
>>> +	enum rte_bbdev_device_status device_status;
>> New elements should be added to the end to improve backward compatibility.
> Same comment in different patch. I would like to know if there is a real recommendation from DPDK on this. I have heard opposite view as well.
> In that very case we are breaking the ABI in that new serie for 22.11 (sizes and offsets are changing).
> 

Since we are breaking ABI anyways, I don't find it unreasonable to take
the opportunity to improve packing the struct.

Maxime


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v5 2/7] bbdev: add device status info
  2022-07-06 23:28         ` [PATCH v5 2/7] bbdev: add device status info Nicolas Chautru
@ 2022-08-25 14:18           ` Maxime Coquelin
  2022-08-25 18:30             ` Chautru, Nicolas
  0 siblings, 1 reply; 174+ messages in thread
From: Maxime Coquelin @ 2022-08-25 14:18 UTC (permalink / raw)
  To: Nicolas Chautru, dev, thomas, gakhil, hemant.agrawal
  Cc: trix, mdr, bruce.richardson, david.marchand, stephen



On 7/7/22 01:28, Nicolas Chautru wrote:
> Added device status information, so that the PMD can
> expose information related to the underlying accelerator device status.
> Minor order change in structure to fit into padding hole.
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>   drivers/baseband/acc100/rte_acc100_pmd.c           |  1 +
>   drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  1 +
>   drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  1 +
>   drivers/baseband/la12xx/bbdev_la12xx.c             |  1 +
>   drivers/baseband/null/bbdev_null.c                 |  1 +
>   drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  1 +
>   lib/bbdev/rte_bbdev.c                              | 22 ++++++++++++++
>   lib/bbdev/rte_bbdev.h                              | 35 ++++++++++++++++++++--
>   lib/bbdev/version.map                              |  6 ++++
>   9 files changed, 67 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
> index de7e4bc..17ba798 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -1060,6 +1060,7 @@
>   
>   	/* Read and save the populated config from ACC100 registers */
>   	fetch_acc100_config(dev);
> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>   
>   	/* This isn't ideal because it reports the maximum number of queues but
>   	 * does not provide info on how many can be uplink/downlink or different
> diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> index 82ae6ba..57b12af 100644
> --- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> +++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> @@ -369,6 +369,7 @@
>   	dev_info->capabilities = bbdev_capabilities;
>   	dev_info->cpu_flag_reqs = NULL;
>   	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>   
>   	/* Calculates number of queues assigned to device */
>   	dev_info->max_num_queues = 0;
> diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> index 21d3529..2a330c4 100644
> --- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> +++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> @@ -645,6 +645,7 @@ struct __rte_cache_aligned fpga_queue {
>   	dev_info->capabilities = bbdev_capabilities;
>   	dev_info->cpu_flag_reqs = NULL;
>   	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>   
>   	/* Calculates number of queues assigned to device */
>   	dev_info->max_num_queues = 0;
> diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c b/drivers/baseband/la12xx/bbdev_la12xx.c
> index 4d1bd16..c1f88c6 100644
> --- a/drivers/baseband/la12xx/bbdev_la12xx.c
> +++ b/drivers/baseband/la12xx/bbdev_la12xx.c
> @@ -100,6 +100,7 @@ struct bbdev_la12xx_params {
>   	dev_info->capabilities = bbdev_capabilities;
>   	dev_info->cpu_flag_reqs = NULL;
>   	dev_info->min_alignment = 64;
> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>   
>   	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
>   }
> diff --git a/drivers/baseband/null/bbdev_null.c b/drivers/baseband/null/bbdev_null.c
> index 248e129..94a1976 100644
> --- a/drivers/baseband/null/bbdev_null.c
> +++ b/drivers/baseband/null/bbdev_null.c
> @@ -82,6 +82,7 @@ struct bbdev_queue {
>   	 * here for code completeness.
>   	 */
>   	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>   
>   	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
>   }
> diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> index af7bc41..dbc5524 100644
> --- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> +++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> @@ -254,6 +254,7 @@ struct turbo_sw_queue {
>   	dev_info->min_alignment = 64;
>   	dev_info->harq_buffer_size = 0;
>   	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>   
>   	rte_bbdev_log_debug("got device info from %u\n", dev->data->dev_id);
>   }
> diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
> index 4da8047..38630a2 100644
> --- a/lib/bbdev/rte_bbdev.c
> +++ b/lib/bbdev/rte_bbdev.c
> @@ -1133,3 +1133,25 @@ struct rte_mempool *
>   	rte_bbdev_log(ERR, "Invalid operation type");
>   	return NULL;
>   }
> +
> +const char *
> +rte_bbdev_device_status_str(enum rte_bbdev_device_status status)
> +{
> +	static const char * const dev_sta_string[] = {
> +		"RTE_BBDEV_DEV_NOSTATUS",
> +		"RTE_BBDEV_DEV_NOT_SUPPORTED",
> +		"RTE_BBDEV_DEV_RESET",
> +		"RTE_BBDEV_DEV_CONFIGURED",
> +		"RTE_BBDEV_DEV_ACTIVE",
> +		"RTE_BBDEV_DEV_FATAL_ERR",
> +		"RTE_BBDEV_DEV_RESTART_REQ",
> +		"RTE_BBDEV_DEV_RECONFIG_REQ",
> +		"RTE_BBDEV_DEV_CORRECT_ERR",
> +	};
> +
> +	if (status < sizeof(dev_sta_string) / sizeof(char *))
> +		return dev_sta_string[status];
> +
> +	rte_bbdev_log(ERR, "Invalid device status");
> +	return NULL;
> +}
> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
> index b88c881..9b1ffa4 100644
> --- a/lib/bbdev/rte_bbdev.h
> +++ b/lib/bbdev/rte_bbdev.h
> @@ -223,6 +223,21 @@ struct rte_bbdev_queue_conf {
>   int
>   rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
>   
> +/**
> + * Flags indicate the status of the device
> + */
> +enum rte_bbdev_device_status {
> +	RTE_BBDEV_DEV_NOSTATUS,        /**< Nothing being reported */
> +	RTE_BBDEV_DEV_NOT_SUPPORTED,   /**< Device status is not supported on the PMD */
> +	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-configured state */
> +	RTE_BBDEV_DEV_CONFIGURED,      /**< Device is configured and ready to use */
> +	RTE_BBDEV_DEV_ACTIVE,          /**< Device is configured and VF is being used */
> +	RTE_BBDEV_DEV_FATAL_ERR,       /**< Device has hit a fatal uncorrectable error */
> +	RTE_BBDEV_DEV_RESTART_REQ,     /**< Device requires application to restart */
> +	RTE_BBDEV_DEV_RECONFIG_REQ,    /**< Device requires application to reconfigure queues */
> +	RTE_BBDEV_DEV_CORRECT_ERR,     /**< Warning of a correctable error event happened */
> +};

I don't have a strong opinion on this, but I think NOT_SUPPORTED should 
be a special value. If you want to keep 0 value for NOSTATUS, maybe you 
could do:

enum rte_bbdev_device_status {
	RTE_BBDEV_DEV_NOT_SUPPORTED = -1,   /**< Device status is not supported 
on the PMD */
	RTE_BBDEV_DEV_NOSTATUS = 0,        /**< Nothing being reported */
	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-configured 
state */
...


> +
>   /** Device statistics. */
>   struct rte_bbdev_stats {
>   	uint64_t enqueued_count;  /**< Count of all operations enqueued */
> @@ -285,12 +300,14 @@ struct rte_bbdev_driver_info {
>   	/** Set if device supports per-queue interrupts */
>   	bool queue_intr_supported;
>   	/** Minimum alignment of buffers, in bytes */
> -	uint16_t min_alignment;
> -	/** HARQ memory available in kB */
> +	/** Device Status */
> +	enum rte_bbdev_device_status device_status;
>   	uint32_t harq_buffer_size;
>   	/** Byte endianness (RTE_BIG_ENDIAN/RTE_LITTLE_ENDIAN) supported
>   	 *  for input/output data
>   	 */
> +	uint16_t min_alignment;
> +	/** HARQ memory available in kB */
>   	uint8_t data_endianness;
>   	/** Default queue configuration used if none is supplied  */
>   	struct rte_bbdev_queue_conf default_queue_conf;
> @@ -827,6 +844,20 @@ typedef void (*rte_bbdev_cb_fn)(uint16_t dev_id,
>   rte_bbdev_queue_intr_ctl(uint16_t dev_id, uint16_t queue_id, int epfd, int op,
>   		void *data);
>   
> +/**
> + * Converts device status from enum to string
> + *
> + * @param status
> + *   Device status as enum
> + *
> + * @returns
> + *   Operation type as string or NULL if op_type is invalid
> + *
> + */
> +__rte_experimental
> +const char*
> +rte_bbdev_device_status_str(enum rte_bbdev_device_status status);
> +
>   #ifdef __cplusplus
>   }
>   #endif
> diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map
> index cce3f3c..9ac3643 100644
> --- a/lib/bbdev/version.map
> +++ b/lib/bbdev/version.map
> @@ -39,3 +39,9 @@ DPDK_22 {
>   
>   	local: *;
>   };
> +
> +EXPERIMENTAL {
> +	global:
> +

We now add the version the new API was introduced in as a comment:

         # added in 22.11
> +	rte_bbdev_device_status_str;
> +};


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v5 3/7] bbdev: add device info on queue topology
  2022-07-06 23:28         ` [PATCH v5 3/7] bbdev: add device info on queue topology Nicolas Chautru
@ 2022-08-25 15:23           ` Maxime Coquelin
  0 siblings, 0 replies; 174+ messages in thread
From: Maxime Coquelin @ 2022-08-25 15:23 UTC (permalink / raw)
  To: Nicolas Chautru, dev, thomas, gakhil, hemant.agrawal
  Cc: trix, mdr, bruce.richardson, david.marchand, stephen



On 7/7/22 01:28, Nicolas Chautru wrote:
> Adding more options in the API to expose the number
> of queues exposed and related priority.
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>   lib/bbdev/rte_bbdev.h | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
> index 9b1ffa4..ac941d6 100644
> --- a/lib/bbdev/rte_bbdev.h
> +++ b/lib/bbdev/rte_bbdev.h
> @@ -289,6 +289,10 @@ struct rte_bbdev_driver_info {
>   
>   	/** Maximum number of queues supported by the device */
>   	unsigned int max_num_queues;
> +	/** Maximum number of queues supported per operation type */
> +	unsigned int num_queues[RTE_BBDEV_OP_TYPE_PADDED_MAX];
> +	/** Priority level supported per operation type */
> +	unsigned int queue_priority[RTE_BBDEV_OP_TYPE_PADDED_MAX];
>   	/** Queue size limit (queue size must also be power of 2) */
>   	uint32_t queue_size_lim;
>   	/** Set if device off-loads operation to hardware  */

Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Thanks,
Maxime


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v6 0/7] bbdev changes for 22.11
  2022-06-17 18:37     ` [PATCH v2 5/5] bbdev: add new operation for FFT processing Nicolas Chautru
                         ` (2 preceding siblings ...)
  2022-07-06 23:28       ` [PATCH v5 0/7] bbdev changes for 22.11 Nicolas Chautru
@ 2022-08-25 18:24       ` Nicolas Chautru
  2022-08-25 18:24         ` [PATCH v6 1/7] bbdev: allow operation type enum for growth Nicolas Chautru
                           ` (6 more replies)
  2022-08-29 18:07       ` [PATCH v7 0/7] bbdev changes for 22.11 Nicolas Chautru
                         ` (5 subsequent siblings)
  9 siblings, 7 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-08-25 18:24 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

v6: added one comment in commit 2/7 suggested by Maxime.
v5: update base on review from Tom Rix. Number of typos reported and resolved,
removed the commit related to rw_lock for now, added a commit for
code clean up from review, resolved one rebase issue between 2 commits, used size of array for some bound check implementation. Thanks. 
v4: update to the last 2 commits to include function to print the queue status and a fix to the rte_lock within the wrong structure
v3: update to device status info to also use padded size for the related array.
Adding also 2 additionals commits to allow the API struc to expose more information related to queues corner cases/warning as well as an optional rw lock.
Hemant, Maxime, this is planned for DPDK 21.11 but would like review/ack early is possible to get this applied earlier and due to time off this summer.
Thanks
Nic

Nicolas Chautru (7):
  bbdev: allow operation type enum for growth
  bbdev: add device status info
  bbdev: add device info on queue topology
  drivers/baseband: update PMDs to expose queue per operation
  bbdev: add new operation for FFT processing
  bbdev: add queue related warning and status information
  bbdev: remove unnecessary if-check

 app/test-bbdev/test_bbdev.c                        |   2 +-
 app/test-bbdev/test_bbdev_perf.c                   |   6 +-
 doc/guides/prog_guide/bbdev.rst                    | 130 +++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.c           |  30 ++--
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |   9 ++
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |   9 ++
 drivers/baseband/la12xx/bbdev_la12xx.c             |  10 +-
 drivers/baseband/null/bbdev_null.c                 |   1 +
 drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  12 ++
 examples/bbdev_app/main.c                          |   2 +-
 lib/bbdev/rte_bbdev.c                              |  57 +++++++-
 lib/bbdev/rte_bbdev.h                              | 149 +++++++++++++++++++-
 lib/bbdev/rte_bbdev_op.h                           | 156 ++++++++++++++++++++-
 lib/bbdev/version.map                              |  12 ++
 14 files changed, 556 insertions(+), 29 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v6 1/7] bbdev: allow operation type enum for growth
  2022-08-25 18:24       ` [PATCH v6 " Nicolas Chautru
@ 2022-08-25 18:24         ` Nicolas Chautru
  2022-08-25 18:24         ` [PATCH v6 2/7] bbdev: add device status info Nicolas Chautru
                           ` (5 subsequent siblings)
  6 siblings, 0 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-08-25 18:24 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Updating the enum for rte_bbdev_op_type
to allow to keep ABI compatible for enum insertion
while adding padded maximum value for array need.
Removing RTE_BBDEV_OP_TYPE_COUNT and instead exposing
RTE_BBDEV_OP_TYPE_PADDED_MAX.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 app/test-bbdev/test_bbdev.c      | 2 +-
 app/test-bbdev/test_bbdev_perf.c | 4 ++--
 examples/bbdev_app/main.c        | 2 +-
 lib/bbdev/rte_bbdev.c            | 8 +++++---
 lib/bbdev/rte_bbdev_op.h         | 2 +-
 5 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/app/test-bbdev/test_bbdev.c b/app/test-bbdev/test_bbdev.c
index ac06d73..1063f6e 100644
--- a/app/test-bbdev/test_bbdev.c
+++ b/app/test-bbdev/test_bbdev.c
@@ -521,7 +521,7 @@ struct bbdev_testsuite_params {
 	rte_mempool_free(mp);
 
 	TEST_ASSERT((mp = rte_bbdev_op_pool_create("Test_INV",
-			RTE_BBDEV_OP_TYPE_COUNT, size, cache_size, 0)) == NULL,
+			RTE_BBDEV_OP_TYPE_PADDED_MAX, size, cache_size, 0)) == NULL,
 			"Failed test for rte_bbdev_op_pool_create: "
 			"returned value is not NULL for invalid type");
 
diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index fad3b1e..1abda2d 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -2428,13 +2428,13 @@ typedef int (test_case_function)(struct active_device *ad,
 
 	/* Find capabilities */
 	const struct rte_bbdev_op_cap *cap = info.drv.capabilities;
-	for (i = 0; i < RTE_BBDEV_OP_TYPE_COUNT; i++) {
+	do {
 		if (cap->type == test_vector.op_type) {
 			capabilities = cap;
 			break;
 		}
 		cap++;
-	}
+	} while (cap->type != RTE_BBDEV_OP_NONE);
 	TEST_ASSERT_NOT_NULL(capabilities,
 			"Couldn't find capabilities");
 
diff --git a/examples/bbdev_app/main.c b/examples/bbdev_app/main.c
index fc7e8b8..ef0ba76 100644
--- a/examples/bbdev_app/main.c
+++ b/examples/bbdev_app/main.c
@@ -1041,7 +1041,7 @@ uint16_t bbdev_parse_number(const char *mask)
 	void *sigret;
 	struct app_config_params app_params = def_app_config;
 	struct rte_mempool *ethdev_mbuf_mempool, *bbdev_mbuf_mempool;
-	struct rte_mempool *bbdev_op_pools[RTE_BBDEV_OP_TYPE_COUNT];
+	struct rte_mempool *bbdev_op_pools[RTE_BBDEV_OP_TYPE_PADDED_MAX];
 	struct lcore_conf lcore_conf[RTE_MAX_LCORE] = { {0} };
 	struct lcore_statistics lcore_stats[RTE_MAX_LCORE] = { {0} };
 	struct stats_lcore_params stats_lcore;
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index aaee7b7..4da8047 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -23,6 +23,8 @@
 
 #define DEV_NAME "BBDEV"
 
+/* Number of supported operation types */
+#define BBDEV_OP_TYPE_COUNT 5
 
 /* BBDev library logging ID */
 RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE);
@@ -890,10 +892,10 @@ struct rte_mempool *
 		return NULL;
 	}
 
-	if (type >= RTE_BBDEV_OP_TYPE_COUNT) {
+	if (type >= BBDEV_OP_TYPE_COUNT) {
 		rte_bbdev_log(ERR,
 				"Invalid op type (%u), should be less than %u",
-				type, RTE_BBDEV_OP_TYPE_COUNT);
+				type, BBDEV_OP_TYPE_COUNT);
 		return NULL;
 	}
 
@@ -1125,7 +1127,7 @@ struct rte_mempool *
 		"RTE_BBDEV_OP_LDPC_ENC",
 	};
 
-	if (op_type < RTE_BBDEV_OP_TYPE_COUNT)
+	if (op_type < BBDEV_OP_TYPE_COUNT)
 		return op_types[op_type];
 
 	rte_bbdev_log(ERR, "Invalid operation type");
diff --git a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h
index 6d56133..cd82418 100644
--- a/lib/bbdev/rte_bbdev_op.h
+++ b/lib/bbdev/rte_bbdev_op.h
@@ -748,7 +748,7 @@ enum rte_bbdev_op_type {
 	RTE_BBDEV_OP_TURBO_ENC,  /**< Turbo encode */
 	RTE_BBDEV_OP_LDPC_DEC,  /**< LDPC decode */
 	RTE_BBDEV_OP_LDPC_ENC,  /**< LDPC encode */
-	RTE_BBDEV_OP_TYPE_COUNT,  /**< Count of different op types */
+	RTE_BBDEV_OP_TYPE_PADDED_MAX = 8,  /**< Maximum op type number including padding */
 };
 
 /** Bit indexes of possible errors reported through status field */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v6 2/7] bbdev: add device status info
  2022-08-25 18:24       ` [PATCH v6 " Nicolas Chautru
  2022-08-25 18:24         ` [PATCH v6 1/7] bbdev: allow operation type enum for growth Nicolas Chautru
@ 2022-08-25 18:24         ` Nicolas Chautru
  2022-08-25 18:24         ` [PATCH v6 3/7] bbdev: add device info on queue topology Nicolas Chautru
                           ` (4 subsequent siblings)
  6 siblings, 0 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-08-25 18:24 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Added device status information, so that the PMD can
expose information related to the underlying accelerator device status.
Minor order change in structure to fit into padding hole.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c           |  1 +
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  1 +
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  1 +
 drivers/baseband/la12xx/bbdev_la12xx.c             |  1 +
 drivers/baseband/null/bbdev_null.c                 |  1 +
 drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  1 +
 lib/bbdev/rte_bbdev.c                              | 22 ++++++++++++++
 lib/bbdev/rte_bbdev.h                              | 35 ++++++++++++++++++++--
 lib/bbdev/version.map                              |  7 +++++
 9 files changed, 68 insertions(+), 2 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index de7e4bc..17ba798 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -1060,6 +1060,7 @@
 
 	/* Read and save the populated config from ACC100 registers */
 	fetch_acc100_config(dev);
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	/* This isn't ideal because it reports the maximum number of queues but
 	 * does not provide info on how many can be uplink/downlink or different
diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
index 82ae6ba..57b12af 100644
--- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
+++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
@@ -369,6 +369,7 @@
 	dev_info->capabilities = bbdev_capabilities;
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	/* Calculates number of queues assigned to device */
 	dev_info->max_num_queues = 0;
diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
index 21d3529..2a330c4 100644
--- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
+++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
@@ -645,6 +645,7 @@ struct __rte_cache_aligned fpga_queue {
 	dev_info->capabilities = bbdev_capabilities;
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	/* Calculates number of queues assigned to device */
 	dev_info->max_num_queues = 0;
diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c b/drivers/baseband/la12xx/bbdev_la12xx.c
index 4d1bd16..c1f88c6 100644
--- a/drivers/baseband/la12xx/bbdev_la12xx.c
+++ b/drivers/baseband/la12xx/bbdev_la12xx.c
@@ -100,6 +100,7 @@ struct bbdev_la12xx_params {
 	dev_info->capabilities = bbdev_capabilities;
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->min_alignment = 64;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
diff --git a/drivers/baseband/null/bbdev_null.c b/drivers/baseband/null/bbdev_null.c
index 248e129..94a1976 100644
--- a/drivers/baseband/null/bbdev_null.c
+++ b/drivers/baseband/null/bbdev_null.c
@@ -82,6 +82,7 @@ struct bbdev_queue {
 	 * here for code completeness.
 	 */
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
index af7bc41..dbc5524 100644
--- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
+++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
@@ -254,6 +254,7 @@ struct turbo_sw_queue {
 	dev_info->min_alignment = 64;
 	dev_info->harq_buffer_size = 0;
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	rte_bbdev_log_debug("got device info from %u\n", dev->data->dev_id);
 }
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index 4da8047..38630a2 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -1133,3 +1133,25 @@ struct rte_mempool *
 	rte_bbdev_log(ERR, "Invalid operation type");
 	return NULL;
 }
+
+const char *
+rte_bbdev_device_status_str(enum rte_bbdev_device_status status)
+{
+	static const char * const dev_sta_string[] = {
+		"RTE_BBDEV_DEV_NOSTATUS",
+		"RTE_BBDEV_DEV_NOT_SUPPORTED",
+		"RTE_BBDEV_DEV_RESET",
+		"RTE_BBDEV_DEV_CONFIGURED",
+		"RTE_BBDEV_DEV_ACTIVE",
+		"RTE_BBDEV_DEV_FATAL_ERR",
+		"RTE_BBDEV_DEV_RESTART_REQ",
+		"RTE_BBDEV_DEV_RECONFIG_REQ",
+		"RTE_BBDEV_DEV_CORRECT_ERR",
+	};
+
+	if (status < sizeof(dev_sta_string) / sizeof(char *))
+		return dev_sta_string[status];
+
+	rte_bbdev_log(ERR, "Invalid device status");
+	return NULL;
+}
diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index b88c881..9b1ffa4 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -223,6 +223,21 @@ struct rte_bbdev_queue_conf {
 int
 rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
 
+/**
+ * Flags indicate the status of the device
+ */
+enum rte_bbdev_device_status {
+	RTE_BBDEV_DEV_NOSTATUS,        /**< Nothing being reported */
+	RTE_BBDEV_DEV_NOT_SUPPORTED,   /**< Device status is not supported on the PMD */
+	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-configured state */
+	RTE_BBDEV_DEV_CONFIGURED,      /**< Device is configured and ready to use */
+	RTE_BBDEV_DEV_ACTIVE,          /**< Device is configured and VF is being used */
+	RTE_BBDEV_DEV_FATAL_ERR,       /**< Device has hit a fatal uncorrectable error */
+	RTE_BBDEV_DEV_RESTART_REQ,     /**< Device requires application to restart */
+	RTE_BBDEV_DEV_RECONFIG_REQ,    /**< Device requires application to reconfigure queues */
+	RTE_BBDEV_DEV_CORRECT_ERR,     /**< Warning of a correctable error event happened */
+};
+
 /** Device statistics. */
 struct rte_bbdev_stats {
 	uint64_t enqueued_count;  /**< Count of all operations enqueued */
@@ -285,12 +300,14 @@ struct rte_bbdev_driver_info {
 	/** Set if device supports per-queue interrupts */
 	bool queue_intr_supported;
 	/** Minimum alignment of buffers, in bytes */
-	uint16_t min_alignment;
-	/** HARQ memory available in kB */
+	/** Device Status */
+	enum rte_bbdev_device_status device_status;
 	uint32_t harq_buffer_size;
 	/** Byte endianness (RTE_BIG_ENDIAN/RTE_LITTLE_ENDIAN) supported
 	 *  for input/output data
 	 */
+	uint16_t min_alignment;
+	/** HARQ memory available in kB */
 	uint8_t data_endianness;
 	/** Default queue configuration used if none is supplied  */
 	struct rte_bbdev_queue_conf default_queue_conf;
@@ -827,6 +844,20 @@ typedef void (*rte_bbdev_cb_fn)(uint16_t dev_id,
 rte_bbdev_queue_intr_ctl(uint16_t dev_id, uint16_t queue_id, int epfd, int op,
 		void *data);
 
+/**
+ * Converts device status from enum to string
+ *
+ * @param status
+ *   Device status as enum
+ *
+ * @returns
+ *   Operation type as string or NULL if op_type is invalid
+ *
+ */
+__rte_experimental
+const char*
+rte_bbdev_device_status_str(enum rte_bbdev_device_status status);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map
index cce3f3c..f0a072e 100644
--- a/lib/bbdev/version.map
+++ b/lib/bbdev/version.map
@@ -39,3 +39,10 @@ DPDK_22 {
 
 	local: *;
 };
+
+EXPERIMENTAL {
+	global:
+
+	# added in 22.11
+	rte_bbdev_device_status_str;
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v6 3/7] bbdev: add device info on queue topology
  2022-08-25 18:24       ` [PATCH v6 " Nicolas Chautru
  2022-08-25 18:24         ` [PATCH v6 1/7] bbdev: allow operation type enum for growth Nicolas Chautru
  2022-08-25 18:24         ` [PATCH v6 2/7] bbdev: add device status info Nicolas Chautru
@ 2022-08-25 18:24         ` Nicolas Chautru
  2022-08-25 18:24         ` [PATCH v6 4/7] drivers/baseband: update PMDs to expose queue per operation Nicolas Chautru
                           ` (3 subsequent siblings)
  6 siblings, 0 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-08-25 18:24 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Adding more options in the API to expose the number
of queues exposed and related priority.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/bbdev/rte_bbdev.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index 9b1ffa4..ac941d6 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -289,6 +289,10 @@ struct rte_bbdev_driver_info {
 
 	/** Maximum number of queues supported by the device */
 	unsigned int max_num_queues;
+	/** Maximum number of queues supported per operation type */
+	unsigned int num_queues[RTE_BBDEV_OP_TYPE_PADDED_MAX];
+	/** Priority level supported per operation type */
+	unsigned int queue_priority[RTE_BBDEV_OP_TYPE_PADDED_MAX];
 	/** Queue size limit (queue size must also be power of 2) */
 	uint32_t queue_size_lim;
 	/** Set if device off-loads operation to hardware  */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v6 4/7] drivers/baseband: update PMDs to expose queue per operation
  2022-08-25 18:24       ` [PATCH v6 " Nicolas Chautru
                           ` (2 preceding siblings ...)
  2022-08-25 18:24         ` [PATCH v6 3/7] bbdev: add device info on queue topology Nicolas Chautru
@ 2022-08-25 18:24         ` Nicolas Chautru
  2022-08-26 11:53           ` Maxime Coquelin
  2022-08-25 18:24         ` [PATCH v6 5/7] bbdev: add new operation for FFT processing Nicolas Chautru
                           ` (2 subsequent siblings)
  6 siblings, 1 reply; 174+ messages in thread
From: Nicolas Chautru @ 2022-08-25 18:24 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Add support in existing bbdev PMDs for the explicit number of queue
and priority for each operation type configured on the device.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c           | 29 +++++++++++++---------
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  8 ++++++
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  8 ++++++
 drivers/baseband/la12xx/bbdev_la12xx.c             |  7 ++++++
 drivers/baseband/turbo_sw/bbdev_turbo_software.c   | 11 ++++++++
 5 files changed, 51 insertions(+), 12 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 17ba798..f967e3f 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -966,6 +966,7 @@
 		struct rte_bbdev_driver_info *dev_info)
 {
 	struct acc100_device *d = dev->data->dev_private;
+	int i;
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
 		{
@@ -1062,19 +1063,23 @@
 	fetch_acc100_config(dev);
 	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
-	/* This isn't ideal because it reports the maximum number of queues but
-	 * does not provide info on how many can be uplink/downlink or different
-	 * priorities
-	 */
-	dev_info->max_num_queues =
-			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
-			d->acc100_conf.q_dl_5g.num_qgroups +
-			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
-			d->acc100_conf.q_ul_5g.num_qgroups +
-			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
-			d->acc100_conf.q_dl_4g.num_qgroups +
-			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
+	/* Expose number of queues */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = d->acc100_conf.q_ul_4g.num_aqs_per_groups *
 			d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = d->acc100_conf.q_dl_4g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_4g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = d->acc100_conf.q_ul_5g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_5g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = d->acc100_conf.q_dl_5g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_5g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = d->acc100_conf.q_dl_4g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = d->acc100_conf.q_ul_5g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = d->acc100_conf.q_dl_5g.num_qgroups;
+	dev_info->max_num_queues = 0;
+	for (i = RTE_BBDEV_OP_TURBO_DEC; i <= RTE_BBDEV_OP_LDPC_ENC; i++)
+		dev_info->max_num_queues += dev_info->num_queues[i];
 	dev_info->queue_size_lim = ACC100_MAX_QUEUE_DEPTH;
 	dev_info->hardware_accelerated = true;
 	dev_info->max_dl_queue_priority =
diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
index 57b12af..b4982af 100644
--- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
+++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
@@ -379,6 +379,14 @@
 		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
 			dev_info->max_num_queues++;
 	}
+	/* Expose number of queue per operation type */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = dev_info->max_num_queues / 2;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
 }
 
 /**
diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
index 2a330c4..dc7f479 100644
--- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
+++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
@@ -655,6 +655,14 @@ struct __rte_cache_aligned fpga_queue {
 		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
 			dev_info->max_num_queues++;
 	}
+	/* Expose number of queue per operation type */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = 0;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = 1;
 }
 
 /**
diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c b/drivers/baseband/la12xx/bbdev_la12xx.c
index c1f88c6..e99ea9a 100644
--- a/drivers/baseband/la12xx/bbdev_la12xx.c
+++ b/drivers/baseband/la12xx/bbdev_la12xx.c
@@ -102,6 +102,13 @@ struct bbdev_la12xx_params {
 	dev_info->min_alignment = 64;
 	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = LA12XX_MAX_QUEUES / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = LA12XX_MAX_QUEUES / 2;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
 	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
 
diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
index dbc5524..647e706 100644
--- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
+++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
@@ -256,6 +256,17 @@ struct turbo_sw_queue {
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
 	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
+	const struct rte_bbdev_op_cap *op_cap = bbdev_capabilities;
+	int num_op_type = 0;
+	for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
+		num_op_type++;
+	op_cap = bbdev_capabilities;
+	if (num_op_type > 0) {
+		int num_queue_per_type = dev_info->max_num_queues / num_op_type;
+		for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
+			dev_info->num_queues[op_cap->type] = num_queue_per_type;
+	}
+
 	rte_bbdev_log_debug("got device info from %u\n", dev->data->dev_id);
 }
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v6 5/7] bbdev: add new operation for FFT processing
  2022-08-25 18:24       ` [PATCH v6 " Nicolas Chautru
                           ` (3 preceding siblings ...)
  2022-08-25 18:24         ` [PATCH v6 4/7] drivers/baseband: update PMDs to expose queue per operation Nicolas Chautru
@ 2022-08-25 18:24         ` Nicolas Chautru
  2022-08-26 12:07           ` Maxime Coquelin
  2022-08-25 18:24         ` [PATCH v6 6/7] bbdev: add queue related warning and status information Nicolas Chautru
  2022-08-25 18:24         ` [PATCH v6 7/7] bbdev: remove unnecessary if-check Nicolas Chautru
  6 siblings, 1 reply; 174+ messages in thread
From: Nicolas Chautru @ 2022-08-25 18:24 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=Y, Size: 22119 bytes --]

Extension of bbdev operation to support FFT based operations.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
---
 doc/guides/prog_guide/bbdev.rst | 130 +++++++++++++++++++++++++++++++++++
 lib/bbdev/rte_bbdev.c           |  10 ++-
 lib/bbdev/rte_bbdev.h           |  76 ++++++++++++++++++++
 lib/bbdev/rte_bbdev_op.h        | 149 ++++++++++++++++++++++++++++++++++++++++
 lib/bbdev/version.map           |   4 ++
 5 files changed, 368 insertions(+), 1 deletion(-)

diff --git a/doc/guides/prog_guide/bbdev.rst b/doc/guides/prog_guide/bbdev.rst
index 70fa01a..150161b 100644
--- a/doc/guides/prog_guide/bbdev.rst
+++ b/doc/guides/prog_guide/bbdev.rst
@@ -1118,6 +1118,136 @@ Figure :numref:`figure_turbo_tb_decode` above
 showing the Turbo decoding of CBs using BBDEV interface in TB-mode
 is also valid for LDPC decode.
 
+BBDEV FFT Operation
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This operation allows to run a combination of DFT and/or IDFT and/or time-domain windowing.
+These can be used in a modular fashion (using bypass modes) or as a processing pipeline
+which can be used for FFT-based baseband signal processing.
+In more details it allows :
+- to process the data first through an IDFT of adjustable size and padding;
+- to perform the windowing as a programmable cyclic shift offset of the data followed by a
+pointwise multiplication by a time domain window;
+- to process the related data through a DFT of adjustable size and depadding for each such cyclic
+shift output.
+
+A flexible number of Rx antennas are being processed in parallel with the same configuration.
+The API allows more generally for flexibility in what the PMD may support (cabability flags) and
+flexibility to adjust some of the parameters of the processing.
+
+The operation/capability flags that can be set for each FFT operation are given below.
+
+  **NOTE:** The actual operation flags that may be used with a specific
+  BBDEV PMD are dependent on the driver capabilities as reported via
+  ``rte_bbdev_info_get()``, and may be a subset of those below.
+
++--------------------------------------------------------------------+
+|Description of FFT capability flags                                 |
++====================================================================+
+|RTE_BBDEV_FFT_WINDOWING                                             |
+| Set to enable/support windowing in time domain                     |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_CS_ADJUSTMENT                                         |
+| Set to enable/support  the cyclic shift time offset adjustment     |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_DFT_BYPASS                                            |
+| Set to bypass the DFT and use directly the IDFT as an option       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_IDFT_BYPASS                                           |
+| Set to bypass the IDFT and use directly the DFT as an option       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_WINDOWING_BYPASS                                      |
+| Set to bypass the time domain windowing  as an option              |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_POWER_MEAS                                            |
+| Set to provide an optional power measurement of the DFT output     |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_FP16_INPUT                                            |
+| Set if the input data shall use FP16 format instead of INT16       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_FP16_OUTPUT                                           |
+| Set if the output data shall use FP16 format instead of INT16      |
++--------------------------------------------------------------------+
+
+The structure passed for each FFT operation is given below,
+with the operation flags forming a bitmask in the ``op_flags`` field.
+
+.. code-block:: c
+
+    struct rte_bbdev_op_fft {
+        struct rte_bbdev_op_data base_input;
+        struct rte_bbdev_op_data base_output;
+        struct rte_bbdev_op_data power_meas_output;
+        uint32_t op_flags;
+        uint16_t input_sequence_size;
+        uint16_t input_leading_padding;
+        uint16_t output_sequence_size;
+        uint16_t output_leading_depadding;
+        uint8_t window_index[RTE_BBDEV_MAX_CS_2];
+        uint16_t cs_bitmap;
+        uint8_t num_antennas_log2;
+        uint8_t idft_log2;
+        uint8_t dft_log2;
+        int8_t cs_time_adjustment;
+        int8_t idft_shift;
+        int8_t dft_shift;
+        uint16_t ncs_reciprocal;
+        uint16_t power_shift;
+        uint16_t fp16_exp_adjust;
+    };
+
+The FFT parameters are set out in the table below.
+
++-------------------------+--------------------------------------------------------------+
+|Parameter                |Description                                                   |
++=========================+==============================================================+
+|base_input               |input data                                                    |
++-------------------------+--------------------------------------------------------------+
+|base_output              |output data                                                   |
++-------------------------+--------------------------------------------------------------+
+|power_meas_output        |optional output data with power measurement on DFT output     |
++-------------------------+--------------------------------------------------------------+
+|op_flags                 |bitmask of all active operation capabilities                  |
++-------------------------+--------------------------------------------------------------+
+|input_sequence_size      |size of the input sequence in 32-bits points per antenna      |
++-------------------------+--------------------------------------------------------------+
+|input_leading_padding    |number of points padded at the start of input data            |
++-------------------------+--------------------------------------------------------------+
+|output_sequence_size     |size of the output sequence per antenna and cyclic shift      |
++-------------------------+--------------------------------------------------------------+
+|output_leading_depadding |number of points depadded at the start of output data         |
++-------------------------+--------------------------------------------------------------+
+|window_index             |optional windowing profile index used for each cyclic shift   |
++-------------------------+--------------------------------------------------------------+
+|cs_bitmap                |bitmap of the cyclic shift output requested (LSB for index 0) |
++-------------------------+--------------------------------------------------------------+
+|num_antennas_log2        |number of antennas as a log2 (10 maps to 1024...)             |
++-------------------------+--------------------------------------------------------------+
+|idft_log2                |iDFT size as a log2                                           |
++-------------------------+--------------------------------------------------------------+
+|dft_log2                 |DFT size as a log2                                            |
++-------------------------+--------------------------------------------------------------+
+|cs_time_adjustment       |adjustment of time position of all the cyclic shift output    |
++-------------------------+--------------------------------------------------------------+
+|idft_shift               |shift down of signal level post iDFT                          |
++-------------------------+--------------------------------------------------------------+
+|dft_shift                |shift down of signal level post DFT                           |
++-------------------------+--------------------------------------------------------------+
+|ncs_reciprocal           |inverse of max number of CS normalized to 15b (ie. 231 for 12)|
++-------------------------+--------------------------------------------------------------+
+|power_shift              |shift down of level of power measurement when enabled         |
++-------------------------+--------------------------------------------------------------+
+|fp16_exp_adjust          |value added to FP16 exponent at conversion from INT16         |
++-------------------------+--------------------------------------------------------------+
+
+The mbuf input ``base_input`` is mandatory for all BBDEV PMDs and is the
+incoming data for the processing. Its size may not fit into an actual mbuf, but the
+structure is used to pass iova address.
+The mbuf output ``output`` is mandatory and is output of the FFT processing chain.
+Each point is a complex number of 32bits : either as 2 INT16 or as 2 FP16 based when the option
+supported.
+The data layout is based on contiguous concatenation of output data first by cyclic shift then
+by antenna.
 
 Sample code
 -----------
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index 38630a2..9d65ba8 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -24,7 +24,7 @@
 #define DEV_NAME "BBDEV"
 
 /* Number of supported operation types */
-#define BBDEV_OP_TYPE_COUNT 5
+#define BBDEV_OP_TYPE_COUNT 6
 
 /* BBDev library logging ID */
 RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE);
@@ -852,6 +852,9 @@ struct rte_bbdev *
 	case RTE_BBDEV_OP_LDPC_ENC:
 		result = sizeof(struct rte_bbdev_enc_op);
 		break;
+	case RTE_BBDEV_OP_FFT:
+		result = sizeof(struct rte_bbdev_fft_op);
+		break;
 	default:
 		break;
 	}
@@ -875,6 +878,10 @@ struct rte_bbdev *
 		struct rte_bbdev_enc_op *op = element;
 		memset(op, 0, mempool->elt_size);
 		op->mempool = mempool;
+	} else if (type == RTE_BBDEV_OP_FFT) {
+		struct rte_bbdev_fft_op *op = element;
+		memset(op, 0, mempool->elt_size);
+		op->mempool = mempool;
 	}
 }
 
@@ -1125,6 +1132,7 @@ struct rte_mempool *
 		"RTE_BBDEV_OP_TURBO_ENC",
 		"RTE_BBDEV_OP_LDPC_DEC",
 		"RTE_BBDEV_OP_LDPC_ENC",
+		"RTE_BBDEV_OP_FFT",
 	};
 
 	if (op_type < BBDEV_OP_TYPE_COUNT)
diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index ac941d6..ed528b8 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -401,6 +401,12 @@ typedef uint16_t (*rte_bbdev_enqueue_dec_ops_t)(
 		struct rte_bbdev_dec_op **ops,
 		uint16_t num);
 
+/** @internal Enqueue fft operations for processing on queue of a device. */
+typedef uint16_t (*rte_bbdev_enqueue_fft_ops_t)(
+		struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_fft_op **ops,
+		uint16_t num);
+
 /** @internal Dequeue encode operations from a queue of a device. */
 typedef uint16_t (*rte_bbdev_dequeue_enc_ops_t)(
 		struct rte_bbdev_queue_data *q_data,
@@ -411,6 +417,11 @@ typedef uint16_t (*rte_bbdev_dequeue_dec_ops_t)(
 		struct rte_bbdev_queue_data *q_data,
 		struct rte_bbdev_dec_op **ops, uint16_t num);
 
+/** @internal Dequeue fft operations from a queue of a device. */
+typedef uint16_t (*rte_bbdev_dequeue_fft_ops_t)(
+		struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_fft_op **ops, uint16_t num);
+
 #define RTE_BBDEV_NAME_MAX_LEN  64  /**< Max length of device name */
 
 /**
@@ -459,6 +470,10 @@ struct __rte_cache_aligned rte_bbdev {
 	rte_bbdev_dequeue_enc_ops_t dequeue_ldpc_enc_ops;
 	/** Dequeue decode function */
 	rte_bbdev_dequeue_dec_ops_t dequeue_ldpc_dec_ops;
+	/** Enqueue FFT function */
+	rte_bbdev_enqueue_fft_ops_t enqueue_fft_ops;
+	/** Dequeue FFT function */
+	rte_bbdev_dequeue_fft_ops_t dequeue_fft_ops;
 	const struct rte_bbdev_ops *dev_ops;  /**< Functions exported by PMD */
 	struct rte_bbdev_data *data;  /**< Pointer to device data */
 	enum rte_bbdev_state state;  /**< If device is currently used or not */
@@ -591,6 +606,36 @@ struct __rte_cache_aligned rte_bbdev {
 	return dev->enqueue_ldpc_dec_ops(q_data, ops, num_ops);
 }
 
+/**
+ * Enqueue a burst of fft operations to a queue of the device.
+ * This functions only enqueues as many operations as currently possible and
+ * does not block until @p num_ops entries in the queue are available.
+ * This function does not provide any error notification to avoid the
+ * corresponding overhead.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_id
+ *   The index of the queue.
+ * @param ops
+ *   Pointer array containing operations to be enqueued Must have at least
+ *   @p num_ops entries
+ * @param num_ops
+ *   The maximum number of operations to enqueue.
+ *
+ * @return
+ *   The number of operations actually enqueued (this is the number of processed
+ *   entries in the @p ops array).
+ */
+__rte_experimental
+static inline uint16_t
+rte_bbdev_enqueue_fft_ops(uint16_t dev_id, uint16_t queue_id,
+		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
+{
+	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
+	struct rte_bbdev_queue_data *q_data = &dev->data->queues[queue_id];
+	return dev->enqueue_fft_ops(q_data, ops, num_ops);
+}
 
 /**
  * Dequeue a burst of processed encode operations from a queue of the device.
@@ -716,6 +761,37 @@ struct __rte_cache_aligned rte_bbdev {
 	return dev->dequeue_ldpc_dec_ops(q_data, ops, num_ops);
 }
 
+/**
+ * Dequeue a burst of fft operations from a queue of the device.
+ * This functions returns only the current contents of the queue, and does not
+ * block until @ num_ops is available.
+ * This function does not provide any error notification to avoid the
+ * corresponding overhead.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_id
+ *   The index of the queue.
+ * @param ops
+ *   Pointer array where operations will be dequeued to. Must have at least
+ *   @p num_ops entries
+ * @param num_ops
+ *   The maximum number of operations to dequeue.
+ *
+ * @return
+ *   The number of operations actually dequeued (this is the number of entries
+ *   copied into the @p ops array).
+ */
+__rte_experimental
+static inline uint16_t
+rte_bbdev_dequeue_fft_ops(uint16_t dev_id, uint16_t queue_id,
+		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
+{
+	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
+	struct rte_bbdev_queue_data *q_data = &dev->data->queues[queue_id];
+	return dev->dequeue_fft_ops(q_data, ops, num_ops);
+}
+
 /** Definitions of device event types */
 enum rte_bbdev_event_type {
 	RTE_BBDEV_EVENT_UNKNOWN,  /**< unknown event type */
diff --git a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h
index cd82418..afa1a71 100644
--- a/lib/bbdev/rte_bbdev_op.h
+++ b/lib/bbdev/rte_bbdev_op.h
@@ -47,6 +47,8 @@
 #define RTE_BBDEV_TURBO_MAX_CODE_BLOCKS (64)
 /* LDPC:  Maximum number of Code Blocks in Transport Block.*/
 #define RTE_BBDEV_LDPC_MAX_CODE_BLOCKS (256)
+/* 12 CS maximum */
+#define RTE_BBDEV_MAX_CS_2 (6)
 
 /** Flags for turbo decoder operation and capability structure */
 enum rte_bbdev_op_td_flag_bitmasks {
@@ -211,6 +213,26 @@ enum rte_bbdev_op_ldpcenc_flag_bitmasks {
 	RTE_BBDEV_LDPC_ENC_CONCATENATION = (1ULL << 7)
 };
 
+/** Flags for DFT operation and capability structure */
+enum rte_bbdev_op_fft_flag_bitmasks {
+	/** Flexible windowing capability */
+	RTE_BBDEV_FFT_WINDOWING = (1ULL << 0),
+	/** Flexible adjustment of Cyclic Shift time offset */
+	RTE_BBDEV_FFT_CS_ADJUSTMENT = (1ULL << 1),
+	/** Set for bypass the DFT and get directly into iDFT input */
+	RTE_BBDEV_FFT_DFT_BYPASS = (1ULL << 2),
+	/** Set for bypass the IDFT and get directly the DFT output */
+	RTE_BBDEV_FFT_IDFT_BYPASS = (1ULL << 3),
+	/** Set for bypass time domain windowing */
+	RTE_BBDEV_FFT_WINDOWING_BYPASS = (1ULL << 4),
+	/** Set for optional power measurement on DFT output */
+	RTE_BBDEV_FFT_POWER_MEAS = (1ULL << 5),
+	/** Set if the input data used FP16 format */
+	RTE_BBDEV_FFT_FP16_INPUT = (1ULL << 6),
+	/**  Set if the output data uses FP16 format  */
+	RTE_BBDEV_FFT_FP16_OUTPUT = (1ULL << 7)
+};
+
 /** Flags for the Code Block/Transport block mode  */
 enum rte_bbdev_op_cb_mode {
 	/** One operation is one or fraction of one transport block  */
@@ -689,6 +711,55 @@ struct rte_bbdev_op_ldpc_enc {
 	};
 };
 
+/** Operation structure for FFT processing.
+ *
+ * The operation processes the data for multiple antennas in a single call
+ * (.i.e for all the REs belonging to a given SRS sequence for instance)
+ *
+ * The output mbuf data structure is expected to be allocated by the
+ * application with enough room for the output data.
+ */
+struct rte_bbdev_op_fft {
+	/** Input data starting from first antenna */
+	struct rte_bbdev_op_data base_input;
+	/** Output data starting from first antenna and first cyclic shift */
+	struct rte_bbdev_op_data base_output;
+	/** Optional power measurement output data */
+	struct rte_bbdev_op_data power_meas_output;
+	/** Flags from rte_bbdev_op_fft_flag_bitmasks */
+	uint32_t op_flags;
+	/** Input sequence size in 32-bits points */
+	uint16_t input_sequence_size;
+	/** Padding at the start of the sequence */
+	uint16_t input_leading_padding;
+	/** Output sequence size in 32-bits points */
+	uint16_t output_sequence_size;
+	/** Depadding at the start of the DFT output */
+	uint16_t output_leading_depadding;
+	/** Window index being used for each cyclic shift output */
+	uint8_t window_index[RTE_BBDEV_MAX_CS_2];
+	/** Bitmap of the cyclic shift output requested */
+	uint16_t cs_bitmap;
+	/** Number of antennas as a log2 – 8 to 128 */
+	uint8_t num_antennas_log2;
+	/** iDFT size as a log2 - 32 to 2048 */
+	uint8_t idft_log2;
+	/** DFT size as a log2 - 8 to 2048 */
+	uint8_t dft_log2;
+	/** Adjustment of position of the cyclic shifts - -31 to 31 */
+	int8_t cs_time_adjustment;
+	/** iDFT shift down */
+	int8_t idft_shift;
+	/** DFT shift down */
+	int8_t dft_shift;
+	/** NCS reciprocal factor  */
+	uint16_t ncs_reciprocal;
+	/** power measurement out shift down */
+	uint16_t power_shift;
+	/** Adjust the FP6 exponent for INT<->FP16 conversion */
+	uint16_t fp16_exp_adjust;
+};
+
 /** List of the capabilities for the Turbo Decoder */
 struct rte_bbdev_op_cap_turbo_dec {
 	/** Flags from rte_bbdev_op_td_flag_bitmasks */
@@ -741,6 +812,16 @@ struct rte_bbdev_op_cap_ldpc_enc {
 	uint16_t num_buffers_dst;
 };
 
+/** List of the capabilities for the FFT */
+struct rte_bbdev_op_cap_fft {
+	/** Flags from rte_bbdev_op_fft_flag_bitmasks */
+	uint32_t capability_flags;
+	/** Num input code block buffers */
+	uint16_t num_buffers_src;
+	/** Num output code block buffers */
+	uint16_t num_buffers_dst;
+};
+
 /** Different operation types supported by the device */
 enum rte_bbdev_op_type {
 	RTE_BBDEV_OP_NONE,  /**< Dummy operation that does nothing */
@@ -748,6 +829,7 @@ enum rte_bbdev_op_type {
 	RTE_BBDEV_OP_TURBO_ENC,  /**< Turbo encode */
 	RTE_BBDEV_OP_LDPC_DEC,  /**< LDPC decode */
 	RTE_BBDEV_OP_LDPC_ENC,  /**< LDPC encode */
+	RTE_BBDEV_OP_FFT,  /**< FFT */
 	RTE_BBDEV_OP_TYPE_PADDED_MAX = 8,  /**< Maximum op type number including padding */
 };
 
@@ -791,6 +873,18 @@ struct rte_bbdev_dec_op {
 	};
 };
 
+/** Structure specifying a single fft operation */
+struct rte_bbdev_fft_op {
+	/** Status of operation that was performed */
+	int status;
+	/** Mempool which op instance is in */
+	struct rte_mempool *mempool;
+	/** Opaque pointer for user data */
+	void *opaque_data;
+	/** Contains turbo decoder specific parameters */
+	struct rte_bbdev_op_fft fft;
+};
+
 /** Operation capabilities supported by a device */
 struct rte_bbdev_op_cap {
 	enum rte_bbdev_op_type type;  /**< Type of operation */
@@ -799,6 +893,7 @@ struct rte_bbdev_op_cap {
 		struct rte_bbdev_op_cap_turbo_enc turbo_enc;
 		struct rte_bbdev_op_cap_ldpc_dec ldpc_dec;
 		struct rte_bbdev_op_cap_ldpc_enc ldpc_enc;
+		struct rte_bbdev_op_cap_fft fft;
 	} cap;  /**< Operation-type specific capabilities */
 };
 
@@ -918,6 +1013,42 @@ struct rte_mempool *
 }
 
 /**
+ * Bulk allocate fft operations from a mempool with parameter defaults reset.
+ *
+ * @param mempool
+ *   Operation mempool, created by rte_bbdev_op_pool_create().
+ * @param ops
+ *   Output array to place allocated operations
+ * @param num_ops
+ *   Number of operations to allocate
+ *
+ * @returns
+ *   - 0 on success
+ *   - EINVAL if invalid mempool is provided
+ */
+__rte_experimental
+static inline int
+rte_bbdev_fft_op_alloc_bulk(struct rte_mempool *mempool,
+		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
+{
+	struct rte_bbdev_op_pool_private *priv;
+	int ret;
+
+	/* Check type */
+	priv = (struct rte_bbdev_op_pool_private *)
+			rte_mempool_get_priv(mempool);
+	if (unlikely(priv->type != RTE_BBDEV_OP_FFT))
+		return -EINVAL;
+
+	/* Get elements */
+	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
+	if (unlikely(ret < 0))
+		return ret;
+
+	return 0;
+}
+
+/**
  * Free decode operation structures that were allocated by
  * rte_bbdev_dec_op_alloc_bulk().
  * All structures must belong to the same mempool.
@@ -951,6 +1082,24 @@ struct rte_mempool *
 		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops, num_ops);
 }
 
+/**
+ * Free encode operation structures that were allocated by
+ * rte_bbdev_fft_op_alloc_bulk().
+ * All structures must belong to the same mempool.
+ *
+ * @param ops
+ *   Operation structures
+ * @param num_ops
+ *   Number of structures
+ */
+__rte_experimental
+static inline void
+rte_bbdev_fft_op_free_bulk(struct rte_bbdev_fft_op **ops, unsigned int num_ops)
+{
+	if (num_ops > 0)
+		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops, num_ops);
+}
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map
index f0a072e..0cbeab3 100644
--- a/lib/bbdev/version.map
+++ b/lib/bbdev/version.map
@@ -45,4 +45,8 @@ EXPERIMENTAL {
 
 	# added in 22.11
 	rte_bbdev_device_status_str;
+	rte_bbdev_enqueue_fft_ops;
+	rte_bbdev_dequeue_fft_ops;
+	rte_bbdev_fft_op_alloc_bulk;
+	rte_bbdev_fft_op_free_bulk;
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v6 6/7] bbdev: add queue related warning and status information
  2022-08-25 18:24       ` [PATCH v6 " Nicolas Chautru
                           ` (4 preceding siblings ...)
  2022-08-25 18:24         ` [PATCH v6 5/7] bbdev: add new operation for FFT processing Nicolas Chautru
@ 2022-08-25 18:24         ` Nicolas Chautru
  2022-08-26 19:51           ` Maxime Coquelin
  2022-08-25 18:24         ` [PATCH v6 7/7] bbdev: remove unnecessary if-check Nicolas Chautru
  6 siblings, 1 reply; 174+ messages in thread
From: Nicolas Chautru @ 2022-08-25 18:24 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

This allows to expose more information with regards to any
queue related failure and warning which cannot be supported
in existing API.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 app/test-bbdev/test_bbdev_perf.c |  2 ++
 lib/bbdev/rte_bbdev.c            | 19 +++++++++++++++++++
 lib/bbdev/rte_bbdev.h            | 34 ++++++++++++++++++++++++++++++++++
 lib/bbdev/version.map            |  1 +
 4 files changed, 56 insertions(+)

diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 1abda2d..653b21f 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -4360,6 +4360,8 @@ typedef int (test_case_function)(struct active_device *ad,
 	stats->dequeued_count = q_stats->dequeued_count;
 	stats->enqueue_err_count = q_stats->enqueue_err_count;
 	stats->dequeue_err_count = q_stats->dequeue_err_count;
+	stats->enqueue_warning_count = q_stats->enqueue_warning_count;
+	stats->dequeue_warning_count = q_stats->dequeue_warning_count;
 	stats->acc_offload_cycles = q_stats->acc_offload_cycles;
 
 	return 0;
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index 9d65ba8..bdd7c2f 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -721,6 +721,8 @@ struct rte_bbdev *
 		stats->dequeued_count += q_stats->dequeued_count;
 		stats->enqueue_err_count += q_stats->enqueue_err_count;
 		stats->dequeue_err_count += q_stats->dequeue_err_count;
+		stats->enqueue_warn_count += q_stats->enqueue_warn_count;
+		stats->dequeue_warn_count += q_stats->dequeue_warn_count;
 	}
 	rte_bbdev_log_debug("Got stats on %u", dev->data->dev_id);
 }
@@ -1163,3 +1165,20 @@ struct rte_mempool *
 	rte_bbdev_log(ERR, "Invalid device status");
 	return NULL;
 }
+
+const char *
+rte_bbdev_enqueue_status_str(enum rte_bbdev_enqueue_status status)
+{
+	static const char * const enq_sta_string[] = {
+		"RTE_BBDEV_ENQ_STATUS_NONE",
+		"RTE_BBDEV_ENQ_STATUS_QUEUE_FULL",
+		"RTE_BBDEV_ENQ_STATUS_RING_FULL",
+		"RTE_BBDEV_ENQ_STATUS_INVALID_OP",
+	};
+
+	if (status < sizeof(enq_sta_string) / sizeof(char *))
+		return enq_sta_string[status];
+
+	rte_bbdev_log(ERR, "Invalid enqueue status");
+	return NULL;
+}
diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index ed528b8..b7ecf94 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -224,6 +224,19 @@ struct rte_bbdev_queue_conf {
 rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
 
 /**
+ * Flags indicate the reason why a previous enqueue may not have
+ * consumed all requested operations
+ * In case of multiple reasons the latter superdes a previous one
+ */
+enum rte_bbdev_enqueue_status {
+	RTE_BBDEV_ENQ_STATUS_NONE,             /**< Nothing to report */
+	RTE_BBDEV_ENQ_STATUS_QUEUE_FULL,       /**< Not enough room in queue */
+	RTE_BBDEV_ENQ_STATUS_RING_FULL,        /**< Not enough room in ring */
+	RTE_BBDEV_ENQ_STATUS_INVALID_OP,       /**< Operation was rejected as invalid */
+	RTE_BBDEV_ENQ_STATUS_PADDED_MAX = 6,   /**< Maximum enq status number including padding */
+};
+
+/**
  * Flags indicate the status of the device
  */
 enum rte_bbdev_device_status {
@@ -246,6 +259,12 @@ struct rte_bbdev_stats {
 	uint64_t enqueue_err_count;
 	/** Total error count on operations dequeued */
 	uint64_t dequeue_err_count;
+	/** Total warning count on operations enqueued */
+	uint64_t enqueue_warn_count;
+	/** Total warning count on operations dequeued */
+	uint64_t dequeue_warn_count;
+	/** Total enqueue status count based on rte_bbdev_enqueue_status enum */
+	uint64_t enqueue_status_count[RTE_BBDEV_ENQ_STATUS_PADDED_MAX];
 	/** CPU cycles consumed by the (HW/SW) accelerator device to offload
 	 *  the enqueue request to its internal queues.
 	 *  - For a HW device this is the cycles consumed in MMIO write
@@ -386,6 +405,7 @@ struct rte_bbdev_queue_data {
 	void *queue_private;  /**< Driver-specific per-queue data */
 	struct rte_bbdev_queue_conf conf;  /**< Current configuration */
 	struct rte_bbdev_stats queue_stats;  /**< Queue statistics */
+	enum rte_bbdev_enqueue_status enqueue_status; /**< Enqueue status when op is rejected */
 	bool started;  /**< Queue state */
 };
 
@@ -938,6 +958,20 @@ typedef void (*rte_bbdev_cb_fn)(uint16_t dev_id,
 const char*
 rte_bbdev_device_status_str(enum rte_bbdev_device_status status);
 
+/**
+ * Converts queue status from enum to string
+ *
+ * @param status
+ *   Queue status as enum
+ *
+ * @returns
+ *  Queue status as string or NULL if op_type is invalid
+ *
+ */
+__rte_experimental
+const char*
+rte_bbdev_enqueue_status_str(enum rte_bbdev_enqueue_status status);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map
index 0cbeab3..f5e2dd7 100644
--- a/lib/bbdev/version.map
+++ b/lib/bbdev/version.map
@@ -45,6 +45,7 @@ EXPERIMENTAL {
 
 	# added in 22.11
 	rte_bbdev_device_status_str;
+	rte_bbdev_enqueue_status_str;
 	rte_bbdev_enqueue_fft_ops;
 	rte_bbdev_dequeue_fft_ops;
 	rte_bbdev_fft_op_alloc_bulk;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v6 7/7] bbdev: remove unnecessary if-check
  2022-08-25 18:24       ` [PATCH v6 " Nicolas Chautru
                           ` (5 preceding siblings ...)
  2022-08-25 18:24         ` [PATCH v6 6/7] bbdev: add queue related warning and status information Nicolas Chautru
@ 2022-08-25 18:24         ` Nicolas Chautru
  2022-08-26 19:52           ` Maxime Coquelin
  6 siblings, 1 reply; 174+ messages in thread
From: Nicolas Chautru @ 2022-08-25 18:24 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, Nicolas Chautru

Code clean up due to if-check not required

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 lib/bbdev/rte_bbdev_op.h | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h
index afa1a71..386eed8 100644
--- a/lib/bbdev/rte_bbdev_op.h
+++ b/lib/bbdev/rte_bbdev_op.h
@@ -970,10 +970,8 @@ struct rte_mempool *
 
 	/* Get elements */
 	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
-	if (unlikely(ret < 0))
-		return ret;
 
-	return 0;
+	return ret;
 }
 
 /**
@@ -1006,10 +1004,8 @@ struct rte_mempool *
 
 	/* Get elements */
 	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
-	if (unlikely(ret < 0))
-		return ret;
 
-	return 0;
+	return ret;
 }
 
 /**
@@ -1035,17 +1031,14 @@ struct rte_mempool *
 	int ret;
 
 	/* Check type */
-	priv = (struct rte_bbdev_op_pool_private *)
-			rte_mempool_get_priv(mempool);
+	priv = (struct rte_bbdev_op_pool_private *) rte_mempool_get_priv(mempool);
 	if (unlikely(priv->type != RTE_BBDEV_OP_FFT))
 		return -EINVAL;
 
 	/* Get elements */
 	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
-	if (unlikely(ret < 0))
-		return ret;
 
-	return 0;
+	return ret;
 }
 
 /**
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* RE: [PATCH v5 2/7] bbdev: add device status info
  2022-08-25 14:18           ` Maxime Coquelin
@ 2022-08-25 18:30             ` Chautru, Nicolas
  2022-08-26 10:12               ` Maxime Coquelin
  0 siblings, 1 reply; 174+ messages in thread
From: Chautru, Nicolas @ 2022-08-25 18:30 UTC (permalink / raw)
  To: Maxime Coquelin, dev, thomas, gakhil, hemant.agrawal
  Cc: trix, mdr, Richardson, Bruce, david.marchand, stephen

Thanks Maxime,

> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Thursday, August 25, 2022 7:19 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
> Cc: trix@redhat.com; mdr@ashroe.eu; Richardson, Bruce
> <bruce.richardson@intel.com>; david.marchand@redhat.com;
> stephen@networkplumber.org
> Subject: Re: [PATCH v5 2/7] bbdev: add device status info
> 
> 
> 
> On 7/7/22 01:28, Nicolas Chautru wrote:
> > Added device status information, so that the PMD can expose
> > information related to the underlying accelerator device status.
> > Minor order change in structure to fit into padding hole.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > ---
> >   drivers/baseband/acc100/rte_acc100_pmd.c           |  1 +
> >   drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  1 +
> >   drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  1 +
> >   drivers/baseband/la12xx/bbdev_la12xx.c             |  1 +
> >   drivers/baseband/null/bbdev_null.c                 |  1 +
> >   drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  1 +
> >   lib/bbdev/rte_bbdev.c                              | 22 ++++++++++++++
> >   lib/bbdev/rte_bbdev.h                              | 35 ++++++++++++++++++++--
> >   lib/bbdev/version.map                              |  6 ++++
> >   9 files changed, 67 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index de7e4bc..17ba798 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -1060,6 +1060,7 @@
> >
> >   	/* Read and save the populated config from ACC100 registers */
> >   	fetch_acc100_config(dev);
> > +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >
> >   	/* This isn't ideal because it reports the maximum number of queues
> but
> >   	 * does not provide info on how many can be uplink/downlink or
> > different diff --git
> > a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> > b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> > index 82ae6ba..57b12af 100644
> > --- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> > +++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> > @@ -369,6 +369,7 @@
> >   	dev_info->capabilities = bbdev_capabilities;
> >   	dev_info->cpu_flag_reqs = NULL;
> >   	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> > +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >
> >   	/* Calculates number of queues assigned to device */
> >   	dev_info->max_num_queues = 0;
> > diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> > b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> > index 21d3529..2a330c4 100644
> > --- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> > +++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> > @@ -645,6 +645,7 @@ struct __rte_cache_aligned fpga_queue {
> >   	dev_info->capabilities = bbdev_capabilities;
> >   	dev_info->cpu_flag_reqs = NULL;
> >   	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> > +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >
> >   	/* Calculates number of queues assigned to device */
> >   	dev_info->max_num_queues = 0;
> > diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c
> > b/drivers/baseband/la12xx/bbdev_la12xx.c
> > index 4d1bd16..c1f88c6 100644
> > --- a/drivers/baseband/la12xx/bbdev_la12xx.c
> > +++ b/drivers/baseband/la12xx/bbdev_la12xx.c
> > @@ -100,6 +100,7 @@ struct bbdev_la12xx_params {
> >   	dev_info->capabilities = bbdev_capabilities;
> >   	dev_info->cpu_flag_reqs = NULL;
> >   	dev_info->min_alignment = 64;
> > +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >
> >   	rte_bbdev_log_debug("got device info from %u", dev->data-
> >dev_id);
> >   }
> > diff --git a/drivers/baseband/null/bbdev_null.c
> > b/drivers/baseband/null/bbdev_null.c
> > index 248e129..94a1976 100644
> > --- a/drivers/baseband/null/bbdev_null.c
> > +++ b/drivers/baseband/null/bbdev_null.c
> > @@ -82,6 +82,7 @@ struct bbdev_queue {
> >   	 * here for code completeness.
> >   	 */
> >   	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> > +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >
> >   	rte_bbdev_log_debug("got device info from %u", dev->data-
> >dev_id);
> >   }
> > diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> > b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> > index af7bc41..dbc5524 100644
> > --- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> > +++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> > @@ -254,6 +254,7 @@ struct turbo_sw_queue {
> >   	dev_info->min_alignment = 64;
> >   	dev_info->harq_buffer_size = 0;
> >   	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> > +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >
> >   	rte_bbdev_log_debug("got device info from %u\n", dev->data-
> >dev_id);
> >   }
> > diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c index
> > 4da8047..38630a2 100644
> > --- a/lib/bbdev/rte_bbdev.c
> > +++ b/lib/bbdev/rte_bbdev.c
> > @@ -1133,3 +1133,25 @@ struct rte_mempool *
> >   	rte_bbdev_log(ERR, "Invalid operation type");
> >   	return NULL;
> >   }
> > +
> > +const char *
> > +rte_bbdev_device_status_str(enum rte_bbdev_device_status status) {
> > +	static const char * const dev_sta_string[] = {
> > +		"RTE_BBDEV_DEV_NOSTATUS",
> > +		"RTE_BBDEV_DEV_NOT_SUPPORTED",
> > +		"RTE_BBDEV_DEV_RESET",
> > +		"RTE_BBDEV_DEV_CONFIGURED",
> > +		"RTE_BBDEV_DEV_ACTIVE",
> > +		"RTE_BBDEV_DEV_FATAL_ERR",
> > +		"RTE_BBDEV_DEV_RESTART_REQ",
> > +		"RTE_BBDEV_DEV_RECONFIG_REQ",
> > +		"RTE_BBDEV_DEV_CORRECT_ERR",
> > +	};
> > +
> > +	if (status < sizeof(dev_sta_string) / sizeof(char *))
> > +		return dev_sta_string[status];
> > +
> > +	rte_bbdev_log(ERR, "Invalid device status");
> > +	return NULL;
> > +}
> > diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
> > b88c881..9b1ffa4 100644
> > --- a/lib/bbdev/rte_bbdev.h
> > +++ b/lib/bbdev/rte_bbdev.h
> > @@ -223,6 +223,21 @@ struct rte_bbdev_queue_conf {
> >   int
> >   rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
> >
> > +/**
> > + * Flags indicate the status of the device  */ enum
> > +rte_bbdev_device_status {
> > +	RTE_BBDEV_DEV_NOSTATUS,        /**< Nothing being reported */
> > +	RTE_BBDEV_DEV_NOT_SUPPORTED,   /**< Device status is not
> supported on the PMD */
> > +	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-
> configured state */
> > +	RTE_BBDEV_DEV_CONFIGURED,      /**< Device is configured and
> ready to use */
> > +	RTE_BBDEV_DEV_ACTIVE,          /**< Device is configured and VF is
> being used */
> > +	RTE_BBDEV_DEV_FATAL_ERR,       /**< Device has hit a fatal
> uncorrectable error */
> > +	RTE_BBDEV_DEV_RESTART_REQ,     /**< Device requires application
> to restart */
> > +	RTE_BBDEV_DEV_RECONFIG_REQ,    /**< Device requires
> application to reconfigure queues */
> > +	RTE_BBDEV_DEV_CORRECT_ERR,     /**< Warning of a correctable
> error event happened */
> > +};
> 
> I don't have a strong opinion on this, but I think NOT_SUPPORTED should be
> a special value. If you want to keep 0 value for NOSTATUS, maybe you could
> do:
> 
> enum rte_bbdev_device_status {
> 	RTE_BBDEV_DEV_NOT_SUPPORTED = -1,   /**< Device status is not
> supported
> on the PMD */
> 	RTE_BBDEV_DEV_NOSTATUS = 0,        /**< Nothing being reported
> */
> 	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-
> configured
> state */
> ...

Thanks Maxime. My concern is that I am upstreaming in parallel in pf_bb_config in parallel hence would like to keep it unchanged if possible.
Given you don’t have a strong opinion is that okay to keep as is? Or I can force special value 1 for NOT_SUPPORTED so that this is explicitly defined. But really enum should always be used.


> 
> 
> > +
> >   /** Device statistics. */
> >   struct rte_bbdev_stats {
> >   	uint64_t enqueued_count;  /**< Count of all operations enqueued */
> > @@ -285,12 +300,14 @@ struct rte_bbdev_driver_info {
> >   	/** Set if device supports per-queue interrupts */
> >   	bool queue_intr_supported;
> >   	/** Minimum alignment of buffers, in bytes */
> > -	uint16_t min_alignment;
> > -	/** HARQ memory available in kB */
> > +	/** Device Status */
> > +	enum rte_bbdev_device_status device_status;
> >   	uint32_t harq_buffer_size;
> >   	/** Byte endianness (RTE_BIG_ENDIAN/RTE_LITTLE_ENDIAN)
> supported
> >   	 *  for input/output data
> >   	 */
> > +	uint16_t min_alignment;
> > +	/** HARQ memory available in kB */
> >   	uint8_t data_endianness;
> >   	/** Default queue configuration used if none is supplied  */
> >   	struct rte_bbdev_queue_conf default_queue_conf; @@ -827,6
> +844,20
> > @@ typedef void (*rte_bbdev_cb_fn)(uint16_t dev_id,
> >   rte_bbdev_queue_intr_ctl(uint16_t dev_id, uint16_t queue_id, int epfd, int
> op,
> >   		void *data);
> >
> > +/**
> > + * Converts device status from enum to string
> > + *
> > + * @param status
> > + *   Device status as enum
> > + *
> > + * @returns
> > + *   Operation type as string or NULL if op_type is invalid
> > + *
> > + */
> > +__rte_experimental
> > +const char*
> > +rte_bbdev_device_status_str(enum rte_bbdev_device_status status);
> > +
> >   #ifdef __cplusplus
> >   }
> >   #endif
> > diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map index
> > cce3f3c..9ac3643 100644
> > --- a/lib/bbdev/version.map
> > +++ b/lib/bbdev/version.map
> > @@ -39,3 +39,9 @@ DPDK_22 {
> >
> >   	local: *;
> >   };
> > +
> > +EXPERIMENTAL {
> > +	global:
> > +
> 
> We now add the version the new API was introduced in as a comment:
> 
>          # added in 22.11

Thanks for this feedback, I will update this

> > +	rte_bbdev_device_status_str;
> > +};


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v5 2/7] bbdev: add device status info
  2022-08-25 18:30             ` Chautru, Nicolas
@ 2022-08-26 10:12               ` Maxime Coquelin
  2022-08-29 16:10                 ` Chautru, Nicolas
  0 siblings, 1 reply; 174+ messages in thread
From: Maxime Coquelin @ 2022-08-26 10:12 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, thomas, gakhil, hemant.agrawal
  Cc: trix, mdr, Richardson, Bruce, david.marchand, stephen

Hi,

On 8/25/22 20:30, Chautru, Nicolas wrote:
> Thanks Maxime,
> 
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Thursday, August 25, 2022 7:19 AM
>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
>> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
>> Cc: trix@redhat.com; mdr@ashroe.eu; Richardson, Bruce
>> <bruce.richardson@intel.com>; david.marchand@redhat.com;
>> stephen@networkplumber.org
>> Subject: Re: [PATCH v5 2/7] bbdev: add device status info
>>
>>
>>
>> On 7/7/22 01:28, Nicolas Chautru wrote:
>>> Added device status information, so that the PMD can expose
>>> information related to the underlying accelerator device status.
>>> Minor order change in structure to fit into padding hole.
>>>
>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>> ---
>>>    drivers/baseband/acc100/rte_acc100_pmd.c           |  1 +
>>>    drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  1 +
>>>    drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  1 +
>>>    drivers/baseband/la12xx/bbdev_la12xx.c             |  1 +
>>>    drivers/baseband/null/bbdev_null.c                 |  1 +
>>>    drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  1 +
>>>    lib/bbdev/rte_bbdev.c                              | 22 ++++++++++++++
>>>    lib/bbdev/rte_bbdev.h                              | 35 ++++++++++++++++++++--
>>>    lib/bbdev/version.map                              |  6 ++++
>>>    9 files changed, 67 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> index de7e4bc..17ba798 100644
>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> @@ -1060,6 +1060,7 @@
>>>
>>>    	/* Read and save the populated config from ACC100 registers */
>>>    	fetch_acc100_config(dev);
>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>
>>>    	/* This isn't ideal because it reports the maximum number of queues
>> but
>>>    	 * does not provide info on how many can be uplink/downlink or
>>> different diff --git
>>> a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>> b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>> index 82ae6ba..57b12af 100644
>>> --- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>> +++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>> @@ -369,6 +369,7 @@
>>>    	dev_info->capabilities = bbdev_capabilities;
>>>    	dev_info->cpu_flag_reqs = NULL;
>>>    	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>
>>>    	/* Calculates number of queues assigned to device */
>>>    	dev_info->max_num_queues = 0;
>>> diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>> b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>> index 21d3529..2a330c4 100644
>>> --- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>> +++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>> @@ -645,6 +645,7 @@ struct __rte_cache_aligned fpga_queue {
>>>    	dev_info->capabilities = bbdev_capabilities;
>>>    	dev_info->cpu_flag_reqs = NULL;
>>>    	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>
>>>    	/* Calculates number of queues assigned to device */
>>>    	dev_info->max_num_queues = 0;
>>> diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c
>>> b/drivers/baseband/la12xx/bbdev_la12xx.c
>>> index 4d1bd16..c1f88c6 100644
>>> --- a/drivers/baseband/la12xx/bbdev_la12xx.c
>>> +++ b/drivers/baseband/la12xx/bbdev_la12xx.c
>>> @@ -100,6 +100,7 @@ struct bbdev_la12xx_params {
>>>    	dev_info->capabilities = bbdev_capabilities;
>>>    	dev_info->cpu_flag_reqs = NULL;
>>>    	dev_info->min_alignment = 64;
>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>
>>>    	rte_bbdev_log_debug("got device info from %u", dev->data-
>>> dev_id);
>>>    }
>>> diff --git a/drivers/baseband/null/bbdev_null.c
>>> b/drivers/baseband/null/bbdev_null.c
>>> index 248e129..94a1976 100644
>>> --- a/drivers/baseband/null/bbdev_null.c
>>> +++ b/drivers/baseband/null/bbdev_null.c
>>> @@ -82,6 +82,7 @@ struct bbdev_queue {
>>>    	 * here for code completeness.
>>>    	 */
>>>    	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>
>>>    	rte_bbdev_log_debug("got device info from %u", dev->data-
>>> dev_id);
>>>    }
>>> diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>> b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>> index af7bc41..dbc5524 100644
>>> --- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>> +++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>> @@ -254,6 +254,7 @@ struct turbo_sw_queue {
>>>    	dev_info->min_alignment = 64;
>>>    	dev_info->harq_buffer_size = 0;
>>>    	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>
>>>    	rte_bbdev_log_debug("got device info from %u\n", dev->data-
>>> dev_id);
>>>    }
>>> diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c index
>>> 4da8047..38630a2 100644
>>> --- a/lib/bbdev/rte_bbdev.c
>>> +++ b/lib/bbdev/rte_bbdev.c
>>> @@ -1133,3 +1133,25 @@ struct rte_mempool *
>>>    	rte_bbdev_log(ERR, "Invalid operation type");
>>>    	return NULL;
>>>    }
>>> +
>>> +const char *
>>> +rte_bbdev_device_status_str(enum rte_bbdev_device_status status) {
>>> +	static const char * const dev_sta_string[] = {
>>> +		"RTE_BBDEV_DEV_NOSTATUS",
>>> +		"RTE_BBDEV_DEV_NOT_SUPPORTED",
>>> +		"RTE_BBDEV_DEV_RESET",
>>> +		"RTE_BBDEV_DEV_CONFIGURED",
>>> +		"RTE_BBDEV_DEV_ACTIVE",
>>> +		"RTE_BBDEV_DEV_FATAL_ERR",
>>> +		"RTE_BBDEV_DEV_RESTART_REQ",
>>> +		"RTE_BBDEV_DEV_RECONFIG_REQ",
>>> +		"RTE_BBDEV_DEV_CORRECT_ERR",
>>> +	};
>>> +
>>> +	if (status < sizeof(dev_sta_string) / sizeof(char *))
>>> +		return dev_sta_string[status];
>>> +
>>> +	rte_bbdev_log(ERR, "Invalid device status");
>>> +	return NULL;
>>> +}
>>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
>>> b88c881..9b1ffa4 100644
>>> --- a/lib/bbdev/rte_bbdev.h
>>> +++ b/lib/bbdev/rte_bbdev.h
>>> @@ -223,6 +223,21 @@ struct rte_bbdev_queue_conf {
>>>    int
>>>    rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
>>>
>>> +/**
>>> + * Flags indicate the status of the device  */ enum
>>> +rte_bbdev_device_status {
>>> +	RTE_BBDEV_DEV_NOSTATUS,        /**< Nothing being reported */
>>> +	RTE_BBDEV_DEV_NOT_SUPPORTED,   /**< Device status is not
>> supported on the PMD */
>>> +	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-
>> configured state */
>>> +	RTE_BBDEV_DEV_CONFIGURED,      /**< Device is configured and
>> ready to use */
>>> +	RTE_BBDEV_DEV_ACTIVE,          /**< Device is configured and VF is
>> being used */
>>> +	RTE_BBDEV_DEV_FATAL_ERR,       /**< Device has hit a fatal
>> uncorrectable error */
>>> +	RTE_BBDEV_DEV_RESTART_REQ,     /**< Device requires application
>> to restart */
>>> +	RTE_BBDEV_DEV_RECONFIG_REQ,    /**< Device requires
>> application to reconfigure queues */
>>> +	RTE_BBDEV_DEV_CORRECT_ERR,     /**< Warning of a correctable
>> error event happened */
>>> +};
>>
>> I don't have a strong opinion on this, but I think NOT_SUPPORTED should be
>> a special value. If you want to keep 0 value for NOSTATUS, maybe you could
>> do:
>>
>> enum rte_bbdev_device_status {
>> 	RTE_BBDEV_DEV_NOT_SUPPORTED = -1,   /**< Device status is not
>> supported
>> on the PMD */
>> 	RTE_BBDEV_DEV_NOSTATUS = 0,        /**< Nothing being reported
>> */
>> 	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-
>> configured
>> state */
>> ...
> 
> Thanks Maxime. My concern is that I am upstreaming in parallel in pf_bb_config in parallel hence would like to keep it unchanged if possible.
> Given you don’t have a strong opinion is that okay to keep as is? Or I can force special value 1 for NOT_SUPPORTED so that this is explicitly defined. But really enum should always be used.

I don't understand. It should not have any impact on pf_bb_config, given
pf_bb_config does not use DPDK.

Maxime


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v6 4/7] drivers/baseband: update PMDs to expose queue per operation
  2022-08-25 18:24         ` [PATCH v6 4/7] drivers/baseband: update PMDs to expose queue per operation Nicolas Chautru
@ 2022-08-26 11:53           ` Maxime Coquelin
  0 siblings, 0 replies; 174+ messages in thread
From: Maxime Coquelin @ 2022-08-26 11:53 UTC (permalink / raw)
  To: Nicolas Chautru, dev, thomas, gakhil, hemant.agrawal
  Cc: trix, mdr, bruce.richardson, david.marchand, stephen



On 8/25/22 20:24, Nicolas Chautru wrote:
> Add support in existing bbdev PMDs for the explicit number of queue

queues

> and priority for each operation type configured on the device.
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>   drivers/baseband/acc100/rte_acc100_pmd.c           | 29 +++++++++++++---------
>   drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  8 ++++++
>   drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  8 ++++++
>   drivers/baseband/la12xx/bbdev_la12xx.c             |  7 ++++++
>   drivers/baseband/turbo_sw/bbdev_turbo_software.c   | 11 ++++++++
>   5 files changed, 51 insertions(+), 12 deletions(-)
> 

Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Thanks,
Maxime


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v6 5/7] bbdev: add new operation for FFT processing
  2022-08-25 18:24         ` [PATCH v6 5/7] bbdev: add new operation for FFT processing Nicolas Chautru
@ 2022-08-26 12:07           ` Maxime Coquelin
  2022-08-29 18:18             ` Chautru, Nicolas
  0 siblings, 1 reply; 174+ messages in thread
From: Maxime Coquelin @ 2022-08-26 12:07 UTC (permalink / raw)
  To: Nicolas Chautru, dev, thomas, gakhil, hemant.agrawal
  Cc: trix, mdr, bruce.richardson, david.marchand, stephen



On 8/25/22 20:24, Nicolas Chautru wrote:
> Extension of bbdev operation to support FFT based operations.
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
> ---
>   doc/guides/prog_guide/bbdev.rst | 130 +++++++++++++++++++++++++++++++++++
>   lib/bbdev/rte_bbdev.c           |  10 ++-
>   lib/bbdev/rte_bbdev.h           |  76 ++++++++++++++++++++
>   lib/bbdev/rte_bbdev_op.h        | 149 ++++++++++++++++++++++++++++++++++++++++
>   lib/bbdev/version.map           |   4 ++
>   5 files changed, 368 insertions(+), 1 deletion(-)
> 
> diff --git a/doc/guides/prog_guide/bbdev.rst b/doc/guides/prog_guide/bbdev.rst
> index 70fa01a..150161b 100644
> --- a/doc/guides/prog_guide/bbdev.rst
> +++ b/doc/guides/prog_guide/bbdev.rst
> @@ -1118,6 +1118,136 @@ Figure :numref:`figure_turbo_tb_decode` above
>   showing the Turbo decoding of CBs using BBDEV interface in TB-mode
>   is also valid for LDPC decode.
>   
> +BBDEV FFT Operation
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +This operation allows to run a combination of DFT and/or IDFT and/or time-domain windowing.
> +These can be used in a modular fashion (using bypass modes) or as a processing pipeline
> +which can be used for FFT-based baseband signal processing.
> +In more details it allows :
> +- to process the data first through an IDFT of adjustable size and padding;
> +- to perform the windowing as a programmable cyclic shift offset of the data followed by a
> +pointwise multiplication by a time domain window;
> +- to process the related data through a DFT of adjustable size and depadding for each such cyclic

depadding?

> +shift output.
> +
> +A flexible number of Rx antennas are being processed in parallel with the same configuration.
> +The API allows more generally for flexibility in what the PMD may support (cabability flags) and

s/cabability/capability/

With above typos fixed:
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Thanks,
Maxime


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v6 6/7] bbdev: add queue related warning and status information
  2022-08-25 18:24         ` [PATCH v6 6/7] bbdev: add queue related warning and status information Nicolas Chautru
@ 2022-08-26 19:51           ` Maxime Coquelin
  0 siblings, 0 replies; 174+ messages in thread
From: Maxime Coquelin @ 2022-08-26 19:51 UTC (permalink / raw)
  To: Nicolas Chautru, dev, thomas, gakhil, hemant.agrawal
  Cc: trix, mdr, bruce.richardson, david.marchand, stephen



On 8/25/22 20:24, Nicolas Chautru wrote:
> This allows to expose more information with regards to any
> queue related failure and warning which cannot be supported
> in existing API.
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>   app/test-bbdev/test_bbdev_perf.c |  2 ++
>   lib/bbdev/rte_bbdev.c            | 19 +++++++++++++++++++
>   lib/bbdev/rte_bbdev.h            | 34 ++++++++++++++++++++++++++++++++++
>   lib/bbdev/version.map            |  1 +
>   4 files changed, 56 insertions(+)
> 

Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Thanks,
Maxime


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v6 7/7] bbdev: remove unnecessary if-check
  2022-08-25 18:24         ` [PATCH v6 7/7] bbdev: remove unnecessary if-check Nicolas Chautru
@ 2022-08-26 19:52           ` Maxime Coquelin
  0 siblings, 0 replies; 174+ messages in thread
From: Maxime Coquelin @ 2022-08-26 19:52 UTC (permalink / raw)
  To: Nicolas Chautru, dev, thomas, gakhil, hemant.agrawal
  Cc: trix, mdr, bruce.richardson, david.marchand, stephen



On 8/25/22 20:24, Nicolas Chautru wrote:
> Code clean up due to if-check not required
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>   lib/bbdev/rte_bbdev_op.h | 15 ++++-----------
>   1 file changed, 4 insertions(+), 11 deletions(-)
> 
> diff --git a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h
> index afa1a71..386eed8 100644
> --- a/lib/bbdev/rte_bbdev_op.h
> +++ b/lib/bbdev/rte_bbdev_op.h
> @@ -970,10 +970,8 @@ struct rte_mempool *
>   
>   	/* Get elements */
>   	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
> -	if (unlikely(ret < 0))
> -		return ret;
>   
> -	return 0;
> +	return ret;
>   }
>   
>   /**
> @@ -1006,10 +1004,8 @@ struct rte_mempool *
>   
>   	/* Get elements */
>   	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
> -	if (unlikely(ret < 0))
> -		return ret;
>   
> -	return 0;
> +	return ret;
>   }
>   
>   /**
> @@ -1035,17 +1031,14 @@ struct rte_mempool *
>   	int ret;
>   
>   	/* Check type */
> -	priv = (struct rte_bbdev_op_pool_private *)
> -			rte_mempool_get_priv(mempool);
> +	priv = (struct rte_bbdev_op_pool_private *) rte_mempool_get_priv(mempool);
>   	if (unlikely(priv->type != RTE_BBDEV_OP_FFT))
>   		return -EINVAL;
>   
>   	/* Get elements */
>   	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
> -	if (unlikely(ret < 0))
> -		return ret;
>   
> -	return 0;
> +	return ret;
>   }
>   
>   /**

Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Thanks,
Maxime


^ permalink raw reply	[flat|nested] 174+ messages in thread

* RE: [PATCH v5 2/7] bbdev: add device status info
  2022-08-26 10:12               ` Maxime Coquelin
@ 2022-08-29 16:10                 ` Chautru, Nicolas
  2022-08-30  7:08                   ` Maxime Coquelin
  0 siblings, 1 reply; 174+ messages in thread
From: Chautru, Nicolas @ 2022-08-29 16:10 UTC (permalink / raw)
  To: Maxime Coquelin, dev, thomas, gakhil, hemant.agrawal
  Cc: trix, mdr, Richardson, Bruce, david.marchand, stephen

Hi Maxime, 

> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Friday, August 26, 2022 3:13 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
> Cc: trix@redhat.com; mdr@ashroe.eu; Richardson, Bruce
> <bruce.richardson@intel.com>; david.marchand@redhat.com;
> stephen@networkplumber.org
> Subject: Re: [PATCH v5 2/7] bbdev: add device status info
> 
> Hi,
> 
> On 8/25/22 20:30, Chautru, Nicolas wrote:
> > Thanks Maxime,
> >
> >> -----Original Message-----
> >> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> >> Sent: Thursday, August 25, 2022 7:19 AM
> >> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> >> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
> >> Cc: trix@redhat.com; mdr@ashroe.eu; Richardson, Bruce
> >> <bruce.richardson@intel.com>; david.marchand@redhat.com;
> >> stephen@networkplumber.org
> >> Subject: Re: [PATCH v5 2/7] bbdev: add device status info
> >>
> >>
> >>
> >> On 7/7/22 01:28, Nicolas Chautru wrote:
> >>> Added device status information, so that the PMD can expose
> >>> information related to the underlying accelerator device status.
> >>> Minor order change in structure to fit into padding hole.
> >>>
> >>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> >>> ---
> >>>    drivers/baseband/acc100/rte_acc100_pmd.c           |  1 +
> >>>    drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  1 +
> >>>    drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  1 +
> >>>    drivers/baseband/la12xx/bbdev_la12xx.c             |  1 +
> >>>    drivers/baseband/null/bbdev_null.c                 |  1 +
> >>>    drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  1 +
> >>>    lib/bbdev/rte_bbdev.c                              | 22 ++++++++++++++
> >>>    lib/bbdev/rte_bbdev.h                              | 35 ++++++++++++++++++++--
> >>>    lib/bbdev/version.map                              |  6 ++++
> >>>    9 files changed, 67 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> index de7e4bc..17ba798 100644
> >>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> @@ -1060,6 +1060,7 @@
> >>>
> >>>    	/* Read and save the populated config from ACC100 registers */
> >>>    	fetch_acc100_config(dev);
> >>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>
> >>>    	/* This isn't ideal because it reports the maximum number of
> >>> queues
> >> but
> >>>    	 * does not provide info on how many can be uplink/downlink or
> >>> different diff --git
> >>> a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> >>> b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> >>> index 82ae6ba..57b12af 100644
> >>> --- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> >>> +++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> >>> @@ -369,6 +369,7 @@
> >>>    	dev_info->capabilities = bbdev_capabilities;
> >>>    	dev_info->cpu_flag_reqs = NULL;
> >>>    	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> >>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>
> >>>    	/* Calculates number of queues assigned to device */
> >>>    	dev_info->max_num_queues = 0;
> >>> diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> >>> b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> >>> index 21d3529..2a330c4 100644
> >>> --- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> >>> +++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> >>> @@ -645,6 +645,7 @@ struct __rte_cache_aligned fpga_queue {
> >>>    	dev_info->capabilities = bbdev_capabilities;
> >>>    	dev_info->cpu_flag_reqs = NULL;
> >>>    	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> >>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>
> >>>    	/* Calculates number of queues assigned to device */
> >>>    	dev_info->max_num_queues = 0;
> >>> diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c
> >>> b/drivers/baseband/la12xx/bbdev_la12xx.c
> >>> index 4d1bd16..c1f88c6 100644
> >>> --- a/drivers/baseband/la12xx/bbdev_la12xx.c
> >>> +++ b/drivers/baseband/la12xx/bbdev_la12xx.c
> >>> @@ -100,6 +100,7 @@ struct bbdev_la12xx_params {
> >>>    	dev_info->capabilities = bbdev_capabilities;
> >>>    	dev_info->cpu_flag_reqs = NULL;
> >>>    	dev_info->min_alignment = 64;
> >>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>
> >>>    	rte_bbdev_log_debug("got device info from %u", dev->data-
> >>> dev_id);
> >>>    }
> >>> diff --git a/drivers/baseband/null/bbdev_null.c
> >>> b/drivers/baseband/null/bbdev_null.c
> >>> index 248e129..94a1976 100644
> >>> --- a/drivers/baseband/null/bbdev_null.c
> >>> +++ b/drivers/baseband/null/bbdev_null.c
> >>> @@ -82,6 +82,7 @@ struct bbdev_queue {
> >>>    	 * here for code completeness.
> >>>    	 */
> >>>    	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> >>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>
> >>>    	rte_bbdev_log_debug("got device info from %u", dev->data-
> >>> dev_id);
> >>>    }
> >>> diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> >>> b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> >>> index af7bc41..dbc5524 100644
> >>> --- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> >>> +++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> >>> @@ -254,6 +254,7 @@ struct turbo_sw_queue {
> >>>    	dev_info->min_alignment = 64;
> >>>    	dev_info->harq_buffer_size = 0;
> >>>    	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> >>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>
> >>>    	rte_bbdev_log_debug("got device info from %u\n", dev->data-
> >>> dev_id);
> >>>    }
> >>> diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c index
> >>> 4da8047..38630a2 100644
> >>> --- a/lib/bbdev/rte_bbdev.c
> >>> +++ b/lib/bbdev/rte_bbdev.c
> >>> @@ -1133,3 +1133,25 @@ struct rte_mempool *
> >>>    	rte_bbdev_log(ERR, "Invalid operation type");
> >>>    	return NULL;
> >>>    }
> >>> +
> >>> +const char *
> >>> +rte_bbdev_device_status_str(enum rte_bbdev_device_status status) {
> >>> +	static const char * const dev_sta_string[] = {
> >>> +		"RTE_BBDEV_DEV_NOSTATUS",
> >>> +		"RTE_BBDEV_DEV_NOT_SUPPORTED",
> >>> +		"RTE_BBDEV_DEV_RESET",
> >>> +		"RTE_BBDEV_DEV_CONFIGURED",
> >>> +		"RTE_BBDEV_DEV_ACTIVE",
> >>> +		"RTE_BBDEV_DEV_FATAL_ERR",
> >>> +		"RTE_BBDEV_DEV_RESTART_REQ",
> >>> +		"RTE_BBDEV_DEV_RECONFIG_REQ",
> >>> +		"RTE_BBDEV_DEV_CORRECT_ERR",
> >>> +	};
> >>> +
> >>> +	if (status < sizeof(dev_sta_string) / sizeof(char *))
> >>> +		return dev_sta_string[status];
> >>> +
> >>> +	rte_bbdev_log(ERR, "Invalid device status");
> >>> +	return NULL;
> >>> +}
> >>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
> >>> b88c881..9b1ffa4 100644
> >>> --- a/lib/bbdev/rte_bbdev.h
> >>> +++ b/lib/bbdev/rte_bbdev.h
> >>> @@ -223,6 +223,21 @@ struct rte_bbdev_queue_conf {
> >>>    int
> >>>    rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
> >>>
> >>> +/**
> >>> + * Flags indicate the status of the device  */ enum
> >>> +rte_bbdev_device_status {
> >>> +	RTE_BBDEV_DEV_NOSTATUS,        /**< Nothing being reported */
> >>> +	RTE_BBDEV_DEV_NOT_SUPPORTED,   /**< Device status is not
> >> supported on the PMD */
> >>> +	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-
> >> configured state */
> >>> +	RTE_BBDEV_DEV_CONFIGURED,      /**< Device is configured and
> >> ready to use */
> >>> +	RTE_BBDEV_DEV_ACTIVE,          /**< Device is configured and VF is
> >> being used */
> >>> +	RTE_BBDEV_DEV_FATAL_ERR,       /**< Device has hit a fatal
> >> uncorrectable error */
> >>> +	RTE_BBDEV_DEV_RESTART_REQ,     /**< Device requires application
> >> to restart */
> >>> +	RTE_BBDEV_DEV_RECONFIG_REQ,    /**< Device requires
> >> application to reconfigure queues */
> >>> +	RTE_BBDEV_DEV_CORRECT_ERR,     /**< Warning of a correctable
> >> error event happened */
> >>> +};
> >>
> >> I don't have a strong opinion on this, but I think NOT_SUPPORTED
> >> should be a special value. If you want to keep 0 value for NOSTATUS,
> >> maybe you could
> >> do:
> >>
> >> enum rte_bbdev_device_status {
> >> 	RTE_BBDEV_DEV_NOT_SUPPORTED = -1,   /**< Device status is not
> >> supported
> >> on the PMD */
> >> 	RTE_BBDEV_DEV_NOSTATUS = 0,        /**< Nothing being reported
> >> */
> >> 	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-
> >> configured
> >> state */
> >> ...
> >
> > Thanks Maxime. My concern is that I am upstreaming in parallel in
> pf_bb_config in parallel hence would like to keep it unchanged if possible.
> > Given you don’t have a strong opinion is that okay to keep as is? Or I can
> force special value 1 for NOT_SUPPORTED so that this is explicitly defined. But
> really enum should always be used.
> 
> I don't understand. It should not have any impact on pf_bb_config, given
> pf_bb_config does not use DPDK.
> 
> Maxime

That device status is being shared from pf_bb_config to the bbdev PMD through PF2VF communications, hence they share that same enum. 


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v7 0/7] bbdev changes for 22.11
  2022-06-17 18:37     ` [PATCH v2 5/5] bbdev: add new operation for FFT processing Nicolas Chautru
                         ` (3 preceding siblings ...)
  2022-08-25 18:24       ` [PATCH v6 " Nicolas Chautru
@ 2022-08-29 18:07       ` Nicolas Chautru
  2022-08-29 18:07         ` [PATCH v7 1/7] bbdev: allow operation type enum for growth Nicolas Chautru
                           ` (8 more replies)
  2022-09-21 21:02       ` [PATCH v8 " Nic Chautru
                         ` (4 subsequent siblings)
  9 siblings, 9 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-08-29 18:07 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, mingshan.zhang, Nicolas Chautru

v7: couple of typos in documentation spotted by Maxime. Thanks.
v6: added one comment in commit 2/7 suggested by Maxime.
v5: update base on review from Tom Rix. Number of typos reported and resolved,
removed the commit related to rw_lock for now, added a commit for
code clean up from review, resolved one rebase issue between 2 commits, used size of array for some bound check implementation. Thanks. 
v4: update to the last 2 commits to include function to print the queue status and a fix to the rte_lock within the wrong structure
v3: update to device status info to also use padded size for the related array.
Adding also 2 additionals commits to allow the API struc to expose more information related to queues corner cases/warning as well as an optional rw lock.
Hemant, Maxime, this is planned for DPDK 21.11 but would like review/ack early is possible to get this applied earlier and due to time off this summer.
Thanks
Nic

Nicolas Chautru (7):
  bbdev: allow operation type enum for growth
  bbdev: add device status info
  bbdev: add device info on queue topology
  drivers/baseband: update PMDs to expose queue per operation
  bbdev: add new operation for FFT processing
  bbdev: add queue related warning and status information
  bbdev: remove unnecessary if-check

 app/test-bbdev/test_bbdev.c                        |   2 +-
 app/test-bbdev/test_bbdev_perf.c                   |   6 +-
 doc/guides/prog_guide/bbdev.rst                    | 130 +++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.c           |  30 ++--
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |   9 ++
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |   9 ++
 drivers/baseband/la12xx/bbdev_la12xx.c             |  10 +-
 drivers/baseband/null/bbdev_null.c                 |   1 +
 drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  12 ++
 examples/bbdev_app/main.c                          |   2 +-
 lib/bbdev/rte_bbdev.c                              |  57 +++++++-
 lib/bbdev/rte_bbdev.h                              | 149 +++++++++++++++++++-
 lib/bbdev/rte_bbdev_op.h                           | 156 ++++++++++++++++++++-
 lib/bbdev/version.map                              |  12 ++
 14 files changed, 556 insertions(+), 29 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v7 1/7] bbdev: allow operation type enum for growth
  2022-08-29 18:07       ` [PATCH v7 0/7] bbdev changes for 22.11 Nicolas Chautru
@ 2022-08-29 18:07         ` Nicolas Chautru
  2022-08-29 18:07         ` [PATCH v7 2/7] bbdev: add device status info Nicolas Chautru
                           ` (7 subsequent siblings)
  8 siblings, 0 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-08-29 18:07 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, mingshan.zhang, Nicolas Chautru

Updating the enum for rte_bbdev_op_type
to allow to keep ABI compatible for enum insertion
while adding padded maximum value for array need.
Removing RTE_BBDEV_OP_TYPE_COUNT and instead exposing
RTE_BBDEV_OP_TYPE_PADDED_MAX.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 app/test-bbdev/test_bbdev.c      | 2 +-
 app/test-bbdev/test_bbdev_perf.c | 4 ++--
 examples/bbdev_app/main.c        | 2 +-
 lib/bbdev/rte_bbdev.c            | 8 +++++---
 lib/bbdev/rte_bbdev_op.h         | 2 +-
 5 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/app/test-bbdev/test_bbdev.c b/app/test-bbdev/test_bbdev.c
index ac06d73..1063f6e 100644
--- a/app/test-bbdev/test_bbdev.c
+++ b/app/test-bbdev/test_bbdev.c
@@ -521,7 +521,7 @@ struct bbdev_testsuite_params {
 	rte_mempool_free(mp);
 
 	TEST_ASSERT((mp = rte_bbdev_op_pool_create("Test_INV",
-			RTE_BBDEV_OP_TYPE_COUNT, size, cache_size, 0)) == NULL,
+			RTE_BBDEV_OP_TYPE_PADDED_MAX, size, cache_size, 0)) == NULL,
 			"Failed test for rte_bbdev_op_pool_create: "
 			"returned value is not NULL for invalid type");
 
diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index fad3b1e..1abda2d 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -2428,13 +2428,13 @@ typedef int (test_case_function)(struct active_device *ad,
 
 	/* Find capabilities */
 	const struct rte_bbdev_op_cap *cap = info.drv.capabilities;
-	for (i = 0; i < RTE_BBDEV_OP_TYPE_COUNT; i++) {
+	do {
 		if (cap->type == test_vector.op_type) {
 			capabilities = cap;
 			break;
 		}
 		cap++;
-	}
+	} while (cap->type != RTE_BBDEV_OP_NONE);
 	TEST_ASSERT_NOT_NULL(capabilities,
 			"Couldn't find capabilities");
 
diff --git a/examples/bbdev_app/main.c b/examples/bbdev_app/main.c
index fc7e8b8..ef0ba76 100644
--- a/examples/bbdev_app/main.c
+++ b/examples/bbdev_app/main.c
@@ -1041,7 +1041,7 @@ uint16_t bbdev_parse_number(const char *mask)
 	void *sigret;
 	struct app_config_params app_params = def_app_config;
 	struct rte_mempool *ethdev_mbuf_mempool, *bbdev_mbuf_mempool;
-	struct rte_mempool *bbdev_op_pools[RTE_BBDEV_OP_TYPE_COUNT];
+	struct rte_mempool *bbdev_op_pools[RTE_BBDEV_OP_TYPE_PADDED_MAX];
 	struct lcore_conf lcore_conf[RTE_MAX_LCORE] = { {0} };
 	struct lcore_statistics lcore_stats[RTE_MAX_LCORE] = { {0} };
 	struct stats_lcore_params stats_lcore;
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index aaee7b7..4da8047 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -23,6 +23,8 @@
 
 #define DEV_NAME "BBDEV"
 
+/* Number of supported operation types */
+#define BBDEV_OP_TYPE_COUNT 5
 
 /* BBDev library logging ID */
 RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE);
@@ -890,10 +892,10 @@ struct rte_mempool *
 		return NULL;
 	}
 
-	if (type >= RTE_BBDEV_OP_TYPE_COUNT) {
+	if (type >= BBDEV_OP_TYPE_COUNT) {
 		rte_bbdev_log(ERR,
 				"Invalid op type (%u), should be less than %u",
-				type, RTE_BBDEV_OP_TYPE_COUNT);
+				type, BBDEV_OP_TYPE_COUNT);
 		return NULL;
 	}
 
@@ -1125,7 +1127,7 @@ struct rte_mempool *
 		"RTE_BBDEV_OP_LDPC_ENC",
 	};
 
-	if (op_type < RTE_BBDEV_OP_TYPE_COUNT)
+	if (op_type < BBDEV_OP_TYPE_COUNT)
 		return op_types[op_type];
 
 	rte_bbdev_log(ERR, "Invalid operation type");
diff --git a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h
index 6d56133..cd82418 100644
--- a/lib/bbdev/rte_bbdev_op.h
+++ b/lib/bbdev/rte_bbdev_op.h
@@ -748,7 +748,7 @@ enum rte_bbdev_op_type {
 	RTE_BBDEV_OP_TURBO_ENC,  /**< Turbo encode */
 	RTE_BBDEV_OP_LDPC_DEC,  /**< LDPC decode */
 	RTE_BBDEV_OP_LDPC_ENC,  /**< LDPC encode */
-	RTE_BBDEV_OP_TYPE_COUNT,  /**< Count of different op types */
+	RTE_BBDEV_OP_TYPE_PADDED_MAX = 8,  /**< Maximum op type number including padding */
 };
 
 /** Bit indexes of possible errors reported through status field */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v7 2/7] bbdev: add device status info
  2022-08-29 18:07       ` [PATCH v7 0/7] bbdev changes for 22.11 Nicolas Chautru
  2022-08-29 18:07         ` [PATCH v7 1/7] bbdev: allow operation type enum for growth Nicolas Chautru
@ 2022-08-29 18:07         ` Nicolas Chautru
  2022-08-30  2:19           ` Zhang, Mingshan
                             ` (2 more replies)
  2022-08-29 18:07         ` [PATCH v7 3/7] bbdev: add device info on queue topology Nicolas Chautru
                           ` (6 subsequent siblings)
  8 siblings, 3 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-08-29 18:07 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, mingshan.zhang, Nicolas Chautru

Added device status information, so that the PMD can
expose information related to the underlying accelerator device status.
Minor order change in structure to fit into padding hole.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c           |  1 +
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  1 +
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  1 +
 drivers/baseband/la12xx/bbdev_la12xx.c             |  1 +
 drivers/baseband/null/bbdev_null.c                 |  1 +
 drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  1 +
 lib/bbdev/rte_bbdev.c                              | 22 ++++++++++++++
 lib/bbdev/rte_bbdev.h                              | 35 ++++++++++++++++++++--
 lib/bbdev/version.map                              |  7 +++++
 9 files changed, 68 insertions(+), 2 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index de7e4bc..17ba798 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -1060,6 +1060,7 @@
 
 	/* Read and save the populated config from ACC100 registers */
 	fetch_acc100_config(dev);
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	/* This isn't ideal because it reports the maximum number of queues but
 	 * does not provide info on how many can be uplink/downlink or different
diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
index 82ae6ba..57b12af 100644
--- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
+++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
@@ -369,6 +369,7 @@
 	dev_info->capabilities = bbdev_capabilities;
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	/* Calculates number of queues assigned to device */
 	dev_info->max_num_queues = 0;
diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
index 21d3529..2a330c4 100644
--- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
+++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
@@ -645,6 +645,7 @@ struct __rte_cache_aligned fpga_queue {
 	dev_info->capabilities = bbdev_capabilities;
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	/* Calculates number of queues assigned to device */
 	dev_info->max_num_queues = 0;
diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c b/drivers/baseband/la12xx/bbdev_la12xx.c
index 4d1bd16..c1f88c6 100644
--- a/drivers/baseband/la12xx/bbdev_la12xx.c
+++ b/drivers/baseband/la12xx/bbdev_la12xx.c
@@ -100,6 +100,7 @@ struct bbdev_la12xx_params {
 	dev_info->capabilities = bbdev_capabilities;
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->min_alignment = 64;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
diff --git a/drivers/baseband/null/bbdev_null.c b/drivers/baseband/null/bbdev_null.c
index 248e129..94a1976 100644
--- a/drivers/baseband/null/bbdev_null.c
+++ b/drivers/baseband/null/bbdev_null.c
@@ -82,6 +82,7 @@ struct bbdev_queue {
 	 * here for code completeness.
 	 */
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
index af7bc41..dbc5524 100644
--- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
+++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
@@ -254,6 +254,7 @@ struct turbo_sw_queue {
 	dev_info->min_alignment = 64;
 	dev_info->harq_buffer_size = 0;
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
 	rte_bbdev_log_debug("got device info from %u\n", dev->data->dev_id);
 }
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index 4da8047..38630a2 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -1133,3 +1133,25 @@ struct rte_mempool *
 	rte_bbdev_log(ERR, "Invalid operation type");
 	return NULL;
 }
+
+const char *
+rte_bbdev_device_status_str(enum rte_bbdev_device_status status)
+{
+	static const char * const dev_sta_string[] = {
+		"RTE_BBDEV_DEV_NOSTATUS",
+		"RTE_BBDEV_DEV_NOT_SUPPORTED",
+		"RTE_BBDEV_DEV_RESET",
+		"RTE_BBDEV_DEV_CONFIGURED",
+		"RTE_BBDEV_DEV_ACTIVE",
+		"RTE_BBDEV_DEV_FATAL_ERR",
+		"RTE_BBDEV_DEV_RESTART_REQ",
+		"RTE_BBDEV_DEV_RECONFIG_REQ",
+		"RTE_BBDEV_DEV_CORRECT_ERR",
+	};
+
+	if (status < sizeof(dev_sta_string) / sizeof(char *))
+		return dev_sta_string[status];
+
+	rte_bbdev_log(ERR, "Invalid device status");
+	return NULL;
+}
diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index b88c881..9b1ffa4 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -223,6 +223,21 @@ struct rte_bbdev_queue_conf {
 int
 rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
 
+/**
+ * Flags indicate the status of the device
+ */
+enum rte_bbdev_device_status {
+	RTE_BBDEV_DEV_NOSTATUS,        /**< Nothing being reported */
+	RTE_BBDEV_DEV_NOT_SUPPORTED,   /**< Device status is not supported on the PMD */
+	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-configured state */
+	RTE_BBDEV_DEV_CONFIGURED,      /**< Device is configured and ready to use */
+	RTE_BBDEV_DEV_ACTIVE,          /**< Device is configured and VF is being used */
+	RTE_BBDEV_DEV_FATAL_ERR,       /**< Device has hit a fatal uncorrectable error */
+	RTE_BBDEV_DEV_RESTART_REQ,     /**< Device requires application to restart */
+	RTE_BBDEV_DEV_RECONFIG_REQ,    /**< Device requires application to reconfigure queues */
+	RTE_BBDEV_DEV_CORRECT_ERR,     /**< Warning of a correctable error event happened */
+};
+
 /** Device statistics. */
 struct rte_bbdev_stats {
 	uint64_t enqueued_count;  /**< Count of all operations enqueued */
@@ -285,12 +300,14 @@ struct rte_bbdev_driver_info {
 	/** Set if device supports per-queue interrupts */
 	bool queue_intr_supported;
 	/** Minimum alignment of buffers, in bytes */
-	uint16_t min_alignment;
-	/** HARQ memory available in kB */
+	/** Device Status */
+	enum rte_bbdev_device_status device_status;
 	uint32_t harq_buffer_size;
 	/** Byte endianness (RTE_BIG_ENDIAN/RTE_LITTLE_ENDIAN) supported
 	 *  for input/output data
 	 */
+	uint16_t min_alignment;
+	/** HARQ memory available in kB */
 	uint8_t data_endianness;
 	/** Default queue configuration used if none is supplied  */
 	struct rte_bbdev_queue_conf default_queue_conf;
@@ -827,6 +844,20 @@ typedef void (*rte_bbdev_cb_fn)(uint16_t dev_id,
 rte_bbdev_queue_intr_ctl(uint16_t dev_id, uint16_t queue_id, int epfd, int op,
 		void *data);
 
+/**
+ * Converts device status from enum to string
+ *
+ * @param status
+ *   Device status as enum
+ *
+ * @returns
+ *   Operation type as string or NULL if op_type is invalid
+ *
+ */
+__rte_experimental
+const char*
+rte_bbdev_device_status_str(enum rte_bbdev_device_status status);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map
index cce3f3c..f0a072e 100644
--- a/lib/bbdev/version.map
+++ b/lib/bbdev/version.map
@@ -39,3 +39,10 @@ DPDK_22 {
 
 	local: *;
 };
+
+EXPERIMENTAL {
+	global:
+
+	# added in 22.11
+	rte_bbdev_device_status_str;
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v7 3/7] bbdev: add device info on queue topology
  2022-08-29 18:07       ` [PATCH v7 0/7] bbdev changes for 22.11 Nicolas Chautru
  2022-08-29 18:07         ` [PATCH v7 1/7] bbdev: allow operation type enum for growth Nicolas Chautru
  2022-08-29 18:07         ` [PATCH v7 2/7] bbdev: add device status info Nicolas Chautru
@ 2022-08-29 18:07         ` Nicolas Chautru
  2022-08-29 18:07         ` [PATCH v7 4/7] drivers/baseband: update PMDs to expose queue per operation Nicolas Chautru
                           ` (5 subsequent siblings)
  8 siblings, 0 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-08-29 18:07 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, mingshan.zhang, Nicolas Chautru

Adding more options in the API to expose the number
of queues exposed and related priority.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/bbdev/rte_bbdev.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index 9b1ffa4..ac941d6 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -289,6 +289,10 @@ struct rte_bbdev_driver_info {
 
 	/** Maximum number of queues supported by the device */
 	unsigned int max_num_queues;
+	/** Maximum number of queues supported per operation type */
+	unsigned int num_queues[RTE_BBDEV_OP_TYPE_PADDED_MAX];
+	/** Priority level supported per operation type */
+	unsigned int queue_priority[RTE_BBDEV_OP_TYPE_PADDED_MAX];
 	/** Queue size limit (queue size must also be power of 2) */
 	uint32_t queue_size_lim;
 	/** Set if device off-loads operation to hardware  */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v7 4/7] drivers/baseband: update PMDs to expose queue per operation
  2022-08-29 18:07       ` [PATCH v7 0/7] bbdev changes for 22.11 Nicolas Chautru
                           ` (2 preceding siblings ...)
  2022-08-29 18:07         ` [PATCH v7 3/7] bbdev: add device info on queue topology Nicolas Chautru
@ 2022-08-29 18:07         ` Nicolas Chautru
  2022-08-30  4:44           ` Hemant Agrawal
  2022-09-21 19:00           ` [EXT] " Akhil Goyal
  2022-08-29 18:07         ` [PATCH v7 5/7] bbdev: add new operation for FFT processing Nicolas Chautru
                           ` (4 subsequent siblings)
  8 siblings, 2 replies; 174+ messages in thread
From: Nicolas Chautru @ 2022-08-29 18:07 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, mingshan.zhang, Nicolas Chautru

Add support in existing bbdev PMDs for the explicit number of queues
and priority for each operation type configured on the device.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c           | 29 +++++++++++++---------
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  8 ++++++
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  8 ++++++
 drivers/baseband/la12xx/bbdev_la12xx.c             |  7 ++++++
 drivers/baseband/turbo_sw/bbdev_turbo_software.c   | 11 ++++++++
 5 files changed, 51 insertions(+), 12 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 17ba798..f967e3f 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -966,6 +966,7 @@
 		struct rte_bbdev_driver_info *dev_info)
 {
 	struct acc100_device *d = dev->data->dev_private;
+	int i;
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
 		{
@@ -1062,19 +1063,23 @@
 	fetch_acc100_config(dev);
 	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
-	/* This isn't ideal because it reports the maximum number of queues but
-	 * does not provide info on how many can be uplink/downlink or different
-	 * priorities
-	 */
-	dev_info->max_num_queues =
-			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
-			d->acc100_conf.q_dl_5g.num_qgroups +
-			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
-			d->acc100_conf.q_ul_5g.num_qgroups +
-			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
-			d->acc100_conf.q_dl_4g.num_qgroups +
-			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
+	/* Expose number of queues */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = d->acc100_conf.q_ul_4g.num_aqs_per_groups *
 			d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = d->acc100_conf.q_dl_4g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_4g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = d->acc100_conf.q_ul_5g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_5g.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = d->acc100_conf.q_dl_5g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_5g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = d->acc100_conf.q_dl_4g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = d->acc100_conf.q_ul_5g.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = d->acc100_conf.q_dl_5g.num_qgroups;
+	dev_info->max_num_queues = 0;
+	for (i = RTE_BBDEV_OP_TURBO_DEC; i <= RTE_BBDEV_OP_LDPC_ENC; i++)
+		dev_info->max_num_queues += dev_info->num_queues[i];
 	dev_info->queue_size_lim = ACC100_MAX_QUEUE_DEPTH;
 	dev_info->hardware_accelerated = true;
 	dev_info->max_dl_queue_priority =
diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
index 57b12af..b4982af 100644
--- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
+++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
@@ -379,6 +379,14 @@
 		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
 			dev_info->max_num_queues++;
 	}
+	/* Expose number of queue per operation type */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = dev_info->max_num_queues / 2;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
 }
 
 /**
diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
index 2a330c4..dc7f479 100644
--- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
+++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
@@ -655,6 +655,14 @@ struct __rte_cache_aligned fpga_queue {
 		if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
 			dev_info->max_num_queues++;
 	}
+	/* Expose number of queue per operation type */
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = dev_info->max_num_queues / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = 0;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = 1;
 }
 
 /**
diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c b/drivers/baseband/la12xx/bbdev_la12xx.c
index c1f88c6..e99ea9a 100644
--- a/drivers/baseband/la12xx/bbdev_la12xx.c
+++ b/drivers/baseband/la12xx/bbdev_la12xx.c
@@ -102,6 +102,13 @@ struct bbdev_la12xx_params {
 	dev_info->min_alignment = 64;
 	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
+	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = LA12XX_MAX_QUEUES / 2;
+	dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = LA12XX_MAX_QUEUES / 2;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
+	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
 	rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
 
diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
index dbc5524..647e706 100644
--- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
+++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
@@ -256,6 +256,17 @@ struct turbo_sw_queue {
 	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
 	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
+	const struct rte_bbdev_op_cap *op_cap = bbdev_capabilities;
+	int num_op_type = 0;
+	for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
+		num_op_type++;
+	op_cap = bbdev_capabilities;
+	if (num_op_type > 0) {
+		int num_queue_per_type = dev_info->max_num_queues / num_op_type;
+		for (; op_cap->type != RTE_BBDEV_OP_NONE; ++op_cap)
+			dev_info->num_queues[op_cap->type] = num_queue_per_type;
+	}
+
 	rte_bbdev_log_debug("got device info from %u\n", dev->data->dev_id);
 }
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v7 5/7] bbdev: add new operation for FFT processing
  2022-08-29 18:07       ` [PATCH v7 0/7] bbdev changes for 22.11 Nicolas Chautru
                           ` (3 preceding siblings ...)
  2022-08-29 18:07         ` [PATCH v7 4/7] drivers/baseband: update PMDs to expose queue per operation Nicolas Chautru
@ 2022-08-29 18:07         ` Nicolas Chautru
  2022-09-21 19:14           ` [EXT] " Akhil Goyal
  2022-08-29 18:07         ` [PATCH v7 6/7] bbdev: add queue related warning and status information Nicolas Chautru
                           ` (3 subsequent siblings)
  8 siblings, 1 reply; 174+ messages in thread
From: Nicolas Chautru @ 2022-08-29 18:07 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, mingshan.zhang, Nicolas Chautru

Extension of bbdev operation to support FFT based operations.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 doc/guides/prog_guide/bbdev.rst | 130 +++++++++++++++++++++++++++++++++++
 lib/bbdev/rte_bbdev.c           |  10 ++-
 lib/bbdev/rte_bbdev.h           |  76 ++++++++++++++++++++
 lib/bbdev/rte_bbdev_op.h        | 149 ++++++++++++++++++++++++++++++++++++++++
 lib/bbdev/version.map           |   4 ++
 5 files changed, 368 insertions(+), 1 deletion(-)

diff --git a/doc/guides/prog_guide/bbdev.rst b/doc/guides/prog_guide/bbdev.rst
index 70fa01a..5dcc7b5 100644
--- a/doc/guides/prog_guide/bbdev.rst
+++ b/doc/guides/prog_guide/bbdev.rst
@@ -1118,6 +1118,136 @@ Figure :numref:`figure_turbo_tb_decode` above
 showing the Turbo decoding of CBs using BBDEV interface in TB-mode
 is also valid for LDPC decode.
 
+BBDEV FFT Operation
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This operation allows to run a combination of DFT and/or IDFT and/or time-domain windowing.
+These can be used in a modular fashion (using bypass modes) or as a processing pipeline
+which can be used for FFT-based baseband signal processing.
+In more details it allows :
+- to process the data first through an IDFT of adjustable size and padding;
+- to perform the windowing as a programmable cyclic shift offset of the data followed by a
+pointwise multiplication by a time domain window;
+- to process the related data through a DFT of adjustable size and de-padding for each such cyclic
+shift output.
+
+A flexible number of Rx antennas are being processed in parallel with the same configuration.
+The API allows more generally for flexibility in what the PMD may support (capability flags) and
+flexibility to adjust some of the parameters of the processing.
+
+The operation/capability flags that can be set for each FFT operation are given below.
+
+  **NOTE:** The actual operation flags that may be used with a specific
+  BBDEV PMD are dependent on the driver capabilities as reported via
+  ``rte_bbdev_info_get()``, and may be a subset of those below.
+
++--------------------------------------------------------------------+
+|Description of FFT capability flags                                 |
++====================================================================+
+|RTE_BBDEV_FFT_WINDOWING                                             |
+| Set to enable/support windowing in time domain                     |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_CS_ADJUSTMENT                                         |
+| Set to enable/support  the cyclic shift time offset adjustment     |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_DFT_BYPASS                                            |
+| Set to bypass the DFT and use directly the IDFT as an option       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_IDFT_BYPASS                                           |
+| Set to bypass the IDFT and use directly the DFT as an option       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_WINDOWING_BYPASS                                      |
+| Set to bypass the time domain windowing  as an option              |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_POWER_MEAS                                            |
+| Set to provide an optional power measurement of the DFT output     |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_FP16_INPUT                                            |
+| Set if the input data shall use FP16 format instead of INT16       |
++--------------------------------------------------------------------+
+|RTE_BBDEV_FFT_FP16_OUTPUT                                           |
+| Set if the output data shall use FP16 format instead of INT16      |
++--------------------------------------------------------------------+
+
+The structure passed for each FFT operation is given below,
+with the operation flags forming a bitmask in the ``op_flags`` field.
+
+.. code-block:: c
+
+    struct rte_bbdev_op_fft {
+        struct rte_bbdev_op_data base_input;
+        struct rte_bbdev_op_data base_output;
+        struct rte_bbdev_op_data power_meas_output;
+        uint32_t op_flags;
+        uint16_t input_sequence_size;
+        uint16_t input_leading_padding;
+        uint16_t output_sequence_size;
+        uint16_t output_leading_depadding;
+        uint8_t window_index[RTE_BBDEV_MAX_CS_2];
+        uint16_t cs_bitmap;
+        uint8_t num_antennas_log2;
+        uint8_t idft_log2;
+        uint8_t dft_log2;
+        int8_t cs_time_adjustment;
+        int8_t idft_shift;
+        int8_t dft_shift;
+        uint16_t ncs_reciprocal;
+        uint16_t power_shift;
+        uint16_t fp16_exp_adjust;
+    };
+
+The FFT parameters are set out in the table below.
+
++-------------------------+--------------------------------------------------------------+
+|Parameter                |Description                                                   |
++=========================+==============================================================+
+|base_input               |input data                                                    |
++-------------------------+--------------------------------------------------------------+
+|base_output              |output data                                                   |
++-------------------------+--------------------------------------------------------------+
+|power_meas_output        |optional output data with power measurement on DFT output     |
++-------------------------+--------------------------------------------------------------+
+|op_flags                 |bitmask of all active operation capabilities                  |
++-------------------------+--------------------------------------------------------------+
+|input_sequence_size      |size of the input sequence in 32-bits points per antenna      |
++-------------------------+--------------------------------------------------------------+
+|input_leading_padding    |number of points padded at the start of input data            |
++-------------------------+--------------------------------------------------------------+
+|output_sequence_size     |size of the output sequence per antenna and cyclic shift      |
++-------------------------+--------------------------------------------------------------+
+|output_leading_depadding |number of points de-padded at the start of output data        |
++-------------------------+--------------------------------------------------------------+
+|window_index             |optional windowing profile index used for each cyclic shift   |
++-------------------------+--------------------------------------------------------------+
+|cs_bitmap                |bitmap of the cyclic shift output requested (LSB for index 0) |
++-------------------------+--------------------------------------------------------------+
+|num_antennas_log2        |number of antennas as a log2 (10 maps to 1024...)             |
++-------------------------+--------------------------------------------------------------+
+|idft_log2                |iDFT size as a log2                                           |
++-------------------------+--------------------------------------------------------------+
+|dft_log2                 |DFT size as a log2                                            |
++-------------------------+--------------------------------------------------------------+
+|cs_time_adjustment       |adjustment of time position of all the cyclic shift output    |
++-------------------------+--------------------------------------------------------------+
+|idft_shift               |shift down of signal level post iDFT                          |
++-------------------------+--------------------------------------------------------------+
+|dft_shift                |shift down of signal level post DFT                           |
++-------------------------+--------------------------------------------------------------+
+|ncs_reciprocal           |inverse of max number of CS normalized to 15b (ie. 231 for 12)|
++-------------------------+--------------------------------------------------------------+
+|power_shift              |shift down of level of power measurement when enabled         |
++-------------------------+--------------------------------------------------------------+
+|fp16_exp_adjust          |value added to FP16 exponent at conversion from INT16         |
++-------------------------+--------------------------------------------------------------+
+
+The mbuf input ``base_input`` is mandatory for all BBDEV PMDs and is the
+incoming data for the processing. Its size may not fit into an actual mbuf, but the
+structure is used to pass iova address.
+The mbuf output ``output`` is mandatory and is output of the FFT processing chain.
+Each point is a complex number of 32bits : either as 2 INT16 or as 2 FP16 based when the option
+supported.
+The data layout is based on contiguous concatenation of output data first by cyclic shift then
+by antenna.
 
 Sample code
 -----------
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index 38630a2..9d65ba8 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -24,7 +24,7 @@
 #define DEV_NAME "BBDEV"
 
 /* Number of supported operation types */
-#define BBDEV_OP_TYPE_COUNT 5
+#define BBDEV_OP_TYPE_COUNT 6
 
 /* BBDev library logging ID */
 RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE);
@@ -852,6 +852,9 @@ struct rte_bbdev *
 	case RTE_BBDEV_OP_LDPC_ENC:
 		result = sizeof(struct rte_bbdev_enc_op);
 		break;
+	case RTE_BBDEV_OP_FFT:
+		result = sizeof(struct rte_bbdev_fft_op);
+		break;
 	default:
 		break;
 	}
@@ -875,6 +878,10 @@ struct rte_bbdev *
 		struct rte_bbdev_enc_op *op = element;
 		memset(op, 0, mempool->elt_size);
 		op->mempool = mempool;
+	} else if (type == RTE_BBDEV_OP_FFT) {
+		struct rte_bbdev_fft_op *op = element;
+		memset(op, 0, mempool->elt_size);
+		op->mempool = mempool;
 	}
 }
 
@@ -1125,6 +1132,7 @@ struct rte_mempool *
 		"RTE_BBDEV_OP_TURBO_ENC",
 		"RTE_BBDEV_OP_LDPC_DEC",
 		"RTE_BBDEV_OP_LDPC_ENC",
+		"RTE_BBDEV_OP_FFT",
 	};
 
 	if (op_type < BBDEV_OP_TYPE_COUNT)
diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index ac941d6..ed528b8 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -401,6 +401,12 @@ typedef uint16_t (*rte_bbdev_enqueue_dec_ops_t)(
 		struct rte_bbdev_dec_op **ops,
 		uint16_t num);
 
+/** @internal Enqueue fft operations for processing on queue of a device. */
+typedef uint16_t (*rte_bbdev_enqueue_fft_ops_t)(
+		struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_fft_op **ops,
+		uint16_t num);
+
 /** @internal Dequeue encode operations from a queue of a device. */
 typedef uint16_t (*rte_bbdev_dequeue_enc_ops_t)(
 		struct rte_bbdev_queue_data *q_data,
@@ -411,6 +417,11 @@ typedef uint16_t (*rte_bbdev_dequeue_dec_ops_t)(
 		struct rte_bbdev_queue_data *q_data,
 		struct rte_bbdev_dec_op **ops, uint16_t num);
 
+/** @internal Dequeue fft operations from a queue of a device. */
+typedef uint16_t (*rte_bbdev_dequeue_fft_ops_t)(
+		struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_fft_op **ops, uint16_t num);
+
 #define RTE_BBDEV_NAME_MAX_LEN  64  /**< Max length of device name */
 
 /**
@@ -459,6 +470,10 @@ struct __rte_cache_aligned rte_bbdev {
 	rte_bbdev_dequeue_enc_ops_t dequeue_ldpc_enc_ops;
 	/** Dequeue decode function */
 	rte_bbdev_dequeue_dec_ops_t dequeue_ldpc_dec_ops;
+	/** Enqueue FFT function */
+	rte_bbdev_enqueue_fft_ops_t enqueue_fft_ops;
+	/** Dequeue FFT function */
+	rte_bbdev_dequeue_fft_ops_t dequeue_fft_ops;
 	const struct rte_bbdev_ops *dev_ops;  /**< Functions exported by PMD */
 	struct rte_bbdev_data *data;  /**< Pointer to device data */
 	enum rte_bbdev_state state;  /**< If device is currently used or not */
@@ -591,6 +606,36 @@ struct __rte_cache_aligned rte_bbdev {
 	return dev->enqueue_ldpc_dec_ops(q_data, ops, num_ops);
 }
 
+/**
+ * Enqueue a burst of fft operations to a queue of the device.
+ * This functions only enqueues as many operations as currently possible and
+ * does not block until @p num_ops entries in the queue are available.
+ * This function does not provide any error notification to avoid the
+ * corresponding overhead.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_id
+ *   The index of the queue.
+ * @param ops
+ *   Pointer array containing operations to be enqueued Must have at least
+ *   @p num_ops entries
+ * @param num_ops
+ *   The maximum number of operations to enqueue.
+ *
+ * @return
+ *   The number of operations actually enqueued (this is the number of processed
+ *   entries in the @p ops array).
+ */
+__rte_experimental
+static inline uint16_t
+rte_bbdev_enqueue_fft_ops(uint16_t dev_id, uint16_t queue_id,
+		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
+{
+	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
+	struct rte_bbdev_queue_data *q_data = &dev->data->queues[queue_id];
+	return dev->enqueue_fft_ops(q_data, ops, num_ops);
+}
 
 /**
  * Dequeue a burst of processed encode operations from a queue of the device.
@@ -716,6 +761,37 @@ struct __rte_cache_aligned rte_bbdev {
 	return dev->dequeue_ldpc_dec_ops(q_data, ops, num_ops);
 }
 
+/**
+ * Dequeue a burst of fft operations from a queue of the device.
+ * This functions returns only the current contents of the queue, and does not
+ * block until @ num_ops is available.
+ * This function does not provide any error notification to avoid the
+ * corresponding overhead.
+ *
+ * @param dev_id
+ *   The identifier of the device.
+ * @param queue_id
+ *   The index of the queue.
+ * @param ops
+ *   Pointer array where operations will be dequeued to. Must have at least
+ *   @p num_ops entries
+ * @param num_ops
+ *   The maximum number of operations to dequeue.
+ *
+ * @return
+ *   The number of operations actually dequeued (this is the number of entries
+ *   copied into the @p ops array).
+ */
+__rte_experimental
+static inline uint16_t
+rte_bbdev_dequeue_fft_ops(uint16_t dev_id, uint16_t queue_id,
+		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
+{
+	struct rte_bbdev *dev = &rte_bbdev_devices[dev_id];
+	struct rte_bbdev_queue_data *q_data = &dev->data->queues[queue_id];
+	return dev->dequeue_fft_ops(q_data, ops, num_ops);
+}
+
 /** Definitions of device event types */
 enum rte_bbdev_event_type {
 	RTE_BBDEV_EVENT_UNKNOWN,  /**< unknown event type */
diff --git a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h
index cd82418..afa1a71 100644
--- a/lib/bbdev/rte_bbdev_op.h
+++ b/lib/bbdev/rte_bbdev_op.h
@@ -47,6 +47,8 @@
 #define RTE_BBDEV_TURBO_MAX_CODE_BLOCKS (64)
 /* LDPC:  Maximum number of Code Blocks in Transport Block.*/
 #define RTE_BBDEV_LDPC_MAX_CODE_BLOCKS (256)
+/* 12 CS maximum */
+#define RTE_BBDEV_MAX_CS_2 (6)
 
 /** Flags for turbo decoder operation and capability structure */
 enum rte_bbdev_op_td_flag_bitmasks {
@@ -211,6 +213,26 @@ enum rte_bbdev_op_ldpcenc_flag_bitmasks {
 	RTE_BBDEV_LDPC_ENC_CONCATENATION = (1ULL << 7)
 };
 
+/** Flags for DFT operation and capability structure */
+enum rte_bbdev_op_fft_flag_bitmasks {
+	/** Flexible windowing capability */
+	RTE_BBDEV_FFT_WINDOWING = (1ULL << 0),
+	/** Flexible adjustment of Cyclic Shift time offset */
+	RTE_BBDEV_FFT_CS_ADJUSTMENT = (1ULL << 1),
+	/** Set for bypass the DFT and get directly into iDFT input */
+	RTE_BBDEV_FFT_DFT_BYPASS = (1ULL << 2),
+	/** Set for bypass the IDFT and get directly the DFT output */
+	RTE_BBDEV_FFT_IDFT_BYPASS = (1ULL << 3),
+	/** Set for bypass time domain windowing */
+	RTE_BBDEV_FFT_WINDOWING_BYPASS = (1ULL << 4),
+	/** Set for optional power measurement on DFT output */
+	RTE_BBDEV_FFT_POWER_MEAS = (1ULL << 5),
+	/** Set if the input data used FP16 format */
+	RTE_BBDEV_FFT_FP16_INPUT = (1ULL << 6),
+	/**  Set if the output data uses FP16 format  */
+	RTE_BBDEV_FFT_FP16_OUTPUT = (1ULL << 7)
+};
+
 /** Flags for the Code Block/Transport block mode  */
 enum rte_bbdev_op_cb_mode {
 	/** One operation is one or fraction of one transport block  */
@@ -689,6 +711,55 @@ struct rte_bbdev_op_ldpc_enc {
 	};
 };
 
+/** Operation structure for FFT processing.
+ *
+ * The operation processes the data for multiple antennas in a single call
+ * (.i.e for all the REs belonging to a given SRS sequence for instance)
+ *
+ * The output mbuf data structure is expected to be allocated by the
+ * application with enough room for the output data.
+ */
+struct rte_bbdev_op_fft {
+	/** Input data starting from first antenna */
+	struct rte_bbdev_op_data base_input;
+	/** Output data starting from first antenna and first cyclic shift */
+	struct rte_bbdev_op_data base_output;
+	/** Optional power measurement output data */
+	struct rte_bbdev_op_data power_meas_output;
+	/** Flags from rte_bbdev_op_fft_flag_bitmasks */
+	uint32_t op_flags;
+	/** Input sequence size in 32-bits points */
+	uint16_t input_sequence_size;
+	/** Padding at the start of the sequence */
+	uint16_t input_leading_padding;
+	/** Output sequence size in 32-bits points */
+	uint16_t output_sequence_size;
+	/** Depadding at the start of the DFT output */
+	uint16_t output_leading_depadding;
+	/** Window index being used for each cyclic shift output */
+	uint8_t window_index[RTE_BBDEV_MAX_CS_2];
+	/** Bitmap of the cyclic shift output requested */
+	uint16_t cs_bitmap;
+	/** Number of antennas as a log2 – 8 to 128 */
+	uint8_t num_antennas_log2;
+	/** iDFT size as a log2 - 32 to 2048 */
+	uint8_t idft_log2;
+	/** DFT size as a log2 - 8 to 2048 */
+	uint8_t dft_log2;
+	/** Adjustment of position of the cyclic shifts - -31 to 31 */
+	int8_t cs_time_adjustment;
+	/** iDFT shift down */
+	int8_t idft_shift;
+	/** DFT shift down */
+	int8_t dft_shift;
+	/** NCS reciprocal factor  */
+	uint16_t ncs_reciprocal;
+	/** power measurement out shift down */
+	uint16_t power_shift;
+	/** Adjust the FP6 exponent for INT<->FP16 conversion */
+	uint16_t fp16_exp_adjust;
+};
+
 /** List of the capabilities for the Turbo Decoder */
 struct rte_bbdev_op_cap_turbo_dec {
 	/** Flags from rte_bbdev_op_td_flag_bitmasks */
@@ -741,6 +812,16 @@ struct rte_bbdev_op_cap_ldpc_enc {
 	uint16_t num_buffers_dst;
 };
 
+/** List of the capabilities for the FFT */
+struct rte_bbdev_op_cap_fft {
+	/** Flags from rte_bbdev_op_fft_flag_bitmasks */
+	uint32_t capability_flags;
+	/** Num input code block buffers */
+	uint16_t num_buffers_src;
+	/** Num output code block buffers */
+	uint16_t num_buffers_dst;
+};
+
 /** Different operation types supported by the device */
 enum rte_bbdev_op_type {
 	RTE_BBDEV_OP_NONE,  /**< Dummy operation that does nothing */
@@ -748,6 +829,7 @@ enum rte_bbdev_op_type {
 	RTE_BBDEV_OP_TURBO_ENC,  /**< Turbo encode */
 	RTE_BBDEV_OP_LDPC_DEC,  /**< LDPC decode */
 	RTE_BBDEV_OP_LDPC_ENC,  /**< LDPC encode */
+	RTE_BBDEV_OP_FFT,  /**< FFT */
 	RTE_BBDEV_OP_TYPE_PADDED_MAX = 8,  /**< Maximum op type number including padding */
 };
 
@@ -791,6 +873,18 @@ struct rte_bbdev_dec_op {
 	};
 };
 
+/** Structure specifying a single fft operation */
+struct rte_bbdev_fft_op {
+	/** Status of operation that was performed */
+	int status;
+	/** Mempool which op instance is in */
+	struct rte_mempool *mempool;
+	/** Opaque pointer for user data */
+	void *opaque_data;
+	/** Contains turbo decoder specific parameters */
+	struct rte_bbdev_op_fft fft;
+};
+
 /** Operation capabilities supported by a device */
 struct rte_bbdev_op_cap {
 	enum rte_bbdev_op_type type;  /**< Type of operation */
@@ -799,6 +893,7 @@ struct rte_bbdev_op_cap {
 		struct rte_bbdev_op_cap_turbo_enc turbo_enc;
 		struct rte_bbdev_op_cap_ldpc_dec ldpc_dec;
 		struct rte_bbdev_op_cap_ldpc_enc ldpc_enc;
+		struct rte_bbdev_op_cap_fft fft;
 	} cap;  /**< Operation-type specific capabilities */
 };
 
@@ -918,6 +1013,42 @@ struct rte_mempool *
 }
 
 /**
+ * Bulk allocate fft operations from a mempool with parameter defaults reset.
+ *
+ * @param mempool
+ *   Operation mempool, created by rte_bbdev_op_pool_create().
+ * @param ops
+ *   Output array to place allocated operations
+ * @param num_ops
+ *   Number of operations to allocate
+ *
+ * @returns
+ *   - 0 on success
+ *   - EINVAL if invalid mempool is provided
+ */
+__rte_experimental
+static inline int
+rte_bbdev_fft_op_alloc_bulk(struct rte_mempool *mempool,
+		struct rte_bbdev_fft_op **ops, uint16_t num_ops)
+{
+	struct rte_bbdev_op_pool_private *priv;
+	int ret;
+
+	/* Check type */
+	priv = (struct rte_bbdev_op_pool_private *)
+			rte_mempool_get_priv(mempool);
+	if (unlikely(priv->type != RTE_BBDEV_OP_FFT))
+		return -EINVAL;
+
+	/* Get elements */
+	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
+	if (unlikely(ret < 0))
+		return ret;
+
+	return 0;
+}
+
+/**
  * Free decode operation structures that were allocated by
  * rte_bbdev_dec_op_alloc_bulk().
  * All structures must belong to the same mempool.
@@ -951,6 +1082,24 @@ struct rte_mempool *
 		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops, num_ops);
 }
 
+/**
+ * Free encode operation structures that were allocated by
+ * rte_bbdev_fft_op_alloc_bulk().
+ * All structures must belong to the same mempool.
+ *
+ * @param ops
+ *   Operation structures
+ * @param num_ops
+ *   Number of structures
+ */
+__rte_experimental
+static inline void
+rte_bbdev_fft_op_free_bulk(struct rte_bbdev_fft_op **ops, unsigned int num_ops)
+{
+	if (num_ops > 0)
+		rte_mempool_put_bulk(ops[0]->mempool, (void **)ops, num_ops);
+}
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map
index f0a072e..0cbeab3 100644
--- a/lib/bbdev/version.map
+++ b/lib/bbdev/version.map
@@ -45,4 +45,8 @@ EXPERIMENTAL {
 
 	# added in 22.11
 	rte_bbdev_device_status_str;
+	rte_bbdev_enqueue_fft_ops;
+	rte_bbdev_dequeue_fft_ops;
+	rte_bbdev_fft_op_alloc_bulk;
+	rte_bbdev_fft_op_free_bulk;
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v7 6/7] bbdev: add queue related warning and status information
  2022-08-29 18:07       ` [PATCH v7 0/7] bbdev changes for 22.11 Nicolas Chautru
                           ` (4 preceding siblings ...)
  2022-08-29 18:07         ` [PATCH v7 5/7] bbdev: add new operation for FFT processing Nicolas Chautru
@ 2022-08-29 18:07         ` Nicolas Chautru
  2022-09-21 19:21           ` [EXT] " Akhil Goyal
  2022-08-29 18:07         ` [PATCH v7 7/7] bbdev: remove unnecessary if-check Nicolas Chautru
                           ` (2 subsequent siblings)
  8 siblings, 1 reply; 174+ messages in thread
From: Nicolas Chautru @ 2022-08-29 18:07 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, mingshan.zhang, Nicolas Chautru

This allows to expose more information with regards to any
queue related failure and warning which cannot be supported
in existing API.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 app/test-bbdev/test_bbdev_perf.c |  2 ++
 lib/bbdev/rte_bbdev.c            | 19 +++++++++++++++++++
 lib/bbdev/rte_bbdev.h            | 34 ++++++++++++++++++++++++++++++++++
 lib/bbdev/version.map            |  1 +
 4 files changed, 56 insertions(+)

diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 1abda2d..653b21f 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -4360,6 +4360,8 @@ typedef int (test_case_function)(struct active_device *ad,
 	stats->dequeued_count = q_stats->dequeued_count;
 	stats->enqueue_err_count = q_stats->enqueue_err_count;
 	stats->dequeue_err_count = q_stats->dequeue_err_count;
+	stats->enqueue_warning_count = q_stats->enqueue_warning_count;
+	stats->dequeue_warning_count = q_stats->dequeue_warning_count;
 	stats->acc_offload_cycles = q_stats->acc_offload_cycles;
 
 	return 0;
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index 9d65ba8..bdd7c2f 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -721,6 +721,8 @@ struct rte_bbdev *
 		stats->dequeued_count += q_stats->dequeued_count;
 		stats->enqueue_err_count += q_stats->enqueue_err_count;
 		stats->dequeue_err_count += q_stats->dequeue_err_count;
+		stats->enqueue_warn_count += q_stats->enqueue_warn_count;
+		stats->dequeue_warn_count += q_stats->dequeue_warn_count;
 	}
 	rte_bbdev_log_debug("Got stats on %u", dev->data->dev_id);
 }
@@ -1163,3 +1165,20 @@ struct rte_mempool *
 	rte_bbdev_log(ERR, "Invalid device status");
 	return NULL;
 }
+
+const char *
+rte_bbdev_enqueue_status_str(enum rte_bbdev_enqueue_status status)
+{
+	static const char * const enq_sta_string[] = {
+		"RTE_BBDEV_ENQ_STATUS_NONE",
+		"RTE_BBDEV_ENQ_STATUS_QUEUE_FULL",
+		"RTE_BBDEV_ENQ_STATUS_RING_FULL",
+		"RTE_BBDEV_ENQ_STATUS_INVALID_OP",
+	};
+
+	if (status < sizeof(enq_sta_string) / sizeof(char *))
+		return enq_sta_string[status];
+
+	rte_bbdev_log(ERR, "Invalid enqueue status");
+	return NULL;
+}
diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index ed528b8..b7ecf94 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -224,6 +224,19 @@ struct rte_bbdev_queue_conf {
 rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
 
 /**
+ * Flags indicate the reason why a previous enqueue may not have
+ * consumed all requested operations
+ * In case of multiple reasons the latter superdes a previous one
+ */
+enum rte_bbdev_enqueue_status {
+	RTE_BBDEV_ENQ_STATUS_NONE,             /**< Nothing to report */
+	RTE_BBDEV_ENQ_STATUS_QUEUE_FULL,       /**< Not enough room in queue */
+	RTE_BBDEV_ENQ_STATUS_RING_FULL,        /**< Not enough room in ring */
+	RTE_BBDEV_ENQ_STATUS_INVALID_OP,       /**< Operation was rejected as invalid */
+	RTE_BBDEV_ENQ_STATUS_PADDED_MAX = 6,   /**< Maximum enq status number including padding */
+};
+
+/**
  * Flags indicate the status of the device
  */
 enum rte_bbdev_device_status {
@@ -246,6 +259,12 @@ struct rte_bbdev_stats {
 	uint64_t enqueue_err_count;
 	/** Total error count on operations dequeued */
 	uint64_t dequeue_err_count;
+	/** Total warning count on operations enqueued */
+	uint64_t enqueue_warn_count;
+	/** Total warning count on operations dequeued */
+	uint64_t dequeue_warn_count;
+	/** Total enqueue status count based on rte_bbdev_enqueue_status enum */
+	uint64_t enqueue_status_count[RTE_BBDEV_ENQ_STATUS_PADDED_MAX];
 	/** CPU cycles consumed by the (HW/SW) accelerator device to offload
 	 *  the enqueue request to its internal queues.
 	 *  - For a HW device this is the cycles consumed in MMIO write
@@ -386,6 +405,7 @@ struct rte_bbdev_queue_data {
 	void *queue_private;  /**< Driver-specific per-queue data */
 	struct rte_bbdev_queue_conf conf;  /**< Current configuration */
 	struct rte_bbdev_stats queue_stats;  /**< Queue statistics */
+	enum rte_bbdev_enqueue_status enqueue_status; /**< Enqueue status when op is rejected */
 	bool started;  /**< Queue state */
 };
 
@@ -938,6 +958,20 @@ typedef void (*rte_bbdev_cb_fn)(uint16_t dev_id,
 const char*
 rte_bbdev_device_status_str(enum rte_bbdev_device_status status);
 
+/**
+ * Converts queue status from enum to string
+ *
+ * @param status
+ *   Queue status as enum
+ *
+ * @returns
+ *  Queue status as string or NULL if op_type is invalid
+ *
+ */
+__rte_experimental
+const char*
+rte_bbdev_enqueue_status_str(enum rte_bbdev_enqueue_status status);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map
index 0cbeab3..f5e2dd7 100644
--- a/lib/bbdev/version.map
+++ b/lib/bbdev/version.map
@@ -45,6 +45,7 @@ EXPERIMENTAL {
 
 	# added in 22.11
 	rte_bbdev_device_status_str;
+	rte_bbdev_enqueue_status_str;
 	rte_bbdev_enqueue_fft_ops;
 	rte_bbdev_dequeue_fft_ops;
 	rte_bbdev_fft_op_alloc_bulk;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* [PATCH v7 7/7] bbdev: remove unnecessary if-check
  2022-08-29 18:07       ` [PATCH v7 0/7] bbdev changes for 22.11 Nicolas Chautru
                           ` (5 preceding siblings ...)
  2022-08-29 18:07         ` [PATCH v7 6/7] bbdev: add queue related warning and status information Nicolas Chautru
@ 2022-08-29 18:07         ` Nicolas Chautru
  2022-09-21 19:25           ` [EXT] " Akhil Goyal
  2022-08-30  4:45         ` [PATCH v7 0/7] bbdev changes for 22.11 Hemant Agrawal
  2022-09-06 16:47         ` Chautru, Nicolas
  8 siblings, 1 reply; 174+ messages in thread
From: Nicolas Chautru @ 2022-08-29 18:07 UTC (permalink / raw)
  To: dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, mingshan.zhang, Nicolas Chautru

Code clean up due to if-check not required

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/bbdev/rte_bbdev_op.h | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h
index afa1a71..386eed8 100644
--- a/lib/bbdev/rte_bbdev_op.h
+++ b/lib/bbdev/rte_bbdev_op.h
@@ -970,10 +970,8 @@ struct rte_mempool *
 
 	/* Get elements */
 	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
-	if (unlikely(ret < 0))
-		return ret;
 
-	return 0;
+	return ret;
 }
 
 /**
@@ -1006,10 +1004,8 @@ struct rte_mempool *
 
 	/* Get elements */
 	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
-	if (unlikely(ret < 0))
-		return ret;
 
-	return 0;
+	return ret;
 }
 
 /**
@@ -1035,17 +1031,14 @@ struct rte_mempool *
 	int ret;
 
 	/* Check type */
-	priv = (struct rte_bbdev_op_pool_private *)
-			rte_mempool_get_priv(mempool);
+	priv = (struct rte_bbdev_op_pool_private *) rte_mempool_get_priv(mempool);
 	if (unlikely(priv->type != RTE_BBDEV_OP_FFT))
 		return -EINVAL;
 
 	/* Get elements */
 	ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops);
-	if (unlikely(ret < 0))
-		return ret;
 
-	return 0;
+	return ret;
 }
 
 /**
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* RE: [PATCH v6 5/7] bbdev: add new operation for FFT processing
  2022-08-26 12:07           ` Maxime Coquelin
@ 2022-08-29 18:18             ` Chautru, Nicolas
  0 siblings, 0 replies; 174+ messages in thread
From: Chautru, Nicolas @ 2022-08-29 18:18 UTC (permalink / raw)
  To: Maxime Coquelin, dev, thomas, gakhil, hemant.agrawal
  Cc: trix, mdr, Richardson, Bruce, david.marchand, stephen

Hi Maxime, 

> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Friday, August 26, 2022 5:08 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
> Cc: trix@redhat.com; mdr@ashroe.eu; Richardson, Bruce
> <bruce.richardson@intel.com>; david.marchand@redhat.com;
> stephen@networkplumber.org
> Subject: Re: [PATCH v6 5/7] bbdev: add new operation for FFT processing
> 
> 
> 
> On 8/25/22 20:24, Nicolas Chautru wrote:
> > Extension of bbdev operation to support FFT based operations.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
> > ---
> >   doc/guides/prog_guide/bbdev.rst | 130
> +++++++++++++++++++++++++++++++++++
> >   lib/bbdev/rte_bbdev.c           |  10 ++-
> >   lib/bbdev/rte_bbdev.h           |  76 ++++++++++++++++++++
> >   lib/bbdev/rte_bbdev_op.h        | 149
> ++++++++++++++++++++++++++++++++++++++++
> >   lib/bbdev/version.map           |   4 ++
> >   5 files changed, 368 insertions(+), 1 deletion(-)
> >
> > diff --git a/doc/guides/prog_guide/bbdev.rst
> > b/doc/guides/prog_guide/bbdev.rst index 70fa01a..150161b 100644
> > --- a/doc/guides/prog_guide/bbdev.rst
> > +++ b/doc/guides/prog_guide/bbdev.rst
> > @@ -1118,6 +1118,136 @@ Figure :numref:`figure_turbo_tb_decode` above
> >   showing the Turbo decoding of CBs using BBDEV interface in TB-mode
> >   is also valid for LDPC decode.
> >
> > +BBDEV FFT Operation
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +This operation allows to run a combination of DFT and/or IDFT and/or time-
> domain windowing.
> > +These can be used in a modular fashion (using bypass modes) or as a
> > +processing pipeline which can be used for FFT-based baseband signal
> processing.
> > +In more details it allows :
> > +- to process the data first through an IDFT of adjustable size and
> > +padding;
> > +- to perform the windowing as a programmable cyclic shift offset of
> > +the data followed by a pointwise multiplication by a time domain
> > +window;
> > +- to process the related data through a DFT of adjustable size and
> > +depadding for each such cyclic
> 
> depadding?

This is the opposite processing of padding bits to the buffer (take out such bits either leading or trailing). 
From googling around, this is more often written as "de-padding" so will correct.

> 
> > +shift output.
> > +
> > +A flexible number of Rx antennas are being processed in parallel with the
> same configuration.
> > +The API allows more generally for flexibility in what the PMD may
> > +support (cabability flags) and
> 
> s/cabability/capability/
> 

Thanks! I will fix now

> With above typos fixed:
> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> 
> Thanks,
> Maxime


^ permalink raw reply	[flat|nested] 174+ messages in thread

* RE: [PATCH v7 2/7] bbdev: add device status info
  2022-08-29 18:07         ` [PATCH v7 2/7] bbdev: add device status info Nicolas Chautru
@ 2022-08-30  2:19           ` Zhang, Mingshan
  2022-08-30  4:43           ` Hemant Agrawal
  2022-09-21 18:54           ` [EXT] " Akhil Goyal
  2 siblings, 0 replies; 174+ messages in thread
From: Zhang, Mingshan @ 2022-08-30  2:19 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, Richardson, Bruce, david.marchand, stephen

Acked-by: Mingshan Zhang <mingshan.zhang@intel.com>

> -----Original Message-----
> From: Chautru, Nicolas <nicolas.chautru@intel.com>
> Sent: Tuesday, August 30, 2022 2:07 AM
> To: dev@dpdk.org; thomas@monjalon.net; gakhil@marvell.com;
> hemant.agrawal@nxp.com
> Cc: maxime.coquelin@redhat.com; trix@redhat.com; mdr@ashroe.eu;
> Richardson, Bruce <bruce.richardson@intel.com>;
> david.marchand@redhat.com; stephen@networkplumber.org; Zhang,
> Mingshan <mingshan.zhang@intel.com>; Chautru, Nicolas
> <nicolas.chautru@intel.com>
> Subject: [PATCH v7 2/7] bbdev: add device status info
> 
> Added device status information, so that the PMD can expose information
> related to the underlying accelerator device status.
> Minor order change in structure to fit into padding hole.
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  drivers/baseband/acc100/rte_acc100_pmd.c           |  1 +
>  drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  1 +
>  drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  1 +
>  drivers/baseband/la12xx/bbdev_la12xx.c             |  1 +
>  drivers/baseband/null/bbdev_null.c                 |  1 +
>  drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  1 +
>  lib/bbdev/rte_bbdev.c                              | 22 ++++++++++++++
>  lib/bbdev/rte_bbdev.h                              | 35 ++++++++++++++++++++--
>  lib/bbdev/version.map                              |  7 +++++
>  9 files changed, 68 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> b/drivers/baseband/acc100/rte_acc100_pmd.c
> index de7e4bc..17ba798 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -1060,6 +1060,7 @@
> 
>  	/* Read and save the populated config from ACC100 registers */
>  	fetch_acc100_config(dev);
> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> 
>  	/* This isn't ideal because it reports the maximum number of queues
> but
>  	 * does not provide info on how many can be uplink/downlink or
> different diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> index 82ae6ba..57b12af 100644
> --- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> +++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> @@ -369,6 +369,7 @@
>  	dev_info->capabilities = bbdev_capabilities;
>  	dev_info->cpu_flag_reqs = NULL;
>  	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> 
>  	/* Calculates number of queues assigned to device */
>  	dev_info->max_num_queues = 0;
> diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> index 21d3529..2a330c4 100644
> --- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> +++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> @@ -645,6 +645,7 @@ struct __rte_cache_aligned fpga_queue {
>  	dev_info->capabilities = bbdev_capabilities;
>  	dev_info->cpu_flag_reqs = NULL;
>  	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> 
>  	/* Calculates number of queues assigned to device */
>  	dev_info->max_num_queues = 0;
> diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c
> b/drivers/baseband/la12xx/bbdev_la12xx.c
> index 4d1bd16..c1f88c6 100644
> --- a/drivers/baseband/la12xx/bbdev_la12xx.c
> +++ b/drivers/baseband/la12xx/bbdev_la12xx.c
> @@ -100,6 +100,7 @@ struct bbdev_la12xx_params {
>  	dev_info->capabilities = bbdev_capabilities;
>  	dev_info->cpu_flag_reqs = NULL;
>  	dev_info->min_alignment = 64;
> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> 
>  	rte_bbdev_log_debug("got device info from %u", dev->data-
> >dev_id);  } diff --git a/drivers/baseband/null/bbdev_null.c
> b/drivers/baseband/null/bbdev_null.c
> index 248e129..94a1976 100644
> --- a/drivers/baseband/null/bbdev_null.c
> +++ b/drivers/baseband/null/bbdev_null.c
> @@ -82,6 +82,7 @@ struct bbdev_queue {
>  	 * here for code completeness.
>  	 */
>  	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> 
>  	rte_bbdev_log_debug("got device info from %u", dev->data-
> >dev_id);  } diff --git
> a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> index af7bc41..dbc5524 100644
> --- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> +++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> @@ -254,6 +254,7 @@ struct turbo_sw_queue {
>  	dev_info->min_alignment = 64;
>  	dev_info->harq_buffer_size = 0;
>  	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
> 
>  	rte_bbdev_log_debug("got device info from %u\n", dev->data-
> >dev_id);  } diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
> index 4da8047..38630a2 100644
> --- a/lib/bbdev/rte_bbdev.c
> +++ b/lib/bbdev/rte_bbdev.c
> @@ -1133,3 +1133,25 @@ struct rte_mempool *
>  	rte_bbdev_log(ERR, "Invalid operation type");
>  	return NULL;
>  }
> +
> +const char *
> +rte_bbdev_device_status_str(enum rte_bbdev_device_status status) {
> +	static const char * const dev_sta_string[] = {
> +		"RTE_BBDEV_DEV_NOSTATUS",
> +		"RTE_BBDEV_DEV_NOT_SUPPORTED",
> +		"RTE_BBDEV_DEV_RESET",
> +		"RTE_BBDEV_DEV_CONFIGURED",
> +		"RTE_BBDEV_DEV_ACTIVE",
> +		"RTE_BBDEV_DEV_FATAL_ERR",
> +		"RTE_BBDEV_DEV_RESTART_REQ",
> +		"RTE_BBDEV_DEV_RECONFIG_REQ",
> +		"RTE_BBDEV_DEV_CORRECT_ERR",
> +	};
> +
> +	if (status < sizeof(dev_sta_string) / sizeof(char *))
> +		return dev_sta_string[status];
> +
> +	rte_bbdev_log(ERR, "Invalid device status");
> +	return NULL;
> +}
> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
> b88c881..9b1ffa4 100644
> --- a/lib/bbdev/rte_bbdev.h
> +++ b/lib/bbdev/rte_bbdev.h
> @@ -223,6 +223,21 @@ struct rte_bbdev_queue_conf {  int
> rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
> 
> +/**
> + * Flags indicate the status of the device  */ enum
> +rte_bbdev_device_status {
> +	RTE_BBDEV_DEV_NOSTATUS,        /**< Nothing being reported */
> +	RTE_BBDEV_DEV_NOT_SUPPORTED,   /**< Device status is not
> supported on the PMD */
> +	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-
> configured state */
> +	RTE_BBDEV_DEV_CONFIGURED,      /**< Device is configured and
> ready to use */
> +	RTE_BBDEV_DEV_ACTIVE,          /**< Device is configured and VF is
> being used */
> +	RTE_BBDEV_DEV_FATAL_ERR,       /**< Device has hit a fatal
> uncorrectable error */
> +	RTE_BBDEV_DEV_RESTART_REQ,     /**< Device requires application
> to restart */
> +	RTE_BBDEV_DEV_RECONFIG_REQ,    /**< Device requires application
> to reconfigure queues */
> +	RTE_BBDEV_DEV_CORRECT_ERR,     /**< Warning of a correctable
> error event happened */
> +};
> +
>  /** Device statistics. */
>  struct rte_bbdev_stats {
>  	uint64_t enqueued_count;  /**< Count of all operations enqueued
> */ @@ -285,12 +300,14 @@ struct rte_bbdev_driver_info {
>  	/** Set if device supports per-queue interrupts */
>  	bool queue_intr_supported;
>  	/** Minimum alignment of buffers, in bytes */
> -	uint16_t min_alignment;
> -	/** HARQ memory available in kB */
> +	/** Device Status */
> +	enum rte_bbdev_device_status device_status;
>  	uint32_t harq_buffer_size;
>  	/** Byte endianness (RTE_BIG_ENDIAN/RTE_LITTLE_ENDIAN)
> supported
>  	 *  for input/output data
>  	 */
> +	uint16_t min_alignment;
> +	/** HARQ memory available in kB */
>  	uint8_t data_endianness;
>  	/** Default queue configuration used if none is supplied  */
>  	struct rte_bbdev_queue_conf default_queue_conf; @@ -827,6
> +844,20 @@ typedef void (*rte_bbdev_cb_fn)(uint16_t dev_id,
> rte_bbdev_queue_intr_ctl(uint16_t dev_id, uint16_t queue_id, int epfd, int
> op,
>  		void *data);
> 
> +/**
> + * Converts device status from enum to string
> + *
> + * @param status
> + *   Device status as enum
> + *
> + * @returns
> + *   Operation type as string or NULL if op_type is invalid
> + *
> + */
> +__rte_experimental
> +const char*
> +rte_bbdev_device_status_str(enum rte_bbdev_device_status status);
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map index
> cce3f3c..f0a072e 100644
> --- a/lib/bbdev/version.map
> +++ b/lib/bbdev/version.map
> @@ -39,3 +39,10 @@ DPDK_22 {
> 
>  	local: *;
>  };
> +
> +EXPERIMENTAL {
> +	global:
> +
> +	# added in 22.11
> +	rte_bbdev_device_status_str;
> +};
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v7 2/7] bbdev: add device status info
  2022-08-29 18:07         ` [PATCH v7 2/7] bbdev: add device status info Nicolas Chautru
  2022-08-30  2:19           ` Zhang, Mingshan
@ 2022-08-30  4:43           ` Hemant Agrawal
  2022-09-21 18:54           ` [EXT] " Akhil Goyal
  2 siblings, 0 replies; 174+ messages in thread
From: Hemant Agrawal @ 2022-08-30  4:43 UTC (permalink / raw)
  To: Nicolas Chautru, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, mingshan.zhang


On 8/29/2022 11:37 PM, Nicolas Chautru wrote:
> Added device status information, so that the PMD can
> expose information related to the underlying accelerator device status.
> Minor order change in structure to fit into padding hole.
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>

^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v7 4/7] drivers/baseband: update PMDs to expose queue per operation
  2022-08-29 18:07         ` [PATCH v7 4/7] drivers/baseband: update PMDs to expose queue per operation Nicolas Chautru
@ 2022-08-30  4:44           ` Hemant Agrawal
  2022-09-21 19:00           ` [EXT] " Akhil Goyal
  1 sibling, 0 replies; 174+ messages in thread
From: Hemant Agrawal @ 2022-08-30  4:44 UTC (permalink / raw)
  To: Nicolas Chautru, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, mingshan.zhang

Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>

On 8/29/2022 11:37 PM, Nicolas Chautru wrote:
> Add support in existing bbdev PMDs for the explicit number of queues
> and priority for each operation type configured on the device.
>
>   

^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v7 0/7] bbdev changes for 22.11
  2022-08-29 18:07       ` [PATCH v7 0/7] bbdev changes for 22.11 Nicolas Chautru
                           ` (6 preceding siblings ...)
  2022-08-29 18:07         ` [PATCH v7 7/7] bbdev: remove unnecessary if-check Nicolas Chautru
@ 2022-08-30  4:45         ` Hemant Agrawal
  2022-09-06 16:47         ` Chautru, Nicolas
  8 siblings, 0 replies; 174+ messages in thread
From: Hemant Agrawal @ 2022-08-30  4:45 UTC (permalink / raw)
  To: Nicolas Chautru, dev, thomas, gakhil, hemant.agrawal
  Cc: maxime.coquelin, trix, mdr, bruce.richardson, david.marchand,
	stephen, mingshan.zhang

Series-

Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>


^ permalink raw reply	[flat|nested] 174+ messages in thread

* Re: [PATCH v5 2/7] bbdev: add device status info
  2022-08-29 16:10                 ` Chautru, Nicolas
@ 2022-08-30  7:08                   ` Maxime Coquelin
  2022-08-30 19:38                     ` Chautru, Nicolas
  0 siblings, 1 reply; 174+ messages in thread
From: Maxime Coquelin @ 2022-08-30  7:08 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, thomas, gakhil, hemant.agrawal
  Cc: trix, mdr, Richardson, Bruce, david.marchand, stephen



On 8/29/22 18:10, Chautru, Nicolas wrote:
> Hi Maxime,
> 
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Friday, August 26, 2022 3:13 AM
>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
>> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
>> Cc: trix@redhat.com; mdr@ashroe.eu; Richardson, Bruce
>> <bruce.richardson@intel.com>; david.marchand@redhat.com;
>> stephen@networkplumber.org
>> Subject: Re: [PATCH v5 2/7] bbdev: add device status info
>>
>> Hi,
>>
>> On 8/25/22 20:30, Chautru, Nicolas wrote:
>>> Thanks Maxime,
>>>
>>>> -----Original Message-----
>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> Sent: Thursday, August 25, 2022 7:19 AM
>>>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
>>>> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
>>>> Cc: trix@redhat.com; mdr@ashroe.eu; Richardson, Bruce
>>>> <bruce.richardson@intel.com>; david.marchand@redhat.com;
>>>> stephen@networkplumber.org
>>>> Subject: Re: [PATCH v5 2/7] bbdev: add device status info
>>>>
>>>>
>>>>
>>>> On 7/7/22 01:28, Nicolas Chautru wrote:
>>>>> Added device status information, so that the PMD can expose
>>>>> information related to the underlying accelerator device status.
>>>>> Minor order change in structure to fit into padding hole.
>>>>>
>>>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>>>> ---
>>>>>     drivers/baseband/acc100/rte_acc100_pmd.c           |  1 +
>>>>>     drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  1 +
>>>>>     drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  1 +
>>>>>     drivers/baseband/la12xx/bbdev_la12xx.c             |  1 +
>>>>>     drivers/baseband/null/bbdev_null.c                 |  1 +
>>>>>     drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  1 +
>>>>>     lib/bbdev/rte_bbdev.c                              | 22 ++++++++++++++
>>>>>     lib/bbdev/rte_bbdev.h                              | 35 ++++++++++++++++++++--
>>>>>     lib/bbdev/version.map                              |  6 ++++
>>>>>     9 files changed, 67 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
>>>>> b/drivers/baseband/acc100/rte_acc100_pmd.c
>>>>> index de7e4bc..17ba798 100644
>>>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
>>>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
>>>>> @@ -1060,6 +1060,7 @@
>>>>>
>>>>>     	/* Read and save the populated config from ACC100 registers */
>>>>>     	fetch_acc100_config(dev);
>>>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>>>
>>>>>     	/* This isn't ideal because it reports the maximum number of
>>>>> queues
>>>> but
>>>>>     	 * does not provide info on how many can be uplink/downlink or
>>>>> different diff --git
>>>>> a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>>>> b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>>>> index 82ae6ba..57b12af 100644
>>>>> --- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>>>> +++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
>>>>> @@ -369,6 +369,7 @@
>>>>>     	dev_info->capabilities = bbdev_capabilities;
>>>>>     	dev_info->cpu_flag_reqs = NULL;
>>>>>     	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
>>>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>>>
>>>>>     	/* Calculates number of queues assigned to device */
>>>>>     	dev_info->max_num_queues = 0;
>>>>> diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>>>> b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>>>> index 21d3529..2a330c4 100644
>>>>> --- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>>>> +++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
>>>>> @@ -645,6 +645,7 @@ struct __rte_cache_aligned fpga_queue {
>>>>>     	dev_info->capabilities = bbdev_capabilities;
>>>>>     	dev_info->cpu_flag_reqs = NULL;
>>>>>     	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
>>>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>>>
>>>>>     	/* Calculates number of queues assigned to device */
>>>>>     	dev_info->max_num_queues = 0;
>>>>> diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c
>>>>> b/drivers/baseband/la12xx/bbdev_la12xx.c
>>>>> index 4d1bd16..c1f88c6 100644
>>>>> --- a/drivers/baseband/la12xx/bbdev_la12xx.c
>>>>> +++ b/drivers/baseband/la12xx/bbdev_la12xx.c
>>>>> @@ -100,6 +100,7 @@ struct bbdev_la12xx_params {
>>>>>     	dev_info->capabilities = bbdev_capabilities;
>>>>>     	dev_info->cpu_flag_reqs = NULL;
>>>>>     	dev_info->min_alignment = 64;
>>>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>>>
>>>>>     	rte_bbdev_log_debug("got device info from %u", dev->data-
>>>>> dev_id);
>>>>>     }
>>>>> diff --git a/drivers/baseband/null/bbdev_null.c
>>>>> b/drivers/baseband/null/bbdev_null.c
>>>>> index 248e129..94a1976 100644
>>>>> --- a/drivers/baseband/null/bbdev_null.c
>>>>> +++ b/drivers/baseband/null/bbdev_null.c
>>>>> @@ -82,6 +82,7 @@ struct bbdev_queue {
>>>>>     	 * here for code completeness.
>>>>>     	 */
>>>>>     	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
>>>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>>>
>>>>>     	rte_bbdev_log_debug("got device info from %u", dev->data-
>>>>> dev_id);
>>>>>     }
>>>>> diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>>>> b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>>>> index af7bc41..dbc5524 100644
>>>>> --- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>>>> +++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
>>>>> @@ -254,6 +254,7 @@ struct turbo_sw_queue {
>>>>>     	dev_info->min_alignment = 64;
>>>>>     	dev_info->harq_buffer_size = 0;
>>>>>     	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
>>>>> +	dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>>>>>
>>>>>     	rte_bbdev_log_debug("got device info from %u\n", dev->data-
>>>>> dev_id);
>>>>>     }
>>>>> diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c index
>>>>> 4da8047..38630a2 100644
>>>>> --- a/lib/bbdev/rte_bbdev.c
>>>>> +++ b/lib/bbdev/rte_bbdev.c
>>>>> @@ -1133,3 +1133,25 @@ struct rte_mempool *
>>>>>     	rte_bbdev_log(ERR, "Invalid operation type");
>>>>>     	return NULL;
>>>>>     }
>>>>> +
>>>>> +const char *
>>>>> +rte_bbdev_device_status_str(enum rte_bbdev_device_status status) {
>>>>> +	static const char * const dev_sta_string[] = {
>>>>> +		"RTE_BBDEV_DEV_NOSTATUS",
>>>>> +		"RTE_BBDEV_DEV_NOT_SUPPORTED",
>>>>> +		"RTE_BBDEV_DEV_RESET",
>>>>> +		"RTE_BBDEV_DEV_CONFIGURED",
>>>>> +		"RTE_BBDEV_DEV_ACTIVE",
>>>>> +		"RTE_BBDEV_DEV_FATAL_ERR",
>>>>> +		"RTE_BBDEV_DEV_RESTART_REQ",
>>>>> +		"RTE_BBDEV_DEV_RECONFIG_REQ",
>>>>> +		"RTE_BBDEV_DEV_CORRECT_ERR",
>>>>> +	};
>>>>> +
>>>>> +	if (status < sizeof(dev_sta_string) / sizeof(char *))
>>>>> +		return dev_sta_string[status];
>>>>> +
>>>>> +	rte_bbdev_log(ERR, "Invalid device status");
>>>>> +	return NULL;
>>>>> +}
>>>>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
>>>>> b88c881..9b1ffa4 100644
>>>>> --- a/lib/bbdev/rte_bbdev.h
>>>>> +++ b/lib/bbdev/rte_bbdev.h
>>>>> @@ -223,6 +223,21 @@ struct rte_bbdev_queue_conf {
>>>>>     int
>>>>>     rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
>>>>>
>>>>> +/**
>>>>> + * Flags indicate the status of the device  */ enum
>>>>> +rte_bbdev_device_status {
>>>>> +	RTE_BBDEV_DEV_NOSTATUS,        /**< Nothing being reported */
>>>>> +	RTE_BBDEV_DEV_NOT_SUPPORTED,   /**< Device status is not
>>>> supported on the PMD */
>>>>> +	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-
>>>> configured state */
>>>>> +	RTE_BBDEV_DEV_CONFIGURED,      /**< Device is configured and
>>>> ready to use */
>>>>> +	RTE_BBDEV_DEV_ACTIVE,          /**< Device is configured and VF is
>>>> being used */
>>>>> +	RTE_BBDEV_DEV_FATAL_ERR,       /**< Device has hit a fatal
>>>> uncorrectable error */
>>>>> +	RTE_BBDEV_DEV_RESTART_REQ,     /**< Device requires application
>>>> to restart */
>>>>> +	RTE_BBDEV_DEV_RECONFIG_REQ,    /**< Device requires
>>>> application to reconfigure queues */
>>>>> +	RTE_BBDEV_DEV_CORRECT_ERR,     /**< Warning of a correctable
>>>> error event happened */
>>>>> +};
>>>>
>>>> I don't have a strong opinion on this, but I think NOT_SUPPORTED
>>>> should be a special value. If you want to keep 0 value for NOSTATUS,
>>>> maybe you could
>>>> do:
>>>>
>>>> enum rte_bbdev_device_status {
>>>> 	RTE_BBDEV_DEV_NOT_SUPPORTED = -1,   /**< Device status is not
>>>> supported
>>>> on the PMD */
>>>> 	RTE_BBDEV_DEV_NOSTATUS = 0,        /**< Nothing being reported
>>>> */
>>>> 	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-
>>>> configured
>>>> state */
>>>> ...
>>>
>>> Thanks Maxime. My concern is that I am upstreaming in parallel in
>> pf_bb_config in parallel hence would like to keep it unchanged if possible.
>>> Given you don’t have a strong opinion is that okay to keep as is? Or I can
>> force special value 1 for NOT_SUPPORTED so that this is explicitly defined. But
>> really enum should always be used.
>>
>> I don't understand. It should not have any impact on pf_bb_config, given
>> pf_bb_config does not use DPDK.
>>
>> Maxime
> 
> That device status is being shared from pf_bb_config to the bbdev PMD through PF2VF communications, hence they share that same enum.
> 

Ok, but generic DPDK ABI should not be dependent on a vendor internal
implementation IMHO.

Maxime


^ permalink raw reply	[flat|nested] 174+ messages in thread

* RE: [PATCH v5 2/7] bbdev: add device status info
  2022-08-30  7:08                   ` Maxime Coquelin
@ 2022-08-30 19:38                     ` Chautru, Nicolas
  0 siblings, 0 replies; 174+ messages in thread
From: Chautru, Nicolas @ 2022-08-30 19:38 UTC (permalink / raw)
  To: Maxime Coquelin, dev, thomas, gakhil, hemant.agrawal
  Cc: trix, mdr, Richardson, Bruce, david.marchand, stephen

Hi Maxime, 

> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Tuesday, August 30, 2022 12:09 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
> Cc: trix@redhat.com; mdr@ashroe.eu; Richardson, Bruce
> <bruce.richardson@intel.com>; david.marchand@redhat.com;
> stephen@networkplumber.org
> Subject: Re: [PATCH v5 2/7] bbdev: add device status info
> 
> 
> 
> On 8/29/22 18:10, Chautru, Nicolas wrote:
> > Hi Maxime,
> >
> >> -----Original Message-----
> >> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> >> Sent: Friday, August 26, 2022 3:13 AM
> >> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> >> thomas@monjalon.net; gakhil@marvell.com; hemant.agrawal@nxp.com
> >> Cc: trix@redhat.com; mdr@ashroe.eu; Richardson, Bruce
> >> <bruce.richardson@intel.com>; david.marchand@redhat.com;
> >> stephen@networkplumber.org
> >> Subject: Re: [PATCH v5 2/7] bbdev: add device status info
> >>
> >> Hi,
> >>
> >> On 8/25/22 20:30, Chautru, Nicolas wrote:
> >>> Thanks Maxime,
> >>>
> >>>> -----Original Message-----
> >>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> >>>> Sent: Thursday, August 25, 2022 7:19 AM
> >>>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> >>>> thomas@monjalon.net; gakhil@marvell.com;
> hemant.agrawal@nxp.com
> >>>> Cc: trix@redhat.com; mdr@ashroe.eu; Richardson, Bruce
> >>>> <bruce.richardson@intel.com>; david.marchand@redhat.com;
> >>>> stephen@networkplumber.org
> >>>> Subject: Re: [PATCH v5 2/7] bbdev: add device status info
> >>>>
> >>>>
> >>>>
> >>>> On 7/7/22 01:28, Nicolas Chautru wrote:
> >>>>> Added device status information, so that the PMD can expose
> >>>>> information related to the underlying accelerator device status.
> >>>>> Minor order change in structure to fit into padding hole.
> >>>>>
> >>>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> >>>>> ---
> >>>>>     drivers/baseband/acc100/rte_acc100_pmd.c           |  1 +
> >>>>>     drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  1 +
> >>>>>     drivers/baseband/fpga_lte_fec/fpga_lte_fec.c       |  1 +
> >>>>>     drivers/baseband/la12xx/bbdev_la12xx.c             |  1 +
> >>>>>     drivers/baseband/null/bbdev_null.c                 |  1 +
> >>>>>     drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  1 +
> >>>>>     lib/bbdev/rte_bbdev.c                              | 22 ++++++++++++++
> >>>>>     lib/bbdev/rte_bbdev.h                              | 35
> ++++++++++++++++++++--
> >>>>>     lib/bbdev/version.map                              |  6 ++++
> >>>>>     9 files changed, 67 insertions(+), 2 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> >>>>> b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>>>> index de7e4bc..17ba798 100644
> >>>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> >>>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>>>> @@ -1060,6 +1060,7 @@
> >>>>>
> >>>>>     	/* Read and save the populated config from ACC100
> registers */
> >>>>>     	fetch_acc100_config(dev);
> >>>>> +	dev_info->device_status =
> RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>>>
> >>>>>     	/* This isn't ideal because it reports the maximum number of
> >>>>> queues
> >>>> but
> >>>>>     	 * does not provide info on how many can be
> uplink/downlink
> >>>>> or different diff --git
> >>>>> a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> >>>>> b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> >>>>> index 82ae6ba..57b12af 100644
> >>>>> --- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> >>>>> +++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> >>>>> @@ -369,6 +369,7 @@
> >>>>>     	dev_info->capabilities = bbdev_capabilities;
> >>>>>     	dev_info->cpu_flag_reqs = NULL;
> >>>>>     	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> >>>>> +	dev_info->device_status =
> RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>>>
> >>>>>     	/* Calculates number of queues assigned to device */
> >>>>>     	dev_info->max_num_queues = 0; diff --git
> >>>>> a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> >>>>> b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> >>>>> index 21d3529..2a330c4 100644
> >>>>> --- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> >>>>> +++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
> >>>>> @@ -645,6 +645,7 @@ struct __rte_cache_aligned fpga_queue {
> >>>>>     	dev_info->capabilities = bbdev_capabilities;
> >>>>>     	dev_info->cpu_flag_reqs = NULL;
> >>>>>     	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> >>>>> +	dev_info->device_status =
> RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>>>
> >>>>>     	/* Calculates number of queues assigned to device */
> >>>>>     	dev_info->max_num_queues = 0; diff --git
> >>>>> a/drivers/baseband/la12xx/bbdev_la12xx.c
> >>>>> b/drivers/baseband/la12xx/bbdev_la12xx.c
> >>>>> index 4d1bd16..c1f88c6 100644
> >>>>> --- a/drivers/baseband/la12xx/bbdev_la12xx.c
> >>>>> +++ b/drivers/baseband/la12xx/bbdev_la12xx.c
> >>>>> @@ -100,6 +100,7 @@ struct bbdev_la12xx_params {
> >>>>>     	dev_info->capabilities = bbdev_capabilities;
> >>>>>     	dev_info->cpu_flag_reqs = NULL;
> >>>>>     	dev_info->min_alignment = 64;
> >>>>> +	dev_info->device_status =
> RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>>>
> >>>>>     	rte_bbdev_log_debug("got device info from %u", dev->data-
> >>>>> dev_id);
> >>>>>     }
> >>>>> diff --git a/drivers/baseband/null/bbdev_null.c
> >>>>> b/drivers/baseband/null/bbdev_null.c
> >>>>> index 248e129..94a1976 100644
> >>>>> --- a/drivers/baseband/null/bbdev_null.c
> >>>>> +++ b/drivers/baseband/null/bbdev_null.c
> >>>>> @@ -82,6 +82,7 @@ struct bbdev_queue {
> >>>>>     	 * here for code completeness.
> >>>>>     	 */
> >>>>>     	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> >>>>> +	dev_info->device_status =
> RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>>>
> >>>>>     	rte_bbdev_log_debug("got device info from %u", dev->data-
> >>>>> dev_id);
> >>>>>     }
> >>>>> diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> >>>>> b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> >>>>> index af7bc41..dbc5524 100644
> >>>>> --- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> >>>>> +++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
> >>>>> @@ -254,6 +254,7 @@ struct turbo_sw_queue {
> >>>>>     	dev_info->min_alignment = 64;
> >>>>>     	dev_info->harq_buffer_size = 0;
> >>>>>     	dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> >>>>> +	dev_info->device_status =
> RTE_BBDEV_DEV_NOT_SUPPORTED;
> >>>>>
> >>>>>     	rte_bbdev_log_debug("got device info from %u\n", dev-
> >data-
> >>>>> dev_id);
> >>>>>     }
> >>>>> diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c index
> >>>>> 4da8047..38630a2 100644
> >>>>> --- a/lib/bbdev/rte_bbdev.c
> >>>>> +++ b/lib/bbdev/rte_bbdev.c
> >>>>> @@ -1133,3 +1133,25 @@ struct rte_mempool *
> >>>>>     	rte_bbdev_log(ERR, "Invalid operation type");
> >>>>>     	return NULL;
> >>>>>     }
> >>>>> +
> >>>>> +const char *
> >>>>> +rte_bbdev_device_status_str(enum rte_bbdev_device_status status)
> {
> >>>>> +	static const char * const dev_sta_string[] = {
> >>>>> +		"RTE_BBDEV_DEV_NOSTATUS",
> >>>>> +		"RTE_BBDEV_DEV_NOT_SUPPORTED",
> >>>>> +		"RTE_BBDEV_DEV_RESET",
> >>>>> +		"RTE_BBDEV_DEV_CONFIGURED",
> >>>>> +		"RTE_BBDEV_DEV_ACTIVE",
> >>>>> +		"RTE_BBDEV_DEV_FATAL_ERR",
> >>>>> +		"RTE_BBDEV_DEV_RESTART_REQ",
> >>>>> +		"RTE_BBDEV_DEV_RECONFIG_REQ",
> >>>>> +		"RTE_BBDEV_DEV_CORRECT_ERR",
> >>>>> +	};
> >>>>> +
> >>>>> +	if (status < sizeof(dev_sta_string) / sizeof(char *))
> >>>>> +		return dev_sta_string[status];
> >>>>> +
> >>>>> +	rte_bbdev_log(ERR, "Invalid device status");
> >>>>> +	return NULL;
> >>>>> +}
> >>>>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
> >>>>> b88c881..9b1ffa4 100644
> >>>>> --- a/lib/bbdev/rte_bbdev.h
> >>>>> +++ b/lib/bbdev/rte_bbdev.h
> >>>>> @@ -223,6 +223,21 @@ struct rte_bbdev_queue_conf {
> >>>>>     int
> >>>>>     rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
> >>>>>
> >>>>> +/**
> >>>>> + * Flags indicate the status of the device  */ enum
> >>>>> +rte_bbdev_device_status {
> >>>>> +	RTE_BBDEV_DEV_NOSTATUS,        /**< Nothing being
> reported */
> >>>>> +	RTE_BBDEV_DEV_NOT_SUPPORTED,   /**< Device status is
> not
> >>>> supported on the PMD */
> >>>>> +	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-
> >>>> configured state */
> >>>>> +	RTE_BBDEV_DEV_CONFIGURED,      /**< Device is
> configured and
> >>>> ready to use */
> >>>>> +	RTE_BBDEV_DEV_ACTIVE,          /**< Device is configured
> and VF is
> >>>> being used */
> >>>>> +	RTE_BBDEV_DEV_FATAL_ERR,       /**< Device has hit a fatal
> >>>> uncorrectable error */
> >>>>> +	RTE_BBDEV_DEV_RESTART_REQ,     /**< Device requires
> application
> >>>> to restart */
> >>>>> +	RTE_BBDEV_DEV_RECONFIG_REQ,    /**< Device requires
> >>>> application to reconfigure queues */
> >>>>> +	RTE_BBDEV_DEV_CORRECT_ERR,     /**< Warning of a
> correctable
> >>>> error event happened */
> >>>>> +};
> >>>>
> >>>> I don't have a strong opinion on this, but I think NOT_SUPPORTED
> >>>> should be a special value. If you want to keep 0 value for
> >>>> NOSTATUS, maybe you could
> >>>> do:
> >>>>
> >>>> enum rte_bbdev_device_status {
> >>>> 	RTE_BBDEV_DEV_NOT_SUPPORTED = -1,   /**< Device status is not
> >>>> supported
> >>>> on the PMD */
> >>>> 	RTE_BBDEV_DEV_NOSTATUS = 0,        /**< Nothing being reported
> >>>> */
> >>>> 	RTE_BBDEV_DEV_RESET,           /**< Device in reset and un-
> >>>> configured
> >>>> state */
> >>>> ...
> >>>
> >>> Thanks Maxime. My concern is that I am upstreaming in parallel in
> >> pf_bb_config in parallel hence would like to keep it unchanged if possible.
> >>> Given you don’t have a strong opinion is that okay to keep as is? Or
> >>> I can
> >> force special value 1 for NOT_SUPPORTED so that this is explicitly
> >> defined. But really enum should always be used.
> >>
> >> I don't understand. It should not have any impact on pf_bb_config,
> >> given pf_bb_config does not use DPDK.
> >>
> >> Maxime
> >
> > That device status is being shared from pf_bb_config to the bbdev PMD
> through PF2VF communications, hence they share that same enum.
> >
> 
> Ok, but generic DPDK ABI should not be dependent on a vendor internal
> implementation IMHO.
> 

I agree. This is the opposite direction, pf_bb_config is reusing the same API enumeration for now the assumption is that enum is being applied. 
In case we change here (for a fairly cosmetic reason) that would cause to change as well on the other ingredient, which is doable but has overhead.
If you really believe that there is a strong reason to do such a change let me know now, we would need to impact pf_bb_config release to have similar change and match that API change which is possible but not ideal. 

Thanks
Nic




^ permalink raw reply	[flat|