* Re: [dpdk-dev] [PATCH] doc: eventdev ABI change to support DLB PMD
2020-08-03 17:55 13% ` [dpdk-dev] [PATCH] doc: eventdev ABI change to support DLB PMD McDaniel, Timothy
2020-08-04 7:38 4% ` Jerin Jacob
@ 2020-08-04 13:46 4% ` Van Haaren, Harry
1 sibling, 0 replies; 200+ results
From: Van Haaren, Harry @ 2020-08-04 13:46 UTC (permalink / raw)
To: McDaniel, Timothy; +Cc: mattias.ronnblom, dev, Eads, Gage, jerinj
> -----Original Message-----
> From: McDaniel, Timothy <timothy.mcdaniel@intel.com>
> Sent: Monday, August 3, 2020 6:56 PM
> To: jerinj@marvell.com
> Cc: mattias.ronnblom@ericsson.com; dev@dpdk.org; Eads, Gage
> <gage.eads@intel.com>; Van Haaren, Harry <harry.van.haaren@intel.com>;
> McDaniel, Timothy <timothy.mcdaniel@intel.com>
> Subject: [PATCH] doc: eventdev ABI change to support DLB PMD
>
> From: "McDaniel, Timothy" <timothy.mcdaniel@intel.com>
>
> The ABI changes associated with this notification will better support
> devices that:
> 1. Have limits on the number or queues that may be linked to a port
> 2. Have ports that are limited to exactly one linked queue
> 3. Are not able to transparently transfer the event flow_id field
>
> Signed-off-by: McDaniel Timothy
> <timothy.mcdaniel@intel.com>
Nitpick: git warns on added extra new blank line at end of file.
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
<snip>
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH] doc: announce change in IPv6 item struct
2020-08-03 19:51 9% [dpdk-dev] [PATCH] doc: announce change in IPv6 item struct Dekel Peled
@ 2020-08-04 13:17 0% ` Dekel Peled
0 siblings, 0 replies; 200+ results
From: Dekel Peled @ 2020-08-04 13:17 UTC (permalink / raw)
To: dev
Cc: jerinjacobk, stephen, arybchenko, ajit.khaparde, maxime.coquelin,
olivier.matz, david.marchand, ferruh.yigit, Asaf Penso
Kind reminder to all maintainers, please review and ack/comment.
> -----Original Message-----
> From: Dekel Peled <dekelp@mellanox.com>
> Sent: Monday, August 3, 2020 10:51 PM
> To: dev@dpdk.org
> Cc: jerinjacobk@gmail.com; stephen@networkplumber.org;
> arybchenko@solarflare.com; ajit.khaparde@broadcom.com;
> maxime.coquelin@redhat.com; olivier.matz@6wind.com;
> david.marchand@redhat.com; ferruh.yigit@intel.com
> Subject: [PATCH] doc: announce change in IPv6 item struct
>
> Struct rte_flow_item_ipv6 will be modified to include additional values,
> indicating existence or absence of IPv6 extension headers following the IPv6
> header, as proposed in RFC https://mails.dpdk.org/archives/dev/2020-
> August/177257.html.
> Because of ABI break this change is proposed for 20.11.
>
> Signed-off-by: Dekel Peled <dekelp@mellanox.com>
> ---
> doc/guides/rel_notes/deprecation.rst | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst
> b/doc/guides/rel_notes/deprecation.rst
> index ea4cfa7..5201142 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -110,6 +110,11 @@ Deprecation Notices
> break the ABI checks, that is why change is planned for 20.11.
> The list of internal APIs are mainly ones listed in ``rte_ethdev_driver.h``.
>
> +* ethdev: The ``struct rte_flow_item_ipv6`` struct will be modified to
> +include
> + additional values, indicating existence or absence of IPv6 extension
> +headers
> + following the IPv6 header, as proposed in RFC
> + https://mails.dpdk.org/archives/dev/2020-August/177257.html.
> +
> * traffic manager: All traffic manager API's in ``rte_tm.h`` were mistakenly
> made
> ABI stable in the v19.11 release. The TM maintainer and other contributors
> have
> agreed to keep the TM APIs as experimental in expectation of additional
> spec
> --
> 1.8.3.1
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v2] doc: add reserve fields to eventdev public structures
2020-08-04 10:41 4% ` Bruce Richardson
@ 2020-08-04 11:37 0% ` Jerin Jacob
0 siblings, 0 replies; 200+ results
From: Jerin Jacob @ 2020-08-04 11:37 UTC (permalink / raw)
To: Bruce Richardson
Cc: Pavan Nikhilesh, Jerin Jacob, Ray Kinsella, Neil Horman,
John McNamara, Marko Kovacevic, dpdk-dev, Thomas Monjalon,
David Marchand
On Tue, Aug 4, 2020 at 4:12 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Mon, Aug 03, 2020 at 12:59:03PM +0530, pbhagavatula@marvell.com wrote:
> > From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> >
> > Add 64 byte padding at the end of event device public structure to allow
> > future extensions.
> >
> > Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> > Acked-by: Jerin Jacob <jerinj@marvell.com>
> > ---
> > v2 Changes:
> > - Modify commit title.
> > - Add patch reference to doc.
> >
> > doc/guides/rel_notes/deprecation.rst | 11 +++++++++++
> > 1 file changed, 11 insertions(+)
> >
> > diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> > index ea4cfa7a4..ec5db68e9 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -151,3 +151,14 @@ Deprecation Notices
> > Python 2 support will be completely removed in 20.11.
> > In 20.08, explicit deprecation warnings will be displayed when running
> > scripts with Python 2.
> > +
> > +* eventdev: A 64 byte padding is added at the end of the following structures
> > + in event device library to support future extensions:
> > + ``rte_event_crypto_adapter_conf``, ``rte_event_eth_rx_adapter_conf``,
> > + ``rte_event_eth_rx_adapter_queue_conf``, ``rte_event_eth_tx_adapter_conf``,
> > + ``rte_event_timer_adapter_conf``, ``rte_event_timer_adapter_info``,
> > + ``rte_event_dev_info``, ``rte_event_dev_config``, ``rte_event_queue_conf``,
> > + ``rte_event_port_conf``, ``rte_event_timer_adapter``,
> > + ``rte_event_timer_adapter_data``.
> > + Reference:
> > + http://patches.dpdk.org/project/dpdk/list/?series=10728&archive=both&state=*
> > --
>
> I don't like this idea of adding lots of padding to the ends of these
> structures. For some structures, such as the public arrays for devices it
> may be necessary, but for all the conf structures passed as parameters to
> functions I think we can do better. Since these structures are passed by
> the user to various functions, function versioning can be used to ensure
> that the correct function in eventdev is always called. From there to the
> individual PMDs, we can implement ABI compatibility by either:
> 1. including the length of the struct as a parameter to the driver. (This is
> a bit similar to my proposal for rawdev [1])
> 2. including the ABI version as a parameter to the driver.
But, Will the above solution work if the application is dependent on
struct size?
i.e change of s1 size will change offset of s3 i.e
app_sepecific_struct_s3. Right?
i.e DPDK version should not change the offset of s3. Right?
example,
struct app_struct {
struct dpdk_public_struct_s1 s1;
struct dpdk_public_struct_s2 s2;
struct app_sepecific_struct_s3 s3;
}
>
> Regards
> /Bruce
>
> [1] http://inbox.dpdk.org/dev/?q=enhance+rawdev+APIs
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v2] doc: add reserve fields to eventdev public structures
@ 2020-08-04 10:41 4% ` Bruce Richardson
2020-08-04 11:37 0% ` Jerin Jacob
0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2020-08-04 10:41 UTC (permalink / raw)
To: pbhagavatula
Cc: jerinj, Ray Kinsella, Neil Horman, John McNamara,
Marko Kovacevic, dev, thomas, david.marchand
On Mon, Aug 03, 2020 at 12:59:03PM +0530, pbhagavatula@marvell.com wrote:
> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
>
> Add 64 byte padding at the end of event device public structure to allow
> future extensions.
>
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> Acked-by: Jerin Jacob <jerinj@marvell.com>
> ---
> v2 Changes:
> - Modify commit title.
> - Add patch reference to doc.
>
> doc/guides/rel_notes/deprecation.rst | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> index ea4cfa7a4..ec5db68e9 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -151,3 +151,14 @@ Deprecation Notices
> Python 2 support will be completely removed in 20.11.
> In 20.08, explicit deprecation warnings will be displayed when running
> scripts with Python 2.
> +
> +* eventdev: A 64 byte padding is added at the end of the following structures
> + in event device library to support future extensions:
> + ``rte_event_crypto_adapter_conf``, ``rte_event_eth_rx_adapter_conf``,
> + ``rte_event_eth_rx_adapter_queue_conf``, ``rte_event_eth_tx_adapter_conf``,
> + ``rte_event_timer_adapter_conf``, ``rte_event_timer_adapter_info``,
> + ``rte_event_dev_info``, ``rte_event_dev_config``, ``rte_event_queue_conf``,
> + ``rte_event_port_conf``, ``rte_event_timer_adapter``,
> + ``rte_event_timer_adapter_data``.
> + Reference:
> + http://patches.dpdk.org/project/dpdk/list/?series=10728&archive=both&state=*
> --
I don't like this idea of adding lots of padding to the ends of these
structures. For some structures, such as the public arrays for devices it
may be necessary, but for all the conf structures passed as parameters to
functions I think we can do better. Since these structures are passed by
the user to various functions, function versioning can be used to ensure
that the correct function in eventdev is always called. From there to the
individual PMDs, we can implement ABI compatibility by either:
1. including the length of the struct as a parameter to the driver. (This is
a bit similar to my proposal for rawdev [1])
2. including the ABI version as a parameter to the driver.
Regards
/Bruce
[1] http://inbox.dpdk.org/dev/?q=enhance+rawdev+APIs
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH] doc: eventdev ABI change to support DLB PMD
2020-08-03 17:55 13% ` [dpdk-dev] [PATCH] doc: eventdev ABI change to support DLB PMD McDaniel, Timothy
@ 2020-08-04 7:38 4% ` Jerin Jacob
2020-08-04 13:46 4% ` Van Haaren, Harry
1 sibling, 0 replies; 200+ results
From: Jerin Jacob @ 2020-08-04 7:38 UTC (permalink / raw)
To: McDaniel, Timothy
Cc: Jerin Jacob, Mattias Rönnblom, dpdk-dev, Gage Eads,
Van Haaren, Harry
On Mon, Aug 3, 2020 at 11:28 PM McDaniel, Timothy
<timothy.mcdaniel@intel.com> wrote:
>
> From: "McDaniel, Timothy" <timothy.mcdaniel@intel.com>
There is still "," in the name.
>
> The ABI changes associated with this notification will better support
> devices that:
> 1. Have limits on the number or queues that may be linked to a port
> 2. Have ports that are limited to exactly one linked queue
> 3. Are not able to transparently transfer the event flow_id field
>
> Signed-off-by: McDaniel Timothy
> <timothy.mcdaniel@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
> ---
> doc/guides/rel_notes/deprecation.rst | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> index 99c9806..bfe6661 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -148,3 +148,14 @@ Deprecation Notices
> Python 2 support will be completely removed in 20.11.
> In 20.08, explicit deprecation warnings will be displayed when running
> scripts with Python 2.
> +
> +* eventdev: ABI changes to support DLB PMD and future extensions:
> + ``rte_event_dev_info``, ``rte_event_dev_config``, ``rte_event_port_conf`` will
> + be modified to support DLB PMD and future extensions in the eventdev library.
> + Patches containing justification, documentation, and proposed modifications
> + can be found at:
> +
> + - https://patches.dpdk.org/patch/71457/
> + - https://patches.dpdk.org/patch/71456/
> +
> +
> --
> 1.7.10
>
^ permalink raw reply [relevance 4%]
* [dpdk-dev] [PATCH] doc: announce change in IPv6 item struct
@ 2020-08-03 19:51 9% Dekel Peled
2020-08-04 13:17 0% ` Dekel Peled
0 siblings, 1 reply; 200+ results
From: Dekel Peled @ 2020-08-03 19:51 UTC (permalink / raw)
To: dev
Cc: jerinjacobk, stephen, arybchenko, ajit.khaparde, maxime.coquelin,
olivier.matz, david.marchand, ferruh.yigit
Struct rte_flow_item_ipv6 will be modified to include additional
values, indicating existence or absence of IPv6 extension headers
following the IPv6 header, as proposed in RFC
https://mails.dpdk.org/archives/dev/2020-August/177257.html.
Because of ABI break this change is proposed for 20.11.
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
---
doc/guides/rel_notes/deprecation.rst | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index ea4cfa7..5201142 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -110,6 +110,11 @@ Deprecation Notices
break the ABI checks, that is why change is planned for 20.11.
The list of internal APIs are mainly ones listed in ``rte_ethdev_driver.h``.
+* ethdev: The ``struct rte_flow_item_ipv6`` struct will be modified to include
+ additional values, indicating existence or absence of IPv6 extension headers
+ following the IPv6 header, as proposed in RFC
+ https://mails.dpdk.org/archives/dev/2020-August/177257.html.
+
* traffic manager: All traffic manager API's in ``rte_tm.h`` were mistakenly made
ABI stable in the v19.11 release. The TM maintainer and other contributors have
agreed to keep the TM APIs as experimental in expectation of additional spec
--
1.8.3.1
^ permalink raw reply [relevance 9%]
* [dpdk-dev] [PATCH] doc: eventdev ABI change to support DLB PMD
2020-08-03 6:09 4% ` Jerin Jacob
@ 2020-08-03 17:55 13% ` McDaniel, Timothy
2020-08-04 7:38 4% ` Jerin Jacob
2020-08-04 13:46 4% ` Van Haaren, Harry
0 siblings, 2 replies; 200+ results
From: McDaniel, Timothy @ 2020-08-03 17:55 UTC (permalink / raw)
To: jerinj
Cc: mattias.ronnblom, dev, gage.eads, harry.van.haaren, McDaniel, Timothy
From: "McDaniel, Timothy" <timothy.mcdaniel@intel.com>
The ABI changes associated with this notification will better support
devices that:
1. Have limits on the number or queues that may be linked to a port
2. Have ports that are limited to exactly one linked queue
3. Are not able to transparently transfer the event flow_id field
Signed-off-by: McDaniel Timothy
<timothy.mcdaniel@intel.com>
---
doc/guides/rel_notes/deprecation.rst | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 99c9806..bfe6661 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -148,3 +148,14 @@ Deprecation Notices
Python 2 support will be completely removed in 20.11.
In 20.08, explicit deprecation warnings will be displayed when running
scripts with Python 2.
+
+* eventdev: ABI changes to support DLB PMD and future extensions:
+ ``rte_event_dev_info``, ``rte_event_dev_config``, ``rte_event_port_conf`` will
+ be modified to support DLB PMD and future extensions in the eventdev library.
+ Patches containing justification, documentation, and proposed modifications
+ can be found at:
+
+ - https://patches.dpdk.org/patch/71457/
+ - https://patches.dpdk.org/patch/71456/
+
+
--
1.7.10
^ permalink raw reply [relevance 13%]
* [dpdk-dev] [RFC v3] ethdev: add extensions attributes to IPv6 item
2020-08-03 17:01 3% ` [dpdk-dev] [RFC v2] ethdev: add extensions attributes " Dekel Peled
@ 2020-08-03 17:11 3% ` Dekel Peled
0 siblings, 0 replies; 200+ results
From: Dekel Peled @ 2020-08-03 17:11 UTC (permalink / raw)
To: ferruh.yigit, arybchenko, orika, john.mcnamara, marko.kovacevic
Cc: asafp, matan, elibr, dev
Using the current implementation of DPDK, an application cannot match on
IPv6 packets, based on the existing extension headers, in a simple way.
Field 'Next Header' in IPv6 header indicates type of the first extension
header only. Following extension headers can't be identified by
inspecting the IPv6 header.
As a result, the existence or absence of specific extension headers
can't be used for packet matching.
For example, fragmented IPv6 packets contain a dedicated extension header,
as detailed in RFC [1], which is not yet supported in rte_flow.
Non-fragmented packets don't contain the fragment extension header.
For an application to match on non-fragmented IPv6 packets, the current
implementation doesn't provide a suitable solution.
Matching on the Next Header field is not sufficient, since additional
extension headers might be present in the same packet.
To match on fragmented IPv6 packets, the same difficulty exists.
Proposed update:
A set of additional values will be added to IPv6 header struct.
These values will indicate the existence of every defined extension
header type, providing simple means for identification of existing
extensions in the packet header.
Continuing the above example, fragmented packets can be identified using
the specific value indicating existence of fragment extension header.
This update changes ABI, and is proposed for the 20.11 LTS version.
[1] http://mails.dpdk.org/archives/dev/2020-March/160255.html
---
v3: Fix checkpatch comments.
v2: Update from fragment attribute to all extensions attributes.
---
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
---
lib/librte_ethdev/rte_flow.h | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index da8bfa5..246918e 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -792,11 +792,33 @@ struct rte_flow_item_ipv4 {
*
* Matches an IPv6 header.
*
+ * Dedicated flags indicate existence of specific extension headers.
+ *
* Note: IPv6 options are handled by dedicated pattern items, see
* RTE_FLOW_ITEM_TYPE_IPV6_EXT.
*/
struct rte_flow_item_ipv6 {
struct rte_ipv6_hdr hdr; /**< IPv6 header definition. */
+ uint64_t hop_ext_exist:1;
+ /**< Hop-by-Hop Options extension header exists. */
+ uint64_t rout_ext_exist:1;
+ /**< Routing extension header exists. */
+ uint64_t frag_ext_exist:1;
+ /**< Fragment extension header exists. */
+ uint64_t auth_ext_exist:1;
+ /**< Authentication extension header exists. */
+ uint64_t esp_ext_exist:1;
+ /**< Encapsulation Security Payload extension header exists. */
+ uint64_t dest_ext_exist:1;
+ /**< Destination Options extension header exists. */
+ uint64_t mobil_ext_exist:1;
+ /**< Mobility extension header exists. */
+ uint64_t hip_ext_exist:1;
+ /**< Host Identity Protocol extension header exists. */
+ uint64_t shim6_ext_exist:1;
+ /**< Shim6 Protocol extension header exists. */
+ uint64_t reserved:55;
+ /**< Reserved for future extension headers, must be zero. */
};
/** Default mask for RTE_FLOW_ITEM_TYPE_IPV6. */
--
1.8.3.1
^ permalink raw reply [relevance 3%]
* [dpdk-dev] [RFC v2] ethdev: add extensions attributes to IPv6 item
@ 2020-08-03 17:01 3% ` Dekel Peled
2020-08-03 17:11 3% ` [dpdk-dev] [RFC v3] " Dekel Peled
1 sibling, 1 reply; 200+ results
From: Dekel Peled @ 2020-08-03 17:01 UTC (permalink / raw)
To: ferruh.yigit, arybchenko, orika, john.mcnamara, marko.kovacevic
Cc: asafp, matan, elibr, dev
Using the current implementation of DPDK, an application cannot match on
IPv6 packets, based on the existing extension headers, in a simple way.
Field 'Next Header' in IPv6 header indicates type of the first extension
header only. Following extension headers can't be identified by
inspecting the IPv6 header.
As a result, the existence or absence of specific extension headers
can't be used for packet matching.
For example, fragmented IPv6 packets contain a dedicated extension header,
as detailed in RFC [1], which is not yet supported in rte_flow.
Non-fragmented packets don't contain the fragment extension header.
For an application to match on non-fragmented IPv6 packets, the current
implementation doesn't provide a suitable solution.
Matching on the Next Header field is not sufficient, since additional
extension headers might be present in the same packet.
To match on fragmented IPv6 packets, the same difficulty exists.
Proposed update:
A set of additional values will be added to IPv6 header struct.
These values will indicate the existence of every defined extension
header type, providing simple means for identification of existing
extensions in the packet header.
Continuing the above example, fragmented packets can be identified using
the specific value indicating existence of fragment extension header.
This update changes ABI, and is proposed for the 20.11 LTS version.
[1] http://mails.dpdk.org/archives/dev/2020-March/160255.html
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
---
lib/librte_ethdev/rte_flow.h | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index da8bfa5..8d2073d 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -792,11 +792,33 @@ struct rte_flow_item_ipv4 {
*
* Matches an IPv6 header.
*
+ * Dedicated flags indicate existence of specific extension headers.
+ *
* Note: IPv6 options are handled by dedicated pattern items, see
* RTE_FLOW_ITEM_TYPE_IPV6_EXT.
*/
struct rte_flow_item_ipv6 {
struct rte_ipv6_hdr hdr; /**< IPv6 header definition. */
+ uint64_t hop_ext_exist:1;
+ /**< Hop-by-Hop Options extension header exists. */
+ uint64_t rout_ext_exist:1;
+ /**< Routing extension header exists. */
+ uint64_t frag_ext_exist:1;
+ /**< Fragment extension header exists. */
+ uint64_t auth_ext_exist:1;
+ /**< Authentication extension header exists. */
+ uint64_t esp_ext_exist:1;
+ /**< Encapsulation Security Payload extension header exists. */
+ uint64_t dest_ext_exist:1;
+ /**< Destination Options extension header exists. */
+ uint64_t mobil_ext_exist:1;
+ /**< Mobility extension header exists. */
+ uint64_t hip_ext_exist:1;
+ /**< Host Identity Protocol extension header exists. */
+ uint64_t shim6_ext_exist:1;
+ /**< Shim6 Protocol extension header exists. */
+ uint64_t reserved:55;
+ /**< Reserved for future extension headers, must be zero. */
};
/** Default mask for RTE_FLOW_ITEM_TYPE_IPV6. */
--
1.8.3.1
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] The mbuf API needs some cleaning up
2020-07-31 15:24 0% ` Olivier Matz
@ 2020-08-03 8:42 0% ` Morten Brørup
0 siblings, 0 replies; 200+ results
From: Morten Brørup @ 2020-08-03 8:42 UTC (permalink / raw)
To: Olivier Matz; +Cc: dev, Thomas Monjalon
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Olivier Matz
> Sent: Friday, July 31, 2020 5:25 PM
>
> Hi Morten,
>
> Thanks for the feedback.
>
> On Mon, Jul 13, 2020 at 11:57:38AM +0200, Morten Brørup wrote:
>
> > The MBUF library exposes some macros and constants without the RTE_
> prefix. I
> > propose cleaning up these, so better names get into the coming LTS
> release.
>
> Yes, Thomas talked about it some time ago and he even drafted a patch
> to
> fix it. We can target 20.11 for the changes, but I think we'll have to
> keep a compat API until 21.11.
>
Great, then I will back off. No need for multiple patches fixing the same things. :-)
And I agree with all your feedback... although I do consider the mbuf port_id so much at the core of DPDK that I suggested RTE_PORT_INVALID over RTE_MBUF_PORT_INVALID, but I don't feel strongly about it. Whatever you and Thomas prefer is probably fine.
> > The worst is:
> > #define MBUF_INVALID_PORT UINT16_MAX
> >
> > I say it's the worst because when we were looking for the official
> "invalid"
> > port value for our application, we didn't find this one. (Probably
> because its
> > documentation is wrong.)
> >
> > MBUF_INVALID_PORT is defined in rte_mbuf_core.h without any
> description, and
> > in rte_mbuf.h, where it is injected between the rte_pktmbuf_reset()
> function
> > and its description, so the API documentation shows the function's
> description
> > for the constant, and no description for the function.
>
> The one in rte_mbuf_core.h should be kept, with a documentation.
>
> > I propose keeping it at a sensible location in rte_mbuf_core.h only,
> adding a description, and renaming it to:
> > #define RTE_PORT_INVALID UINT16_MAX
>
> I suggest RTE_MBUF_PORT_INVALID
>
> > For backwards compatibility, we could add:
> > /* this old name is deprecated */
> > #define MBUF_INVALID_PORT RTE_PORT_INVALID
> >
> > I also wonder why there are no compiler warnings about the double
> definition?
>
> If the value is the same, the compiler won't complain.
>
> > There are also the data buffer location constants:
> > #define EXT_ATTACHED_MBUF (1ULL << 61)
> > and
> > #define IND_ATTACHED_MBUF (1ULL << 62)
> >
> >
> > There are already macros (with good names) for reading these, so
> > simply adding the RTE_ prefix to these two constants suffices.
>
> Some applications use it, we also need a compat here.
>
> > And all the packet offload flags, such as:
> > #define PKT_RX_VLAN (1ULL << 0)
> >
> >
> > They are supposed to be used by applications, so I guess we should
> > keep them unchanged for ABI stability reasons.
>
> I propose RTE_MBUF_F_<name> for the mbuf flags.
>
> > And the local macro:
> > #define MBUF_RAW_ALLOC_CHECK(m) do { \
> >
> > This might as well be an internal inline function:
> > /* internal */
> > static inline void
> > __rte_mbuf_raw_alloc_check(const struct rte_mbuf *m)
> >
>
> agree, I don't think a macro is mandatory here
>
>
> Thanks,
> Olivier
>
>
> > Or we could keep it a macro and move it next to
> > __rte_mbuf_sanity_check(), keeping it clear that it is only relevant
> when
> > RTE_LIBRTE_MBUF_DEBUG is set. But rename it to lower case, similar to
> the
> > __rte_mbuf_sanity_check() macro.
> >
> >
> > Med venlig hilsen / kind regards
> > - Morten Brørup
> >
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH] doc: announce changes to eventdev public data structures
2020-07-31 19:31 5% ` McDaniel, Timothy
@ 2020-08-03 6:09 4% ` Jerin Jacob
2020-08-03 17:55 13% ` [dpdk-dev] [PATCH] doc: eventdev ABI change to support DLB PMD McDaniel, Timothy
0 siblings, 1 reply; 200+ results
From: Jerin Jacob @ 2020-08-03 6:09 UTC (permalink / raw)
To: McDaniel, Timothy
Cc: Jerin Jacob, Mattias Rönnblom, dpdk-dev, Gage Eads,
Van Haaren, Harry
On Sat, Aug 1, 2020 at 1:04 AM McDaniel, Timothy
<timothy.mcdaniel@intel.com> wrote:
>
> From: "McDaniel, Timothy" <timothy.mcdaniel@intel.com>
The patch should have some description and
Please change the subject to: "doc: eventdev ABI change to support DLB PMD"
>
> Signed-off-by: McDaniel, Timothy <timothy.mcdaniel@intel.com>
We don't use "," in the Signed-off-by.
Please change to ``Signed-off-by: McDaniel Timothy
<timothy.mcdaniel@intel.com>``
> ---
> doc/guides/rel_notes/deprecation.rst | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> index 99c9806..b9682a7 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -148,3 +148,14 @@ Deprecation Notices
> Python 2 support will be completely removed in 20.11.
> In 20.08, explicit deprecation warnings will be displayed when running
> scripts with Python 2.
> +
> +* eventdev: ABI change to support DLB PMD and future extensions
> + The following structures and will be modified to support to DLB PMD and future
> + extension in the eventdev library.
> + - ``rte_event_dev_info``
> + - ``rte_event_dev_config``
> + - ``rte_event_port_conf``
> + Patches containing justification, documentation, and proposed modifications
> + can be found at:
> + - https://patches.dpdk.org/patch/71457/
> + - https://patches.dpdk.org/patch/71456/
The HTML rendering of the above text is not proper.
Please run "make doc-guides-html" to check generated HTML output.
You could use the below text as an example for sphinx syntax.
* eventdev: ABI change to support DLB PMD and future extensions:
``rte_event_dev_info``, ``rte_event_dev_config``, ``rte_event_port_conf`` will
be modified to support to DLB PMD and future extension in the
eventdev library.
Patches containing justification, documentation, and proposed modifications
can be found at:
- https://patches.dpdk.org/patch/71457/
- https://patches.dpdk.org/patch/71456/
With the above changes:
Acked-by: Jerin Jacob <jerinj@marvell.com>
> --
> 1.7.10
>
^ permalink raw reply [relevance 4%]
* [dpdk-dev] [PATCH] doc: announce changes to eventdev public data structures
2020-07-31 18:51 5% ` McDaniel, Timothy
@ 2020-07-31 19:31 5% ` McDaniel, Timothy
2020-08-03 6:09 4% ` Jerin Jacob
2 siblings, 1 reply; 200+ results
From: McDaniel, Timothy @ 2020-07-31 19:31 UTC (permalink / raw)
To: jerinj
Cc: mattias.ronnblom, dev, gage.eads, harry.van.haaren, McDaniel, Timothy
From: "McDaniel, Timothy" <timothy.mcdaniel@intel.com>
Signed-off-by: McDaniel, Timothy <timothy.mcdaniel@intel.com>
---
doc/guides/rel_notes/deprecation.rst | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 99c9806..b9682a7 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -148,3 +148,14 @@ Deprecation Notices
Python 2 support will be completely removed in 20.11.
In 20.08, explicit deprecation warnings will be displayed when running
scripts with Python 2.
+
+* eventdev: ABI change to support DLB PMD and future extensions
+ The following structures and will be modified to support to DLB PMD and future
+ extension in the eventdev library.
+ - ``rte_event_dev_info``
+ - ``rte_event_dev_config``
+ - ``rte_event_port_conf``
+ Patches containing justification, documentation, and proposed modifications
+ can be found at:
+ - https://patches.dpdk.org/patch/71457/
+ - https://patches.dpdk.org/patch/71456/
--
1.7.10
^ permalink raw reply [relevance 5%]
* Re: [dpdk-dev] [PATCH] doc: announce changes to eventdev public data structures
2020-07-31 18:51 5% ` McDaniel, Timothy
@ 2020-07-31 19:03 0% ` Jerin Jacob
0 siblings, 0 replies; 200+ results
From: Jerin Jacob @ 2020-07-31 19:03 UTC (permalink / raw)
To: McDaniel, Timothy
Cc: Jerin Jacob, Mattias Rönnblom, dpdk-dev, Gage Eads,
Van Haaren, Harry
On Sat, Aug 1, 2020 at 12:24 AM McDaniel, Timothy
<timothy.mcdaniel@intel.com> wrote:
>
> From: "McDaniel, Timothy" <timothy.mcdaniel@intel.com>
>
> Signed-off-by: McDaniel, Timothy <timothy.mcdaniel@intel.com>
> ---
> doc/guides/rel_notes/deprecation.rst | 37 +++++++++-------------------------
> 1 file changed, 10 insertions(+), 27 deletions(-)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> index ecb1bc4..4809643 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -149,30 +149,13 @@ Deprecation Notices
> In 20.08, explicit deprecation warnings will be displayed when running
> scripts with Python 2.
>
> -* eventdev: Three public data structures will be updated in 20.11;
> - ``rte_event_dev_info``, ``rte_event_dev_config``, and
> - ``rte_event_port_conf``.
> - Two new members will be added to the ``rte_event_dev_info`` struct.
> - The first, max_event_port_links, will be a uint8_t, and represents the
> - maximum number of queues that can be linked to a single event port by
> - this device. The second, max_single_link_event_port_queue_pairs, will be a
> - uint8_t, and represents the maximum number of event ports and queues that
> - are optimized for (and only capable of) single-link configurations
> - supported by this device. These ports and queues are not accounted for in
> - max_event_ports or max_event_queues.
> - One new member will be added to the ``rte_event_dev_config`` struct. The
> - nb_single_link_event_port_queues member will be a uint8_t, and will
> - represent the number of event ports and queues that will be singly-linked
> - to each other. These are a subset of the overall event ports and queues.
> - This value cannot exceed nb_event_ports or nb_event_queues. If the
> - device has ports and queues that are optimized for single-link usage, this
> - field is a hint for how many to allocate; otherwise, regular event ports and
> - queues can be used.
> - Finally, the ``rte_event_port_conf`` struct will be
> - modified as follows. The uint8_t implicit_release_disabled field
> - will be replaced by a uint32_t event_port_cfg field. The new field will
> - initially have two bits assigned. RTE_EVENT_PORT_CFG_DISABLE_IMPL_REL
> - will have the same meaning as implicit_release_disabled. The second bit,
> - RTE_EVENT_PORT_CFG_SINGLE_LINK will be set if the event port links only
> - to a single event queue.
Remove this section. It is a duplicate of the below section. One
deprecation notice is enough.
> -
> +* eventdev: ABI change to support DLB PMD and future extensions
> + The following structures and will be modified to support to DLB PMD and future
> + extension in the eventdev library.
> + - ``rte_event_dev_info``
> + - ``rte_event_dev_config``
> + - ``rte_event_port_conf``
> + Patches containing justification, documentation, and proposed modifications
> + can be found at
> + - https://patches.dpdk.org/patch/71457/
> + - https://patches.dpdk.org/patch/71456/
> --
> 1.7.10
>
^ permalink raw reply [relevance 0%]
* [dpdk-dev] [PATCH] doc: announce changes to eventdev public data structures
@ 2020-07-31 18:51 5% ` McDaniel, Timothy
2020-07-31 19:03 0% ` Jerin Jacob
2020-07-31 19:31 5% ` McDaniel, Timothy
2 siblings, 1 reply; 200+ results
From: McDaniel, Timothy @ 2020-07-31 18:51 UTC (permalink / raw)
To: jerinj
Cc: mattias.ronnblom, dev, gage.eads, harry.van.haaren, McDaniel, Timothy
From: "McDaniel, Timothy" <timothy.mcdaniel@intel.com>
Signed-off-by: McDaniel, Timothy <timothy.mcdaniel@intel.com>
---
doc/guides/rel_notes/deprecation.rst | 37 +++++++++-------------------------
1 file changed, 10 insertions(+), 27 deletions(-)
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index ecb1bc4..4809643 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -149,30 +149,13 @@ Deprecation Notices
In 20.08, explicit deprecation warnings will be displayed when running
scripts with Python 2.
-* eventdev: Three public data structures will be updated in 20.11;
- ``rte_event_dev_info``, ``rte_event_dev_config``, and
- ``rte_event_port_conf``.
- Two new members will be added to the ``rte_event_dev_info`` struct.
- The first, max_event_port_links, will be a uint8_t, and represents the
- maximum number of queues that can be linked to a single event port by
- this device. The second, max_single_link_event_port_queue_pairs, will be a
- uint8_t, and represents the maximum number of event ports and queues that
- are optimized for (and only capable of) single-link configurations
- supported by this device. These ports and queues are not accounted for in
- max_event_ports or max_event_queues.
- One new member will be added to the ``rte_event_dev_config`` struct. The
- nb_single_link_event_port_queues member will be a uint8_t, and will
- represent the number of event ports and queues that will be singly-linked
- to each other. These are a subset of the overall event ports and queues.
- This value cannot exceed nb_event_ports or nb_event_queues. If the
- device has ports and queues that are optimized for single-link usage, this
- field is a hint for how many to allocate; otherwise, regular event ports and
- queues can be used.
- Finally, the ``rte_event_port_conf`` struct will be
- modified as follows. The uint8_t implicit_release_disabled field
- will be replaced by a uint32_t event_port_cfg field. The new field will
- initially have two bits assigned. RTE_EVENT_PORT_CFG_DISABLE_IMPL_REL
- will have the same meaning as implicit_release_disabled. The second bit,
- RTE_EVENT_PORT_CFG_SINGLE_LINK will be set if the event port links only
- to a single event queue.
-
+* eventdev: ABI change to support DLB PMD and future extensions
+ The following structures and will be modified to support to DLB PMD and future
+ extension in the eventdev library.
+ - ``rte_event_dev_info``
+ - ``rte_event_dev_config``
+ - ``rte_event_port_conf``
+ Patches containing justification, documentation, and proposed modifications
+ can be found at
+ - https://patches.dpdk.org/patch/71457/
+ - https://patches.dpdk.org/patch/71456/
--
1.7.10
^ permalink raw reply [relevance 5%]
* Re: [dpdk-dev] [PATCH v3] cmdline: increase maximum line length
2020-07-31 12:55 0% ` Olivier Matz
2020-07-31 13:00 0% ` David Marchand
@ 2020-07-31 15:46 0% ` Stephen Hemminger
1 sibling, 0 replies; 200+ results
From: Stephen Hemminger @ 2020-07-31 15:46 UTC (permalink / raw)
To: Olivier Matz
Cc: David Marchand, Wisam Jaddo, dev, Raslan, Thomas Monjalon,
Iremonger, Bernard, dpdk stable
On Fri, 31 Jul 2020 14:55:16 +0200
Olivier Matz <olivier.matz@6wind.com> wrote:
> Hi,
>
> Ressurecting this old thread.
>
> On Sat, Feb 22, 2020 at 04:28:15PM +0100, David Marchand wrote:
> > This patch is flagged as an ABI breakage:
> > https://travis-ci.com/ovsrobot/dpdk/jobs/289313318#L2273
> >
>
> In case we want this fix for 20.11, should we do a deprecation notice
> in 20.08?
>
>
> Olivier
>
>
> >
> > On Thu, Feb 20, 2020 at 3:53 PM Wisam Jaddo <wisamm@mellanox.com> wrote:
> > >
> > > This increase due to the usage of cmdline in dpdk applications
> > > as config commands such as testpmd do for rte_flow rules creation.
> > >
> > > The current size of buffer is not enough to fill
> > > many cases of rte_flow commands validation/creation.
> > >
> > > rte_flow now can have outer items, inner items, modify
> > > actions, meta data actions, duplicate action, fate action and
> > > more in one single rte flow, thus 512 char will not be enough
> > > to validate such rte flow rules.
> > >
> > > Such change shouldn't affect the memory since the cmdline
> > > reading again using the same buffer.
> >
> > I don't get your point here.
The cmdline is a awkward user API. Thomas wanted to replace it but
it seems to have gotten nowhere.
Agree that having something dynamic would be best, Something
based of getline() or editline (readline).
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] The mbuf API needs some cleaning up
2020-07-13 9:57 3% [dpdk-dev] The mbuf API needs some cleaning up Morten Brørup
@ 2020-07-31 15:24 0% ` Olivier Matz
2020-08-03 8:42 0% ` Morten Brørup
0 siblings, 1 reply; 200+ results
From: Olivier Matz @ 2020-07-31 15:24 UTC (permalink / raw)
To: Morten Brørup; +Cc: dev
Hi Morten,
Thanks for the feedback.
On Mon, Jul 13, 2020 at 11:57:38AM +0200, Morten Brørup wrote:
> The MBUF library exposes some macros and constants without the RTE_ prefix. I
> propose cleaning up these, so better names get into the coming LTS release.
Yes, Thomas talked about it some time ago and he even drafted a patch to
fix it. We can target 20.11 for the changes, but I think we'll have to
keep a compat API until 21.11.
> The worst is:
> #define MBUF_INVALID_PORT UINT16_MAX
>
> I say it's the worst because when we were looking for the official "invalid"
> port value for our application, we didn't find this one. (Probably because its
> documentation is wrong.)
>
> MBUF_INVALID_PORT is defined in rte_mbuf_core.h without any description, and
> in rte_mbuf.h, where it is injected between the rte_pktmbuf_reset() function
> and its description, so the API documentation shows the function's description
> for the constant, and no description for the function.
The one in rte_mbuf_core.h should be kept, with a documentation.
> I propose keeping it at a sensible location in rte_mbuf_core.h only, adding a description, and renaming it to:
> #define RTE_PORT_INVALID UINT16_MAX
I suggest RTE_MBUF_PORT_INVALID
> For backwards compatibility, we could add:
> /* this old name is deprecated */
> #define MBUF_INVALID_PORT RTE_PORT_INVALID
>
> I also wonder why there are no compiler warnings about the double definition?
If the value is the same, the compiler won't complain.
> There are also the data buffer location constants:
> #define EXT_ATTACHED_MBUF (1ULL << 61)
> and
> #define IND_ATTACHED_MBUF (1ULL << 62)
>
>
> There are already macros (with good names) for reading these, so
> simply adding the RTE_ prefix to these two constants suffices.
Some applications use it, we also need a compat here.
> And all the packet offload flags, such as:
> #define PKT_RX_VLAN (1ULL << 0)
>
>
> They are supposed to be used by applications, so I guess we should
> keep them unchanged for ABI stability reasons.
I propose RTE_MBUF_F_<name> for the mbuf flags.
> And the local macro:
> #define MBUF_RAW_ALLOC_CHECK(m) do { \
>
> This might as well be an internal inline function:
> /* internal */
> static inline void
> __rte_mbuf_raw_alloc_check(const struct rte_mbuf *m)
>
agree, I don't think a macro is mandatory here
Thanks,
Olivier
> Or we could keep it a macro and move it next to
> __rte_mbuf_sanity_check(), keeping it clear that it is only relevant when
> RTE_LIBRTE_MBUF_DEBUG is set. But rename it to lower case, similar to the
> __rte_mbuf_sanity_check() macro.
>
>
> Med venlig hilsen / kind regards
> - Morten Brørup
>
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v3] cmdline: increase maximum line length
2020-07-31 12:55 0% ` Olivier Matz
@ 2020-07-31 13:00 0% ` David Marchand
2020-07-31 15:46 0% ` Stephen Hemminger
1 sibling, 0 replies; 200+ results
From: David Marchand @ 2020-07-31 13:00 UTC (permalink / raw)
To: Olivier Matz
Cc: Wisam Jaddo, dev, Raslan, Thomas Monjalon, Iremonger, Bernard,
dpdk stable
On Fri, Jul 31, 2020 at 2:55 PM Olivier Matz <olivier.matz@6wind.com> wrote:
> Ressurecting this old thread.
>
> On Sat, Feb 22, 2020 at 04:28:15PM +0100, David Marchand wrote:
> > This patch is flagged as an ABI breakage:
> > https://travis-ci.com/ovsrobot/dpdk/jobs/289313318#L2273
> >
>
> In case we want this fix for 20.11, should we do a deprecation notice
> in 20.08?
If there is something to change, that would be removing this max size
rather than extend it.
Let's not go the "XX bytes ought to be enough for anybody" way.
--
David Marchand
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v3] cmdline: increase maximum line length
@ 2020-07-31 12:55 0% ` Olivier Matz
2020-07-31 13:00 0% ` David Marchand
2020-07-31 15:46 0% ` Stephen Hemminger
0 siblings, 2 replies; 200+ results
From: Olivier Matz @ 2020-07-31 12:55 UTC (permalink / raw)
To: David Marchand
Cc: Wisam Jaddo, dev, Raslan, Thomas Monjalon, Iremonger, Bernard,
dpdk stable
Hi,
Ressurecting this old thread.
On Sat, Feb 22, 2020 at 04:28:15PM +0100, David Marchand wrote:
> This patch is flagged as an ABI breakage:
> https://travis-ci.com/ovsrobot/dpdk/jobs/289313318#L2273
>
In case we want this fix for 20.11, should we do a deprecation notice
in 20.08?
Olivier
>
> On Thu, Feb 20, 2020 at 3:53 PM Wisam Jaddo <wisamm@mellanox.com> wrote:
> >
> > This increase due to the usage of cmdline in dpdk applications
> > as config commands such as testpmd do for rte_flow rules creation.
> >
> > The current size of buffer is not enough to fill
> > many cases of rte_flow commands validation/creation.
> >
> > rte_flow now can have outer items, inner items, modify
> > actions, meta data actions, duplicate action, fate action and
> > more in one single rte flow, thus 512 char will not be enough
> > to validate such rte flow rules.
> >
> > Such change shouldn't affect the memory since the cmdline
> > reading again using the same buffer.
>
> I don't get your point here.
>
>
> > Cc: stable@dpdk.org
>
> This is not a fix.
>
>
> --
> David Marchand
>
^ permalink raw reply [relevance 0%]
* [dpdk-dev] [PATCH 08/27] event/dlb: add definitions shared with LKM or shared code
2020-07-30 19:49 1% ` [dpdk-dev] [PATCH 03/27] event/dlb: add shared code version 10.7.9 McDaniel, Timothy
@ 2020-07-30 19:49 1% ` McDaniel, Timothy
1 sibling, 0 replies; 200+ results
From: McDaniel, Timothy @ 2020-07-30 19:49 UTC (permalink / raw)
To: jerinj
Cc: mattias.ronnblom, dev, gage.eads, harry.van.haaren, McDaniel, Timothy
From: "McDaniel, Timothy" <timothy.mcdaniel@intel.com>
Signed-off-by: McDaniel, Timothy <timothy.mcdaniel@intel.com>
---
drivers/event/dlb/dlb_user.h | 1083 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 1083 insertions(+)
create mode 100644 drivers/event/dlb/dlb_user.h
diff --git a/drivers/event/dlb/dlb_user.h b/drivers/event/dlb/dlb_user.h
new file mode 100644
index 0000000..73b601b
--- /dev/null
+++ b/drivers/event/dlb/dlb_user.h
@@ -0,0 +1,1083 @@
+/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)
+ * Copyright(c) 2016-2020 Intel Corporation
+ */
+
+#ifndef __DLB_USER_H
+#define __DLB_USER_H
+
+#define DLB_MAX_NAME_LEN 64
+
+#include <linux/types.h>
+
+enum dlb_error {
+ DLB_ST_SUCCESS = 0,
+ DLB_ST_NAME_EXISTS,
+ DLB_ST_DOMAIN_UNAVAILABLE,
+ DLB_ST_LDB_PORTS_UNAVAILABLE,
+ DLB_ST_DIR_PORTS_UNAVAILABLE,
+ DLB_ST_LDB_QUEUES_UNAVAILABLE,
+ DLB_ST_LDB_CREDITS_UNAVAILABLE,
+ DLB_ST_DIR_CREDITS_UNAVAILABLE,
+ DLB_ST_LDB_CREDIT_POOLS_UNAVAILABLE,
+ DLB_ST_DIR_CREDIT_POOLS_UNAVAILABLE,
+ DLB_ST_SEQUENCE_NUMBERS_UNAVAILABLE,
+ DLB_ST_INVALID_DOMAIN_ID,
+ DLB_ST_INVALID_QID_INFLIGHT_ALLOCATION,
+ DLB_ST_ATOMIC_INFLIGHTS_UNAVAILABLE,
+ DLB_ST_HIST_LIST_ENTRIES_UNAVAILABLE,
+ DLB_ST_INVALID_LDB_CREDIT_POOL_ID,
+ DLB_ST_INVALID_DIR_CREDIT_POOL_ID,
+ DLB_ST_INVALID_POP_COUNT_VIRT_ADDR,
+ DLB_ST_INVALID_LDB_QUEUE_ID,
+ DLB_ST_INVALID_CQ_DEPTH,
+ DLB_ST_INVALID_CQ_VIRT_ADDR,
+ DLB_ST_INVALID_PORT_ID,
+ DLB_ST_INVALID_QID,
+ DLB_ST_INVALID_PRIORITY,
+ DLB_ST_NO_QID_SLOTS_AVAILABLE,
+ DLB_ST_QED_FREELIST_ENTRIES_UNAVAILABLE,
+ DLB_ST_DQED_FREELIST_ENTRIES_UNAVAILABLE,
+ DLB_ST_INVALID_DIR_QUEUE_ID,
+ DLB_ST_DIR_QUEUES_UNAVAILABLE,
+ DLB_ST_INVALID_LDB_CREDIT_LOW_WATERMARK,
+ DLB_ST_INVALID_LDB_CREDIT_QUANTUM,
+ DLB_ST_INVALID_DIR_CREDIT_LOW_WATERMARK,
+ DLB_ST_INVALID_DIR_CREDIT_QUANTUM,
+ DLB_ST_DOMAIN_NOT_CONFIGURED,
+ DLB_ST_PID_ALREADY_ATTACHED,
+ DLB_ST_PID_NOT_ATTACHED,
+ DLB_ST_INTERNAL_ERROR,
+ DLB_ST_DOMAIN_IN_USE,
+ DLB_ST_IOMMU_MAPPING_ERROR,
+ DLB_ST_FAIL_TO_PIN_MEMORY_PAGE,
+ DLB_ST_UNABLE_TO_PIN_POPCOUNT_PAGES,
+ DLB_ST_UNABLE_TO_PIN_CQ_PAGES,
+ DLB_ST_DISCONTIGUOUS_CQ_MEMORY,
+ DLB_ST_DISCONTIGUOUS_POP_COUNT_MEMORY,
+ DLB_ST_DOMAIN_STARTED,
+ DLB_ST_LARGE_POOL_NOT_SPECIFIED,
+ DLB_ST_SMALL_POOL_NOT_SPECIFIED,
+ DLB_ST_NEITHER_POOL_SPECIFIED,
+ DLB_ST_DOMAIN_NOT_STARTED,
+ DLB_ST_INVALID_MEASUREMENT_DURATION,
+ DLB_ST_INVALID_PERF_METRIC_GROUP_ID,
+ DLB_ST_LDB_PORT_REQUIRED_FOR_LDB_QUEUES,
+ DLB_ST_DOMAIN_RESET_FAILED,
+ DLB_ST_MBOX_ERROR,
+ DLB_ST_INVALID_HIST_LIST_DEPTH,
+ DLB_ST_NO_MEMORY,
+};
+
+static const char dlb_error_strings[][128] = {
+ "DLB_ST_SUCCESS",
+ "DLB_ST_NAME_EXISTS",
+ "DLB_ST_DOMAIN_UNAVAILABLE",
+ "DLB_ST_LDB_PORTS_UNAVAILABLE",
+ "DLB_ST_DIR_PORTS_UNAVAILABLE",
+ "DLB_ST_LDB_QUEUES_UNAVAILABLE",
+ "DLB_ST_LDB_CREDITS_UNAVAILABLE",
+ "DLB_ST_DIR_CREDITS_UNAVAILABLE",
+ "DLB_ST_LDB_CREDIT_POOLS_UNAVAILABLE",
+ "DLB_ST_DIR_CREDIT_POOLS_UNAVAILABLE",
+ "DLB_ST_SEQUENCE_NUMBERS_UNAVAILABLE",
+ "DLB_ST_INVALID_DOMAIN_ID",
+ "DLB_ST_INVALID_QID_INFLIGHT_ALLOCATION",
+ "DLB_ST_ATOMIC_INFLIGHTS_UNAVAILABLE",
+ "DLB_ST_HIST_LIST_ENTRIES_UNAVAILABLE",
+ "DLB_ST_INVALID_LDB_CREDIT_POOL_ID",
+ "DLB_ST_INVALID_DIR_CREDIT_POOL_ID",
+ "DLB_ST_INVALID_POP_COUNT_VIRT_ADDR",
+ "DLB_ST_INVALID_LDB_QUEUE_ID",
+ "DLB_ST_INVALID_CQ_DEPTH",
+ "DLB_ST_INVALID_CQ_VIRT_ADDR",
+ "DLB_ST_INVALID_PORT_ID",
+ "DLB_ST_INVALID_QID",
+ "DLB_ST_INVALID_PRIORITY",
+ "DLB_ST_NO_QID_SLOTS_AVAILABLE",
+ "DLB_ST_QED_FREELIST_ENTRIES_UNAVAILABLE",
+ "DLB_ST_DQED_FREELIST_ENTRIES_UNAVAILABLE",
+ "DLB_ST_INVALID_DIR_QUEUE_ID",
+ "DLB_ST_DIR_QUEUES_UNAVAILABLE",
+ "DLB_ST_INVALID_LDB_CREDIT_LOW_WATERMARK",
+ "DLB_ST_INVALID_LDB_CREDIT_QUANTUM",
+ "DLB_ST_INVALID_DIR_CREDIT_LOW_WATERMARK",
+ "DLB_ST_INVALID_DIR_CREDIT_QUANTUM",
+ "DLB_ST_DOMAIN_NOT_CONFIGURED",
+ "DLB_ST_PID_ALREADY_ATTACHED",
+ "DLB_ST_PID_NOT_ATTACHED",
+ "DLB_ST_INTERNAL_ERROR",
+ "DLB_ST_DOMAIN_IN_USE",
+ "DLB_ST_IOMMU_MAPPING_ERROR",
+ "DLB_ST_FAIL_TO_PIN_MEMORY_PAGE",
+ "DLB_ST_UNABLE_TO_PIN_POPCOUNT_PAGES",
+ "DLB_ST_UNABLE_TO_PIN_CQ_PAGES",
+ "DLB_ST_DISCONTIGUOUS_CQ_MEMORY",
+ "DLB_ST_DISCONTIGUOUS_POP_COUNT_MEMORY",
+ "DLB_ST_DOMAIN_STARTED",
+ "DLB_ST_LARGE_POOL_NOT_SPECIFIED",
+ "DLB_ST_SMALL_POOL_NOT_SPECIFIED",
+ "DLB_ST_NEITHER_POOL_SPECIFIED",
+ "DLB_ST_DOMAIN_NOT_STARTED",
+ "DLB_ST_INVALID_MEASUREMENT_DURATION",
+ "DLB_ST_INVALID_PERF_METRIC_GROUP_ID",
+ "DLB_ST_LDB_PORT_REQUIRED_FOR_LDB_QUEUES",
+ "DLB_ST_DOMAIN_RESET_FAILED",
+ "DLB_ST_MBOX_ERROR",
+ "DLB_ST_INVALID_HIST_LIST_DEPTH",
+ "DLB_ST_NO_MEMORY",
+};
+
+struct dlb_cmd_response {
+ __u32 status; /* Interpret using enum dlb_error */
+ __u32 id;
+};
+
+/******************************/
+/* 'dlb' device file commands */
+/******************************/
+
+#define DLB_DEVICE_VERSION(x) (((x) >> 8) & 0xFF)
+#define DLB_DEVICE_REVISION(x) ((x) & 0xFF)
+
+enum dlb_revisions {
+ DLB_REV_A0 = 0,
+ DLB_REV_A1 = 1,
+ DLB_REV_A2 = 2,
+ DLB_REV_A3 = 3,
+ DLB_REV_B0 = 4,
+};
+
+/*
+ * DLB_CMD_GET_DEVICE_VERSION: Query the DLB device version.
+ *
+ * This ioctl interface is the same in all driver versions and is always
+ * the first ioctl.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id[7:0]: Device revision.
+ * response.id[15:8]: Device version.
+ */
+
+struct dlb_get_device_version_args {
+ /* Output parameters */
+ __u64 response;
+};
+
+#define DLB_VERSION_MAJOR_NUMBER 10
+#define DLB_VERSION_MINOR_NUMBER 7
+#define DLB_VERSION_REVISION_NUMBER 9
+#define DLB_VERSION (DLB_VERSION_MAJOR_NUMBER << 24 | \
+ DLB_VERSION_MINOR_NUMBER << 16 | \
+ DLB_VERSION_REVISION_NUMBER)
+
+#define DLB_VERSION_GET_MAJOR_NUMBER(x) (((x) >> 24) & 0xFF)
+#define DLB_VERSION_GET_MINOR_NUMBER(x) (((x) >> 16) & 0xFF)
+#define DLB_VERSION_GET_REVISION_NUMBER(x) ((x) & 0xFFFF)
+
+static inline __u8 dlb_version_incompatible(__u32 version)
+{
+ __u8 inc;
+
+ inc = DLB_VERSION_GET_MAJOR_NUMBER(version) != DLB_VERSION_MAJOR_NUMBER;
+ inc |= (int)DLB_VERSION_GET_MINOR_NUMBER(version) <
+ DLB_VERSION_MINOR_NUMBER;
+
+ return inc;
+}
+
+/*
+ * DLB_CMD_GET_DRIVER_VERSION: Query the DLB driver version. The major number
+ * is changed when there is an ABI-breaking change, the minor number is
+ * changed if the API is changed in a backwards-compatible way, and the
+ * revision number is changed for fixes that don't affect the API.
+ *
+ * If the kernel driver's API version major number and the header's
+ * DLB_VERSION_MAJOR_NUMBER differ, the two are incompatible, or if the
+ * major numbers match but the kernel driver's minor number is less than
+ * the header file's, they are incompatible. The DLB_VERSION_INCOMPATIBLE
+ * macro should be used to check for compatibility.
+ *
+ * This ioctl interface is the same in all driver versions. Applications
+ * should check the driver version before performing any other ioctl
+ * operations.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: Driver API version. Use the DLB_VERSION_GET_MAJOR_NUMBER,
+ * DLB_VERSION_GET_MINOR_NUMBER, and
+ * DLB_VERSION_GET_REVISION_NUMBER macros to interpret the field.
+ */
+
+struct dlb_get_driver_version_args {
+ /* Output parameters */
+ __u64 response;
+};
+
+/*
+ * DLB_CMD_CREATE_SCHED_DOMAIN: Create a DLB scheduling domain and reserve the
+ * resources (queues, ports, etc.) that it contains.
+ *
+ * Input parameters:
+ * - num_ldb_queues: Number of load-balanced queues.
+ * - num_ldb_ports: Number of load-balanced ports.
+ * - num_dir_ports: Number of directed ports. A directed port has one directed
+ * queue, so no num_dir_queues argument is necessary.
+ * - num_atomic_inflights: This specifies the amount of temporary atomic QE
+ * storage for the domain. This storage is divided among the domain's
+ * load-balanced queues that are configured for atomic scheduling.
+ * - num_hist_list_entries: Amount of history list storage. This is divided
+ * among the domain's CQs.
+ * - num_ldb_credits: Amount of load-balanced QE storage (QED). QEs occupy this
+ * space until they are scheduled to a load-balanced CQ. One credit
+ * represents the storage for one QE.
+ * - num_dir_credits: Amount of directed QE storage (DQED). QEs occupy this
+ * space until they are scheduled to a directed CQ. One credit represents
+ * the storage for one QE.
+ * - num_ldb_credit_pools: Number of pools into which the load-balanced credits
+ * are placed.
+ * - num_dir_credit_pools: Number of pools into which the directed credits are
+ * placed.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: domain ID.
+ */
+struct dlb_create_sched_domain_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 num_ldb_queues;
+ __u32 num_ldb_ports;
+ __u32 num_dir_ports;
+ __u32 num_atomic_inflights;
+ __u32 num_hist_list_entries;
+ __u32 num_ldb_credits;
+ __u32 num_dir_credits;
+ __u32 num_ldb_credit_pools;
+ __u32 num_dir_credit_pools;
+};
+
+/*
+ * DLB_CMD_GET_NUM_RESOURCES: Return the number of available resources
+ * (queues, ports, etc.) that this device owns.
+ *
+ * Output parameters:
+ * - num_domains: Number of available scheduling domains.
+ * - num_ldb_queues: Number of available load-balanced queues.
+ * - num_ldb_ports: Number of available load-balanced ports.
+ * - num_dir_ports: Number of available directed ports. There is one directed
+ * queue for every directed port.
+ * - num_atomic_inflights: Amount of available temporary atomic QE storage.
+ * - max_contiguous_atomic_inflights: When a domain is created, the temporary
+ * atomic QE storage is allocated in a contiguous chunk. This return value
+ * is the longest available contiguous range of atomic QE storage.
+ * - num_hist_list_entries: Amount of history list storage.
+ * - max_contiguous_hist_list_entries: History list storage is allocated in
+ * a contiguous chunk, and this return value is the longest available
+ * contiguous range of history list entries.
+ * - num_ldb_credits: Amount of available load-balanced QE storage.
+ * - max_contiguous_ldb_credits: QED storage is allocated in a contiguous
+ * chunk, and this return value is the longest available contiguous range
+ * of load-balanced credit storage.
+ * - num_dir_credits: Amount of available directed QE storage.
+ * - max_contiguous_dir_credits: DQED storage is allocated in a contiguous
+ * chunk, and this return value is the longest available contiguous range
+ * of directed credit storage.
+ * - num_ldb_credit_pools: Number of available load-balanced credit pools.
+ * - num_dir_credit_pools: Number of available directed credit pools.
+ * - padding0: Reserved for future use.
+ */
+struct dlb_get_num_resources_args {
+ /* Output parameters */
+ __u32 num_sched_domains;
+ __u32 num_ldb_queues;
+ __u32 num_ldb_ports;
+ __u32 num_dir_ports;
+ __u32 num_atomic_inflights;
+ __u32 max_contiguous_atomic_inflights;
+ __u32 num_hist_list_entries;
+ __u32 max_contiguous_hist_list_entries;
+ __u32 num_ldb_credits;
+ __u32 max_contiguous_ldb_credits;
+ __u32 num_dir_credits;
+ __u32 max_contiguous_dir_credits;
+ __u32 num_ldb_credit_pools;
+ __u32 num_dir_credit_pools;
+ __u32 padding0;
+};
+
+/*
+ * DLB_CMD_SET_SN_ALLOCATION: Configure a sequence number group
+ *
+ * Input parameters:
+ * - group: Sequence number group ID.
+ * - num: Number of sequence numbers per queue.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ */
+struct dlb_set_sn_allocation_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 group;
+ __u32 num;
+};
+
+/*
+ * DLB_CMD_GET_SN_ALLOCATION: Get a sequence number group's configuration
+ *
+ * Input parameters:
+ * - group: Sequence number group ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: Specified group's number of sequence numbers per queue.
+ */
+struct dlb_get_sn_allocation_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 group;
+ __u32 padding0;
+};
+
+enum dlb_cq_poll_modes {
+ DLB_CQ_POLL_MODE_STD,
+ DLB_CQ_POLL_MODE_SPARSE,
+
+ /* NUM_DLB_CQ_POLL_MODE must be last */
+ NUM_DLB_CQ_POLL_MODE,
+};
+
+/*
+ * DLB_CMD_QUERY_CQ_POLL_MODE: Query the CQ poll mode the kernel driver is using
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: CQ poll mode (see enum dlb_cq_poll_modes).
+ */
+struct dlb_query_cq_poll_mode_args {
+ /* Output parameters */
+ __u64 response;
+};
+
+/*
+ * DLB_CMD_GET_SN_OCCUPANCY: Get a sequence number group's occupancy
+ *
+ * Each sequence number group has one or more slots, depending on its
+ * configuration. I.e.:
+ * - If configured for 1024 sequence numbers per queue, the group has 1 slot
+ * - If configured for 512 sequence numbers per queue, the group has 2 slots
+ * ...
+ * - If configured for 32 sequence numbers per queue, the group has 32 slots
+ *
+ * This ioctl returns the group's number of in-use slots. If its occupancy is
+ * 0, the group's sequence number allocation can be reconfigured.
+ *
+ * Input parameters:
+ * - group: Sequence number group ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: Specified group's number of used slots.
+ */
+struct dlb_get_sn_occupancy_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 group;
+ __u32 padding0;
+};
+
+enum dlb_user_interface_commands {
+ DLB_CMD_GET_DEVICE_VERSION,
+ DLB_CMD_CREATE_SCHED_DOMAIN,
+ DLB_CMD_GET_NUM_RESOURCES,
+ DLB_CMD_GET_DRIVER_VERSION,
+ DLB_CMD_SAMPLE_PERF_COUNTERS,
+ DLB_CMD_SET_SN_ALLOCATION,
+ DLB_CMD_GET_SN_ALLOCATION,
+ DLB_CMD_MEASURE_SCHED_COUNTS,
+ DLB_CMD_QUERY_CQ_POLL_MODE,
+ DLB_CMD_GET_SN_OCCUPANCY,
+
+ /* NUM_DLB_CMD must be last */
+ NUM_DLB_CMD,
+};
+
+/*******************************/
+/* 'domain' device file alerts */
+/*******************************/
+
+/* Scheduling domain device files can be read to receive domain-specific
+ * notifications, for alerts such as hardware errors.
+ *
+ * Each alert is encoded in a 16B message. The first 8B contains the alert ID,
+ * and the second 8B is optional and contains additional information.
+ * Applications should cast read data to a struct dlb_domain_alert, and
+ * interpret the struct's alert_id according to dlb_domain_alert_id. The read
+ * length must be 16B, or the function will return -EINVAL.
+ *
+ * Reads are destructive, and in the case of multiple file descriptors for the
+ * same domain device file, an alert will be read by only one of the file
+ * descriptors.
+ *
+ * The driver stores alerts in a fixed-size alert ring until they are read. If
+ * the alert ring fills completely, subsequent alerts will be dropped. It is
+ * recommended that DLB applications dedicate a thread to perform blocking
+ * reads on the device file.
+ */
+enum dlb_domain_alert_id {
+ /* A destination domain queue that this domain connected to has
+ * unregistered, and can no longer be sent to. The aux alert data
+ * contains the queue ID.
+ */
+ DLB_DOMAIN_ALERT_REMOTE_QUEUE_UNREGISTER,
+ /* A producer port in this domain attempted to send a QE without a
+ * credit. aux_alert_data[7:0] contains the port ID, and
+ * aux_alert_data[15:8] contains a flag indicating whether the port is
+ * load-balanced (1) or directed (0).
+ */
+ DLB_DOMAIN_ALERT_PP_OUT_OF_CREDITS,
+ /* Software issued an illegal enqueue for a port in this domain. An
+ * illegal enqueue could be:
+ * - Illegal (excess) completion
+ * - Illegal fragment
+ * - Illegal enqueue command
+ * aux_alert_data[7:0] contains the port ID, and aux_alert_data[15:8]
+ * contains a flag indicating whether the port is load-balanced (1) or
+ * directed (0).
+ */
+ DLB_DOMAIN_ALERT_PP_ILLEGAL_ENQ,
+ /* Software issued excess CQ token pops for a port in this domain.
+ * aux_alert_data[7:0] contains the port ID, and aux_alert_data[15:8]
+ * contains a flag indicating whether the port is load-balanced (1) or
+ * directed (0).
+ */
+ DLB_DOMAIN_ALERT_PP_EXCESS_TOKEN_POPS,
+ /* A enqueue contained either an invalid command encoding or a REL,
+ * REL_T, RLS, FWD, FWD_T, FRAG, or FRAG_T from a directed port.
+ *
+ * aux_alert_data[7:0] contains the port ID, and aux_alert_data[15:8]
+ * contains a flag indicating whether the port is load-balanced (1) or
+ * directed (0).
+ */
+ DLB_DOMAIN_ALERT_ILLEGAL_HCW,
+ /* The QID must be valid and less than 128.
+ *
+ * aux_alert_data[7:0] contains the port ID, and aux_alert_data[15:8]
+ * contains a flag indicating whether the port is load-balanced (1) or
+ * directed (0).
+ */
+ DLB_DOMAIN_ALERT_ILLEGAL_QID,
+ /* An enqueue went to a disabled QID.
+ *
+ * aux_alert_data[7:0] contains the port ID, and aux_alert_data[15:8]
+ * contains a flag indicating whether the port is load-balanced (1) or
+ * directed (0).
+ */
+ DLB_DOMAIN_ALERT_DISABLED_QID,
+ /* The device containing this domain was reset. All applications using
+ * the device need to exit for the driver to complete the reset
+ * procedure.
+ *
+ * aux_alert_data doesn't contain any information for this alert.
+ */
+ DLB_DOMAIN_ALERT_DEVICE_RESET,
+ /* User-space has enqueued an alert.
+ *
+ * aux_alert_data contains user-provided data.
+ */
+ DLB_DOMAIN_ALERT_USER,
+
+ /* Number of DLB domain alerts */
+ NUM_DLB_DOMAIN_ALERTS
+};
+
+static const char dlb_domain_alert_strings[][128] = {
+ "DLB_DOMAIN_ALERT_REMOTE_QUEUE_UNREGISTER",
+ "DLB_DOMAIN_ALERT_PP_OUT_OF_CREDITS",
+ "DLB_DOMAIN_ALERT_PP_ILLEGAL_ENQ",
+ "DLB_DOMAIN_ALERT_PP_EXCESS_TOKEN_POPS",
+ "DLB_DOMAIN_ALERT_ILLEGAL_HCW",
+ "DLB_DOMAIN_ALERT_ILLEGAL_QID",
+ "DLB_DOMAIN_ALERT_DISABLED_QID",
+ "DLB_DOMAIN_ALERT_DEVICE_RESET",
+ "DLB_DOMAIN_ALERT_USER",
+};
+
+struct dlb_domain_alert {
+ __u64 alert_id;
+ __u64 aux_alert_data;
+};
+
+/*********************************/
+/* 'domain' device file commands */
+/*********************************/
+
+/*
+ * DLB_DOMAIN_CMD_CREATE_LDB_POOL: Configure a load-balanced credit pool.
+ * Input parameters:
+ * - num_ldb_credits: Number of load-balanced credits (QED space) for this
+ * pool.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: pool ID.
+ */
+struct dlb_create_ldb_pool_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 num_ldb_credits;
+ __u32 padding0;
+};
+
+/*
+ * DLB_DOMAIN_CMD_CREATE_DIR_POOL: Configure a directed credit pool.
+ * Input parameters:
+ * - num_dir_credits: Number of directed credits (DQED space) for this pool.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: Pool ID.
+ */
+struct dlb_create_dir_pool_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 num_dir_credits;
+ __u32 padding0;
+};
+
+/*
+ * DLB_DOMAIN_CMD_CREATE_LDB_QUEUE: Configure a load-balanced queue.
+ * Input parameters:
+ * - num_atomic_inflights: This specifies the amount of temporary atomic QE
+ * storage for this queue. If zero, the queue will not support atomic
+ * scheduling.
+ * - num_sequence_numbers: This specifies the number of sequence numbers used
+ * by this queue. If zero, the queue will not support ordered scheduling.
+ * If non-zero, the queue will not support unordered scheduling.
+ * - num_qid_inflights: The maximum number of QEs that can be inflight
+ * (scheduled to a CQ but not completed) at any time. If
+ * num_sequence_numbers is non-zero, num_qid_inflights must be set equal
+ * to num_sequence_numbers.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: Queue ID.
+ */
+struct dlb_create_ldb_queue_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 num_sequence_numbers;
+ __u32 num_qid_inflights;
+ __u32 num_atomic_inflights;
+ __u32 padding0;
+};
+
+/*
+ * DLB_DOMAIN_CMD_CREATE_DIR_QUEUE: Configure a directed queue.
+ * Input parameters:
+ * - port_id: Port ID. If the corresponding directed port is already created,
+ * specify its ID here. Else this argument must be 0xFFFFFFFF to indicate
+ * that the queue is being created before the port.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: Queue ID.
+ */
+struct dlb_create_dir_queue_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __s32 port_id;
+ __u32 padding0;
+};
+
+/*
+ * DLB_DOMAIN_CMD_CREATE_LDB_PORT: Configure a load-balanced port.
+ * Input parameters:
+ * - ldb_credit_pool_id: Load-balanced credit pool this port will belong to.
+ * - dir_credit_pool_id: Directed credit pool this port will belong to.
+ * - ldb_credit_high_watermark: Number of load-balanced credits from the pool
+ * that this port will own.
+ *
+ * If this port's scheduling domain doesn't have any load-balanced queues,
+ * this argument is ignored and the port is given no load-balanced
+ * credits.
+ * - dir_credit_high_watermark: Number of directed credits from the pool that
+ * this port will own.
+ *
+ * If this port's scheduling domain doesn't have any directed queues,
+ * this argument is ignored and the port is given no directed credits.
+ * - ldb_credit_low_watermark: Load-balanced credit low watermark. When the
+ * port's credits reach this watermark, they become eligible to be
+ * refilled by the DLB as credits until the high watermark
+ * (num_ldb_credits) is reached.
+ *
+ * If this port's scheduling domain doesn't have any load-balanced queues,
+ * this argument is ignored and the port is given no load-balanced
+ * credits.
+ * - dir_credit_low_watermark: Directed credit low watermark. When the port's
+ * credits reach this watermark, they become eligible to be refilled by
+ * the DLB as credits until the high watermark (num_dir_credits) is
+ * reached.
+ *
+ * If this port's scheduling domain doesn't have any directed queues,
+ * this argument is ignored and the port is given no directed credits.
+ * - ldb_credit_quantum: Number of load-balanced credits for the DLB to refill
+ * per refill operation.
+ *
+ * If this port's scheduling domain doesn't have any load-balanced queues,
+ * this argument is ignored and the port is given no load-balanced
+ * credits.
+ * - dir_credit_quantum: Number of directed credits for the DLB to refill per
+ * refill operation.
+ *
+ * If this port's scheduling domain doesn't have any directed queues,
+ * this argument is ignored and the port is given no directed credits.
+ * - padding0: Reserved for future use.
+ * - cq_depth: Depth of the port's CQ. Must be a power-of-two between 8 and
+ * 1024, inclusive.
+ * - cq_depth_threshold: CQ depth interrupt threshold. A value of N means that
+ * the CQ interrupt won't fire until there are N or more outstanding CQ
+ * tokens.
+ * - cq_history_list_size: Number of history list entries. This must be greater
+ * than or equal to cq_depth.
+ * - padding1: Reserved for future use.
+ * - padding2: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: port ID.
+ */
+struct dlb_create_ldb_port_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 ldb_credit_pool_id;
+ __u32 dir_credit_pool_id;
+ __u16 ldb_credit_high_watermark;
+ __u16 ldb_credit_low_watermark;
+ __u16 ldb_credit_quantum;
+ __u16 dir_credit_high_watermark;
+ __u16 dir_credit_low_watermark;
+ __u16 dir_credit_quantum;
+ __u16 padding0;
+ __u16 cq_depth;
+ __u16 cq_depth_threshold;
+ __u16 cq_history_list_size;
+ __u32 padding1;
+};
+
+/*
+ * DLB_DOMAIN_CMD_CREATE_DIR_PORT: Configure a directed port.
+ * Input parameters:
+ * - ldb_credit_pool_id: Load-balanced credit pool this port will belong to.
+ * - dir_credit_pool_id: Directed credit pool this port will belong to.
+ * - ldb_credit_high_watermark: Number of load-balanced credits from the pool
+ * that this port will own.
+ *
+ * If this port's scheduling domain doesn't have any load-balanced queues,
+ * this argument is ignored and the port is given no load-balanced
+ * credits.
+ * - dir_credit_high_watermark: Number of directed credits from the pool that
+ * this port will own.
+ * - ldb_credit_low_watermark: Load-balanced credit low watermark. When the
+ * port's credits reach this watermark, they become eligible to be
+ * refilled by the DLB as credits until the high watermark
+ * (num_ldb_credits) is reached.
+ *
+ * If this port's scheduling domain doesn't have any load-balanced queues,
+ * this argument is ignored and the port is given no load-balanced
+ * credits.
+ * - dir_credit_low_watermark: Directed credit low watermark. When the port's
+ * credits reach this watermark, they become eligible to be refilled by
+ * the DLB as credits until the high watermark (num_dir_credits) is
+ * reached.
+ * - ldb_credit_quantum: Number of load-balanced credits for the DLB to refill
+ * per refill operation.
+ *
+ * If this port's scheduling domain doesn't have any load-balanced queues,
+ * this argument is ignored and the port is given no load-balanced
+ * credits.
+ * - dir_credit_quantum: Number of directed credits for the DLB to refill per
+ * refill operation.
+ * - cq_depth: Depth of the port's CQ. Must be a power-of-two between 8 and
+ * 1024, inclusive.
+ * - cq_depth_threshold: CQ depth interrupt threshold. A value of N means that
+ * the CQ interrupt won't fire until there are N or more outstanding CQ
+ * tokens.
+ * - qid: Queue ID. If the corresponding directed queue is already created,
+ * specify its ID here. Else this argument must be 0xFFFFFFFF to indicate
+ * that the port is being created before the queue.
+ * - padding1: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: Port ID.
+ */
+struct dlb_create_dir_port_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 ldb_credit_pool_id;
+ __u32 dir_credit_pool_id;
+ __u16 ldb_credit_high_watermark;
+ __u16 ldb_credit_low_watermark;
+ __u16 ldb_credit_quantum;
+ __u16 dir_credit_high_watermark;
+ __u16 dir_credit_low_watermark;
+ __u16 dir_credit_quantum;
+ __u16 cq_depth;
+ __u16 cq_depth_threshold;
+ __s32 queue_id;
+ __u32 padding1;
+};
+
+/*
+ * DLB_DOMAIN_CMD_START_DOMAIN: Mark the end of the domain configuration. This
+ * must be called before passing QEs into the device, and no configuration
+ * ioctls can be issued once the domain has started. Sending QEs into the
+ * device before calling this ioctl will result in undefined behavior.
+ * Input parameters:
+ * - (None)
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ */
+struct dlb_start_domain_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+};
+
+/*
+ * DLB_DOMAIN_CMD_MAP_QID: Map a load-balanced queue to a load-balanced port.
+ * Input parameters:
+ * - port_id: Load-balanced port ID.
+ * - qid: Load-balanced queue ID.
+ * - priority: Queue->port service priority.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ */
+struct dlb_map_qid_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 port_id;
+ __u32 qid;
+ __u32 priority;
+ __u32 padding0;
+};
+
+/*
+ * DLB_DOMAIN_CMD_UNMAP_QID: Unmap a load-balanced queue to a load-balanced
+ * port.
+ * Input parameters:
+ * - port_id: Load-balanced port ID.
+ * - qid: Load-balanced queue ID.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ */
+struct dlb_unmap_qid_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 port_id;
+ __u32 qid;
+};
+
+/*
+ * DLB_DOMAIN_CMD_ENABLE_LDB_PORT: Enable scheduling to a load-balanced port.
+ * Input parameters:
+ * - port_id: Load-balanced port ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ */
+struct dlb_enable_ldb_port_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 port_id;
+ __u32 padding0;
+};
+
+/*
+ * DLB_DOMAIN_CMD_ENABLE_DIR_PORT: Enable scheduling to a directed port.
+ * Input parameters:
+ * - port_id: Directed port ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ */
+struct dlb_enable_dir_port_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 port_id;
+};
+
+/*
+ * DLB_DOMAIN_CMD_DISABLE_LDB_PORT: Disable scheduling to a load-balanced port.
+ * Input parameters:
+ * - port_id: Load-balanced port ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ */
+struct dlb_disable_ldb_port_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 port_id;
+ __u32 padding0;
+};
+
+/*
+ * DLB_DOMAIN_CMD_DISABLE_DIR_PORT: Disable scheduling to a directed port.
+ * Input parameters:
+ * - port_id: Directed port ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ */
+struct dlb_disable_dir_port_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 port_id;
+ __u32 padding0;
+};
+
+/*
+ * DLB_DOMAIN_CMD_BLOCK_ON_CQ_INTERRUPT: Block on a CQ interrupt until a QE
+ * arrives for the specified port. If a QE is already present, the ioctl
+ * will immediately return.
+ *
+ * Note: Only one thread can block on a CQ's interrupt at a time. Doing
+ * otherwise can result in hung threads.
+ *
+ * Input parameters:
+ * - port_id: Port ID.
+ * - is_ldb: True if the port is load-balanced, false otherwise.
+ * - arm: Tell the driver to arm the interrupt.
+ * - cq_gen: Current CQ generation bit.
+ * - padding0: Reserved for future use.
+ * - cq_va: VA of the CQ entry where the next QE will be placed.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ */
+struct dlb_block_on_cq_interrupt_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 port_id;
+ __u8 is_ldb;
+ __u8 arm;
+ __u8 cq_gen;
+ __u8 padding0;
+ __u64 cq_va;
+};
+
+/*
+ * DLB_DOMAIN_CMD_ENQUEUE_DOMAIN_ALERT: Enqueue a domain alert that will be
+ * read by one reader thread.
+ *
+ * Input parameters:
+ * - aux_alert_data: user-defined auxiliary data.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ */
+struct dlb_enqueue_domain_alert_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u64 aux_alert_data;
+};
+
+/*
+ * DLB_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH: Get a load-balanced queue's depth.
+ * Input parameters:
+ * - queue_id: The load-balanced queue ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: queue depth.
+ */
+struct dlb_get_ldb_queue_depth_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 queue_id;
+ __u32 padding0;
+};
+
+/*
+ * DLB_DOMAIN_CMD_GET_DIR_QUEUE_DEPTH: Get a directed queue's depth.
+ * Input parameters:
+ * - queue_id: The directed queue ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: queue depth.
+ */
+struct dlb_get_dir_queue_depth_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 queue_id;
+ __u32 padding0;
+};
+
+/*
+ * DLB_DOMAIN_CMD_PENDING_PORT_UNMAPS: Get number of queue unmap operations in
+ * progress for a load-balanced port.
+ *
+ * Note: This is a snapshot; the number of unmap operations in progress
+ * is subject to change at any time.
+ *
+ * Input parameters:
+ * - port_id: Load-balanced port ID.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: number of unmaps in progress.
+ */
+struct dlb_pending_port_unmaps_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 port_id;
+ __u32 padding0;
+};
+
+enum dlb_domain_user_interface_commands {
+ DLB_DOMAIN_CMD_CREATE_LDB_POOL,
+ DLB_DOMAIN_CMD_CREATE_DIR_POOL,
+ DLB_DOMAIN_CMD_CREATE_LDB_QUEUE,
+ DLB_DOMAIN_CMD_CREATE_DIR_QUEUE,
+ DLB_DOMAIN_CMD_CREATE_LDB_PORT,
+ DLB_DOMAIN_CMD_CREATE_DIR_PORT,
+ DLB_DOMAIN_CMD_START_DOMAIN,
+ DLB_DOMAIN_CMD_MAP_QID,
+ DLB_DOMAIN_CMD_UNMAP_QID,
+ DLB_DOMAIN_CMD_ENABLE_LDB_PORT,
+ DLB_DOMAIN_CMD_ENABLE_DIR_PORT,
+ DLB_DOMAIN_CMD_DISABLE_LDB_PORT,
+ DLB_DOMAIN_CMD_DISABLE_DIR_PORT,
+ DLB_DOMAIN_CMD_BLOCK_ON_CQ_INTERRUPT,
+ DLB_DOMAIN_CMD_ENQUEUE_DOMAIN_ALERT,
+ DLB_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH,
+ DLB_DOMAIN_CMD_GET_DIR_QUEUE_DEPTH,
+ DLB_DOMAIN_CMD_PENDING_PORT_UNMAPS,
+
+ /* NUM_DLB_DOMAIN_CMD must be last */
+ NUM_DLB_DOMAIN_CMD,
+};
+
+/*
+ * Base addresses for memory mapping the consumer queue (CQ) and popcount (PC)
+ * memory space, and producer port (PP) MMIO space. The CQ, PC, and PP
+ * addresses are per-port. Every address is page-separated (e.g. LDB PP 0 is at
+ * 0x2100000 and LDB PP 1 is at 0x2101000).
+ */
+#define DLB_LDB_CQ_BASE 0x3000000
+#define DLB_LDB_CQ_MAX_SIZE 65536
+#define DLB_LDB_CQ_OFFS(id) (DLB_LDB_CQ_BASE + (id) * DLB_LDB_CQ_MAX_SIZE)
+
+#define DLB_DIR_CQ_BASE 0x3800000
+#define DLB_DIR_CQ_MAX_SIZE 65536
+#define DLB_DIR_CQ_OFFS(id) (DLB_DIR_CQ_BASE + (id) * DLB_DIR_CQ_MAX_SIZE)
+
+#define DLB_LDB_PC_BASE 0x2300000
+#define DLB_LDB_PC_MAX_SIZE 4096
+#define DLB_LDB_PC_OFFS(id) (DLB_LDB_PC_BASE + (id) * DLB_LDB_PC_MAX_SIZE)
+
+#define DLB_DIR_PC_BASE 0x2200000
+#define DLB_DIR_PC_MAX_SIZE 4096
+#define DLB_DIR_PC_OFFS(id) (DLB_DIR_PC_BASE + (id) * DLB_DIR_PC_MAX_SIZE)
+
+#define DLB_LDB_PP_BASE 0x2100000
+#define DLB_LDB_PP_MAX_SIZE 4096
+#define DLB_LDB_PP_OFFS(id) (DLB_LDB_PP_BASE + (id) * DLB_LDB_PP_MAX_SIZE)
+
+#define DLB_DIR_PP_BASE 0x2000000
+#define DLB_DIR_PP_MAX_SIZE 4096
+#define DLB_DIR_PP_OFFS(id) (DLB_DIR_PP_BASE + (id) * DLB_DIR_PP_MAX_SIZE)
+
+#endif /* __DLB_USER_H */
--
1.7.10
^ permalink raw reply [relevance 1%]
* [dpdk-dev] [PATCH 03/27] event/dlb: add shared code version 10.7.9
@ 2020-07-30 19:49 1% ` McDaniel, Timothy
2020-07-30 19:49 1% ` [dpdk-dev] [PATCH 08/27] event/dlb: add definitions shared with LKM or shared code McDaniel, Timothy
1 sibling, 0 replies; 200+ results
From: McDaniel, Timothy @ 2020-07-30 19:49 UTC (permalink / raw)
To: jerinj
Cc: mattias.ronnblom, dev, gage.eads, harry.van.haaren, McDaniel, Timothy
From: "McDaniel, Timothy" <timothy.mcdaniel@intel.com>
The DLB shared code is auto generated by Intel, and is being committed
here so that it can be built in the DPDK environment. The shared code
should not be modified. The shared code must be present in order to
successfully build the DLB PMD.
Changes since v1 patch series
1) convert C99 comment to standard C
2) remove TODO and FIXME comments
3) converted to use same log i/f as PMD
4) disable PF->VF ISR pending access alarm
5) disable VF->PF ISR pending access alarm
Signed-off-by: McDaniel, Timothy <timothy.mcdaniel@intel.com>
---
drivers/event/dlb/pf/base/dlb_hw_types.h | 360 +
drivers/event/dlb/pf/base/dlb_mbox.h | 645 ++
drivers/event/dlb/pf/base/dlb_osdep.h | 347 +
drivers/event/dlb/pf/base/dlb_osdep_bitmap.h | 442 ++
drivers/event/dlb/pf/base/dlb_osdep_list.h | 131 +
drivers/event/dlb/pf/base/dlb_osdep_types.h | 31 +
drivers/event/dlb/pf/base/dlb_regs.h | 2678 +++++++
drivers/event/dlb/pf/base/dlb_resource.c | 9722 ++++++++++++++++++++++++++
drivers/event/dlb/pf/base/dlb_resource.h | 1639 +++++
drivers/event/dlb/pf/base/dlb_user.h | 1084 +++
10 files changed, 17079 insertions(+)
create mode 100644 drivers/event/dlb/pf/base/dlb_hw_types.h
create mode 100644 drivers/event/dlb/pf/base/dlb_mbox.h
create mode 100644 drivers/event/dlb/pf/base/dlb_osdep.h
create mode 100644 drivers/event/dlb/pf/base/dlb_osdep_bitmap.h
create mode 100644 drivers/event/dlb/pf/base/dlb_osdep_list.h
create mode 100644 drivers/event/dlb/pf/base/dlb_osdep_types.h
create mode 100644 drivers/event/dlb/pf/base/dlb_regs.h
create mode 100644 drivers/event/dlb/pf/base/dlb_resource.c
create mode 100644 drivers/event/dlb/pf/base/dlb_resource.h
create mode 100644 drivers/event/dlb/pf/base/dlb_user.h
diff --git a/drivers/event/dlb/pf/base/dlb_hw_types.h b/drivers/event/dlb/pf/base/dlb_hw_types.h
new file mode 100644
index 0000000..d56590e
--- /dev/null
+++ b/drivers/event/dlb/pf/base/dlb_hw_types.h
@@ -0,0 +1,360 @@
+/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-3-Clause)
+ * Copyright(c) 2016-2020 Intel Corporation
+ */
+
+#ifndef __DLB_HW_TYPES_H
+#define __DLB_HW_TYPES_H
+
+#include "dlb_user.h"
+#include "dlb_osdep_types.h"
+#include "dlb_osdep_list.h"
+
+#define DLB_MAX_NUM_VFS 16
+#define DLB_MAX_NUM_DOMAINS 32
+#define DLB_MAX_NUM_LDB_QUEUES 128
+#define DLB_MAX_NUM_LDB_PORTS 64
+#define DLB_MAX_NUM_DIR_PORTS 128
+#define DLB_MAX_NUM_LDB_CREDITS 16384
+#define DLB_MAX_NUM_DIR_CREDITS 4096
+#define DLB_MAX_NUM_LDB_CREDIT_POOLS 64
+#define DLB_MAX_NUM_DIR_CREDIT_POOLS 64
+#define DLB_MAX_NUM_HIST_LIST_ENTRIES 5120
+#define DLB_MAX_NUM_AQOS_ENTRIES 2048
+#define DLB_MAX_NUM_TOTAL_OUTSTANDING_COMPLETIONS 4096
+#define DLB_MAX_NUM_QIDS_PER_LDB_CQ 8
+#define DLB_MAX_NUM_SEQUENCE_NUMBER_GROUPS 4
+#define DLB_MAX_NUM_SEQUENCE_NUMBER_MODES 6
+#define DLB_QID_PRIORITIES 8
+#define DLB_NUM_ARB_WEIGHTS 8
+#define DLB_MAX_WEIGHT 255
+#define DLB_MAX_PORT_CREDIT_QUANTUM 1023
+#define DLB_MAX_CQ_COMP_CHECK_LOOPS 409600
+#define DLB_MAX_QID_EMPTY_CHECK_LOOPS (32 * 64 * 1024 * (800 / 30))
+#define DLB_HZ 800000000
+
+/* Used for DLB A-stepping workaround for hardware write buffer lock up issue */
+#define DLB_A_STEP_MAX_PORTS 128
+
+#define DLB_PF_DEV_ID 0x270B
+#define DLB_VF_DEV_ID 0x270C
+
+/* Interrupt related macros */
+#define DLB_PF_NUM_NON_CQ_INTERRUPT_VECTORS 8
+#define DLB_PF_NUM_CQ_INTERRUPT_VECTORS 64
+#define DLB_PF_TOTAL_NUM_INTERRUPT_VECTORS \
+ (DLB_PF_NUM_NON_CQ_INTERRUPT_VECTORS + \
+ DLB_PF_NUM_CQ_INTERRUPT_VECTORS)
+#define DLB_PF_NUM_COMPRESSED_MODE_VECTORS \
+ (DLB_PF_NUM_NON_CQ_INTERRUPT_VECTORS + 1)
+#define DLB_PF_NUM_PACKED_MODE_VECTORS DLB_PF_TOTAL_NUM_INTERRUPT_VECTORS
+#define DLB_PF_COMPRESSED_MODE_CQ_VECTOR_ID DLB_PF_NUM_NON_CQ_INTERRUPT_VECTORS
+
+#define DLB_VF_NUM_NON_CQ_INTERRUPT_VECTORS 1
+#define DLB_VF_NUM_CQ_INTERRUPT_VECTORS 31
+#define DLB_VF_BASE_CQ_VECTOR_ID 0
+#define DLB_VF_LAST_CQ_VECTOR_ID 30
+#define DLB_VF_MBOX_VECTOR_ID 31
+#define DLB_VF_TOTAL_NUM_INTERRUPT_VECTORS \
+ (DLB_VF_NUM_NON_CQ_INTERRUPT_VECTORS + \
+ DLB_VF_NUM_CQ_INTERRUPT_VECTORS)
+
+#define DLB_PF_NUM_ALARM_INTERRUPT_VECTORS 4
+/* DLB ALARM interrupts */
+#define DLB_INT_ALARM 0
+/* VF to PF Mailbox Service Request */
+#define DLB_INT_VF_TO_PF_MBOX 1
+/* HCW Ingress Errors */
+#define DLB_INT_INGRESS_ERROR 3
+
+#define DLB_ALARM_HW_SOURCE_SYS 0
+#define DLB_ALARM_HW_SOURCE_DLB 1
+
+#define DLB_ALARM_HW_UNIT_CHP 1
+#define DLB_ALARM_HW_UNIT_LSP 3
+
+#define DLB_ALARM_HW_CHP_AID_OUT_OF_CREDITS 6
+#define DLB_ALARM_HW_CHP_AID_ILLEGAL_ENQ 7
+#define DLB_ALARM_HW_LSP_AID_EXCESS_TOKEN_POPS 15
+#define DLB_ALARM_SYS_AID_ILLEGAL_HCW 0
+#define DLB_ALARM_SYS_AID_ILLEGAL_QID 3
+#define DLB_ALARM_SYS_AID_DISABLED_QID 4
+#define DLB_ALARM_SYS_AID_ILLEGAL_CQID 6
+
+/* Hardware-defined base addresses */
+#define DLB_LDB_PP_BASE 0x2100000
+#define DLB_LDB_PP_STRIDE 0x1000
+#define DLB_LDB_PP_BOUND \
+ (DLB_LDB_PP_BASE + DLB_LDB_PP_STRIDE * DLB_MAX_NUM_LDB_PORTS)
+#define DLB_DIR_PP_BASE 0x2000000
+#define DLB_DIR_PP_STRIDE 0x1000
+#define DLB_DIR_PP_BOUND \
+ (DLB_DIR_PP_BASE + DLB_DIR_PP_STRIDE * DLB_MAX_NUM_DIR_PORTS)
+
+struct dlb_resource_id {
+ u32 phys_id;
+ u32 virt_id;
+ u8 vf_owned;
+ u8 vf_id;
+};
+
+struct dlb_freelist {
+ u32 base;
+ u32 bound;
+ u32 offset;
+};
+
+static inline u32 dlb_freelist_count(struct dlb_freelist *list)
+{
+ return (list->bound - list->base) - list->offset;
+}
+
+struct dlb_hcw {
+ u64 data;
+ /* Word 3 */
+ u16 opaque;
+ u8 qid;
+ u8 sched_type:2;
+ u8 priority:3;
+ u8 msg_type:3;
+ /* Word 4 */
+ u16 lock_id;
+ u8 meas_lat:1;
+ u8 rsvd1:2;
+ u8 no_dec:1;
+ u8 cmp_id:4;
+ u8 cq_token:1;
+ u8 qe_comp:1;
+ u8 qe_frag:1;
+ u8 qe_valid:1;
+ u8 int_arm:1;
+ u8 error:1;
+ u8 rsvd:2;
+};
+
+struct dlb_ldb_queue {
+ struct dlb_list_entry domain_list;
+ struct dlb_list_entry func_list;
+ struct dlb_resource_id id;
+ struct dlb_resource_id domain_id;
+ u32 num_qid_inflights;
+ struct dlb_freelist aqed_freelist;
+ u8 sn_cfg_valid;
+ u32 sn_group;
+ u32 sn_slot;
+ u32 num_mappings;
+ u8 num_pending_additions;
+ u8 owned;
+ u8 configured;
+};
+
+/* Directed ports and queues are paired by nature, so the driver tracks them
+ * with a single data structure.
+ */
+struct dlb_dir_pq_pair {
+ struct dlb_list_entry domain_list;
+ struct dlb_list_entry func_list;
+ struct dlb_resource_id id;
+ struct dlb_resource_id domain_id;
+ u8 ldb_pool_used;
+ u8 dir_pool_used;
+ u8 queue_configured;
+ u8 port_configured;
+ u8 owned;
+ u8 enabled;
+ u32 ref_cnt;
+};
+
+enum dlb_qid_map_state {
+ /* The slot doesn't contain a valid queue mapping */
+ DLB_QUEUE_UNMAPPED,
+ /* The slot contains a valid queue mapping */
+ DLB_QUEUE_MAPPED,
+ /* The driver is mapping a queue into this slot */
+ DLB_QUEUE_MAP_IN_PROGRESS,
+ /* The driver is unmapping a queue from this slot */
+ DLB_QUEUE_UNMAP_IN_PROGRESS,
+ /* The driver is unmapping a queue from this slot, and once complete
+ * will replace it with another mapping.
+ */
+ DLB_QUEUE_UNMAP_IN_PROGRESS_PENDING_MAP,
+};
+
+struct dlb_ldb_port_qid_map {
+ u16 qid;
+ u8 priority;
+ u16 pending_qid;
+ u8 pending_priority;
+ enum dlb_qid_map_state state;
+};
+
+struct dlb_ldb_port {
+ struct dlb_list_entry domain_list;
+ struct dlb_list_entry func_list;
+ struct dlb_resource_id id;
+ struct dlb_resource_id domain_id;
+ u8 ldb_pool_used;
+ u8 dir_pool_used;
+ u8 init_tkn_cnt;
+ u32 hist_list_entry_base;
+ u32 hist_list_entry_limit;
+ /* The qid_map represents the hardware QID mapping state. */
+ struct dlb_ldb_port_qid_map qid_map[DLB_MAX_NUM_QIDS_PER_LDB_CQ];
+ u32 ref_cnt;
+ u8 num_pending_removals;
+ u8 num_mappings;
+ u8 owned;
+ u8 enabled;
+ u8 configured;
+};
+
+struct dlb_credit_pool {
+ struct dlb_list_entry domain_list;
+ struct dlb_list_entry func_list;
+ struct dlb_resource_id id;
+ struct dlb_resource_id domain_id;
+ u32 total_credits;
+ u32 avail_credits;
+ u8 owned;
+ u8 configured;
+};
+
+struct dlb_sn_group {
+ u32 mode;
+ u32 sequence_numbers_per_queue;
+ u32 slot_use_bitmap;
+ u32 id;
+};
+
+static inline bool dlb_sn_group_full(struct dlb_sn_group *group)
+{
+ u32 mask[6] = {
+ 0xffffffff, /* 32 SNs per queue */
+ 0x0000ffff, /* 64 SNs per queue */
+ 0x000000ff, /* 128 SNs per queue */
+ 0x0000000f, /* 256 SNs per queue */
+ 0x00000003, /* 512 SNs per queue */
+ 0x00000001}; /* 1024 SNs per queue */
+
+ return group->slot_use_bitmap == mask[group->mode];
+}
+
+static inline int dlb_sn_group_alloc_slot(struct dlb_sn_group *group)
+{
+ int bound[6] = {32, 16, 8, 4, 2, 1};
+ int i;
+
+ for (i = 0; i < bound[group->mode]; i++) {
+ if (!(group->slot_use_bitmap & (1 << i))) {
+ group->slot_use_bitmap |= 1 << i;
+ return i;
+ }
+ }
+
+ return -1;
+}
+
+static inline void dlb_sn_group_free_slot(struct dlb_sn_group *group, int slot)
+{
+ group->slot_use_bitmap &= ~(1 << slot);
+}
+
+static inline int dlb_sn_group_used_slots(struct dlb_sn_group *group)
+{
+ int i, cnt = 0;
+
+ for (i = 0; i < 32; i++)
+ cnt += !!(group->slot_use_bitmap & (1 << i));
+
+ return cnt;
+}
+
+struct dlb_domain {
+ struct dlb_function_resources *parent_func;
+ struct dlb_list_entry func_list;
+ struct dlb_list_head used_ldb_queues;
+ struct dlb_list_head used_ldb_ports;
+ struct dlb_list_head used_dir_pq_pairs;
+ struct dlb_list_head used_ldb_credit_pools;
+ struct dlb_list_head used_dir_credit_pools;
+ struct dlb_list_head avail_ldb_queues;
+ struct dlb_list_head avail_ldb_ports;
+ struct dlb_list_head avail_dir_pq_pairs;
+ struct dlb_list_head avail_ldb_credit_pools;
+ struct dlb_list_head avail_dir_credit_pools;
+ u32 total_hist_list_entries;
+ u32 avail_hist_list_entries;
+ u32 hist_list_entry_base;
+ u32 hist_list_entry_offset;
+ struct dlb_freelist qed_freelist;
+ struct dlb_freelist dqed_freelist;
+ struct dlb_freelist aqed_freelist;
+ struct dlb_resource_id id;
+ int num_pending_removals;
+ int num_pending_additions;
+ u8 configured;
+ u8 started;
+};
+
+struct dlb_bitmap;
+
+struct dlb_function_resources {
+ u32 num_avail_domains;
+ struct dlb_list_head avail_domains;
+ struct dlb_list_head used_domains;
+ u32 num_avail_ldb_queues;
+ struct dlb_list_head avail_ldb_queues;
+ u32 num_avail_ldb_ports;
+ struct dlb_list_head avail_ldb_ports;
+ u32 num_avail_dir_pq_pairs;
+ struct dlb_list_head avail_dir_pq_pairs;
+ struct dlb_bitmap *avail_hist_list_entries;
+ struct dlb_bitmap *avail_qed_freelist_entries;
+ struct dlb_bitmap *avail_dqed_freelist_entries;
+ struct dlb_bitmap *avail_aqed_freelist_entries;
+ u32 num_avail_ldb_credit_pools;
+ struct dlb_list_head avail_ldb_credit_pools;
+ u32 num_avail_dir_credit_pools;
+ struct dlb_list_head avail_dir_credit_pools;
+ u32 num_enabled_ldb_ports; /* (PF only) */
+ u8 locked; /* (VF only) */
+};
+
+/* After initialization, each resource in dlb_hw_resources is located in one of
+ * the following lists:
+ * -- The PF's available resources list. These are unconfigured resources owned
+ * by the PF and not allocated to a DLB scheduling domain.
+ * -- A VF's available resources list. These are VF-owned unconfigured
+ * resources not allocated to a DLB scheduling domain.
+ * -- A domain's available resources list. These are domain-owned unconfigured
+ * resources.
+ * -- A domain's used resources list. These are are domain-owned configured
+ * resources.
+ *
+ * A resource moves to a new list when a VF or domain is created or destroyed,
+ * or when the resource is configured.
+ */
+struct dlb_hw_resources {
+ struct dlb_ldb_queue ldb_queues[DLB_MAX_NUM_LDB_QUEUES];
+ struct dlb_ldb_port ldb_ports[DLB_MAX_NUM_LDB_PORTS];
+ struct dlb_dir_pq_pair dir_pq_pairs[DLB_MAX_NUM_DIR_PORTS];
+ struct dlb_credit_pool ldb_credit_pools[DLB_MAX_NUM_LDB_CREDIT_POOLS];
+ struct dlb_credit_pool dir_credit_pools[DLB_MAX_NUM_DIR_CREDIT_POOLS];
+ struct dlb_sn_group sn_groups[DLB_MAX_NUM_SEQUENCE_NUMBER_GROUPS];
+};
+
+struct dlb_hw {
+ /* BAR 0 address */
+ void *csr_kva;
+ unsigned long csr_phys_addr;
+ /* BAR 2 address */
+ void *func_kva;
+ unsigned long func_phys_addr;
+
+ /* Resource tracking */
+ struct dlb_hw_resources rsrcs;
+ struct dlb_function_resources pf;
+ struct dlb_function_resources vf[DLB_MAX_NUM_VFS];
+ struct dlb_domain domains[DLB_MAX_NUM_DOMAINS];
+};
+
+#endif /* __DLB_HW_TYPES_H */
diff --git a/drivers/event/dlb/pf/base/dlb_mbox.h b/drivers/event/dlb/pf/base/dlb_mbox.h
new file mode 100644
index 0000000..e195526
--- /dev/null
+++ b/drivers/event/dlb/pf/base/dlb_mbox.h
@@ -0,0 +1,645 @@
+/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-3-Clause)
+ * Copyright(c) 2016-2020 Intel Corporation
+ */
+
+#ifndef __DLB_BASE_DLB_MBOX_H
+#define __DLB_BASE_DLB_MBOX_H
+
+#include "dlb_regs.h"
+#include "dlb_osdep_types.h"
+
+#define DLB_MBOX_INTERFACE_VERSION 1
+
+/* The PF uses its PF->VF mailbox to send responses to VF requests, as well as
+ * to send requests of its own (e.g. notifying a VF of an impending FLR).
+ * To avoid communication race conditions, e.g. the PF sends a response and then
+ * sends a request before the VF reads the response, the PF->VF mailbox is
+ * divided into two sections:
+ * - Bytes 0-47: PF responses
+ * - Bytes 48-63: PF requests
+ *
+ * Partitioning the PF->VF mailbox allows responses and requests to occupy the
+ * mailbox simultaneously.
+ */
+#define DLB_PF2VF_RESP_BYTES 48
+#define DLB_PF2VF_RESP_BASE 0
+#define DLB_PF2VF_RESP_BASE_WORD (DLB_PF2VF_RESP_BASE / 4)
+
+#define DLB_PF2VF_REQ_BYTES \
+ (DLB_FUNC_PF_PF2VF_MAILBOX_BYTES - DLB_PF2VF_RESP_BYTES)
+#define DLB_PF2VF_REQ_BASE DLB_PF2VF_RESP_BYTES
+#define DLB_PF2VF_REQ_BASE_WORD (DLB_PF2VF_REQ_BASE / 4)
+
+/* Similarly, the VF->PF mailbox is divided into two sections:
+ * - Bytes 0-239: VF requests
+ * - Bytes 240-255: VF responses
+ */
+#define DLB_VF2PF_REQ_BYTES 240
+#define DLB_VF2PF_REQ_BASE 0
+#define DLB_VF2PF_REQ_BASE_WORD (DLB_VF2PF_REQ_BASE / 4)
+
+#define DLB_VF2PF_RESP_BYTES \
+ (DLB_FUNC_VF_VF2PF_MAILBOX_BYTES - DLB_VF2PF_REQ_BYTES)
+#define DLB_VF2PF_RESP_BASE DLB_VF2PF_REQ_BYTES
+#define DLB_VF2PF_RESP_BASE_WORD (DLB_VF2PF_RESP_BASE / 4)
+
+/* VF-initiated commands */
+enum dlb_mbox_cmd_type {
+ DLB_MBOX_CMD_REGISTER,
+ DLB_MBOX_CMD_UNREGISTER,
+ DLB_MBOX_CMD_GET_NUM_RESOURCES,
+ DLB_MBOX_CMD_CREATE_SCHED_DOMAIN,
+ DLB_MBOX_CMD_RESET_SCHED_DOMAIN,
+ DLB_MBOX_CMD_CREATE_LDB_POOL,
+ DLB_MBOX_CMD_CREATE_DIR_POOL,
+ DLB_MBOX_CMD_CREATE_LDB_QUEUE,
+ DLB_MBOX_CMD_CREATE_DIR_QUEUE,
+ DLB_MBOX_CMD_CREATE_LDB_PORT,
+ DLB_MBOX_CMD_CREATE_DIR_PORT,
+ DLB_MBOX_CMD_ENABLE_LDB_PORT,
+ DLB_MBOX_CMD_DISABLE_LDB_PORT,
+ DLB_MBOX_CMD_ENABLE_DIR_PORT,
+ DLB_MBOX_CMD_DISABLE_DIR_PORT,
+ DLB_MBOX_CMD_LDB_PORT_OWNED_BY_DOMAIN,
+ DLB_MBOX_CMD_DIR_PORT_OWNED_BY_DOMAIN,
+ DLB_MBOX_CMD_MAP_QID,
+ DLB_MBOX_CMD_UNMAP_QID,
+ DLB_MBOX_CMD_START_DOMAIN,
+ DLB_MBOX_CMD_ENABLE_LDB_PORT_INTR,
+ DLB_MBOX_CMD_ENABLE_DIR_PORT_INTR,
+ DLB_MBOX_CMD_ARM_CQ_INTR,
+ DLB_MBOX_CMD_GET_NUM_USED_RESOURCES,
+ DLB_MBOX_CMD_INIT_CQ_SCHED_COUNT,
+ DLB_MBOX_CMD_COLLECT_CQ_SCHED_COUNT,
+ DLB_MBOX_CMD_ACK_VF_FLR_DONE,
+ DLB_MBOX_CMD_GET_SN_ALLOCATION,
+ DLB_MBOX_CMD_GET_LDB_QUEUE_DEPTH,
+ DLB_MBOX_CMD_GET_DIR_QUEUE_DEPTH,
+ DLB_MBOX_CMD_PENDING_PORT_UNMAPS,
+ DLB_MBOX_CMD_QUERY_CQ_POLL_MODE,
+ DLB_MBOX_CMD_GET_SN_OCCUPANCY,
+
+ /* NUM_QE_CMD_TYPES must be last */
+ NUM_DLB_MBOX_CMD_TYPES,
+};
+
+static const char dlb_mbox_cmd_type_strings[][128] = {
+ "DLB_MBOX_CMD_REGISTER",
+ "DLB_MBOX_CMD_UNREGISTER",
+ "DLB_MBOX_CMD_GET_NUM_RESOURCES",
+ "DLB_MBOX_CMD_CREATE_SCHED_DOMAIN",
+ "DLB_MBOX_CMD_RESET_SCHED_DOMAIN",
+ "DLB_MBOX_CMD_CREATE_LDB_POOL",
+ "DLB_MBOX_CMD_CREATE_DIR_POOL",
+ "DLB_MBOX_CMD_CREATE_LDB_QUEUE",
+ "DLB_MBOX_CMD_CREATE_DIR_QUEUE",
+ "DLB_MBOX_CMD_CREATE_LDB_PORT",
+ "DLB_MBOX_CMD_CREATE_DIR_PORT",
+ "DLB_MBOX_CMD_ENABLE_LDB_PORT",
+ "DLB_MBOX_CMD_DISABLE_LDB_PORT",
+ "DLB_MBOX_CMD_ENABLE_DIR_PORT",
+ "DLB_MBOX_CMD_DISABLE_DIR_PORT",
+ "DLB_MBOX_CMD_LDB_PORT_OWNED_BY_DOMAIN",
+ "DLB_MBOX_CMD_DIR_PORT_OWNED_BY_DOMAIN",
+ "DLB_MBOX_CMD_MAP_QID",
+ "DLB_MBOX_CMD_UNMAP_QID",
+ "DLB_MBOX_CMD_START_DOMAIN",
+ "DLB_MBOX_CMD_ENABLE_LDB_PORT_INTR",
+ "DLB_MBOX_CMD_ENABLE_DIR_PORT_INTR",
+ "DLB_MBOX_CMD_ARM_CQ_INTR",
+ "DLB_MBOX_CMD_GET_NUM_USED_RESOURCES",
+ "DLB_MBOX_CMD_INIT_CQ_SCHED_COUNT",
+ "DLB_MBOX_CMD_COLLECT_CQ_SCHED_COUNT",
+ "DLB_MBOX_CMD_ACK_VF_FLR_DONE",
+ "DLB_MBOX_CMD_GET_SN_ALLOCATION",
+ "DLB_MBOX_CMD_GET_LDB_QUEUE_DEPTH",
+ "DLB_MBOX_CMD_GET_DIR_QUEUE_DEPTH",
+ "DLB_MBOX_CMD_PENDING_PORT_UNMAPS",
+ "DLB_MBOX_CMD_QUERY_CQ_POLL_MODE",
+ "DLB_MBOX_CMD_GET_SN_OCCUPANCY",
+};
+
+/* PF-initiated commands */
+enum dlb_mbox_vf_cmd_type {
+ DLB_MBOX_VF_CMD_DOMAIN_ALERT,
+ DLB_MBOX_VF_CMD_NOTIFICATION,
+ DLB_MBOX_VF_CMD_IN_USE,
+
+ /* NUM_DLB_MBOX_VF_CMD_TYPES must be last */
+ NUM_DLB_MBOX_VF_CMD_TYPES,
+};
+
+static const char dlb_mbox_vf_cmd_type_strings[][128] = {
+ "DLB_MBOX_VF_CMD_DOMAIN_ALERT",
+ "DLB_MBOX_VF_CMD_NOTIFICATION",
+ "DLB_MBOX_VF_CMD_IN_USE",
+};
+
+#define DLB_MBOX_CMD_TYPE(hdr) \
+ (((struct dlb_mbox_req_hdr *)hdr)->type)
+#define DLB_MBOX_CMD_STRING(hdr) \
+ dlb_mbox_cmd_type_strings[DLB_MBOX_CMD_TYPE(hdr)]
+
+enum dlb_mbox_status_type {
+ DLB_MBOX_ST_SUCCESS,
+ DLB_MBOX_ST_INVALID_CMD_TYPE,
+ DLB_MBOX_ST_VERSION_MISMATCH,
+ DLB_MBOX_ST_EXPECTED_PHASE_ONE,
+ DLB_MBOX_ST_EXPECTED_PHASE_TWO,
+ DLB_MBOX_ST_INVALID_OWNER_VF,
+};
+
+static const char dlb_mbox_status_type_strings[][128] = {
+ "DLB_MBOX_ST_SUCCESS",
+ "DLB_MBOX_ST_INVALID_CMD_TYPE",
+ "DLB_MBOX_ST_VERSION_MISMATCH",
+ "DLB_MBOX_ST_EXPECTED_PHASE_ONE",
+ "DLB_MBOX_ST_EXPECTED_PHASE_TWO",
+ "DLB_MBOX_ST_INVALID_OWNER_VF",
+};
+
+#define DLB_MBOX_ST_TYPE(hdr) \
+ (((struct dlb_mbox_resp_hdr *)hdr)->status)
+#define DLB_MBOX_ST_STRING(hdr) \
+ dlb_mbox_status_type_strings[DLB_MBOX_ST_TYPE(hdr)]
+
+/* This structure is always the first field in a request structure */
+struct dlb_mbox_req_hdr {
+ u32 type;
+};
+
+/* This structure is always the first field in a response structure */
+struct dlb_mbox_resp_hdr {
+ u32 status;
+};
+
+struct dlb_mbox_register_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u16 min_interface_version;
+ u16 max_interface_version;
+};
+
+struct dlb_mbox_register_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 interface_version;
+ u8 pf_id;
+ u8 vf_id;
+ u8 is_auxiliary_vf;
+ u8 primary_vf_id;
+ u32 padding;
+};
+
+struct dlb_mbox_unregister_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 padding;
+};
+
+struct dlb_mbox_unregister_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 padding;
+};
+
+struct dlb_mbox_get_num_resources_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 padding;
+};
+
+struct dlb_mbox_get_num_resources_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u16 num_sched_domains;
+ u16 num_ldb_queues;
+ u16 num_ldb_ports;
+ u16 num_dir_ports;
+ u16 padding0;
+ u8 num_ldb_credit_pools;
+ u8 num_dir_credit_pools;
+ u32 num_atomic_inflights;
+ u32 max_contiguous_atomic_inflights;
+ u32 num_hist_list_entries;
+ u32 max_contiguous_hist_list_entries;
+ u16 num_ldb_credits;
+ u16 max_contiguous_ldb_credits;
+ u16 num_dir_credits;
+ u16 max_contiguous_dir_credits;
+ u32 padding1;
+};
+
+struct dlb_mbox_create_sched_domain_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 num_ldb_queues;
+ u32 num_ldb_ports;
+ u32 num_dir_ports;
+ u32 num_atomic_inflights;
+ u32 num_hist_list_entries;
+ u32 num_ldb_credits;
+ u32 num_dir_credits;
+ u32 num_ldb_credit_pools;
+ u32 num_dir_credit_pools;
+};
+
+struct dlb_mbox_create_sched_domain_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 id;
+};
+
+struct dlb_mbox_reset_sched_domain_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 id;
+};
+
+struct dlb_mbox_reset_sched_domain_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+};
+
+struct dlb_mbox_create_credit_pool_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 domain_id;
+ u32 num_credits;
+ u32 padding;
+};
+
+struct dlb_mbox_create_credit_pool_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 id;
+};
+
+struct dlb_mbox_create_ldb_queue_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 domain_id;
+ u32 num_sequence_numbers;
+ u32 num_qid_inflights;
+ u32 num_atomic_inflights;
+ u32 padding;
+};
+
+struct dlb_mbox_create_ldb_queue_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 id;
+};
+
+struct dlb_mbox_create_dir_queue_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 domain_id;
+ u32 port_id;
+ u32 padding0;
+};
+
+struct dlb_mbox_create_dir_queue_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 id;
+};
+
+struct dlb_mbox_create_ldb_port_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 domain_id;
+ u32 ldb_credit_pool_id;
+ u32 dir_credit_pool_id;
+ u64 pop_count_address;
+ u16 ldb_credit_high_watermark;
+ u16 ldb_credit_low_watermark;
+ u16 ldb_credit_quantum;
+ u16 dir_credit_high_watermark;
+ u16 dir_credit_low_watermark;
+ u16 dir_credit_quantum;
+ u32 padding0;
+ u16 cq_depth;
+ u16 cq_history_list_size;
+ u32 padding1;
+ u64 cq_base_address;
+ u64 nq_base_address;
+ u32 nq_size;
+ u32 padding2;
+};
+
+struct dlb_mbox_create_ldb_port_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 id;
+};
+
+struct dlb_mbox_create_dir_port_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 domain_id;
+ u32 ldb_credit_pool_id;
+ u32 dir_credit_pool_id;
+ u64 pop_count_address;
+ u16 ldb_credit_high_watermark;
+ u16 ldb_credit_low_watermark;
+ u16 ldb_credit_quantum;
+ u16 dir_credit_high_watermark;
+ u16 dir_credit_low_watermark;
+ u16 dir_credit_quantum;
+ u16 cq_depth;
+ u16 padding0;
+ u64 cq_base_address;
+ s32 queue_id;
+ u32 padding1;
+};
+
+struct dlb_mbox_create_dir_port_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 id;
+};
+
+struct dlb_mbox_enable_ldb_port_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 domain_id;
+ u32 port_id;
+ u32 padding;
+};
+
+struct dlb_mbox_enable_ldb_port_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 padding;
+};
+
+struct dlb_mbox_disable_ldb_port_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 domain_id;
+ u32 port_id;
+ u32 padding;
+};
+
+struct dlb_mbox_disable_ldb_port_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 padding;
+};
+
+struct dlb_mbox_enable_dir_port_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 domain_id;
+ u32 port_id;
+ u32 padding;
+};
+
+struct dlb_mbox_enable_dir_port_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 padding;
+};
+
+struct dlb_mbox_disable_dir_port_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 domain_id;
+ u32 port_id;
+ u32 padding;
+};
+
+struct dlb_mbox_disable_dir_port_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 padding;
+};
+
+struct dlb_mbox_ldb_port_owned_by_domain_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 domain_id;
+ u32 port_id;
+ u32 padding;
+};
+
+struct dlb_mbox_ldb_port_owned_by_domain_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ s32 owned;
+};
+
+struct dlb_mbox_dir_port_owned_by_domain_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 domain_id;
+ u32 port_id;
+ u32 padding;
+};
+
+struct dlb_mbox_dir_port_owned_by_domain_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ s32 owned;
+};
+
+struct dlb_mbox_map_qid_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 domain_id;
+ u32 port_id;
+ u32 qid;
+ u32 priority;
+ u32 padding0;
+};
+
+struct dlb_mbox_map_qid_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 id;
+};
+
+struct dlb_mbox_unmap_qid_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 domain_id;
+ u32 port_id;
+ u32 qid;
+};
+
+struct dlb_mbox_unmap_qid_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 padding;
+};
+
+struct dlb_mbox_start_domain_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 domain_id;
+};
+
+struct dlb_mbox_start_domain_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 padding;
+};
+
+struct dlb_mbox_enable_ldb_port_intr_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u16 port_id;
+ u16 thresh;
+ u16 vector;
+ u16 owner_vf;
+ u16 reserved[2];
+};
+
+struct dlb_mbox_enable_ldb_port_intr_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 padding0;
+};
+
+struct dlb_mbox_enable_dir_port_intr_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u16 port_id;
+ u16 thresh;
+ u16 vector;
+ u16 owner_vf;
+ u16 reserved[2];
+};
+
+struct dlb_mbox_enable_dir_port_intr_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 padding0;
+};
+
+struct dlb_mbox_arm_cq_intr_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 domain_id;
+ u32 port_id;
+ u32 is_ldb;
+};
+
+struct dlb_mbox_arm_cq_intr_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 padding0;
+};
+
+/* The alert_id and aux_alert_data follows the format of the alerts defined in
+ * dlb_types.h. The alert id contains an enum dlb_domain_alert_id value, and
+ * the aux_alert_data value varies depending on the alert.
+ */
+struct dlb_mbox_vf_alert_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 domain_id;
+ u32 alert_id;
+ u32 aux_alert_data;
+};
+
+enum dlb_mbox_vf_notification_type {
+ DLB_MBOX_VF_NOTIFICATION_PRE_RESET,
+ DLB_MBOX_VF_NOTIFICATION_POST_RESET,
+
+ /* NUM_DLB_MBOX_VF_NOTIFICATION_TYPES must be last */
+ NUM_DLB_MBOX_VF_NOTIFICATION_TYPES,
+};
+
+struct dlb_mbox_vf_notification_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 notification;
+};
+
+struct dlb_mbox_vf_in_use_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 padding;
+};
+
+struct dlb_mbox_vf_in_use_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 in_use;
+};
+
+struct dlb_mbox_ack_vf_flr_done_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 padding;
+};
+
+struct dlb_mbox_ack_vf_flr_done_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 padding;
+};
+
+struct dlb_mbox_get_sn_allocation_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 group_id;
+};
+
+struct dlb_mbox_get_sn_allocation_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 num;
+};
+
+struct dlb_mbox_get_ldb_queue_depth_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 domain_id;
+ u32 queue_id;
+ u32 padding;
+};
+
+struct dlb_mbox_get_ldb_queue_depth_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 depth;
+};
+
+struct dlb_mbox_get_dir_queue_depth_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 domain_id;
+ u32 queue_id;
+ u32 padding;
+};
+
+struct dlb_mbox_get_dir_queue_depth_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 depth;
+};
+
+struct dlb_mbox_pending_port_unmaps_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 domain_id;
+ u32 port_id;
+ u32 padding;
+};
+
+struct dlb_mbox_pending_port_unmaps_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 num;
+};
+
+struct dlb_mbox_query_cq_poll_mode_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 padding;
+};
+
+struct dlb_mbox_query_cq_poll_mode_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 error_code;
+ u32 status;
+ u32 mode;
+};
+
+struct dlb_mbox_get_sn_occupancy_cmd_req {
+ struct dlb_mbox_req_hdr hdr;
+ u32 group_id;
+};
+
+struct dlb_mbox_get_sn_occupancy_cmd_resp {
+ struct dlb_mbox_resp_hdr hdr;
+ u32 num;
+};
+
+#endif /* __DLB_BASE_DLB_MBOX_H */
diff --git a/drivers/event/dlb/pf/base/dlb_osdep.h b/drivers/event/dlb/pf/base/dlb_osdep.h
new file mode 100644
index 0000000..36b0995
--- /dev/null
+++ b/drivers/event/dlb/pf/base/dlb_osdep.h
@@ -0,0 +1,347 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2016-2020 Intel Corporation
+ */
+
+#ifndef __DLB_OSDEP_H__
+#define __DLB_OSDEP_H__
+
+#include <string.h>
+#include <time.h>
+#include <unistd.h>
+#include <pthread.h>
+#include <rte_string_fns.h>
+#include <rte_cycles.h>
+#include <rte_io.h>
+#include <rte_log.h>
+#include <rte_spinlock.h>
+#include "../dlb_main.h"
+#include "dlb_resource.h"
+#include "../../dlb_log.h"
+#include "../../dlb_user.h"
+
+
+#define DLB_PCI_REG_READ(reg) rte_read32((void *)reg)
+#define DLB_PCI_REG_WRITE(reg, val) rte_write32(val, (void *)reg)
+
+#define DLB_CSR_REG_ADDR(a, reg) ((void *)((uintptr_t)(a)->csr_kva + (reg)))
+#define DLB_CSR_RD(hw, reg) \
+ DLB_PCI_REG_READ(DLB_CSR_REG_ADDR((hw), (reg)))
+#define DLB_CSR_WR(hw, reg, val) \
+ DLB_PCI_REG_WRITE(DLB_CSR_REG_ADDR((hw), (reg)), (val))
+
+#define DLB_FUNC_REG_ADDR(a, reg) ((void *)((uintptr_t)(a)->func_kva + (reg)))
+#define DLB_FUNC_RD(hw, reg) \
+ DLB_PCI_REG_READ(DLB_FUNC_REG_ADDR((hw), (reg)))
+#define DLB_FUNC_WR(hw, reg, val) \
+ DLB_PCI_REG_WRITE(DLB_FUNC_REG_ADDR((hw), (reg)), (val))
+
+#define READ_ONCE(x) (x)
+#define WRITE_ONCE(x, y) ((x) = (y))
+
+#define OS_READ_ONCE(x) READ_ONCE(x)
+#define OS_WRITE_ONCE(x, y) WRITE_ONCE(x, y)
+
+
+extern unsigned int dlb_unregister_timeout_s;
+/**
+ * os_queue_unregister_timeout_s() - timeout (in seconds) to wait for queue
+ * unregister acknowledgments.
+ */
+static inline unsigned int os_queue_unregister_timeout_s(void)
+{
+ return dlb_unregister_timeout_s;
+}
+
+static inline size_t os_strlcpy(char *dst, const char *src, size_t sz)
+{
+ return rte_strlcpy(dst, src, sz);
+}
+
+/**
+ * os_udelay() - busy-wait for a number of microseconds
+ * @usecs: delay duration.
+ */
+static inline void os_udelay(int usecs)
+{
+ rte_delay_us(usecs);
+}
+
+/**
+ * os_msleep() - sleep for a number of milliseconds
+ * @usecs: delay duration.
+ */
+
+static inline void os_msleep(int msecs)
+{
+ rte_delay_ms(msecs);
+}
+
+/**
+ * os_curtime_s() - get the current time (in seconds)
+ * @usecs: delay duration.
+ */
+static inline unsigned long os_curtime_s(void)
+{
+ struct timespec tv;
+
+ clock_gettime(CLOCK_MONOTONIC, &tv);
+
+ return (unsigned long)tv.tv_sec;
+}
+
+#define DLB_PP_BASE(__is_ldb) ((__is_ldb) ? DLB_LDB_PP_BASE : DLB_DIR_PP_BASE)
+/**
+ * os_map_producer_port() - map a producer port into the caller's address space
+ * @hw: dlb_hw handle for a particular device.
+ * @port_id: port ID
+ * @is_ldb: true for load-balanced port, false for a directed port
+ *
+ * This function maps the requested producer port memory into the caller's
+ * address space.
+ *
+ * Return:
+ * Returns the base address at which the PP memory was mapped, else NULL.
+ */
+static inline void *os_map_producer_port(struct dlb_hw *hw,
+ u8 port_id,
+ bool is_ldb)
+{
+ uint64_t addr;
+ uint64_t pp_dma_base;
+
+
+ pp_dma_base = (uintptr_t)hw->func_kva + DLB_PP_BASE(is_ldb);
+ addr = (pp_dma_base + (PAGE_SIZE * port_id));
+
+ return (void *)(uintptr_t)addr;
+
+}
+/**
+ * os_unmap_producer_port() - unmap a producer port
+ * @addr: mapped producer port address
+ *
+ * This function undoes os_map_producer_port() by unmapping the producer port
+ * memory from the caller's address space.
+ *
+ * Return:
+ * Returns the base address at which the PP memory was mapped, else NULL.
+ */
+
+/* PFPMD - Nothing to do here, since memory was not actually mapped by us */
+static inline void os_unmap_producer_port(struct dlb_hw *hw, void *addr)
+{
+ RTE_SET_USED(hw);
+ RTE_SET_USED(addr);
+}
+/**
+ * os_enqueue_four_hcws() - enqueue four HCWs to DLB
+ * @hw: dlb_hw handle for a particular device.
+ * @hcw: pointer to the 64B-aligned contiguous HCW memory
+ * @addr: producer port address
+ */
+static inline void os_enqueue_four_hcws(struct dlb_hw *hw,
+ struct dlb_hcw *hcw,
+ void *addr)
+{
+ struct dlb_dev *dlb_dev;
+
+ dlb_dev = container_of(hw, struct dlb_dev, hw);
+
+ dlb_dev->enqueue_four(hcw, addr);
+}
+
+/**
+ * os_fence_hcw() - fence an HCW to ensure it arrives at the device
+ * @hw: dlb_hw handle for a particular device.
+ * @pp_addr: producer port address
+ */
+static inline void os_fence_hcw(struct dlb_hw *hw, u64 *pp_addr)
+{
+ RTE_SET_USED(hw);
+
+ /* To ensure outstanding HCWs reach the device, read the PP address. IA
+ * memory ordering prevents reads from passing older writes, and the
+ * mfence also ensures this.
+ */
+ rte_mb();
+
+ *(volatile u64 *)pp_addr;
+}
+
+/* Map to PMDs logging interface */
+#define DLB_ERR(dev, fmt, args...) \
+ DLB_LOG_ERR(fmt, ## args)
+
+#define DLB_INFO(dev, fmt, args...) \
+ DLB_LOG_INFO(fmt, ## args)
+
+#define DLB_DEBUG(dev, fmt, args...) \
+ DLB_LOG_DEBUG(fmt, ## args)
+
+/**
+ * DLB_HW_ERR() - log an error message
+ * @dlb: dlb_hw handle for a particular device.
+ * @...: variable string args.
+ */
+#define DLB_HW_ERR(dlb, ...) do { \
+ RTE_SET_USED(dlb); \
+ DLB_ERR(dlb, __VA_ARGS__); \
+} while (0)
+
+/**
+ * DLB_HW_INFO() - log an info message
+ * @dlb: dlb_hw handle for a particular device.
+ * @...: variable string args.
+ */
+#define DLB_HW_INFO(dlb, ...) do { \
+ RTE_SET_USED(dlb); \
+ DLB_INFO(dlb, __VA_ARGS__); \
+} while (0)
+
+/*** scheduling functions ***/
+
+/* The callback runs until it completes all outstanding QID->CQ
+ * map and unmap requests. To prevent deadlock, this function gives other
+ * threads a chance to grab the resource mutex and configure hardware.
+ */
+static void *dlb_complete_queue_map_unmap(void *__args)
+{
+ struct dlb_dev *dlb_dev = (struct dlb_dev *)__args;
+ int ret;
+
+ while (1) {
+ rte_spinlock_lock(&dlb_dev->resource_mutex);
+
+ ret = dlb_finish_unmap_qid_procedures(&dlb_dev->hw);
+ ret += dlb_finish_map_qid_procedures(&dlb_dev->hw);
+
+ if (ret != 0) {
+ rte_spinlock_unlock(&dlb_dev->resource_mutex);
+ /* Relinquish the CPU so the application can process
+ * its CQs, so this function doesn't deadlock.
+ */
+ sched_yield();
+ } else
+ break;
+ }
+
+ dlb_dev->worker_launched = false;
+
+ rte_spinlock_unlock(&dlb_dev->resource_mutex);
+
+ return NULL;
+}
+
+
+/**
+ * os_schedule_work() - launch a thread to process pending map and unmap work
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * This function launches a thread that will run until all pending
+ * map and unmap procedures are complete.
+ */
+static inline void os_schedule_work(struct dlb_hw *hw)
+{
+ struct dlb_dev *dlb_dev;
+ pthread_t complete_queue_map_unmap_thread;
+ int ret;
+
+ dlb_dev = container_of(hw, struct dlb_dev, hw);
+
+ ret = pthread_create(&complete_queue_map_unmap_thread,
+ NULL,
+ dlb_complete_queue_map_unmap,
+ dlb_dev);
+ if (ret)
+ DLB_ERR(dlb_dev,
+ "Could not create queue complete map /unmap thread, err=%d\n",
+ ret);
+ else
+ dlb_dev->worker_launched = true;
+}
+
+/**
+ * os_worker_active() - query whether the map/unmap worker thread is active
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * This function returns a boolean indicating whether a thread (launched by
+ * os_schedule_work()) is active. This function is used to determine
+ * whether or not to launch a worker thread.
+ */
+static inline bool os_worker_active(struct dlb_hw *hw)
+{
+ struct dlb_dev *dlb_dev;
+
+ dlb_dev = container_of(hw, struct dlb_dev, hw);
+
+ return dlb_dev->worker_launched;
+}
+
+/**
+ * os_notify_user_space() - notify user space
+ * @hw: dlb_hw handle for a particular device.
+ * @domain_id: ID of domain to notify.
+ * @alert_id: alert ID.
+ * @aux_alert_data: additional alert data.
+ *
+ * This function notifies user space of an alert (such as a remote queue
+ * unregister or hardware alarm).
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ */
+static inline int os_notify_user_space(struct dlb_hw *hw,
+ u32 domain_id,
+ u64 alert_id,
+ u64 aux_alert_data)
+{
+ RTE_SET_USED(hw);
+ RTE_SET_USED(domain_id);
+ RTE_SET_USED(alert_id);
+ RTE_SET_USED(aux_alert_data);
+
+ rte_panic("internal_error: %s should never be called for DLB PF PMD\n",
+ __func__);
+ return -1;
+}
+
+enum dlb_dev_revision {
+ DLB_A0,
+ DLB_A1,
+ DLB_A2,
+ DLB_A3,
+ DLB_B0,
+};
+
+#include <cpuid.h>
+
+/**
+ * os_get_dev_revision() - query the device_revision
+ * @hw: dlb_hw handle for a particular device.
+ */
+static inline enum dlb_dev_revision os_get_dev_revision(struct dlb_hw *hw)
+{
+ uint32_t a, b, c, d, stepping;
+
+ RTE_SET_USED(hw);
+
+ __cpuid(0x1, a, b, c, d);
+
+ stepping = a & 0xf;
+
+ switch (stepping) {
+ case 0:
+ return DLB_A0;
+ case 1:
+ return DLB_A1;
+ case 2:
+ return DLB_A2;
+ case 3:
+ return DLB_A3;
+ default:
+ /* Treat all revisions >= 4 as B0 */
+ return DLB_B0;
+ }
+}
+
+#endif /* __DLB_OSDEP_H__ */
diff --git a/drivers/event/dlb/pf/base/dlb_osdep_bitmap.h b/drivers/event/dlb/pf/base/dlb_osdep_bitmap.h
new file mode 100644
index 0000000..8df1d59
--- /dev/null
+++ b/drivers/event/dlb/pf/base/dlb_osdep_bitmap.h
@@ -0,0 +1,442 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2016-2020 Intel Corporation
+ */
+
+#ifndef __DLB_OSDEP_BITMAP_H__
+#define __DLB_OSDEP_BITMAP_H__
+
+#include <stdint.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <rte_bitmap.h>
+#include <rte_string_fns.h>
+#include <rte_malloc.h>
+#include <rte_errno.h>
+#include "../dlb_main.h"
+
+
+/*************************/
+/*** Bitmap operations ***/
+/*************************/
+struct dlb_bitmap {
+ struct rte_bitmap *map;
+ unsigned int len;
+ struct dlb_hw *hw;
+};
+
+/**
+ * dlb_bitmap_alloc() - alloc a bitmap data structure
+ * @bitmap: pointer to dlb_bitmap structure pointer.
+ * @len: number of entries in the bitmap.
+ *
+ * This function allocates a bitmap and initializes it with length @len. All
+ * entries are initially zero.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise.
+ *
+ * Errors:
+ * EINVAL - bitmap is NULL or len is 0.
+ * ENOMEM - could not allocate memory for the bitmap data structure.
+ */
+static inline int dlb_bitmap_alloc(struct dlb_hw *hw,
+ struct dlb_bitmap **bitmap,
+ unsigned int len)
+{
+ struct dlb_bitmap *bm;
+ void *mem;
+ uint32_t alloc_size;
+ uint32_t nbits = (uint32_t) len;
+ RTE_SET_USED(hw);
+
+ if (!bitmap || nbits == 0)
+ return -EINVAL;
+
+ /* Allocate DLB bitmap control struct */
+ bm = rte_malloc("DLB_PF",
+ sizeof(struct dlb_bitmap),
+ RTE_CACHE_LINE_SIZE);
+
+ if (!bm)
+ return -ENOMEM;
+
+ /* Allocate bitmap memory */
+ alloc_size = rte_bitmap_get_memory_footprint(nbits);
+ mem = rte_malloc("DLB_PF_BITMAP", alloc_size, RTE_CACHE_LINE_SIZE);
+ if (!mem) {
+ rte_free(bm);
+ return -ENOMEM;
+ }
+
+ bm->map = rte_bitmap_init(len, mem, alloc_size);
+ if (!bm->map) {
+ rte_free(mem);
+ rte_free(bm);
+ return -ENOMEM;
+ }
+
+ bm->len = len;
+
+ *bitmap = bm;
+
+ return 0;
+}
+
+/**
+ * dlb_bitmap_free() - free a previously allocated bitmap data structure
+ * @bitmap: pointer to dlb_bitmap structure.
+ *
+ * This function frees a bitmap that was allocated with dlb_bitmap_alloc().
+ */
+static inline void dlb_bitmap_free(struct dlb_bitmap *bitmap)
+{
+ if (!bitmap)
+ rte_panic("NULL dlb_bitmap in %s\n", __func__);
+
+ rte_free(bitmap->map);
+ rte_free(bitmap);
+}
+
+/**
+ * dlb_bitmap_fill() - fill a bitmap with all 1s
+ * @bitmap: pointer to dlb_bitmap structure.
+ *
+ * This function sets all bitmap values to 1.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise.
+ *
+ * Errors:
+ * EINVAL - bitmap is NULL or is uninitialized.
+ */
+static inline int dlb_bitmap_fill(struct dlb_bitmap *bitmap)
+{
+ unsigned int i;
+
+ if (!bitmap || !bitmap->map)
+ return -EINVAL;
+
+ for (i = 0; i != bitmap->len; i++)
+ rte_bitmap_set(bitmap->map, i);
+
+ return 0;
+}
+
+/**
+ * dlb_bitmap_fill() - fill a bitmap with all 0s
+ * @bitmap: pointer to dlb_bitmap structure.
+ *
+ * This function sets all bitmap values to 0.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise.
+ *
+ * Errors:
+ * EINVAL - bitmap is NULL or is uninitialized.
+ */
+static inline int dlb_bitmap_zero(struct dlb_bitmap *bitmap)
+{
+ if (!bitmap || !bitmap->map)
+ return -EINVAL;
+
+ rte_bitmap_reset(bitmap->map);
+
+ return 0;
+}
+
+/**
+ * dlb_bitmap_set() - set a bitmap entry
+ * @bitmap: pointer to dlb_bitmap structure.
+ * @bit: bit index.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise.
+ *
+ * Errors:
+ * EINVAL - bitmap is NULL or is uninitialized, or bit is larger than the
+ * bitmap length.
+ */
+static inline int dlb_bitmap_set(struct dlb_bitmap *bitmap,
+ unsigned int bit)
+{
+ if (!bitmap || !bitmap->map)
+ return -EINVAL;
+
+ if (bitmap->len <= bit)
+ return -EINVAL;
+
+ rte_bitmap_set(bitmap->map, bit);
+
+ return 0;
+}
+
+/**
+ * dlb_bitmap_set_range() - set a range of bitmap entries
+ * @bitmap: pointer to dlb_bitmap structure.
+ * @bit: starting bit index.
+ * @len: length of the range.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise.
+ *
+ * Errors:
+ * EINVAL - bitmap is NULL or is uninitialized, or the range exceeds the bitmap
+ * length.
+ */
+static inline int dlb_bitmap_set_range(struct dlb_bitmap *bitmap,
+ unsigned int bit,
+ unsigned int len)
+{
+ unsigned int i;
+
+ if (!bitmap || !bitmap->map)
+ return -EINVAL;
+
+ if (bitmap->len <= bit)
+ return -EINVAL;
+
+ for (i = 0; i != len; i++)
+ rte_bitmap_set(bitmap->map, bit + i);
+
+ return 0;
+}
+
+/**
+ * dlb_bitmap_clear() - clear a bitmap entry
+ * @bitmap: pointer to dlb_bitmap structure.
+ * @bit: bit index.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise.
+ *
+ * Errors:
+ * EINVAL - bitmap is NULL or is uninitialized, or bit is larger than the
+ * bitmap length.
+ */
+static inline int dlb_bitmap_clear(struct dlb_bitmap *bitmap,
+ unsigned int bit)
+{
+ if (!bitmap || !bitmap->map)
+ return -EINVAL;
+
+ if (bitmap->len <= bit)
+ return -EINVAL;
+
+ rte_bitmap_clear(bitmap->map, bit);
+
+ return 0;
+}
+
+/**
+ * dlb_bitmap_clear_range() - clear a range of bitmap entries
+ * @bitmap: pointer to dlb_bitmap structure.
+ * @bit: starting bit index.
+ * @len: length of the range.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise.
+ *
+ * Errors:
+ * EINVAL - bitmap is NULL or is uninitialized, or the range exceeds the bitmap
+ * length.
+ */
+static inline int dlb_bitmap_clear_range(struct dlb_bitmap *bitmap,
+ unsigned int bit,
+ unsigned int len)
+{
+ unsigned int i;
+
+ if (!bitmap || !bitmap->map)
+ return -EINVAL;
+
+ if (bitmap->len <= bit)
+ return -EINVAL;
+
+ for (i = 0; i != len; i++)
+ rte_bitmap_clear(bitmap->map, bit + i);
+
+ return 0;
+}
+
+/**
+ * dlb_bitmap_find_set_bit_range() - find a range of set bits
+ * @bitmap: pointer to dlb_bitmap structure.
+ * @len: length of the range.
+ *
+ * This function looks for a range of set bits of length @len.
+ *
+ * Return:
+ * Returns the base bit index upon success, < 0 otherwise.
+ *
+ * Errors:
+ * ENOENT - unable to find a length *len* range of set bits.
+ * EINVAL - bitmap is NULL or is uninitialized, or len is invalid.
+ */
+static inline int dlb_bitmap_find_set_bit_range(struct dlb_bitmap *bitmap,
+ unsigned int len)
+{
+ unsigned int i, j = 0;
+
+ if (!bitmap || !bitmap->map || len == 0)
+ return -EINVAL;
+
+ if (bitmap->len < len)
+ return -ENOENT;
+
+ for (i = 0; i != bitmap->len; i++) {
+ if (rte_bitmap_get(bitmap->map, i)) {
+ if (++j == len)
+ return i - j + 1;
+ } else
+ j = 0;
+ }
+
+ /* No set bit range of length len? */
+ return -ENOENT;
+}
+
+/**
+ * dlb_bitmap_find_set_bit() - find the first set bit
+ * @bitmap: pointer to dlb_bitmap structure.
+ *
+ * This function looks for a single set bit.
+ *
+ * Return:
+ * Returns the base bit index upon success, < 0 otherwise.
+ *
+ * Errors:
+ * ENOENT - the bitmap contains no set bits.
+ * EINVAL - bitmap is NULL or is uninitialized, or len is invalid.
+ */
+static inline int dlb_bitmap_find_set_bit(struct dlb_bitmap *bitmap)
+{
+ unsigned int i;
+
+ if (!bitmap)
+ return -EINVAL;
+
+ if (!bitmap->map)
+ return -EINVAL;
+
+ for (i = 0; i != bitmap->len; i++) {
+ if (rte_bitmap_get(bitmap->map, i))
+ return i;
+ }
+
+ return -ENOENT;
+}
+
+/**
+ * dlb_bitmap_count() - returns the number of set bits
+ * @bitmap: pointer to dlb_bitmap structure.
+ *
+ * This function looks for a single set bit.
+ *
+ * Return:
+ * Returns the number of set bits upon success, <0 otherwise.
+ *
+ * Errors:
+ * EINVAL - bitmap is NULL or is uninitialized.
+ */
+static inline int dlb_bitmap_count(struct dlb_bitmap *bitmap)
+{
+ int weight = 0;
+ unsigned int i;
+
+ if (!bitmap)
+ return -EINVAL;
+
+ if (!bitmap->map)
+ return -EINVAL;
+
+ for (i = 0; i != bitmap->len; i++) {
+ if (rte_bitmap_get(bitmap->map, i))
+ weight++;
+ }
+ return weight;
+}
+
+/**
+ * dlb_bitmap_longest_set_range() - returns longest contiguous range of set bits
+ * @bitmap: pointer to dlb_bitmap structure.
+ *
+ * Return:
+ * Returns the bitmap's longest contiguous range of of set bits upon success,
+ * <0 otherwise.
+ *
+ * Errors:
+ * EINVAL - bitmap is NULL or is uninitialized.
+ */
+static inline int dlb_bitmap_longest_set_range(struct dlb_bitmap *bitmap)
+{
+ int max_len = 0, len = 0;
+ unsigned int i;
+
+ if (!bitmap)
+ return -EINVAL;
+
+ if (!bitmap->map)
+ return -EINVAL;
+
+ for (i = 0; i != bitmap->len; i++) {
+ if (rte_bitmap_get(bitmap->map, i)) {
+ len++;
+ } else {
+ if (len > max_len)
+ max_len = len;
+ len = 0;
+ }
+ }
+
+ if (len > max_len)
+ max_len = len;
+
+ return max_len;
+}
+
+/**
+ * dlb_bitmap_or() - store the logical 'or' of two bitmaps into a third
+ * @dest: pointer to dlb_bitmap structure, which will contain the results of
+ * the 'or' of src1 and src2.
+ * @src1: pointer to dlb_bitmap structure, will be 'or'ed with src2.
+ * @src2: pointer to dlb_bitmap structure, will be 'or'ed with src1.
+ *
+ * This function 'or's two bitmaps together and stores the result in a third
+ * bitmap. The source and destination bitmaps can be the same.
+ *
+ * Return:
+ * Returns the number of set bits upon success, <0 otherwise.
+ *
+ * Errors:
+ * EINVAL - One of the bitmaps is NULL or is uninitialized.
+ */
+static inline int dlb_bitmap_or(struct dlb_bitmap *dest,
+ struct dlb_bitmap *src1,
+ struct dlb_bitmap *src2)
+{
+ unsigned int i, min;
+ int numset = 0;
+
+ if (!dest || !dest->map ||
+ !src1 || !src1->map ||
+ !src2 || !src2->map)
+ return -EINVAL;
+
+ min = dest->len;
+ min = (min > src1->len) ? src1->len : min;
+ min = (min > src2->len) ? src2->len : min;
+
+ for (i = 0; i != min; i++) {
+ if (rte_bitmap_get(src1->map, i) ||
+ rte_bitmap_get(src2->map, i)) {
+ rte_bitmap_set(dest->map, i);
+ numset++;
+ } else
+ rte_bitmap_clear(dest->map, i);
+ }
+
+ return numset;
+}
+
+#endif /* __DLB_OSDEP_BITMAP_H__ */
diff --git a/drivers/event/dlb/pf/base/dlb_osdep_list.h b/drivers/event/dlb/pf/base/dlb_osdep_list.h
new file mode 100644
index 0000000..a53b362
--- /dev/null
+++ b/drivers/event/dlb/pf/base/dlb_osdep_list.h
@@ -0,0 +1,131 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2016-2020 Intel Corporation
+ */
+
+#ifndef __DLB_OSDEP_LIST_H__
+#define __DLB_OSDEP_LIST_H__
+
+#include <rte_tailq.h>
+
+struct dlb_list_entry {
+ TAILQ_ENTRY(dlb_list_entry) node;
+};
+
+/* Dummy - just a struct definition */
+TAILQ_HEAD(dlb_list_head, dlb_list_entry);
+
+/* =================
+ * TAILQ Supplements
+ * =================
+ */
+
+#ifndef TAILQ_FOREACH_ENTRY
+#define TAILQ_FOREACH_ENTRY(ptr, head, name, iter) \
+ for ((iter) = TAILQ_FIRST(&head); \
+ (iter) \
+ && (ptr = container_of(iter, typeof(*(ptr)), name)); \
+ (iter) = TAILQ_NEXT((iter), node))
+#endif
+
+#ifndef TAILQ_FOREACH_ENTRY_SAFE
+#define TAILQ_FOREACH_ENTRY_SAFE(ptr, head, name, iter, tvar) \
+ for ((iter) = TAILQ_FIRST(&head); \
+ (iter) && \
+ (ptr = container_of(iter, typeof(*(ptr)), name)) &&\
+ ((tvar) = TAILQ_NEXT((iter), node), 1); \
+ (iter) = (tvar))
+#endif
+
+/* =========
+ * DLB Lists
+ * =========
+ */
+
+/**
+ * dlb_list_init_head() - initialize the head of a list
+ * @head: list head
+ */
+static inline void dlb_list_init_head(struct dlb_list_head *head)
+{
+ TAILQ_INIT(head);
+}
+
+/**
+ * dlb_list_add() - add an entry to a list
+ * @head: new entry will be added after this list header
+ * @entry: new list entry to be added
+ */
+static inline void dlb_list_add(struct dlb_list_head *head,
+ struct dlb_list_entry *entry)
+{
+ TAILQ_INSERT_TAIL(head, entry, node);
+}
+
+/**
+ * @head: list head
+ * @entry: list entry to be deleted
+ */
+static inline void dlb_list_del(struct dlb_list_head *head,
+ struct dlb_list_entry *entry)
+{
+ TAILQ_REMOVE(head, entry, node);
+}
+
+/**
+ * dlb_list_empty() - check if a list is empty
+ * @head: list head
+ *
+ * Return:
+ * Returns 1 if empty, 0 if not.
+ */
+static inline bool dlb_list_empty(struct dlb_list_head *head)
+{
+ return TAILQ_EMPTY(head);
+}
+
+/**
+ * dlb_list_empty() - check if a list is empty
+ * @src_head: list to be added
+ * @ head: where src_head will be inserted
+ */
+static inline void dlb_list_splice(struct dlb_list_head *src_head,
+ struct dlb_list_head *head)
+{
+ TAILQ_CONCAT(head, src_head, node);
+}
+
+/**
+ * DLB_LIST_HEAD() - retrieve the head of the list
+ * @head: list head
+ * @type: type of the list variable
+ * @name: name of the dlb_list within the struct
+ */
+#define DLB_LIST_HEAD(head, type, name) \
+ (TAILQ_FIRST(&head) ? \
+ container_of(TAILQ_FIRST(&head), type, name) : \
+ NULL)
+
+/**
+ * DLB_LIST_FOR_EACH() - iterate over a list
+ * @head: list head
+ * @ptr: pointer to struct containing a struct dlb_list_entry
+ * @name: name of the dlb_list_entry field within the containing struct
+ * @iter: iterator variable
+ */
+#define DLB_LIST_FOR_EACH(head, ptr, name, tmp_iter) \
+ TAILQ_FOREACH_ENTRY(ptr, head, name, tmp_iter)
+
+/**
+ * DLB_LIST_FOR_EACH_SAFE() - iterate over a list. This loop works even if
+ * an element is removed from the list while processing it.
+ * @ptr: pointer to struct containing a struct dlb_list_entry
+ * @ptr_tmp: pointer to struct containing a struct dlb_list_entry (temporary)
+ * @head: list head
+ * @name: name of the dlb_list_entry field within the containing struct
+ * @iter: iterator variable
+ * @iter_tmp: iterator variable (temporary)
+ */
+#define DLB_LIST_FOR_EACH_SAFE(head, ptr, ptr_tmp, name, tmp_iter, saf_iter) \
+ TAILQ_FOREACH_ENTRY_SAFE(ptr, head, name, tmp_iter, saf_iter)
+
+#endif /* __DLB_OSDEP_LIST_H__ */
diff --git a/drivers/event/dlb/pf/base/dlb_osdep_types.h b/drivers/event/dlb/pf/base/dlb_osdep_types.h
new file mode 100644
index 0000000..2e9d7d8
--- /dev/null
+++ b/drivers/event/dlb/pf/base/dlb_osdep_types.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2016-2020 Intel Corporation
+ */
+
+#ifndef __DLB_OSDEP_TYPES_H
+#define __DLB_OSDEP_TYPES_H
+
+#include <linux/types.h>
+
+#include <inttypes.h>
+#include <ctype.h>
+#include <stdint.h>
+#include <stdbool.h>
+#include <string.h>
+#include <unistd.h>
+#include <errno.h>
+
+/* Types for user mode PF PMD */
+typedef uint8_t u8;
+typedef int8_t s8;
+typedef uint16_t u16;
+typedef int16_t s16;
+typedef uint32_t u32;
+typedef int32_t s32;
+typedef uint64_t u64;
+
+#define __iomem
+
+/* END types for user mode PF PMD */
+
+#endif /* __DLB_OSDEP_TYPES_H */
diff --git a/drivers/event/dlb/pf/base/dlb_regs.h b/drivers/event/dlb/pf/base/dlb_regs.h
new file mode 100644
index 0000000..fce5c0b
--- /dev/null
+++ b/drivers/event/dlb/pf/base/dlb_regs.h
@@ -0,0 +1,2678 @@
+/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-3-Clause)
+ * Copyright(c) 2016-2020 Intel Corporation
+ */
+
+#ifndef __DLB_REGS_H
+#define __DLB_REGS_H
+
+#include "dlb_osdep_types.h"
+
+#define DLB_FUNC_PF_VF2PF_MAILBOX_BYTES 256
+#define DLB_FUNC_PF_VF2PF_MAILBOX(vf_id, x) \
+ (0x1000 + 0x4 * (x) + (vf_id) * 0x10000)
+#define DLB_FUNC_PF_VF2PF_MAILBOX_RST 0x0
+union dlb_func_pf_vf2pf_mailbox {
+ struct {
+ u32 msg : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_FUNC_PF_VF2PF_MAILBOX_ISR(vf_id) \
+ (0x1f00 + (vf_id) * 0x10000)
+#define DLB_FUNC_PF_VF2PF_MAILBOX_ISR_RST 0x0
+union dlb_func_pf_vf2pf_mailbox_isr {
+ struct {
+ u32 vf_isr : 16;
+ u32 rsvd0 : 16;
+ } field;
+ u32 val;
+};
+
+#define DLB_FUNC_PF_VF2PF_FLR_ISR(vf_id) \
+ (0x1f04 + (vf_id) * 0x10000)
+#define DLB_FUNC_PF_VF2PF_FLR_ISR_RST 0x0
+union dlb_func_pf_vf2pf_flr_isr {
+ struct {
+ u32 vf_isr : 16;
+ u32 rsvd0 : 16;
+ } field;
+ u32 val;
+};
+
+#define DLB_FUNC_PF_VF2PF_ISR_PEND(vf_id) \
+ (0x1f10 + (vf_id) * 0x10000)
+#define DLB_FUNC_PF_VF2PF_ISR_PEND_RST 0x0
+union dlb_func_pf_vf2pf_isr_pend {
+ struct {
+ u32 isr_pend : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_FUNC_PF_PF2VF_MAILBOX_BYTES 64
+#define DLB_FUNC_PF_PF2VF_MAILBOX(vf_id, x) \
+ (0x2000 + 0x4 * (x) + (vf_id) * 0x10000)
+#define DLB_FUNC_PF_PF2VF_MAILBOX_RST 0x0
+union dlb_func_pf_pf2vf_mailbox {
+ struct {
+ u32 msg : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_FUNC_PF_PF2VF_MAILBOX_ISR(vf_id) \
+ (0x2f00 + (vf_id) * 0x10000)
+#define DLB_FUNC_PF_PF2VF_MAILBOX_ISR_RST 0x0
+union dlb_func_pf_pf2vf_mailbox_isr {
+ struct {
+ u32 isr : 16;
+ u32 rsvd0 : 16;
+ } field;
+ u32 val;
+};
+
+#define DLB_FUNC_PF_VF_RESET_IN_PROGRESS(vf_id) \
+ (0x3000 + (vf_id) * 0x10000)
+#define DLB_FUNC_PF_VF_RESET_IN_PROGRESS_RST 0xffff
+union dlb_func_pf_vf_reset_in_progress {
+ struct {
+ u32 reset_in_progress : 16;
+ u32 rsvd0 : 16;
+ } field;
+ u32 val;
+};
+
+#define DLB_MSIX_MEM_VECTOR_CTRL(x) \
+ (0x100000c + (x) * 0x10)
+#define DLB_MSIX_MEM_VECTOR_CTRL_RST 0x1
+union dlb_msix_mem_vector_ctrl {
+ struct {
+ u32 vec_mask : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_TOTAL_VAS 0x124
+#define DLB_SYS_TOTAL_VAS_RST 0x20
+union dlb_sys_total_vas {
+ struct {
+ u32 total_vas : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_ALARM_PF_SYND2 0x508
+#define DLB_SYS_ALARM_PF_SYND2_RST 0x0
+union dlb_sys_alarm_pf_synd2 {
+ struct {
+ u32 lock_id : 16;
+ u32 meas : 1;
+ u32 debug : 7;
+ u32 cq_pop : 1;
+ u32 qe_uhl : 1;
+ u32 qe_orsp : 1;
+ u32 qe_valid : 1;
+ u32 cq_int_rearm : 1;
+ u32 dsi_error : 1;
+ u32 rsvd0 : 2;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_ALARM_PF_SYND1 0x504
+#define DLB_SYS_ALARM_PF_SYND1_RST 0x0
+union dlb_sys_alarm_pf_synd1 {
+ struct {
+ u32 dsi : 16;
+ u32 qid : 8;
+ u32 qtype : 2;
+ u32 qpri : 3;
+ u32 msg_type : 3;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_ALARM_PF_SYND0 0x500
+#define DLB_SYS_ALARM_PF_SYND0_RST 0x0
+union dlb_sys_alarm_pf_synd0 {
+ struct {
+ u32 syndrome : 8;
+ u32 rtype : 2;
+ u32 rsvd0 : 2;
+ u32 from_dmv : 1;
+ u32 is_ldb : 1;
+ u32 cls : 2;
+ u32 aid : 6;
+ u32 unit : 4;
+ u32 source : 4;
+ u32 more : 1;
+ u32 valid : 1;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_VF_LDB_VPP_V(x) \
+ (0xf00 + (x) * 0x1000)
+#define DLB_SYS_VF_LDB_VPP_V_RST 0x0
+union dlb_sys_vf_ldb_vpp_v {
+ struct {
+ u32 vpp_v : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_VF_LDB_VPP2PP(x) \
+ (0xf08 + (x) * 0x1000)
+#define DLB_SYS_VF_LDB_VPP2PP_RST 0x0
+union dlb_sys_vf_ldb_vpp2pp {
+ struct {
+ u32 pp : 6;
+ u32 rsvd0 : 26;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_VF_DIR_VPP_V(x) \
+ (0xf10 + (x) * 0x1000)
+#define DLB_SYS_VF_DIR_VPP_V_RST 0x0
+union dlb_sys_vf_dir_vpp_v {
+ struct {
+ u32 vpp_v : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_VF_DIR_VPP2PP(x) \
+ (0xf18 + (x) * 0x1000)
+#define DLB_SYS_VF_DIR_VPP2PP_RST 0x0
+union dlb_sys_vf_dir_vpp2pp {
+ struct {
+ u32 pp : 7;
+ u32 rsvd0 : 25;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_VF_LDB_VQID_V(x) \
+ (0xf20 + (x) * 0x1000)
+#define DLB_SYS_VF_LDB_VQID_V_RST 0x0
+union dlb_sys_vf_ldb_vqid_v {
+ struct {
+ u32 vqid_v : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_VF_LDB_VQID2QID(x) \
+ (0xf28 + (x) * 0x1000)
+#define DLB_SYS_VF_LDB_VQID2QID_RST 0x0
+union dlb_sys_vf_ldb_vqid2qid {
+ struct {
+ u32 qid : 7;
+ u32 rsvd0 : 25;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_LDB_QID2VQID(x) \
+ (0xf2c + (x) * 0x1000)
+#define DLB_SYS_LDB_QID2VQID_RST 0x0
+union dlb_sys_ldb_qid2vqid {
+ struct {
+ u32 vqid : 7;
+ u32 rsvd0 : 25;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_VF_DIR_VQID_V(x) \
+ (0xf30 + (x) * 0x1000)
+#define DLB_SYS_VF_DIR_VQID_V_RST 0x0
+union dlb_sys_vf_dir_vqid_v {
+ struct {
+ u32 vqid_v : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_VF_DIR_VQID2QID(x) \
+ (0xf38 + (x) * 0x1000)
+#define DLB_SYS_VF_DIR_VQID2QID_RST 0x0
+union dlb_sys_vf_dir_vqid2qid {
+ struct {
+ u32 qid : 7;
+ u32 rsvd0 : 25;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_LDB_VASQID_V(x) \
+ (0xf60 + (x) * 0x1000)
+#define DLB_SYS_LDB_VASQID_V_RST 0x0
+union dlb_sys_ldb_vasqid_v {
+ struct {
+ u32 vasqid_v : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_DIR_VASQID_V(x) \
+ (0xf68 + (x) * 0x1000)
+#define DLB_SYS_DIR_VASQID_V_RST 0x0
+union dlb_sys_dir_vasqid_v {
+ struct {
+ u32 vasqid_v : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_WBUF_DIR_FLAGS(x) \
+ (0xf70 + (x) * 0x1000)
+#define DLB_SYS_WBUF_DIR_FLAGS_RST 0x0
+union dlb_sys_wbuf_dir_flags {
+ struct {
+ u32 wb_v : 4;
+ u32 cl : 1;
+ u32 busy : 1;
+ u32 opt : 1;
+ u32 rsvd0 : 25;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_WBUF_LDB_FLAGS(x) \
+ (0xf78 + (x) * 0x1000)
+#define DLB_SYS_WBUF_LDB_FLAGS_RST 0x0
+union dlb_sys_wbuf_ldb_flags {
+ struct {
+ u32 wb_v : 4;
+ u32 cl : 1;
+ u32 busy : 1;
+ u32 rsvd0 : 26;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_ALARM_VF_SYND2(x) \
+ (0x8000018 + (x) * 0x1000)
+#define DLB_SYS_ALARM_VF_SYND2_RST 0x0
+union dlb_sys_alarm_vf_synd2 {
+ struct {
+ u32 lock_id : 16;
+ u32 meas : 1;
+ u32 debug : 7;
+ u32 cq_pop : 1;
+ u32 qe_uhl : 1;
+ u32 qe_orsp : 1;
+ u32 qe_valid : 1;
+ u32 cq_int_rearm : 1;
+ u32 dsi_error : 1;
+ u32 rsvd0 : 2;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_ALARM_VF_SYND1(x) \
+ (0x8000014 + (x) * 0x1000)
+#define DLB_SYS_ALARM_VF_SYND1_RST 0x0
+union dlb_sys_alarm_vf_synd1 {
+ struct {
+ u32 dsi : 16;
+ u32 qid : 8;
+ u32 qtype : 2;
+ u32 qpri : 3;
+ u32 msg_type : 3;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_ALARM_VF_SYND0(x) \
+ (0x8000010 + (x) * 0x1000)
+#define DLB_SYS_ALARM_VF_SYND0_RST 0x0
+union dlb_sys_alarm_vf_synd0 {
+ struct {
+ u32 syndrome : 8;
+ u32 rtype : 2;
+ u32 rsvd0 : 2;
+ u32 from_dmv : 1;
+ u32 is_ldb : 1;
+ u32 cls : 2;
+ u32 aid : 6;
+ u32 unit : 4;
+ u32 source : 4;
+ u32 more : 1;
+ u32 valid : 1;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_LDB_QID_V(x) \
+ (0x8000034 + (x) * 0x1000)
+#define DLB_SYS_LDB_QID_V_RST 0x0
+union dlb_sys_ldb_qid_v {
+ struct {
+ u32 qid_v : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_LDB_QID_CFG_V(x) \
+ (0x8000030 + (x) * 0x1000)
+#define DLB_SYS_LDB_QID_CFG_V_RST 0x0
+union dlb_sys_ldb_qid_cfg_v {
+ struct {
+ u32 sn_cfg_v : 1;
+ u32 fid_cfg_v : 1;
+ u32 rsvd0 : 30;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_DIR_QID_V(x) \
+ (0x8000040 + (x) * 0x1000)
+#define DLB_SYS_DIR_QID_V_RST 0x0
+union dlb_sys_dir_qid_v {
+ struct {
+ u32 qid_v : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_LDB_POOL_ENBLD(x) \
+ (0x8000070 + (x) * 0x1000)
+#define DLB_SYS_LDB_POOL_ENBLD_RST 0x0
+union dlb_sys_ldb_pool_enbld {
+ struct {
+ u32 pool_enabled : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_DIR_POOL_ENBLD(x) \
+ (0x8000080 + (x) * 0x1000)
+#define DLB_SYS_DIR_POOL_ENBLD_RST 0x0
+union dlb_sys_dir_pool_enbld {
+ struct {
+ u32 pool_enabled : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_LDB_PP2VPP(x) \
+ (0x8000090 + (x) * 0x1000)
+#define DLB_SYS_LDB_PP2VPP_RST 0x0
+union dlb_sys_ldb_pp2vpp {
+ struct {
+ u32 vpp : 6;
+ u32 rsvd0 : 26;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_DIR_PP2VPP(x) \
+ (0x8000094 + (x) * 0x1000)
+#define DLB_SYS_DIR_PP2VPP_RST 0x0
+union dlb_sys_dir_pp2vpp {
+ struct {
+ u32 vpp : 7;
+ u32 rsvd0 : 25;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_LDB_PP_V(x) \
+ (0x8000128 + (x) * 0x1000)
+#define DLB_SYS_LDB_PP_V_RST 0x0
+union dlb_sys_ldb_pp_v {
+ struct {
+ u32 pp_v : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_LDB_CQ_ISR(x) \
+ (0x8000124 + (x) * 0x1000)
+#define DLB_SYS_LDB_CQ_ISR_RST 0x0
+/* CQ Interrupt Modes */
+#define DLB_CQ_ISR_MODE_DIS 0
+#define DLB_CQ_ISR_MODE_MSI 1
+#define DLB_CQ_ISR_MODE_MSIX 2
+union dlb_sys_ldb_cq_isr {
+ struct {
+ u32 vector : 6;
+ u32 vf : 4;
+ u32 en_code : 2;
+ u32 rsvd0 : 20;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_LDB_CQ2VF_PF(x) \
+ (0x8000120 + (x) * 0x1000)
+#define DLB_SYS_LDB_CQ2VF_PF_RST 0x0
+union dlb_sys_ldb_cq2vf_pf {
+ struct {
+ u32 vf : 4;
+ u32 is_pf : 1;
+ u32 rsvd0 : 27;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_LDB_PP2VAS(x) \
+ (0x800011c + (x) * 0x1000)
+#define DLB_SYS_LDB_PP2VAS_RST 0x0
+union dlb_sys_ldb_pp2vas {
+ struct {
+ u32 vas : 5;
+ u32 rsvd0 : 27;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_LDB_PP2LDBPOOL(x) \
+ (0x8000118 + (x) * 0x1000)
+#define DLB_SYS_LDB_PP2LDBPOOL_RST 0x0
+union dlb_sys_ldb_pp2ldbpool {
+ struct {
+ u32 ldbpool : 6;
+ u32 rsvd0 : 26;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_LDB_PP2DIRPOOL(x) \
+ (0x8000114 + (x) * 0x1000)
+#define DLB_SYS_LDB_PP2DIRPOOL_RST 0x0
+union dlb_sys_ldb_pp2dirpool {
+ struct {
+ u32 dirpool : 6;
+ u32 rsvd0 : 26;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_LDB_PP2VF_PF(x) \
+ (0x8000110 + (x) * 0x1000)
+#define DLB_SYS_LDB_PP2VF_PF_RST 0x0
+union dlb_sys_ldb_pp2vf_pf {
+ struct {
+ u32 vf : 4;
+ u32 is_pf : 1;
+ u32 rsvd0 : 27;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_LDB_PP_ADDR_U(x) \
+ (0x800010c + (x) * 0x1000)
+#define DLB_SYS_LDB_PP_ADDR_U_RST 0x0
+union dlb_sys_ldb_pp_addr_u {
+ struct {
+ u32 addr_u : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_LDB_PP_ADDR_L(x) \
+ (0x8000108 + (x) * 0x1000)
+#define DLB_SYS_LDB_PP_ADDR_L_RST 0x0
+union dlb_sys_ldb_pp_addr_l {
+ struct {
+ u32 rsvd0 : 7;
+ u32 addr_l : 25;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_LDB_CQ_ADDR_U(x) \
+ (0x8000104 + (x) * 0x1000)
+#define DLB_SYS_LDB_CQ_ADDR_U_RST 0x0
+union dlb_sys_ldb_cq_addr_u {
+ struct {
+ u32 addr_u : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_LDB_CQ_ADDR_L(x) \
+ (0x8000100 + (x) * 0x1000)
+#define DLB_SYS_LDB_CQ_ADDR_L_RST 0x0
+union dlb_sys_ldb_cq_addr_l {
+ struct {
+ u32 rsvd0 : 6;
+ u32 addr_l : 26;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_DIR_PP_V(x) \
+ (0x8000228 + (x) * 0x1000)
+#define DLB_SYS_DIR_PP_V_RST 0x0
+union dlb_sys_dir_pp_v {
+ struct {
+ u32 pp_v : 1;
+ u32 mb_dm : 1;
+ u32 rsvd0 : 30;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_DIR_CQ_ISR(x) \
+ (0x8000224 + (x) * 0x1000)
+#define DLB_SYS_DIR_CQ_ISR_RST 0x0
+union dlb_sys_dir_cq_isr {
+ struct {
+ u32 vector : 6;
+ u32 vf : 4;
+ u32 en_code : 2;
+ u32 rsvd0 : 20;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_DIR_CQ2VF_PF(x) \
+ (0x8000220 + (x) * 0x1000)
+#define DLB_SYS_DIR_CQ2VF_PF_RST 0x0
+union dlb_sys_dir_cq2vf_pf {
+ struct {
+ u32 vf : 4;
+ u32 is_pf : 1;
+ u32 rsvd0 : 27;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_DIR_PP2VAS(x) \
+ (0x800021c + (x) * 0x1000)
+#define DLB_SYS_DIR_PP2VAS_RST 0x0
+union dlb_sys_dir_pp2vas {
+ struct {
+ u32 vas : 5;
+ u32 rsvd0 : 27;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_DIR_PP2LDBPOOL(x) \
+ (0x8000218 + (x) * 0x1000)
+#define DLB_SYS_DIR_PP2LDBPOOL_RST 0x0
+union dlb_sys_dir_pp2ldbpool {
+ struct {
+ u32 ldbpool : 6;
+ u32 rsvd0 : 26;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_DIR_PP2DIRPOOL(x) \
+ (0x8000214 + (x) * 0x1000)
+#define DLB_SYS_DIR_PP2DIRPOOL_RST 0x0
+union dlb_sys_dir_pp2dirpool {
+ struct {
+ u32 dirpool : 6;
+ u32 rsvd0 : 26;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_DIR_PP2VF_PF(x) \
+ (0x8000210 + (x) * 0x1000)
+#define DLB_SYS_DIR_PP2VF_PF_RST 0x0
+union dlb_sys_dir_pp2vf_pf {
+ struct {
+ u32 vf : 4;
+ u32 is_pf : 1;
+ u32 is_hw_dsi : 1;
+ u32 rsvd0 : 26;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_DIR_PP_ADDR_U(x) \
+ (0x800020c + (x) * 0x1000)
+#define DLB_SYS_DIR_PP_ADDR_U_RST 0x0
+union dlb_sys_dir_pp_addr_u {
+ struct {
+ u32 addr_u : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_DIR_PP_ADDR_L(x) \
+ (0x8000208 + (x) * 0x1000)
+#define DLB_SYS_DIR_PP_ADDR_L_RST 0x0
+union dlb_sys_dir_pp_addr_l {
+ struct {
+ u32 rsvd0 : 7;
+ u32 addr_l : 25;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_DIR_CQ_ADDR_U(x) \
+ (0x8000204 + (x) * 0x1000)
+#define DLB_SYS_DIR_CQ_ADDR_U_RST 0x0
+union dlb_sys_dir_cq_addr_u {
+ struct {
+ u32 addr_u : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_DIR_CQ_ADDR_L(x) \
+ (0x8000200 + (x) * 0x1000)
+#define DLB_SYS_DIR_CQ_ADDR_L_RST 0x0
+union dlb_sys_dir_cq_addr_l {
+ struct {
+ u32 rsvd0 : 6;
+ u32 addr_l : 26;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_INGRESS_ALARM_ENBL 0x300
+#define DLB_SYS_INGRESS_ALARM_ENBL_RST 0x0
+union dlb_sys_ingress_alarm_enbl {
+ struct {
+ u32 illegal_hcw : 1;
+ u32 illegal_pp : 1;
+ u32 disabled_pp : 1;
+ u32 illegal_qid : 1;
+ u32 disabled_qid : 1;
+ u32 illegal_ldb_qid_cfg : 1;
+ u32 illegal_cqid : 1;
+ u32 rsvd0 : 25;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_CQ_MODE 0x30c
+#define DLB_SYS_CQ_MODE_RST 0x0
+union dlb_sys_cq_mode {
+ struct {
+ u32 ldb_cq64 : 1;
+ u32 dir_cq64 : 1;
+ u32 rsvd0 : 30;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_FUNC_VF_BAR_DSBL(x) \
+ (0x310 + (x) * 0x4)
+#define DLB_SYS_FUNC_VF_BAR_DSBL_RST 0x0
+union dlb_sys_func_vf_bar_dsbl {
+ struct {
+ u32 func_vf_bar_dis : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_MSIX_ACK 0x400
+#define DLB_SYS_MSIX_ACK_RST 0x0
+union dlb_sys_msix_ack {
+ struct {
+ u32 msix_0_ack : 1;
+ u32 msix_1_ack : 1;
+ u32 msix_2_ack : 1;
+ u32 msix_3_ack : 1;
+ u32 msix_4_ack : 1;
+ u32 msix_5_ack : 1;
+ u32 msix_6_ack : 1;
+ u32 msix_7_ack : 1;
+ u32 msix_8_ack : 1;
+ u32 rsvd0 : 23;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_MSIX_PASSTHRU 0x404
+#define DLB_SYS_MSIX_PASSTHRU_RST 0x0
+union dlb_sys_msix_passthru {
+ struct {
+ u32 msix_0_passthru : 1;
+ u32 msix_1_passthru : 1;
+ u32 msix_2_passthru : 1;
+ u32 msix_3_passthru : 1;
+ u32 msix_4_passthru : 1;
+ u32 msix_5_passthru : 1;
+ u32 msix_6_passthru : 1;
+ u32 msix_7_passthru : 1;
+ u32 msix_8_passthru : 1;
+ u32 rsvd0 : 23;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_MSIX_MODE 0x408
+#define DLB_SYS_MSIX_MODE_RST 0x0
+/* MSI-X Modes */
+#define DLB_MSIX_MODE_PACKED 0
+#define DLB_MSIX_MODE_COMPRESSED 1
+union dlb_sys_msix_mode {
+ struct {
+ u32 mode : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_DIR_CQ_31_0_OCC_INT_STS 0x440
+#define DLB_SYS_DIR_CQ_31_0_OCC_INT_STS_RST 0x0
+union dlb_sys_dir_cq_31_0_occ_int_sts {
+ struct {
+ u32 cq_0_occ_int : 1;
+ u32 cq_1_occ_int : 1;
+ u32 cq_2_occ_int : 1;
+ u32 cq_3_occ_int : 1;
+ u32 cq_4_occ_int : 1;
+ u32 cq_5_occ_int : 1;
+ u32 cq_6_occ_int : 1;
+ u32 cq_7_occ_int : 1;
+ u32 cq_8_occ_int : 1;
+ u32 cq_9_occ_int : 1;
+ u32 cq_10_occ_int : 1;
+ u32 cq_11_occ_int : 1;
+ u32 cq_12_occ_int : 1;
+ u32 cq_13_occ_int : 1;
+ u32 cq_14_occ_int : 1;
+ u32 cq_15_occ_int : 1;
+ u32 cq_16_occ_int : 1;
+ u32 cq_17_occ_int : 1;
+ u32 cq_18_occ_int : 1;
+ u32 cq_19_occ_int : 1;
+ u32 cq_20_occ_int : 1;
+ u32 cq_21_occ_int : 1;
+ u32 cq_22_occ_int : 1;
+ u32 cq_23_occ_int : 1;
+ u32 cq_24_occ_int : 1;
+ u32 cq_25_occ_int : 1;
+ u32 cq_26_occ_int : 1;
+ u32 cq_27_occ_int : 1;
+ u32 cq_28_occ_int : 1;
+ u32 cq_29_occ_int : 1;
+ u32 cq_30_occ_int : 1;
+ u32 cq_31_occ_int : 1;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_DIR_CQ_63_32_OCC_INT_STS 0x444
+#define DLB_SYS_DIR_CQ_63_32_OCC_INT_STS_RST 0x0
+union dlb_sys_dir_cq_63_32_occ_int_sts {
+ struct {
+ u32 cq_32_occ_int : 1;
+ u32 cq_33_occ_int : 1;
+ u32 cq_34_occ_int : 1;
+ u32 cq_35_occ_int : 1;
+ u32 cq_36_occ_int : 1;
+ u32 cq_37_occ_int : 1;
+ u32 cq_38_occ_int : 1;
+ u32 cq_39_occ_int : 1;
+ u32 cq_40_occ_int : 1;
+ u32 cq_41_occ_int : 1;
+ u32 cq_42_occ_int : 1;
+ u32 cq_43_occ_int : 1;
+ u32 cq_44_occ_int : 1;
+ u32 cq_45_occ_int : 1;
+ u32 cq_46_occ_int : 1;
+ u32 cq_47_occ_int : 1;
+ u32 cq_48_occ_int : 1;
+ u32 cq_49_occ_int : 1;
+ u32 cq_50_occ_int : 1;
+ u32 cq_51_occ_int : 1;
+ u32 cq_52_occ_int : 1;
+ u32 cq_53_occ_int : 1;
+ u32 cq_54_occ_int : 1;
+ u32 cq_55_occ_int : 1;
+ u32 cq_56_occ_int : 1;
+ u32 cq_57_occ_int : 1;
+ u32 cq_58_occ_int : 1;
+ u32 cq_59_occ_int : 1;
+ u32 cq_60_occ_int : 1;
+ u32 cq_61_occ_int : 1;
+ u32 cq_62_occ_int : 1;
+ u32 cq_63_occ_int : 1;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_DIR_CQ_95_64_OCC_INT_STS 0x448
+#define DLB_SYS_DIR_CQ_95_64_OCC_INT_STS_RST 0x0
+union dlb_sys_dir_cq_95_64_occ_int_sts {
+ struct {
+ u32 cq_64_occ_int : 1;
+ u32 cq_65_occ_int : 1;
+ u32 cq_66_occ_int : 1;
+ u32 cq_67_occ_int : 1;
+ u32 cq_68_occ_int : 1;
+ u32 cq_69_occ_int : 1;
+ u32 cq_70_occ_int : 1;
+ u32 cq_71_occ_int : 1;
+ u32 cq_72_occ_int : 1;
+ u32 cq_73_occ_int : 1;
+ u32 cq_74_occ_int : 1;
+ u32 cq_75_occ_int : 1;
+ u32 cq_76_occ_int : 1;
+ u32 cq_77_occ_int : 1;
+ u32 cq_78_occ_int : 1;
+ u32 cq_79_occ_int : 1;
+ u32 cq_80_occ_int : 1;
+ u32 cq_81_occ_int : 1;
+ u32 cq_82_occ_int : 1;
+ u32 cq_83_occ_int : 1;
+ u32 cq_84_occ_int : 1;
+ u32 cq_85_occ_int : 1;
+ u32 cq_86_occ_int : 1;
+ u32 cq_87_occ_int : 1;
+ u32 cq_88_occ_int : 1;
+ u32 cq_89_occ_int : 1;
+ u32 cq_90_occ_int : 1;
+ u32 cq_91_occ_int : 1;
+ u32 cq_92_occ_int : 1;
+ u32 cq_93_occ_int : 1;
+ u32 cq_94_occ_int : 1;
+ u32 cq_95_occ_int : 1;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_DIR_CQ_127_96_OCC_INT_STS 0x44c
+#define DLB_SYS_DIR_CQ_127_96_OCC_INT_STS_RST 0x0
+union dlb_sys_dir_cq_127_96_occ_int_sts {
+ struct {
+ u32 cq_96_occ_int : 1;
+ u32 cq_97_occ_int : 1;
+ u32 cq_98_occ_int : 1;
+ u32 cq_99_occ_int : 1;
+ u32 cq_100_occ_int : 1;
+ u32 cq_101_occ_int : 1;
+ u32 cq_102_occ_int : 1;
+ u32 cq_103_occ_int : 1;
+ u32 cq_104_occ_int : 1;
+ u32 cq_105_occ_int : 1;
+ u32 cq_106_occ_int : 1;
+ u32 cq_107_occ_int : 1;
+ u32 cq_108_occ_int : 1;
+ u32 cq_109_occ_int : 1;
+ u32 cq_110_occ_int : 1;
+ u32 cq_111_occ_int : 1;
+ u32 cq_112_occ_int : 1;
+ u32 cq_113_occ_int : 1;
+ u32 cq_114_occ_int : 1;
+ u32 cq_115_occ_int : 1;
+ u32 cq_116_occ_int : 1;
+ u32 cq_117_occ_int : 1;
+ u32 cq_118_occ_int : 1;
+ u32 cq_119_occ_int : 1;
+ u32 cq_120_occ_int : 1;
+ u32 cq_121_occ_int : 1;
+ u32 cq_122_occ_int : 1;
+ u32 cq_123_occ_int : 1;
+ u32 cq_124_occ_int : 1;
+ u32 cq_125_occ_int : 1;
+ u32 cq_126_occ_int : 1;
+ u32 cq_127_occ_int : 1;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_LDB_CQ_31_0_OCC_INT_STS 0x460
+#define DLB_SYS_LDB_CQ_31_0_OCC_INT_STS_RST 0x0
+union dlb_sys_ldb_cq_31_0_occ_int_sts {
+ struct {
+ u32 cq_0_occ_int : 1;
+ u32 cq_1_occ_int : 1;
+ u32 cq_2_occ_int : 1;
+ u32 cq_3_occ_int : 1;
+ u32 cq_4_occ_int : 1;
+ u32 cq_5_occ_int : 1;
+ u32 cq_6_occ_int : 1;
+ u32 cq_7_occ_int : 1;
+ u32 cq_8_occ_int : 1;
+ u32 cq_9_occ_int : 1;
+ u32 cq_10_occ_int : 1;
+ u32 cq_11_occ_int : 1;
+ u32 cq_12_occ_int : 1;
+ u32 cq_13_occ_int : 1;
+ u32 cq_14_occ_int : 1;
+ u32 cq_15_occ_int : 1;
+ u32 cq_16_occ_int : 1;
+ u32 cq_17_occ_int : 1;
+ u32 cq_18_occ_int : 1;
+ u32 cq_19_occ_int : 1;
+ u32 cq_20_occ_int : 1;
+ u32 cq_21_occ_int : 1;
+ u32 cq_22_occ_int : 1;
+ u32 cq_23_occ_int : 1;
+ u32 cq_24_occ_int : 1;
+ u32 cq_25_occ_int : 1;
+ u32 cq_26_occ_int : 1;
+ u32 cq_27_occ_int : 1;
+ u32 cq_28_occ_int : 1;
+ u32 cq_29_occ_int : 1;
+ u32 cq_30_occ_int : 1;
+ u32 cq_31_occ_int : 1;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_LDB_CQ_63_32_OCC_INT_STS 0x464
+#define DLB_SYS_LDB_CQ_63_32_OCC_INT_STS_RST 0x0
+union dlb_sys_ldb_cq_63_32_occ_int_sts {
+ struct {
+ u32 cq_32_occ_int : 1;
+ u32 cq_33_occ_int : 1;
+ u32 cq_34_occ_int : 1;
+ u32 cq_35_occ_int : 1;
+ u32 cq_36_occ_int : 1;
+ u32 cq_37_occ_int : 1;
+ u32 cq_38_occ_int : 1;
+ u32 cq_39_occ_int : 1;
+ u32 cq_40_occ_int : 1;
+ u32 cq_41_occ_int : 1;
+ u32 cq_42_occ_int : 1;
+ u32 cq_43_occ_int : 1;
+ u32 cq_44_occ_int : 1;
+ u32 cq_45_occ_int : 1;
+ u32 cq_46_occ_int : 1;
+ u32 cq_47_occ_int : 1;
+ u32 cq_48_occ_int : 1;
+ u32 cq_49_occ_int : 1;
+ u32 cq_50_occ_int : 1;
+ u32 cq_51_occ_int : 1;
+ u32 cq_52_occ_int : 1;
+ u32 cq_53_occ_int : 1;
+ u32 cq_54_occ_int : 1;
+ u32 cq_55_occ_int : 1;
+ u32 cq_56_occ_int : 1;
+ u32 cq_57_occ_int : 1;
+ u32 cq_58_occ_int : 1;
+ u32 cq_59_occ_int : 1;
+ u32 cq_60_occ_int : 1;
+ u32 cq_61_occ_int : 1;
+ u32 cq_62_occ_int : 1;
+ u32 cq_63_occ_int : 1;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_ALARM_HW_SYND 0x50c
+#define DLB_SYS_ALARM_HW_SYND_RST 0x0
+union dlb_sys_alarm_hw_synd {
+ struct {
+ u32 syndrome : 8;
+ u32 rtype : 2;
+ u32 rsvd0 : 2;
+ u32 from_dmv : 1;
+ u32 is_ldb : 1;
+ u32 cls : 2;
+ u32 aid : 6;
+ u32 unit : 4;
+ u32 source : 4;
+ u32 more : 1;
+ u32 valid : 1;
+ } field;
+ u32 val;
+};
+
+#define DLB_SYS_SYS_ALARM_INT_ENABLE 0xc001048
+#define DLB_SYS_SYS_ALARM_INT_ENABLE_RST 0x7fffff
+union dlb_sys_sys_alarm_int_enable {
+ struct {
+ u32 cq_addr_overflow_error : 1;
+ u32 ingress_perr : 1;
+ u32 egress_perr : 1;
+ u32 alarm_perr : 1;
+ u32 vf_to_pf_isr_pend_error : 1;
+ u32 pf_to_vf_isr_pend_error : 1;
+ u32 timeout_error : 1;
+ u32 dmvw_sm_error : 1;
+ u32 pptr_sm_par_error : 1;
+ u32 pptr_sm_len_error : 1;
+ u32 sch_sm_error : 1;
+ u32 wbuf_flag_error : 1;
+ u32 dmvw_cl_error : 1;
+ u32 dmvr_cl_error : 1;
+ u32 cmpl_data_error : 1;
+ u32 cmpl_error : 1;
+ u32 fifo_underflow : 1;
+ u32 fifo_overflow : 1;
+ u32 sb_ep_parity_err : 1;
+ u32 ti_parity_err : 1;
+ u32 ri_parity_err : 1;
+ u32 cfgm_ppw_err : 1;
+ u32 system_csr_perr : 1;
+ u32 rsvd0 : 9;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_CQ_LDB_TOT_SCH_CNT_CTRL(x) \
+ (0x20000000 + (x) * 0x1000)
+#define DLB_LSP_CQ_LDB_TOT_SCH_CNT_CTRL_RST 0x0
+union dlb_lsp_cq_ldb_tot_sch_cnt_ctrl {
+ struct {
+ u32 count : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_CQ_LDB_DSBL(x) \
+ (0x20000124 + (x) * 0x1000)
+#define DLB_LSP_CQ_LDB_DSBL_RST 0x1
+union dlb_lsp_cq_ldb_dsbl {
+ struct {
+ u32 disabled : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_CQ_LDB_TOT_SCH_CNTH(x) \
+ (0x20000120 + (x) * 0x1000)
+#define DLB_LSP_CQ_LDB_TOT_SCH_CNTH_RST 0x0
+union dlb_lsp_cq_ldb_tot_sch_cnth {
+ struct {
+ u32 count : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_CQ_LDB_TOT_SCH_CNTL(x) \
+ (0x2000011c + (x) * 0x1000)
+#define DLB_LSP_CQ_LDB_TOT_SCH_CNTL_RST 0x0
+union dlb_lsp_cq_ldb_tot_sch_cntl {
+ struct {
+ u32 count : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_CQ_LDB_TKN_DEPTH_SEL(x) \
+ (0x20000118 + (x) * 0x1000)
+#define DLB_LSP_CQ_LDB_TKN_DEPTH_SEL_RST 0x0
+union dlb_lsp_cq_ldb_tkn_depth_sel {
+ struct {
+ u32 token_depth_select : 4;
+ u32 ignore_depth : 1;
+ u32 enab_shallow_cq : 1;
+ u32 rsvd0 : 26;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_CQ_LDB_TKN_CNT(x) \
+ (0x20000114 + (x) * 0x1000)
+#define DLB_LSP_CQ_LDB_TKN_CNT_RST 0x0
+union dlb_lsp_cq_ldb_tkn_cnt {
+ struct {
+ u32 token_count : 11;
+ u32 rsvd0 : 21;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_CQ_LDB_INFL_LIM(x) \
+ (0x20000110 + (x) * 0x1000)
+#define DLB_LSP_CQ_LDB_INFL_LIM_RST 0x0
+union dlb_lsp_cq_ldb_infl_lim {
+ struct {
+ u32 limit : 13;
+ u32 rsvd0 : 19;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_CQ_LDB_INFL_CNT(x) \
+ (0x2000010c + (x) * 0x1000)
+#define DLB_LSP_CQ_LDB_INFL_CNT_RST 0x0
+union dlb_lsp_cq_ldb_infl_cnt {
+ struct {
+ u32 count : 13;
+ u32 rsvd0 : 19;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_CQ2QID(x, y) \
+ (0x20000104 + (x) * 0x1000 + (y) * 0x4)
+#define DLB_LSP_CQ2QID_RST 0x0
+union dlb_lsp_cq2qid {
+ struct {
+ u32 qid_p0 : 7;
+ u32 rsvd3 : 1;
+ u32 qid_p1 : 7;
+ u32 rsvd2 : 1;
+ u32 qid_p2 : 7;
+ u32 rsvd1 : 1;
+ u32 qid_p3 : 7;
+ u32 rsvd0 : 1;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_CQ2PRIOV(x) \
+ (0x20000100 + (x) * 0x1000)
+#define DLB_LSP_CQ2PRIOV_RST 0x0
+union dlb_lsp_cq2priov {
+ struct {
+ u32 prio : 24;
+ u32 v : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_CQ_DIR_DSBL(x) \
+ (0x20000310 + (x) * 0x1000)
+#define DLB_LSP_CQ_DIR_DSBL_RST 0x1
+union dlb_lsp_cq_dir_dsbl {
+ struct {
+ u32 disabled : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_CQ_DIR_TKN_DEPTH_SEL_DSI(x) \
+ (0x2000030c + (x) * 0x1000)
+#define DLB_LSP_CQ_DIR_TKN_DEPTH_SEL_DSI_RST 0x0
+union dlb_lsp_cq_dir_tkn_depth_sel_dsi {
+ struct {
+ u32 token_depth_select : 4;
+ u32 disable_wb_opt : 1;
+ u32 ignore_depth : 1;
+ u32 rsvd0 : 26;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_CQ_DIR_TOT_SCH_CNTH(x) \
+ (0x20000308 + (x) * 0x1000)
+#define DLB_LSP_CQ_DIR_TOT_SCH_CNTH_RST 0x0
+union dlb_lsp_cq_dir_tot_sch_cnth {
+ struct {
+ u32 count : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_CQ_DIR_TOT_SCH_CNTL(x) \
+ (0x20000304 + (x) * 0x1000)
+#define DLB_LSP_CQ_DIR_TOT_SCH_CNTL_RST 0x0
+union dlb_lsp_cq_dir_tot_sch_cntl {
+ struct {
+ u32 count : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_CQ_DIR_TKN_CNT(x) \
+ (0x20000300 + (x) * 0x1000)
+#define DLB_LSP_CQ_DIR_TKN_CNT_RST 0x0
+union dlb_lsp_cq_dir_tkn_cnt {
+ struct {
+ u32 count : 11;
+ u32 rsvd0 : 21;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_QID_LDB_QID2CQIDX(x, y) \
+ (0x20000400 + (x) * 0x1000 + (y) * 0x4)
+#define DLB_LSP_QID_LDB_QID2CQIDX_RST 0x0
+union dlb_lsp_qid_ldb_qid2cqidx {
+ struct {
+ u32 cq_p0 : 8;
+ u32 cq_p1 : 8;
+ u32 cq_p2 : 8;
+ u32 cq_p3 : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_QID_LDB_QID2CQIDX2(x, y) \
+ (0x20000500 + (x) * 0x1000 + (y) * 0x4)
+#define DLB_LSP_QID_LDB_QID2CQIDX2_RST 0x0
+union dlb_lsp_qid_ldb_qid2cqidx2 {
+ struct {
+ u32 cq_p0 : 8;
+ u32 cq_p1 : 8;
+ u32 cq_p2 : 8;
+ u32 cq_p3 : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_QID_ATQ_ENQUEUE_CNT(x) \
+ (0x2000066c + (x) * 0x1000)
+#define DLB_LSP_QID_ATQ_ENQUEUE_CNT_RST 0x0
+union dlb_lsp_qid_atq_enqueue_cnt {
+ struct {
+ u32 count : 15;
+ u32 rsvd0 : 17;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_QID_LDB_INFL_LIM(x) \
+ (0x2000064c + (x) * 0x1000)
+#define DLB_LSP_QID_LDB_INFL_LIM_RST 0x0
+union dlb_lsp_qid_ldb_infl_lim {
+ struct {
+ u32 limit : 13;
+ u32 rsvd0 : 19;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_QID_LDB_INFL_CNT(x) \
+ (0x2000062c + (x) * 0x1000)
+#define DLB_LSP_QID_LDB_INFL_CNT_RST 0x0
+union dlb_lsp_qid_ldb_infl_cnt {
+ struct {
+ u32 count : 13;
+ u32 rsvd0 : 19;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_QID_AQED_ACTIVE_LIM(x) \
+ (0x20000628 + (x) * 0x1000)
+#define DLB_LSP_QID_AQED_ACTIVE_LIM_RST 0x0
+union dlb_lsp_qid_aqed_active_lim {
+ struct {
+ u32 limit : 12;
+ u32 rsvd0 : 20;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_QID_AQED_ACTIVE_CNT(x) \
+ (0x20000624 + (x) * 0x1000)
+#define DLB_LSP_QID_AQED_ACTIVE_CNT_RST 0x0
+union dlb_lsp_qid_aqed_active_cnt {
+ struct {
+ u32 count : 12;
+ u32 rsvd0 : 20;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_QID_LDB_ENQUEUE_CNT(x) \
+ (0x20000604 + (x) * 0x1000)
+#define DLB_LSP_QID_LDB_ENQUEUE_CNT_RST 0x0
+union dlb_lsp_qid_ldb_enqueue_cnt {
+ struct {
+ u32 count : 15;
+ u32 rsvd0 : 17;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_QID_LDB_REPLAY_CNT(x) \
+ (0x20000600 + (x) * 0x1000)
+#define DLB_LSP_QID_LDB_REPLAY_CNT_RST 0x0
+union dlb_lsp_qid_ldb_replay_cnt {
+ struct {
+ u32 count : 15;
+ u32 rsvd0 : 17;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_QID_DIR_ENQUEUE_CNT(x) \
+ (0x20000700 + (x) * 0x1000)
+#define DLB_LSP_QID_DIR_ENQUEUE_CNT_RST 0x0
+union dlb_lsp_qid_dir_enqueue_cnt {
+ struct {
+ u32 count : 13;
+ u32 rsvd0 : 19;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_CTRL_CONFIG_0 0x2800002c
+#define DLB_LSP_CTRL_CONFIG_0_RST 0x12cc
+union dlb_lsp_ctrl_config_0 {
+ struct {
+ u32 atm_cq_qid_priority_prot : 1;
+ u32 ldb_arb_ignore_empty : 1;
+ u32 ldb_arb_mode : 2;
+ u32 ldb_arb_threshold : 18;
+ u32 cfg_cq_sla_upd_always : 1;
+ u32 cfg_cq_wcn_upd_always : 1;
+ u32 spare : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_CFG_ARB_WEIGHT_ATM_NALB_QID_1 0x28000028
+#define DLB_LSP_CFG_ARB_WEIGHT_ATM_NALB_QID_1_RST 0x0
+union dlb_lsp_cfg_arb_weight_atm_nalb_qid_1 {
+ struct {
+ u32 slot4_weight : 8;
+ u32 slot5_weight : 8;
+ u32 slot6_weight : 8;
+ u32 slot7_weight : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_CFG_ARB_WEIGHT_ATM_NALB_QID_0 0x28000024
+#define DLB_LSP_CFG_ARB_WEIGHT_ATM_NALB_QID_0_RST 0x0
+union dlb_lsp_cfg_arb_weight_atm_nalb_qid_0 {
+ struct {
+ u32 slot0_weight : 8;
+ u32 slot1_weight : 8;
+ u32 slot2_weight : 8;
+ u32 slot3_weight : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_CFG_ARB_WEIGHT_LDB_QID_1 0x28000020
+#define DLB_LSP_CFG_ARB_WEIGHT_LDB_QID_1_RST 0x0
+union dlb_lsp_cfg_arb_weight_ldb_qid_1 {
+ struct {
+ u32 slot4_weight : 8;
+ u32 slot5_weight : 8;
+ u32 slot6_weight : 8;
+ u32 slot7_weight : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_CFG_ARB_WEIGHT_LDB_QID_0 0x2800001c
+#define DLB_LSP_CFG_ARB_WEIGHT_LDB_QID_0_RST 0x0
+union dlb_lsp_cfg_arb_weight_ldb_qid_0 {
+ struct {
+ u32 slot0_weight : 8;
+ u32 slot1_weight : 8;
+ u32 slot2_weight : 8;
+ u32 slot3_weight : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_LDB_SCHED_CTRL 0x28100000
+#define DLB_LSP_LDB_SCHED_CTRL_RST 0x0
+union dlb_lsp_ldb_sched_ctrl {
+ struct {
+ u32 cq : 8;
+ u32 qidix : 3;
+ u32 value : 1;
+ u32 nalb_haswork_v : 1;
+ u32 rlist_haswork_v : 1;
+ u32 slist_haswork_v : 1;
+ u32 inflight_ok_v : 1;
+ u32 aqed_nfull_v : 1;
+ u32 spare0 : 15;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_DIR_SCH_CNT_H 0x2820000c
+#define DLB_LSP_DIR_SCH_CNT_H_RST 0x0
+union dlb_lsp_dir_sch_cnt_h {
+ struct {
+ u32 count : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_DIR_SCH_CNT_L 0x28200008
+#define DLB_LSP_DIR_SCH_CNT_L_RST 0x0
+union dlb_lsp_dir_sch_cnt_l {
+ struct {
+ u32 count : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_LDB_SCH_CNT_H 0x28200004
+#define DLB_LSP_LDB_SCH_CNT_H_RST 0x0
+union dlb_lsp_ldb_sch_cnt_h {
+ struct {
+ u32 count : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_LSP_LDB_SCH_CNT_L 0x28200000
+#define DLB_LSP_LDB_SCH_CNT_L_RST 0x0
+union dlb_lsp_ldb_sch_cnt_l {
+ struct {
+ u32 count : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_DP_DIR_CSR_CTRL 0x38000018
+#define DLB_DP_DIR_CSR_CTRL_RST 0xc0000000
+union dlb_dp_dir_csr_ctrl {
+ struct {
+ u32 cfg_int_dis : 1;
+ u32 cfg_int_dis_sbe : 1;
+ u32 cfg_int_dis_mbe : 1;
+ u32 spare0 : 27;
+ u32 cfg_vasr_dis : 1;
+ u32 cfg_int_dis_synd : 1;
+ } field;
+ u32 val;
+};
+
+#define DLB_DP_CFG_CTRL_ARB_WEIGHTS_TQPRI_DIR_1 0x38000014
+#define DLB_DP_CFG_CTRL_ARB_WEIGHTS_TQPRI_DIR_1_RST 0xfffefdfc
+union dlb_dp_cfg_ctrl_arb_weights_tqpri_dir_1 {
+ struct {
+ u32 pri4 : 8;
+ u32 pri5 : 8;
+ u32 pri6 : 8;
+ u32 pri7 : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_DP_CFG_CTRL_ARB_WEIGHTS_TQPRI_DIR_0 0x38000010
+#define DLB_DP_CFG_CTRL_ARB_WEIGHTS_TQPRI_DIR_0_RST 0xfbfaf9f8
+union dlb_dp_cfg_ctrl_arb_weights_tqpri_dir_0 {
+ struct {
+ u32 pri0 : 8;
+ u32 pri1 : 8;
+ u32 pri2 : 8;
+ u32 pri3 : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_DP_CFG_CTRL_ARB_WEIGHTS_TQPRI_REPLAY_1 0x3800000c
+#define DLB_DP_CFG_CTRL_ARB_WEIGHTS_TQPRI_REPLAY_1_RST 0xfffefdfc
+union dlb_dp_cfg_ctrl_arb_weights_tqpri_replay_1 {
+ struct {
+ u32 pri4 : 8;
+ u32 pri5 : 8;
+ u32 pri6 : 8;
+ u32 pri7 : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_DP_CFG_CTRL_ARB_WEIGHTS_TQPRI_REPLAY_0 0x38000008
+#define DLB_DP_CFG_CTRL_ARB_WEIGHTS_TQPRI_REPLAY_0_RST 0xfbfaf9f8
+union dlb_dp_cfg_ctrl_arb_weights_tqpri_replay_0 {
+ struct {
+ u32 pri0 : 8;
+ u32 pri1 : 8;
+ u32 pri2 : 8;
+ u32 pri3 : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_NALB_PIPE_CTRL_ARB_WEIGHTS_TQPRI_NALB_1 0x6800001c
+#define DLB_NALB_PIPE_CTRL_ARB_WEIGHTS_TQPRI_NALB_1_RST 0xfffefdfc
+union dlb_nalb_pipe_ctrl_arb_weights_tqpri_nalb_1 {
+ struct {
+ u32 pri4 : 8;
+ u32 pri5 : 8;
+ u32 pri6 : 8;
+ u32 pri7 : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_NALB_PIPE_CTRL_ARB_WEIGHTS_TQPRI_NALB_0 0x68000018
+#define DLB_NALB_PIPE_CTRL_ARB_WEIGHTS_TQPRI_NALB_0_RST 0xfbfaf9f8
+union dlb_nalb_pipe_ctrl_arb_weights_tqpri_nalb_0 {
+ struct {
+ u32 pri0 : 8;
+ u32 pri1 : 8;
+ u32 pri2 : 8;
+ u32 pri3 : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_NALB_PIPE_CFG_CTRL_ARB_WEIGHTS_TQPRI_ATQ_1 0x68000014
+#define DLB_NALB_PIPE_CFG_CTRL_ARB_WEIGHTS_TQPRI_ATQ_1_RST 0xfffefdfc
+union dlb_nalb_pipe_cfg_ctrl_arb_weights_tqpri_atq_1 {
+ struct {
+ u32 pri4 : 8;
+ u32 pri5 : 8;
+ u32 pri6 : 8;
+ u32 pri7 : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_NALB_PIPE_CFG_CTRL_ARB_WEIGHTS_TQPRI_ATQ_0 0x68000010
+#define DLB_NALB_PIPE_CFG_CTRL_ARB_WEIGHTS_TQPRI_ATQ_0_RST 0xfbfaf9f8
+union dlb_nalb_pipe_cfg_ctrl_arb_weights_tqpri_atq_0 {
+ struct {
+ u32 pri0 : 8;
+ u32 pri1 : 8;
+ u32 pri2 : 8;
+ u32 pri3 : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_NALB_PIPE_CFG_CTRL_ARB_WEIGHTS_TQPRI_REPLAY_1 0x6800000c
+#define DLB_NALB_PIPE_CFG_CTRL_ARB_WEIGHTS_TQPRI_REPLAY_1_RST 0xfffefdfc
+union dlb_nalb_pipe_cfg_ctrl_arb_weights_tqpri_replay_1 {
+ struct {
+ u32 pri4 : 8;
+ u32 pri5 : 8;
+ u32 pri6 : 8;
+ u32 pri7 : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_NALB_PIPE_CFG_CTRL_ARB_WEIGHTS_TQPRI_REPLAY_0 0x68000008
+#define DLB_NALB_PIPE_CFG_CTRL_ARB_WEIGHTS_TQPRI_REPLAY_0_RST 0xfbfaf9f8
+union dlb_nalb_pipe_cfg_ctrl_arb_weights_tqpri_replay_0 {
+ struct {
+ u32 pri0 : 8;
+ u32 pri1 : 8;
+ u32 pri2 : 8;
+ u32 pri3 : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_ATM_PIPE_QID_LDB_QID2CQIDX(x, y) \
+ (0x70000000 + (x) * 0x1000 + (y) * 0x4)
+#define DLB_ATM_PIPE_QID_LDB_QID2CQIDX_RST 0x0
+union dlb_atm_pipe_qid_ldb_qid2cqidx {
+ struct {
+ u32 cq_p0 : 8;
+ u32 cq_p1 : 8;
+ u32 cq_p2 : 8;
+ u32 cq_p3 : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_ATM_PIPE_CFG_CTRL_ARB_WEIGHTS_SCHED_BIN 0x7800000c
+#define DLB_ATM_PIPE_CFG_CTRL_ARB_WEIGHTS_SCHED_BIN_RST 0xfffefdfc
+union dlb_atm_pipe_cfg_ctrl_arb_weights_sched_bin {
+ struct {
+ u32 bin0 : 8;
+ u32 bin1 : 8;
+ u32 bin2 : 8;
+ u32 bin3 : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_ATM_PIPE_CTRL_ARB_WEIGHTS_RDY_BIN 0x78000008
+#define DLB_ATM_PIPE_CTRL_ARB_WEIGHTS_RDY_BIN_RST 0xfffefdfc
+union dlb_atm_pipe_ctrl_arb_weights_rdy_bin {
+ struct {
+ u32 bin0 : 8;
+ u32 bin1 : 8;
+ u32 bin2 : 8;
+ u32 bin3 : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_AQED_PIPE_QID_FID_LIM(x) \
+ (0x80000014 + (x) * 0x1000)
+#define DLB_AQED_PIPE_QID_FID_LIM_RST 0x7ff
+union dlb_aqed_pipe_qid_fid_lim {
+ struct {
+ u32 qid_fid_limit : 13;
+ u32 rsvd0 : 19;
+ } field;
+ u32 val;
+};
+
+#define DLB_AQED_PIPE_FL_POP_PTR(x) \
+ (0x80000010 + (x) * 0x1000)
+#define DLB_AQED_PIPE_FL_POP_PTR_RST 0x0
+union dlb_aqed_pipe_fl_pop_ptr {
+ struct {
+ u32 pop_ptr : 11;
+ u32 generation : 1;
+ u32 rsvd0 : 20;
+ } field;
+ u32 val;
+};
+
+#define DLB_AQED_PIPE_FL_PUSH_PTR(x) \
+ (0x8000000c + (x) * 0x1000)
+#define DLB_AQED_PIPE_FL_PUSH_PTR_RST 0x0
+union dlb_aqed_pipe_fl_push_ptr {
+ struct {
+ u32 push_ptr : 11;
+ u32 generation : 1;
+ u32 rsvd0 : 20;
+ } field;
+ u32 val;
+};
+
+#define DLB_AQED_PIPE_FL_BASE(x) \
+ (0x80000008 + (x) * 0x1000)
+#define DLB_AQED_PIPE_FL_BASE_RST 0x0
+union dlb_aqed_pipe_fl_base {
+ struct {
+ u32 base : 11;
+ u32 rsvd0 : 21;
+ } field;
+ u32 val;
+};
+
+#define DLB_AQED_PIPE_FL_LIM(x) \
+ (0x80000004 + (x) * 0x1000)
+#define DLB_AQED_PIPE_FL_LIM_RST 0x800
+union dlb_aqed_pipe_fl_lim {
+ struct {
+ u32 limit : 11;
+ u32 freelist_disable : 1;
+ u32 rsvd0 : 20;
+ } field;
+ u32 val;
+};
+
+#define DLB_AQED_PIPE_CFG_CTRL_ARB_WEIGHTS_TQPRI_ATM_0 0x88000008
+#define DLB_AQED_PIPE_CFG_CTRL_ARB_WEIGHTS_TQPRI_ATM_0_RST 0xfffe
+union dlb_aqed_pipe_cfg_ctrl_arb_weights_tqpri_atm_0 {
+ struct {
+ u32 pri0 : 8;
+ u32 pri1 : 8;
+ u32 pri2 : 8;
+ u32 pri3 : 8;
+ } field;
+ u32 val;
+};
+
+#define DLB_RO_PIPE_QID2GRPSLT(x) \
+ (0x90000000 + (x) * 0x1000)
+#define DLB_RO_PIPE_QID2GRPSLT_RST 0x0
+union dlb_ro_pipe_qid2grpslt {
+ struct {
+ u32 slot : 5;
+ u32 rsvd1 : 3;
+ u32 group : 2;
+ u32 rsvd0 : 22;
+ } field;
+ u32 val;
+};
+
+#define DLB_RO_PIPE_GRP_SN_MODE 0x98000008
+#define DLB_RO_PIPE_GRP_SN_MODE_RST 0x0
+union dlb_ro_pipe_grp_sn_mode {
+ struct {
+ u32 sn_mode_0 : 3;
+ u32 reserved0 : 5;
+ u32 sn_mode_1 : 3;
+ u32 reserved1 : 5;
+ u32 sn_mode_2 : 3;
+ u32 reserved2 : 5;
+ u32 sn_mode_3 : 3;
+ u32 reserved3 : 5;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_CFG_DIR_PP_SW_ALARM_EN(x) \
+ (0xa000003c + (x) * 0x1000)
+#define DLB_CHP_CFG_DIR_PP_SW_ALARM_EN_RST 0x1
+union dlb_chp_cfg_dir_pp_sw_alarm_en {
+ struct {
+ u32 alarm_enable : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_CQ_WD_ENB(x) \
+ (0xa0000038 + (x) * 0x1000)
+#define DLB_CHP_DIR_CQ_WD_ENB_RST 0x0
+union dlb_chp_dir_cq_wd_enb {
+ struct {
+ u32 wd_enable : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_LDB_PP2POOL(x) \
+ (0xa0000034 + (x) * 0x1000)
+#define DLB_CHP_DIR_LDB_PP2POOL_RST 0x0
+union dlb_chp_dir_ldb_pp2pool {
+ struct {
+ u32 pool : 6;
+ u32 rsvd0 : 26;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_DIR_PP2POOL(x) \
+ (0xa0000030 + (x) * 0x1000)
+#define DLB_CHP_DIR_DIR_PP2POOL_RST 0x0
+union dlb_chp_dir_dir_pp2pool {
+ struct {
+ u32 pool : 6;
+ u32 rsvd0 : 26;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_PP_LDB_CRD_CNT(x) \
+ (0xa000002c + (x) * 0x1000)
+#define DLB_CHP_DIR_PP_LDB_CRD_CNT_RST 0x0
+union dlb_chp_dir_pp_ldb_crd_cnt {
+ struct {
+ u32 count : 16;
+ u32 rsvd0 : 16;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_PP_DIR_CRD_CNT(x) \
+ (0xa0000028 + (x) * 0x1000)
+#define DLB_CHP_DIR_PP_DIR_CRD_CNT_RST 0x0
+union dlb_chp_dir_pp_dir_crd_cnt {
+ struct {
+ u32 count : 14;
+ u32 rsvd0 : 18;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_CQ_TMR_THRESHOLD(x) \
+ (0xa0000024 + (x) * 0x1000)
+#define DLB_CHP_DIR_CQ_TMR_THRESHOLD_RST 0x0
+union dlb_chp_dir_cq_tmr_threshold {
+ struct {
+ u32 timer_thrsh : 14;
+ u32 rsvd0 : 18;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_CQ_INT_ENB(x) \
+ (0xa0000020 + (x) * 0x1000)
+#define DLB_CHP_DIR_CQ_INT_ENB_RST 0x0
+union dlb_chp_dir_cq_int_enb {
+ struct {
+ u32 en_tim : 1;
+ u32 en_depth : 1;
+ u32 rsvd0 : 30;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_CQ_INT_DEPTH_THRSH(x) \
+ (0xa000001c + (x) * 0x1000)
+#define DLB_CHP_DIR_CQ_INT_DEPTH_THRSH_RST 0x0
+union dlb_chp_dir_cq_int_depth_thrsh {
+ struct {
+ u32 depth_threshold : 12;
+ u32 rsvd0 : 20;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_CQ_TKN_DEPTH_SEL(x) \
+ (0xa0000018 + (x) * 0x1000)
+#define DLB_CHP_DIR_CQ_TKN_DEPTH_SEL_RST 0x0
+union dlb_chp_dir_cq_tkn_depth_sel {
+ struct {
+ u32 token_depth_select : 4;
+ u32 rsvd0 : 28;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_PP_LDB_MIN_CRD_QNT(x) \
+ (0xa0000014 + (x) * 0x1000)
+#define DLB_CHP_DIR_PP_LDB_MIN_CRD_QNT_RST 0x1
+union dlb_chp_dir_pp_ldb_min_crd_qnt {
+ struct {
+ u32 quanta : 10;
+ u32 rsvd0 : 22;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_PP_DIR_MIN_CRD_QNT(x) \
+ (0xa0000010 + (x) * 0x1000)
+#define DLB_CHP_DIR_PP_DIR_MIN_CRD_QNT_RST 0x1
+union dlb_chp_dir_pp_dir_min_crd_qnt {
+ struct {
+ u32 quanta : 10;
+ u32 rsvd0 : 22;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_PP_LDB_CRD_LWM(x) \
+ (0xa000000c + (x) * 0x1000)
+#define DLB_CHP_DIR_PP_LDB_CRD_LWM_RST 0x0
+union dlb_chp_dir_pp_ldb_crd_lwm {
+ struct {
+ u32 lwm : 16;
+ u32 rsvd0 : 16;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_PP_LDB_CRD_HWM(x) \
+ (0xa0000008 + (x) * 0x1000)
+#define DLB_CHP_DIR_PP_LDB_CRD_HWM_RST 0x0
+union dlb_chp_dir_pp_ldb_crd_hwm {
+ struct {
+ u32 hwm : 16;
+ u32 rsvd0 : 16;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_PP_DIR_CRD_LWM(x) \
+ (0xa0000004 + (x) * 0x1000)
+#define DLB_CHP_DIR_PP_DIR_CRD_LWM_RST 0x0
+union dlb_chp_dir_pp_dir_crd_lwm {
+ struct {
+ u32 lwm : 14;
+ u32 rsvd0 : 18;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_PP_DIR_CRD_HWM(x) \
+ (0xa0000000 + (x) * 0x1000)
+#define DLB_CHP_DIR_PP_DIR_CRD_HWM_RST 0x0
+union dlb_chp_dir_pp_dir_crd_hwm {
+ struct {
+ u32 hwm : 14;
+ u32 rsvd0 : 18;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_CFG_LDB_PP_SW_ALARM_EN(x) \
+ (0xa0000148 + (x) * 0x1000)
+#define DLB_CHP_CFG_LDB_PP_SW_ALARM_EN_RST 0x1
+union dlb_chp_cfg_ldb_pp_sw_alarm_en {
+ struct {
+ u32 alarm_enable : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_CQ_WD_ENB(x) \
+ (0xa0000144 + (x) * 0x1000)
+#define DLB_CHP_LDB_CQ_WD_ENB_RST 0x0
+union dlb_chp_ldb_cq_wd_enb {
+ struct {
+ u32 wd_enable : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_SN_CHK_ENBL(x) \
+ (0xa0000140 + (x) * 0x1000)
+#define DLB_CHP_SN_CHK_ENBL_RST 0x0
+union dlb_chp_sn_chk_enbl {
+ struct {
+ u32 en : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_HIST_LIST_BASE(x) \
+ (0xa000013c + (x) * 0x1000)
+#define DLB_CHP_HIST_LIST_BASE_RST 0x0
+union dlb_chp_hist_list_base {
+ struct {
+ u32 base : 13;
+ u32 rsvd0 : 19;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_HIST_LIST_LIM(x) \
+ (0xa0000138 + (x) * 0x1000)
+#define DLB_CHP_HIST_LIST_LIM_RST 0x0
+union dlb_chp_hist_list_lim {
+ struct {
+ u32 limit : 13;
+ u32 rsvd0 : 19;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_LDB_PP2POOL(x) \
+ (0xa0000134 + (x) * 0x1000)
+#define DLB_CHP_LDB_LDB_PP2POOL_RST 0x0
+union dlb_chp_ldb_ldb_pp2pool {
+ struct {
+ u32 pool : 6;
+ u32 rsvd0 : 26;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_DIR_PP2POOL(x) \
+ (0xa0000130 + (x) * 0x1000)
+#define DLB_CHP_LDB_DIR_PP2POOL_RST 0x0
+union dlb_chp_ldb_dir_pp2pool {
+ struct {
+ u32 pool : 6;
+ u32 rsvd0 : 26;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_PP_LDB_CRD_CNT(x) \
+ (0xa000012c + (x) * 0x1000)
+#define DLB_CHP_LDB_PP_LDB_CRD_CNT_RST 0x0
+union dlb_chp_ldb_pp_ldb_crd_cnt {
+ struct {
+ u32 count : 16;
+ u32 rsvd0 : 16;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_PP_DIR_CRD_CNT(x) \
+ (0xa0000128 + (x) * 0x1000)
+#define DLB_CHP_LDB_PP_DIR_CRD_CNT_RST 0x0
+union dlb_chp_ldb_pp_dir_crd_cnt {
+ struct {
+ u32 count : 14;
+ u32 rsvd0 : 18;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_CQ_TMR_THRESHOLD(x) \
+ (0xa0000124 + (x) * 0x1000)
+#define DLB_CHP_LDB_CQ_TMR_THRESHOLD_RST 0x0
+union dlb_chp_ldb_cq_tmr_threshold {
+ struct {
+ u32 thrsh : 14;
+ u32 rsvd0 : 18;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_CQ_INT_ENB(x) \
+ (0xa0000120 + (x) * 0x1000)
+#define DLB_CHP_LDB_CQ_INT_ENB_RST 0x0
+union dlb_chp_ldb_cq_int_enb {
+ struct {
+ u32 en_tim : 1;
+ u32 en_depth : 1;
+ u32 rsvd0 : 30;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_CQ_INT_DEPTH_THRSH(x) \
+ (0xa000011c + (x) * 0x1000)
+#define DLB_CHP_LDB_CQ_INT_DEPTH_THRSH_RST 0x0
+union dlb_chp_ldb_cq_int_depth_thrsh {
+ struct {
+ u32 depth_threshold : 12;
+ u32 rsvd0 : 20;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_CQ_TKN_DEPTH_SEL(x) \
+ (0xa0000118 + (x) * 0x1000)
+#define DLB_CHP_LDB_CQ_TKN_DEPTH_SEL_RST 0x0
+union dlb_chp_ldb_cq_tkn_depth_sel {
+ struct {
+ u32 token_depth_select : 4;
+ u32 rsvd0 : 28;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_PP_LDB_MIN_CRD_QNT(x) \
+ (0xa0000114 + (x) * 0x1000)
+#define DLB_CHP_LDB_PP_LDB_MIN_CRD_QNT_RST 0x1
+union dlb_chp_ldb_pp_ldb_min_crd_qnt {
+ struct {
+ u32 quanta : 10;
+ u32 rsvd0 : 22;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_PP_DIR_MIN_CRD_QNT(x) \
+ (0xa0000110 + (x) * 0x1000)
+#define DLB_CHP_LDB_PP_DIR_MIN_CRD_QNT_RST 0x1
+union dlb_chp_ldb_pp_dir_min_crd_qnt {
+ struct {
+ u32 quanta : 10;
+ u32 rsvd0 : 22;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_PP_LDB_CRD_LWM(x) \
+ (0xa000010c + (x) * 0x1000)
+#define DLB_CHP_LDB_PP_LDB_CRD_LWM_RST 0x0
+union dlb_chp_ldb_pp_ldb_crd_lwm {
+ struct {
+ u32 lwm : 16;
+ u32 rsvd0 : 16;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_PP_LDB_CRD_HWM(x) \
+ (0xa0000108 + (x) * 0x1000)
+#define DLB_CHP_LDB_PP_LDB_CRD_HWM_RST 0x0
+union dlb_chp_ldb_pp_ldb_crd_hwm {
+ struct {
+ u32 hwm : 16;
+ u32 rsvd0 : 16;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_PP_DIR_CRD_LWM(x) \
+ (0xa0000104 + (x) * 0x1000)
+#define DLB_CHP_LDB_PP_DIR_CRD_LWM_RST 0x0
+union dlb_chp_ldb_pp_dir_crd_lwm {
+ struct {
+ u32 lwm : 14;
+ u32 rsvd0 : 18;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_PP_DIR_CRD_HWM(x) \
+ (0xa0000100 + (x) * 0x1000)
+#define DLB_CHP_LDB_PP_DIR_CRD_HWM_RST 0x0
+union dlb_chp_ldb_pp_dir_crd_hwm {
+ struct {
+ u32 hwm : 14;
+ u32 rsvd0 : 18;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_CQ_DEPTH(x) \
+ (0xa0000218 + (x) * 0x1000)
+#define DLB_CHP_DIR_CQ_DEPTH_RST 0x0
+union dlb_chp_dir_cq_depth {
+ struct {
+ u32 cq_depth : 11;
+ u32 rsvd0 : 21;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_CQ_WPTR(x) \
+ (0xa0000214 + (x) * 0x1000)
+#define DLB_CHP_DIR_CQ_WPTR_RST 0x0
+union dlb_chp_dir_cq_wptr {
+ struct {
+ u32 write_pointer : 10;
+ u32 rsvd0 : 22;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_PP_LDB_PUSH_PTR(x) \
+ (0xa0000210 + (x) * 0x1000)
+#define DLB_CHP_DIR_PP_LDB_PUSH_PTR_RST 0x0
+union dlb_chp_dir_pp_ldb_push_ptr {
+ struct {
+ u32 push_pointer : 16;
+ u32 rsvd0 : 16;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_PP_DIR_PUSH_PTR(x) \
+ (0xa000020c + (x) * 0x1000)
+#define DLB_CHP_DIR_PP_DIR_PUSH_PTR_RST 0x0
+union dlb_chp_dir_pp_dir_push_ptr {
+ struct {
+ u32 push_pointer : 16;
+ u32 rsvd0 : 16;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_PP_STATE_RESET(x) \
+ (0xa0000204 + (x) * 0x1000)
+#define DLB_CHP_DIR_PP_STATE_RESET_RST 0x0
+union dlb_chp_dir_pp_state_reset {
+ struct {
+ u32 rsvd1 : 7;
+ u32 dir_type : 1;
+ u32 rsvd0 : 23;
+ u32 reset_pp_state : 1;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_PP_CRD_REQ_STATE(x) \
+ (0xa0000200 + (x) * 0x1000)
+#define DLB_CHP_DIR_PP_CRD_REQ_STATE_RST 0x0
+union dlb_chp_dir_pp_crd_req_state {
+ struct {
+ u32 dir_crd_req_active_valid : 1;
+ u32 dir_crd_req_active_check : 1;
+ u32 dir_crd_req_active_busy : 1;
+ u32 rsvd1 : 1;
+ u32 ldb_crd_req_active_valid : 1;
+ u32 ldb_crd_req_active_check : 1;
+ u32 ldb_crd_req_active_busy : 1;
+ u32 rsvd0 : 1;
+ u32 no_pp_credit_update : 1;
+ u32 crd_req_state : 23;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_CQ_DEPTH(x) \
+ (0xa0000320 + (x) * 0x1000)
+#define DLB_CHP_LDB_CQ_DEPTH_RST 0x0
+union dlb_chp_ldb_cq_depth {
+ struct {
+ u32 depth : 11;
+ u32 reserved : 2;
+ u32 rsvd0 : 19;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_CQ_WPTR(x) \
+ (0xa000031c + (x) * 0x1000)
+#define DLB_CHP_LDB_CQ_WPTR_RST 0x0
+union dlb_chp_ldb_cq_wptr {
+ struct {
+ u32 write_pointer : 10;
+ u32 rsvd0 : 22;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_PP_LDB_PUSH_PTR(x) \
+ (0xa0000318 + (x) * 0x1000)
+#define DLB_CHP_LDB_PP_LDB_PUSH_PTR_RST 0x0
+union dlb_chp_ldb_pp_ldb_push_ptr {
+ struct {
+ u32 push_pointer : 16;
+ u32 rsvd0 : 16;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_PP_DIR_PUSH_PTR(x) \
+ (0xa0000314 + (x) * 0x1000)
+#define DLB_CHP_LDB_PP_DIR_PUSH_PTR_RST 0x0
+union dlb_chp_ldb_pp_dir_push_ptr {
+ struct {
+ u32 push_pointer : 16;
+ u32 rsvd0 : 16;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_HIST_LIST_POP_PTR(x) \
+ (0xa000030c + (x) * 0x1000)
+#define DLB_CHP_HIST_LIST_POP_PTR_RST 0x0
+union dlb_chp_hist_list_pop_ptr {
+ struct {
+ u32 pop_ptr : 13;
+ u32 generation : 1;
+ u32 rsvd0 : 18;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_HIST_LIST_PUSH_PTR(x) \
+ (0xa0000308 + (x) * 0x1000)
+#define DLB_CHP_HIST_LIST_PUSH_PTR_RST 0x0
+union dlb_chp_hist_list_push_ptr {
+ struct {
+ u32 push_ptr : 13;
+ u32 generation : 1;
+ u32 rsvd0 : 18;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_PP_STATE_RESET(x) \
+ (0xa0000304 + (x) * 0x1000)
+#define DLB_CHP_LDB_PP_STATE_RESET_RST 0x0
+union dlb_chp_ldb_pp_state_reset {
+ struct {
+ u32 rsvd1 : 7;
+ u32 dir_type : 1;
+ u32 rsvd0 : 23;
+ u32 reset_pp_state : 1;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_PP_CRD_REQ_STATE(x) \
+ (0xa0000300 + (x) * 0x1000)
+#define DLB_CHP_LDB_PP_CRD_REQ_STATE_RST 0x0
+union dlb_chp_ldb_pp_crd_req_state {
+ struct {
+ u32 dir_crd_req_active_valid : 1;
+ u32 dir_crd_req_active_check : 1;
+ u32 dir_crd_req_active_busy : 1;
+ u32 rsvd1 : 1;
+ u32 ldb_crd_req_active_valid : 1;
+ u32 ldb_crd_req_active_check : 1;
+ u32 ldb_crd_req_active_busy : 1;
+ u32 rsvd0 : 1;
+ u32 no_pp_credit_update : 1;
+ u32 crd_req_state : 23;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_ORD_QID_SN(x) \
+ (0xa0000408 + (x) * 0x1000)
+#define DLB_CHP_ORD_QID_SN_RST 0x0
+union dlb_chp_ord_qid_sn {
+ struct {
+ u32 sn : 12;
+ u32 rsvd0 : 20;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_ORD_QID_SN_MAP(x) \
+ (0xa0000404 + (x) * 0x1000)
+#define DLB_CHP_ORD_QID_SN_MAP_RST 0x0
+union dlb_chp_ord_qid_sn_map {
+ struct {
+ u32 mode : 3;
+ u32 slot : 5;
+ u32 grp : 2;
+ u32 rsvd0 : 22;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_POOL_CRD_CNT(x) \
+ (0xa000050c + (x) * 0x1000)
+#define DLB_CHP_LDB_POOL_CRD_CNT_RST 0x0
+union dlb_chp_ldb_pool_crd_cnt {
+ struct {
+ u32 count : 16;
+ u32 rsvd0 : 16;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_QED_FL_BASE(x) \
+ (0xa0000508 + (x) * 0x1000)
+#define DLB_CHP_QED_FL_BASE_RST 0x0
+union dlb_chp_qed_fl_base {
+ struct {
+ u32 base : 14;
+ u32 rsvd0 : 18;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_QED_FL_LIM(x) \
+ (0xa0000504 + (x) * 0x1000)
+#define DLB_CHP_QED_FL_LIM_RST 0x8000
+union dlb_chp_qed_fl_lim {
+ struct {
+ u32 limit : 14;
+ u32 rsvd1 : 1;
+ u32 freelist_disable : 1;
+ u32 rsvd0 : 16;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_POOL_CRD_LIM(x) \
+ (0xa0000500 + (x) * 0x1000)
+#define DLB_CHP_LDB_POOL_CRD_LIM_RST 0x0
+union dlb_chp_ldb_pool_crd_lim {
+ struct {
+ u32 limit : 16;
+ u32 rsvd0 : 16;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_QED_FL_POP_PTR(x) \
+ (0xa0000604 + (x) * 0x1000)
+#define DLB_CHP_QED_FL_POP_PTR_RST 0x0
+union dlb_chp_qed_fl_pop_ptr {
+ struct {
+ u32 pop_ptr : 14;
+ u32 reserved0 : 1;
+ u32 generation : 1;
+ u32 rsvd0 : 16;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_QED_FL_PUSH_PTR(x) \
+ (0xa0000600 + (x) * 0x1000)
+#define DLB_CHP_QED_FL_PUSH_PTR_RST 0x0
+union dlb_chp_qed_fl_push_ptr {
+ struct {
+ u32 push_ptr : 14;
+ u32 reserved0 : 1;
+ u32 generation : 1;
+ u32 rsvd0 : 16;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_POOL_CRD_CNT(x) \
+ (0xa000070c + (x) * 0x1000)
+#define DLB_CHP_DIR_POOL_CRD_CNT_RST 0x0
+union dlb_chp_dir_pool_crd_cnt {
+ struct {
+ u32 count : 14;
+ u32 rsvd0 : 18;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DQED_FL_BASE(x) \
+ (0xa0000708 + (x) * 0x1000)
+#define DLB_CHP_DQED_FL_BASE_RST 0x0
+union dlb_chp_dqed_fl_base {
+ struct {
+ u32 base : 12;
+ u32 rsvd0 : 20;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DQED_FL_LIM(x) \
+ (0xa0000704 + (x) * 0x1000)
+#define DLB_CHP_DQED_FL_LIM_RST 0x2000
+union dlb_chp_dqed_fl_lim {
+ struct {
+ u32 limit : 12;
+ u32 rsvd1 : 1;
+ u32 freelist_disable : 1;
+ u32 rsvd0 : 18;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_POOL_CRD_LIM(x) \
+ (0xa0000700 + (x) * 0x1000)
+#define DLB_CHP_DIR_POOL_CRD_LIM_RST 0x0
+union dlb_chp_dir_pool_crd_lim {
+ struct {
+ u32 limit : 14;
+ u32 rsvd0 : 18;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DQED_FL_POP_PTR(x) \
+ (0xa0000804 + (x) * 0x1000)
+#define DLB_CHP_DQED_FL_POP_PTR_RST 0x0
+union dlb_chp_dqed_fl_pop_ptr {
+ struct {
+ u32 pop_ptr : 12;
+ u32 reserved0 : 1;
+ u32 generation : 1;
+ u32 rsvd0 : 18;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DQED_FL_PUSH_PTR(x) \
+ (0xa0000800 + (x) * 0x1000)
+#define DLB_CHP_DQED_FL_PUSH_PTR_RST 0x0
+union dlb_chp_dqed_fl_push_ptr {
+ struct {
+ u32 push_ptr : 12;
+ u32 reserved0 : 1;
+ u32 generation : 1;
+ u32 rsvd0 : 18;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_CTRL_DIAG_02 0xa8000154
+#define DLB_CHP_CTRL_DIAG_02_RST 0x0
+union dlb_chp_ctrl_diag_02 {
+ struct {
+ u32 control : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_CFG_CHP_CSR_CTRL 0xa8000130
+#define DLB_CHP_CFG_CHP_CSR_CTRL_RST 0xc0003fff
+#define DLB_CHP_CFG_EXCESS_TOKENS_SHIFT 12
+union dlb_chp_cfg_chp_csr_ctrl {
+ struct {
+ u32 int_inf_alarm_enable_0 : 1;
+ u32 int_inf_alarm_enable_1 : 1;
+ u32 int_inf_alarm_enable_2 : 1;
+ u32 int_inf_alarm_enable_3 : 1;
+ u32 int_inf_alarm_enable_4 : 1;
+ u32 int_inf_alarm_enable_5 : 1;
+ u32 int_inf_alarm_enable_6 : 1;
+ u32 int_inf_alarm_enable_7 : 1;
+ u32 int_inf_alarm_enable_8 : 1;
+ u32 int_inf_alarm_enable_9 : 1;
+ u32 int_inf_alarm_enable_10 : 1;
+ u32 int_inf_alarm_enable_11 : 1;
+ u32 int_inf_alarm_enable_12 : 1;
+ u32 int_cor_alarm_enable : 1;
+ u32 csr_control_spare : 14;
+ u32 cfg_vasr_dis : 1;
+ u32 counter_clear : 1;
+ u32 blk_cor_report : 1;
+ u32 blk_cor_synd : 1;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_CQ_INTR_ARMED1 0xa8000068
+#define DLB_CHP_LDB_CQ_INTR_ARMED1_RST 0x0
+union dlb_chp_ldb_cq_intr_armed1 {
+ struct {
+ u32 armed : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_LDB_CQ_INTR_ARMED0 0xa8000064
+#define DLB_CHP_LDB_CQ_INTR_ARMED0_RST 0x0
+union dlb_chp_ldb_cq_intr_armed0 {
+ struct {
+ u32 armed : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_CQ_INTR_ARMED3 0xa8000024
+#define DLB_CHP_DIR_CQ_INTR_ARMED3_RST 0x0
+union dlb_chp_dir_cq_intr_armed3 {
+ struct {
+ u32 armed : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_CQ_INTR_ARMED2 0xa8000020
+#define DLB_CHP_DIR_CQ_INTR_ARMED2_RST 0x0
+union dlb_chp_dir_cq_intr_armed2 {
+ struct {
+ u32 armed : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_CQ_INTR_ARMED1 0xa800001c
+#define DLB_CHP_DIR_CQ_INTR_ARMED1_RST 0x0
+union dlb_chp_dir_cq_intr_armed1 {
+ struct {
+ u32 armed : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_CHP_DIR_CQ_INTR_ARMED0 0xa8000018
+#define DLB_CHP_DIR_CQ_INTR_ARMED0_RST 0x0
+union dlb_chp_dir_cq_intr_armed0 {
+ struct {
+ u32 armed : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_CFG_MSTR_DIAG_RESET_STS 0xb8000004
+#define DLB_CFG_MSTR_DIAG_RESET_STS_RST 0x1ff
+union dlb_cfg_mstr_diag_reset_sts {
+ struct {
+ u32 chp_pf_reset_done : 1;
+ u32 rop_pf_reset_done : 1;
+ u32 lsp_pf_reset_done : 1;
+ u32 nalb_pf_reset_done : 1;
+ u32 ap_pf_reset_done : 1;
+ u32 dp_pf_reset_done : 1;
+ u32 qed_pf_reset_done : 1;
+ u32 dqed_pf_reset_done : 1;
+ u32 aqed_pf_reset_done : 1;
+ u32 rsvd1 : 6;
+ u32 pf_reset_active : 1;
+ u32 chp_vf_reset_done : 1;
+ u32 rop_vf_reset_done : 1;
+ u32 lsp_vf_reset_done : 1;
+ u32 nalb_vf_reset_done : 1;
+ u32 ap_vf_reset_done : 1;
+ u32 dp_vf_reset_done : 1;
+ u32 qed_vf_reset_done : 1;
+ u32 dqed_vf_reset_done : 1;
+ u32 aqed_vf_reset_done : 1;
+ u32 rsvd0 : 6;
+ u32 vf_reset_active : 1;
+ } field;
+ u32 val;
+};
+
+#define DLB_CFG_MSTR_BCAST_RESET_VF_START 0xc8100000
+#define DLB_CFG_MSTR_BCAST_RESET_VF_START_RST 0x0
+/* HW Reset Types */
+#define VF_RST_TYPE_CQ_LDB 0
+#define VF_RST_TYPE_QID_LDB 1
+#define VF_RST_TYPE_POOL_LDB 2
+#define VF_RST_TYPE_CQ_DIR 8
+#define VF_RST_TYPE_QID_DIR 9
+#define VF_RST_TYPE_POOL_DIR 10
+union dlb_cfg_mstr_bcast_reset_vf_start {
+ struct {
+ u32 vf_reset_start : 1;
+ u32 reserved : 3;
+ u32 vf_reset_type : 4;
+ u32 vf_reset_id : 24;
+ } field;
+ u32 val;
+};
+
+#define DLB_FUNC_VF_VF2PF_MAILBOX_BYTES 256
+#define DLB_FUNC_VF_VF2PF_MAILBOX(x) \
+ (0x1000 + (x) * 0x4)
+#define DLB_FUNC_VF_VF2PF_MAILBOX_RST 0x0
+union dlb_func_vf_vf2pf_mailbox {
+ struct {
+ u32 msg : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_FUNC_VF_VF2PF_MAILBOX_ISR 0x1f00
+#define DLB_FUNC_VF_VF2PF_MAILBOX_ISR_RST 0x0
+union dlb_func_vf_vf2pf_mailbox_isr {
+ struct {
+ u32 isr : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_FUNC_VF_PF2VF_MAILBOX_BYTES 64
+#define DLB_FUNC_VF_PF2VF_MAILBOX(x) \
+ (0x2000 + (x) * 0x4)
+#define DLB_FUNC_VF_PF2VF_MAILBOX_RST 0x0
+union dlb_func_vf_pf2vf_mailbox {
+ struct {
+ u32 msg : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_FUNC_VF_PF2VF_MAILBOX_ISR 0x2f00
+#define DLB_FUNC_VF_PF2VF_MAILBOX_ISR_RST 0x0
+union dlb_func_vf_pf2vf_mailbox_isr {
+ struct {
+ u32 pf_isr : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_FUNC_VF_VF_MSI_ISR_PEND 0x2f10
+#define DLB_FUNC_VF_VF_MSI_ISR_PEND_RST 0x0
+union dlb_func_vf_vf_msi_isr_pend {
+ struct {
+ u32 isr_pend : 32;
+ } field;
+ u32 val;
+};
+
+#define DLB_FUNC_VF_VF_RESET_IN_PROGRESS 0x3000
+#define DLB_FUNC_VF_VF_RESET_IN_PROGRESS_RST 0x1
+union dlb_func_vf_vf_reset_in_progress {
+ struct {
+ u32 reset_in_progress : 1;
+ u32 rsvd0 : 31;
+ } field;
+ u32 val;
+};
+
+#define DLB_FUNC_VF_VF_MSI_ISR 0x4000
+#define DLB_FUNC_VF_VF_MSI_ISR_RST 0x0
+union dlb_func_vf_vf_msi_isr {
+ struct {
+ u32 vf_msi_isr : 32;
+ } field;
+ u32 val;
+};
+
+#endif /* __DLB_REGS_H */
diff --git a/drivers/event/dlb/pf/base/dlb_resource.c b/drivers/event/dlb/pf/base/dlb_resource.c
new file mode 100644
index 0000000..51265b9
--- /dev/null
+++ b/drivers/event/dlb/pf/base/dlb_resource.c
@@ -0,0 +1,9722 @@
+/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-3-Clause)
+ * Copyright(c) 2016-2020 Intel Corporation
+ */
+
+#include "dlb_hw_types.h"
+#include "dlb_user.h"
+#include "dlb_resource.h"
+#include "dlb_osdep.h"
+#include "dlb_osdep_bitmap.h"
+#include "dlb_osdep_types.h"
+#include "dlb_regs.h"
+#include "dlb_mbox.h"
+
+#define DLB_DOM_LIST_HEAD(head, type) \
+ DLB_LIST_HEAD((head), type, domain_list)
+
+#define DLB_FUNC_LIST_HEAD(head, type) \
+ DLB_LIST_HEAD((head), type, func_list)
+
+#define DLB_DOM_LIST_FOR(head, ptr, iter) \
+ DLB_LIST_FOR_EACH(head, ptr, domain_list, iter)
+
+#define DLB_FUNC_LIST_FOR(head, ptr, iter) \
+ DLB_LIST_FOR_EACH(head, ptr, func_list, iter)
+
+#define DLB_DOM_LIST_FOR_SAFE(head, ptr, ptr_tmp, it, it_tmp) \
+ DLB_LIST_FOR_EACH_SAFE((head), ptr, ptr_tmp, domain_list, it, it_tmp)
+
+#define DLB_FUNC_LIST_FOR_SAFE(head, ptr, ptr_tmp, it, it_tmp) \
+ DLB_LIST_FOR_EACH_SAFE((head), ptr, ptr_tmp, func_list, it, it_tmp)
+
+/* The PF driver cannot assume that a register write will affect subsequent HCW
+ * writes. To ensure a write completes, the driver must read back a CSR. This
+ * function only need be called for configuration that can occur after the
+ * domain has started; prior to starting, applications can't send HCWs.
+ */
+static inline void dlb_flush_csr(struct dlb_hw *hw)
+{
+ DLB_CSR_RD(hw, DLB_SYS_TOTAL_VAS);
+}
+
+static void dlb_init_fn_rsrc_lists(struct dlb_function_resources *rsrc)
+{
+ dlb_list_init_head(&rsrc->avail_domains);
+ dlb_list_init_head(&rsrc->used_domains);
+ dlb_list_init_head(&rsrc->avail_ldb_queues);
+ dlb_list_init_head(&rsrc->avail_ldb_ports);
+ dlb_list_init_head(&rsrc->avail_dir_pq_pairs);
+ dlb_list_init_head(&rsrc->avail_ldb_credit_pools);
+ dlb_list_init_head(&rsrc->avail_dir_credit_pools);
+}
+
+static void dlb_init_domain_rsrc_lists(struct dlb_domain *domain)
+{
+ dlb_list_init_head(&domain->used_ldb_queues);
+ dlb_list_init_head(&domain->used_ldb_ports);
+ dlb_list_init_head(&domain->used_dir_pq_pairs);
+ dlb_list_init_head(&domain->used_ldb_credit_pools);
+ dlb_list_init_head(&domain->used_dir_credit_pools);
+ dlb_list_init_head(&domain->avail_ldb_queues);
+ dlb_list_init_head(&domain->avail_ldb_ports);
+ dlb_list_init_head(&domain->avail_dir_pq_pairs);
+ dlb_list_init_head(&domain->avail_ldb_credit_pools);
+ dlb_list_init_head(&domain->avail_dir_credit_pools);
+}
+
+int dlb_resource_init(struct dlb_hw *hw)
+{
+ struct dlb_list_entry *list;
+ unsigned int i;
+
+ /* For optimal load-balancing, ports that map to one or more QIDs in
+ * common should not be in numerical sequence. This is application
+ * dependent, but the driver interleaves port IDs as much as possible
+ * to reduce the likelihood of this. This initial allocation maximizes
+ * the average distance between an ID and its immediate neighbors (i.e.
+ * the distance from 1 to 0 and to 2, the distance from 2 to 1 and to
+ * 3, etc.).
+ */
+ u32 init_ldb_port_allocation[DLB_MAX_NUM_LDB_PORTS] = {
+ 0, 31, 62, 29, 60, 27, 58, 25, 56, 23, 54, 21, 52, 19, 50, 17,
+ 48, 15, 46, 13, 44, 11, 42, 9, 40, 7, 38, 5, 36, 3, 34, 1,
+ 32, 63, 30, 61, 28, 59, 26, 57, 24, 55, 22, 53, 20, 51, 18, 49,
+ 16, 47, 14, 45, 12, 43, 10, 41, 8, 39, 6, 37, 4, 35, 2, 33
+ };
+
+ /* Zero-out resource tracking data structures */
+ memset(&hw->rsrcs, 0, sizeof(hw->rsrcs));
+ memset(&hw->pf, 0, sizeof(hw->pf));
+
+ dlb_init_fn_rsrc_lists(&hw->pf);
+
+ for (i = 0; i < DLB_MAX_NUM_VFS; i++) {
+ memset(&hw->vf[i], 0, sizeof(hw->vf[i]));
+ dlb_init_fn_rsrc_lists(&hw->vf[i]);
+ }
+
+ for (i = 0; i < DLB_MAX_NUM_DOMAINS; i++) {
+ memset(&hw->domains[i], 0, sizeof(hw->domains[i]));
+ dlb_init_domain_rsrc_lists(&hw->domains[i]);
+ hw->domains[i].parent_func = &hw->pf;
+ }
+
+ /* Give all resources to the PF driver */
+ hw->pf.num_avail_domains = DLB_MAX_NUM_DOMAINS;
+ for (i = 0; i < hw->pf.num_avail_domains; i++) {
+ list = &hw->domains[i].func_list;
+
+ dlb_list_add(&hw->pf.avail_domains, list);
+ }
+
+ hw->pf.num_avail_ldb_queues = DLB_MAX_NUM_LDB_QUEUES;
+ for (i = 0; i < hw->pf.num_avail_ldb_queues; i++) {
+ list = &hw->rsrcs.ldb_queues[i].func_list;
+
+ dlb_list_add(&hw->pf.avail_ldb_queues, list);
+ }
+
+ hw->pf.num_avail_ldb_ports = DLB_MAX_NUM_LDB_PORTS;
+ for (i = 0; i < hw->pf.num_avail_ldb_ports; i++) {
+ struct dlb_ldb_port *port;
+
+ port = &hw->rsrcs.ldb_ports[init_ldb_port_allocation[i]];
+
+ dlb_list_add(&hw->pf.avail_ldb_ports, &port->func_list);
+ }
+
+ hw->pf.num_avail_dir_pq_pairs = DLB_MAX_NUM_DIR_PORTS;
+ for (i = 0; i < hw->pf.num_avail_dir_pq_pairs; i++) {
+ list = &hw->rsrcs.dir_pq_pairs[i].func_list;
+
+ dlb_list_add(&hw->pf.avail_dir_pq_pairs, list);
+ }
+
+ hw->pf.num_avail_ldb_credit_pools = DLB_MAX_NUM_LDB_CREDIT_POOLS;
+ for (i = 0; i < hw->pf.num_avail_ldb_credit_pools; i++) {
+ list = &hw->rsrcs.ldb_credit_pools[i].func_list;
+
+ dlb_list_add(&hw->pf.avail_ldb_credit_pools, list);
+ }
+
+ hw->pf.num_avail_dir_credit_pools = DLB_MAX_NUM_DIR_CREDIT_POOLS;
+ for (i = 0; i < hw->pf.num_avail_dir_credit_pools; i++) {
+ list = &hw->rsrcs.dir_credit_pools[i].func_list;
+
+ dlb_list_add(&hw->pf.avail_dir_credit_pools, list);
+ }
+
+ /* There are 5120 history list entries, which allows us to overprovision
+ * the inflight limit (4096) by 1k.
+ */
+ if (dlb_bitmap_alloc(hw,
+ &hw->pf.avail_hist_list_entries,
+ DLB_MAX_NUM_HIST_LIST_ENTRIES))
+ return -1;
+
+ if (dlb_bitmap_fill(hw->pf.avail_hist_list_entries))
+ return -1;
+
+ if (dlb_bitmap_alloc(hw,
+ &hw->pf.avail_qed_freelist_entries,
+ DLB_MAX_NUM_LDB_CREDITS))
+ return -1;
+
+ if (dlb_bitmap_fill(hw->pf.avail_qed_freelist_entries))
+ return -1;
+
+ if (dlb_bitmap_alloc(hw,
+ &hw->pf.avail_dqed_freelist_entries,
+ DLB_MAX_NUM_DIR_CREDITS))
+ return -1;
+
+ if (dlb_bitmap_fill(hw->pf.avail_dqed_freelist_entries))
+ return -1;
+
+ if (dlb_bitmap_alloc(hw,
+ &hw->pf.avail_aqed_freelist_entries,
+ DLB_MAX_NUM_AQOS_ENTRIES))
+ return -1;
+
+ if (dlb_bitmap_fill(hw->pf.avail_aqed_freelist_entries))
+ return -1;
+
+ for (i = 0; i < DLB_MAX_NUM_VFS; i++) {
+ if (dlb_bitmap_alloc(hw,
+ &hw->vf[i].avail_hist_list_entries,
+ DLB_MAX_NUM_HIST_LIST_ENTRIES))
+ return -1;
+ if (dlb_bitmap_alloc(hw,
+ &hw->vf[i].avail_qed_freelist_entries,
+ DLB_MAX_NUM_LDB_CREDITS))
+ return -1;
+ if (dlb_bitmap_alloc(hw,
+ &hw->vf[i].avail_dqed_freelist_entries,
+ DLB_MAX_NUM_DIR_CREDITS))
+ return -1;
+ if (dlb_bitmap_alloc(hw,
+ &hw->vf[i].avail_aqed_freelist_entries,
+ DLB_MAX_NUM_AQOS_ENTRIES))
+ return -1;
+
+ if (dlb_bitmap_zero(hw->vf[i].avail_hist_list_entries))
+ return -1;
+
+ if (dlb_bitmap_zero(hw->vf[i].avail_qed_freelist_entries))
+ return -1;
+
+ if (dlb_bitmap_zero(hw->vf[i].avail_dqed_freelist_entries))
+ return -1;
+
+ if (dlb_bitmap_zero(hw->vf[i].avail_aqed_freelist_entries))
+ return -1;
+ }
+
+ /* Initialize the hardware resource IDs */
+ for (i = 0; i < DLB_MAX_NUM_DOMAINS; i++) {
+ hw->domains[i].id.phys_id = i;
+ hw->domains[i].id.vf_owned = false;
+ }
+
+ for (i = 0; i < DLB_MAX_NUM_LDB_QUEUES; i++) {
+ hw->rsrcs.ldb_queues[i].id.phys_id = i;
+ hw->rsrcs.ldb_queues[i].id.vf_owned = false;
+ }
+
+ for (i = 0; i < DLB_MAX_NUM_LDB_PORTS; i++) {
+ hw->rsrcs.ldb_ports[i].id.phys_id = i;
+ hw->rsrcs.ldb_ports[i].id.vf_owned = false;
+ }
+
+ for (i = 0; i < DLB_MAX_NUM_DIR_PORTS; i++) {
+ hw->rsrcs.dir_pq_pairs[i].id.phys_id = i;
+ hw->rsrcs.dir_pq_pairs[i].id.vf_owned = false;
+ }
+
+ for (i = 0; i < DLB_MAX_NUM_LDB_CREDIT_POOLS; i++) {
+ hw->rsrcs.ldb_credit_pools[i].id.phys_id = i;
+ hw->rsrcs.ldb_credit_pools[i].id.vf_owned = false;
+ }
+
+ for (i = 0; i < DLB_MAX_NUM_DIR_CREDIT_POOLS; i++) {
+ hw->rsrcs.dir_credit_pools[i].id.phys_id = i;
+ hw->rsrcs.dir_credit_pools[i].id.vf_owned = false;
+ }
+
+ for (i = 0; i < DLB_MAX_NUM_SEQUENCE_NUMBER_GROUPS; i++) {
+ hw->rsrcs.sn_groups[i].id = i;
+ /* Default mode (0) is 32 sequence numbers per queue */
+ hw->rsrcs.sn_groups[i].mode = 0;
+ hw->rsrcs.sn_groups[i].sequence_numbers_per_queue = 32;
+ hw->rsrcs.sn_groups[i].slot_use_bitmap = 0;
+ }
+
+ return 0;
+}
+
+void dlb_resource_free(struct dlb_hw *hw)
+{
+ int i;
+
+ dlb_bitmap_free(hw->pf.avail_hist_list_entries);
+
+ dlb_bitmap_free(hw->pf.avail_qed_freelist_entries);
+
+ dlb_bitmap_free(hw->pf.avail_dqed_freelist_entries);
+
+ dlb_bitmap_free(hw->pf.avail_aqed_freelist_entries);
+
+ for (i = 0; i < DLB_MAX_NUM_VFS; i++) {
+ dlb_bitmap_free(hw->vf[i].avail_hist_list_entries);
+ dlb_bitmap_free(hw->vf[i].avail_qed_freelist_entries);
+ dlb_bitmap_free(hw->vf[i].avail_dqed_freelist_entries);
+ dlb_bitmap_free(hw->vf[i].avail_aqed_freelist_entries);
+ }
+}
+
+static struct dlb_domain *dlb_get_domain_from_id(struct dlb_hw *hw,
+ u32 id,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_function_resources *rsrcs;
+ struct dlb_domain *domain;
+
+ if (id >= DLB_MAX_NUM_DOMAINS)
+ return NULL;
+
+ if (!vf_request)
+ return &hw->domains[id];
+
+ rsrcs = &hw->vf[vf_id];
+
+ DLB_FUNC_LIST_FOR(rsrcs->used_domains, domain, iter)
+ if (domain->id.virt_id == id)
+ return domain;
+
+ return NULL;
+}
+
+static struct dlb_credit_pool *
+dlb_get_domain_ldb_pool(u32 id,
+ bool vf_request,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_credit_pool *pool;
+
+ if (id >= DLB_MAX_NUM_LDB_CREDIT_POOLS)
+ return NULL;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_credit_pools, pool, iter)
+ if ((!vf_request && pool->id.phys_id == id) ||
+ (vf_request && pool->id.virt_id == id))
+ return pool;
+
+ return NULL;
+}
+
+static struct dlb_credit_pool *
+dlb_get_domain_dir_pool(u32 id,
+ bool vf_request,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_credit_pool *pool;
+
+ if (id >= DLB_MAX_NUM_DIR_CREDIT_POOLS)
+ return NULL;
+
+ DLB_DOM_LIST_FOR(domain->used_dir_credit_pools, pool, iter)
+ if ((!vf_request && pool->id.phys_id == id) ||
+ (vf_request && pool->id.virt_id == id))
+ return pool;
+
+ return NULL;
+}
+
+static struct dlb_ldb_port *dlb_get_ldb_port_from_id(struct dlb_hw *hw,
+ u32 id,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_list_entry *iter1 __attribute__((unused));
+ struct dlb_list_entry *iter2 __attribute__((unused));
+ struct dlb_function_resources *rsrcs;
+ struct dlb_ldb_port *port;
+ struct dlb_domain *domain;
+
+ if (id >= DLB_MAX_NUM_LDB_PORTS)
+ return NULL;
+
+ rsrcs = (vf_request) ? &hw->vf[vf_id] : &hw->pf;
+
+ if (!vf_request)
+ return &hw->rsrcs.ldb_ports[id];
+
+ DLB_FUNC_LIST_FOR(rsrcs->used_domains, domain, iter1) {
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, port, iter2)
+ if (port->id.virt_id == id)
+ return port;
+ }
+
+ DLB_FUNC_LIST_FOR(rsrcs->avail_ldb_ports, port, iter1)
+ if (port->id.virt_id == id)
+ return port;
+
+ return NULL;
+}
+
+static struct dlb_ldb_port *
+dlb_get_domain_used_ldb_port(u32 id,
+ bool vf_request,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_ldb_port *port;
+
+ if (id >= DLB_MAX_NUM_LDB_PORTS)
+ return NULL;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, port, iter)
+ if ((!vf_request && port->id.phys_id == id) ||
+ (vf_request && port->id.virt_id == id))
+ return port;
+
+ DLB_DOM_LIST_FOR(domain->avail_ldb_ports, port, iter)
+ if ((!vf_request && port->id.phys_id == id) ||
+ (vf_request && port->id.virt_id == id))
+ return port;
+
+ return NULL;
+}
+
+static struct dlb_ldb_port *dlb_get_domain_ldb_port(u32 id,
+ bool vf_request,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_ldb_port *port;
+
+ if (id >= DLB_MAX_NUM_LDB_PORTS)
+ return NULL;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, port, iter)
+ if ((!vf_request && port->id.phys_id == id) ||
+ (vf_request && port->id.virt_id == id))
+ return port;
+
+ DLB_DOM_LIST_FOR(domain->avail_ldb_ports, port, iter)
+ if ((!vf_request && port->id.phys_id == id) ||
+ (vf_request && port->id.virt_id == id))
+ return port;
+
+ return NULL;
+}
+
+static struct dlb_dir_pq_pair *dlb_get_dir_pq_from_id(struct dlb_hw *hw,
+ u32 id,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_list_entry *iter1 __attribute__((unused));
+ struct dlb_list_entry *iter2 __attribute__((unused));
+ struct dlb_function_resources *rsrcs;
+ struct dlb_dir_pq_pair *port;
+ struct dlb_domain *domain;
+
+ if (id >= DLB_MAX_NUM_DIR_PORTS)
+ return NULL;
+
+ rsrcs = (vf_request) ? &hw->vf[vf_id] : &hw->pf;
+
+ if (!vf_request)
+ return &hw->rsrcs.dir_pq_pairs[id];
+
+ DLB_FUNC_LIST_FOR(rsrcs->used_domains, domain, iter1) {
+ DLB_DOM_LIST_FOR(domain->used_dir_pq_pairs, port, iter2)
+ if (port->id.virt_id == id)
+ return port;
+ }
+
+ DLB_FUNC_LIST_FOR(rsrcs->avail_dir_pq_pairs, port, iter1)
+ if (port->id.virt_id == id)
+ return port;
+
+ return NULL;
+}
+
+static struct dlb_dir_pq_pair *
+dlb_get_domain_used_dir_pq(u32 id,
+ bool vf_request,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_dir_pq_pair *port;
+
+ if (id >= DLB_MAX_NUM_DIR_PORTS)
+ return NULL;
+
+ DLB_DOM_LIST_FOR(domain->used_dir_pq_pairs, port, iter)
+ if ((!vf_request && port->id.phys_id == id) ||
+ (vf_request && port->id.virt_id == id))
+ return port;
+
+ return NULL;
+}
+
+static struct dlb_dir_pq_pair *dlb_get_domain_dir_pq(u32 id,
+ bool vf_request,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_dir_pq_pair *port;
+
+ if (id >= DLB_MAX_NUM_DIR_PORTS)
+ return NULL;
+
+ DLB_DOM_LIST_FOR(domain->used_dir_pq_pairs, port, iter)
+ if ((!vf_request && port->id.phys_id == id) ||
+ (vf_request && port->id.virt_id == id))
+ return port;
+
+ DLB_DOM_LIST_FOR(domain->avail_dir_pq_pairs, port, iter)
+ if ((!vf_request && port->id.phys_id == id) ||
+ (vf_request && port->id.virt_id == id))
+ return port;
+
+ return NULL;
+}
+
+static struct dlb_ldb_queue *dlb_get_ldb_queue_from_id(struct dlb_hw *hw,
+ u32 id,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_list_entry *iter1 __attribute__((unused));
+ struct dlb_list_entry *iter2 __attribute__((unused));
+ struct dlb_function_resources *rsrcs;
+ struct dlb_ldb_queue *queue;
+ struct dlb_domain *domain;
+
+ if (id >= DLB_MAX_NUM_LDB_QUEUES)
+ return NULL;
+
+ rsrcs = (vf_request) ? &hw->vf[vf_id] : &hw->pf;
+
+ if (!vf_request)
+ return &hw->rsrcs.ldb_queues[id];
+
+ DLB_FUNC_LIST_FOR(rsrcs->used_domains, domain, iter1) {
+ DLB_DOM_LIST_FOR(domain->used_ldb_queues, queue, iter2)
+ if (queue->id.virt_id == id)
+ return queue;
+ }
+
+ DLB_FUNC_LIST_FOR(rsrcs->avail_ldb_queues, queue, iter1)
+ if (queue->id.virt_id == id)
+ return queue;
+
+ return NULL;
+}
+
+static struct dlb_ldb_queue *dlb_get_domain_ldb_queue(u32 id,
+ bool vf_request,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_ldb_queue *queue;
+
+ if (id >= DLB_MAX_NUM_LDB_QUEUES)
+ return NULL;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_queues, queue, iter)
+ if ((!vf_request && queue->id.phys_id == id) ||
+ (vf_request && queue->id.virt_id == id))
+ return queue;
+
+ return NULL;
+}
+
+#define DLB_XFER_LL_RSRC(dst, src, num, type_t, name) ({ \
+ struct dlb_list_entry *it1 __attribute__((unused)); \
+ struct dlb_list_entry *it2 __attribute__((unused)); \
+ struct dlb_function_resources *_src = src; \
+ struct dlb_function_resources *_dst = dst; \
+ type_t *ptr, *tmp __attribute__((unused)); \
+ unsigned int i = 0; \
+ \
+ DLB_FUNC_LIST_FOR_SAFE(_src->avail_##name##s, ptr, tmp, it1, it2) { \
+ if (i++ == (num)) \
+ break; \
+ \
+ dlb_list_del(&_src->avail_##name##s, &ptr->func_list); \
+ dlb_list_add(&_dst->avail_##name##s, &ptr->func_list); \
+ _src->num_avail_##name##s--; \
+ _dst->num_avail_##name##s++; \
+ } \
+})
+
+#define DLB_VF_ID_CLEAR(head, type_t) ({ \
+ struct dlb_list_entry *iter __attribute__((unused)); \
+ type_t *var; \
+ \
+ DLB_FUNC_LIST_FOR(head, var, iter) \
+ var->id.vf_owned = false; \
+})
+
+int dlb_update_vf_sched_domains(struct dlb_hw *hw, u32 vf_id, u32 num)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_function_resources *src, *dst;
+ struct dlb_domain *domain;
+ unsigned int orig;
+ int ret;
+
+ if (vf_id >= DLB_MAX_NUM_VFS)
+ return -EINVAL;
+
+ src = &hw->pf;
+ dst = &hw->vf[vf_id];
+
+ /* If the VF is locked, its resource assignment can't be changed */
+ if (dlb_vf_is_locked(hw, vf_id))
+ return -EPERM;
+
+ orig = dst->num_avail_domains;
+
+ /* Detach the destination VF's current resources before checking if
+ * enough are available, and set their IDs accordingly.
+ */
+ DLB_VF_ID_CLEAR(dst->avail_domains, struct dlb_domain);
+
+ DLB_XFER_LL_RSRC(src, dst, orig, struct dlb_domain, domain);
+
+ /* Are there enough available resources to satisfy the request? */
+ if (num > src->num_avail_domains) {
+ num = orig;
+ ret = -EINVAL;
+ } else {
+ ret = 0;
+ }
+
+ DLB_XFER_LL_RSRC(dst, src, num, struct dlb_domain, domain);
+
+ /* Set the domains' VF backpointer */
+ DLB_FUNC_LIST_FOR(dst->avail_domains, domain, iter)
+ domain->parent_func = dst;
+
+ return ret;
+}
+
+int dlb_update_vf_ldb_queues(struct dlb_hw *hw, u32 vf_id, u32 num)
+{
+ struct dlb_function_resources *src, *dst;
+ unsigned int orig;
+ int ret;
+
+ if (vf_id >= DLB_MAX_NUM_VFS)
+ return -EINVAL;
+
+ src = &hw->pf;
+ dst = &hw->vf[vf_id];
+
+ /* If the VF is locked, its resource assignment can't be changed */
+ if (dlb_vf_is_locked(hw, vf_id))
+ return -EPERM;
+
+ orig = dst->num_avail_ldb_queues;
+
+ /* Detach the destination VF's current resources before checking if
+ * enough are available, and set their IDs accordingly.
+ */
+ DLB_VF_ID_CLEAR(dst->avail_ldb_queues, struct dlb_ldb_queue);
+
+ DLB_XFER_LL_RSRC(src, dst, orig, struct dlb_ldb_queue, ldb_queue);
+
+ /* Are there enough available resources to satisfy the request? */
+ if (num > src->num_avail_ldb_queues) {
+ num = orig;
+ ret = -EINVAL;
+ } else {
+ ret = 0;
+ }
+
+ DLB_XFER_LL_RSRC(dst, src, num, struct dlb_ldb_queue, ldb_queue);
+
+ return ret;
+}
+
+int dlb_update_vf_ldb_ports(struct dlb_hw *hw, u32 vf_id, u32 num)
+{
+ struct dlb_function_resources *src, *dst;
+ unsigned int orig;
+ int ret;
+
+ if (vf_id >= DLB_MAX_NUM_VFS)
+ return -EINVAL;
+
+ src = &hw->pf;
+ dst = &hw->vf[vf_id];
+
+ /* If the VF is locked, its resource assignment can't be changed */
+ if (dlb_vf_is_locked(hw, vf_id))
+ return -EPERM;
+
+ orig = dst->num_avail_ldb_ports;
+
+ /* Detach the destination VF's current resources before checking if
+ * enough are available, and set their IDs accordingly.
+ */
+ DLB_VF_ID_CLEAR(dst->avail_ldb_ports, struct dlb_ldb_port);
+
+ DLB_XFER_LL_RSRC(src, dst, orig, struct dlb_ldb_port, ldb_port);
+
+ /* Are there enough available resources to satisfy the request? */
+ if (num > src->num_avail_ldb_ports) {
+ num = orig;
+ ret = -EINVAL;
+ } else {
+ ret = 0;
+ }
+
+ DLB_XFER_LL_RSRC(dst, src, num, struct dlb_ldb_port, ldb_port);
+
+ return ret;
+}
+
+int dlb_update_vf_dir_ports(struct dlb_hw *hw, u32 vf_id, u32 num)
+{
+ struct dlb_function_resources *src, *dst;
+ unsigned int orig;
+ int ret;
+
+ if (vf_id >= DLB_MAX_NUM_VFS)
+ return -EINVAL;
+
+ src = &hw->pf;
+ dst = &hw->vf[vf_id];
+
+ /* If the VF is locked, its resource assignment can't be changed */
+ if (dlb_vf_is_locked(hw, vf_id))
+ return -EPERM;
+
+ orig = dst->num_avail_dir_pq_pairs;
+
+ /* Detach the destination VF's current resources before checking if
+ * enough are available, and set their IDs accordingly.
+ */
+ DLB_VF_ID_CLEAR(dst->avail_dir_pq_pairs, struct dlb_dir_pq_pair);
+
+ DLB_XFER_LL_RSRC(src, dst, orig, struct dlb_dir_pq_pair, dir_pq_pair);
+
+ /* Are there enough available resources to satisfy the request? */
+ if (num > src->num_avail_dir_pq_pairs) {
+ num = orig;
+ ret = -EINVAL;
+ } else {
+ ret = 0;
+ }
+
+ DLB_XFER_LL_RSRC(dst, src, num, struct dlb_dir_pq_pair, dir_pq_pair);
+
+ return ret;
+}
+
+int dlb_update_vf_ldb_credit_pools(struct dlb_hw *hw,
+ u32 vf_id,
+ u32 num)
+{
+ struct dlb_function_resources *src, *dst;
+ unsigned int orig;
+ int ret;
+
+ if (vf_id >= DLB_MAX_NUM_VFS)
+ return -EINVAL;
+
+ src = &hw->pf;
+ dst = &hw->vf[vf_id];
+
+ /* If the VF is locked, its resource assignment can't be changed */
+ if (dlb_vf_is_locked(hw, vf_id))
+ return -EPERM;
+
+ orig = dst->num_avail_ldb_credit_pools;
+
+ /* Detach the destination VF's current resources before checking if
+ * enough are available, and set their IDs accordingly.
+ */
+ DLB_VF_ID_CLEAR(dst->avail_ldb_credit_pools, struct dlb_credit_pool);
+
+ DLB_XFER_LL_RSRC(src,
+ dst,
+ orig,
+ struct dlb_credit_pool,
+ ldb_credit_pool);
+
+ /* Are there enough available resources to satisfy the request? */
+ if (num > src->num_avail_ldb_credit_pools) {
+ num = orig;
+ ret = -EINVAL;
+ } else {
+ ret = 0;
+ }
+
+ DLB_XFER_LL_RSRC(dst,
+ src,
+ num,
+ struct dlb_credit_pool,
+ ldb_credit_pool);
+
+ return ret;
+}
+
+int dlb_update_vf_dir_credit_pools(struct dlb_hw *hw,
+ u32 vf_id,
+ u32 num)
+{
+ struct dlb_function_resources *src, *dst;
+ unsigned int orig;
+ int ret;
+
+ if (vf_id >= DLB_MAX_NUM_VFS)
+ return -EINVAL;
+
+ src = &hw->pf;
+ dst = &hw->vf[vf_id];
+
+ /* If the VF is locked, its resource assignment can't be changed */
+ if (dlb_vf_is_locked(hw, vf_id))
+ return -EPERM;
+
+ orig = dst->num_avail_dir_credit_pools;
+
+ /* Detach the VF's current resources before checking if enough are
+ * available, and set their IDs accordingly.
+ */
+ DLB_VF_ID_CLEAR(dst->avail_dir_credit_pools, struct dlb_credit_pool);
+
+ DLB_XFER_LL_RSRC(src,
+ dst,
+ orig,
+ struct dlb_credit_pool,
+ dir_credit_pool);
+
+ /* Are there enough available resources to satisfy the request? */
+ if (num > src->num_avail_dir_credit_pools) {
+ num = orig;
+ ret = -EINVAL;
+ } else {
+ ret = 0;
+ }
+
+ DLB_XFER_LL_RSRC(dst,
+ src,
+ num,
+ struct dlb_credit_pool,
+ dir_credit_pool);
+
+ return ret;
+}
+
+static int dlb_transfer_bitmap_resources(struct dlb_bitmap *src,
+ struct dlb_bitmap *dst,
+ u32 num)
+{
+ int orig, ret, base;
+
+ /* Validate bitmaps before use */
+ if (dlb_bitmap_count(dst) < 0 || dlb_bitmap_count(src) < 0)
+ return -EINVAL;
+
+ /* Reassign the dest's bitmap entries to the source's before checking
+ * if a contiguous chunk of size 'num' is available. The reassignment
+ * may be necessary to create a sufficiently large contiguous chunk.
+ */
+ orig = dlb_bitmap_count(dst);
+
+ dlb_bitmap_or(src, src, dst);
+
+ dlb_bitmap_zero(dst);
+
+ /* Are there enough available resources to satisfy the request? */
+ base = dlb_bitmap_find_set_bit_range(src, num);
+
+ if (base == -ENOENT) {
+ num = orig;
+ base = dlb_bitmap_find_set_bit_range(src, num);
+ ret = -EINVAL;
+ } else {
+ ret = 0;
+ }
+
+ dlb_bitmap_set_range(dst, base, num);
+
+ dlb_bitmap_clear_range(src, base, num);
+
+ return ret;
+}
+
+int dlb_update_vf_ldb_credits(struct dlb_hw *hw, u32 vf_id, u32 num)
+{
+ struct dlb_function_resources *src, *dst;
+
+ if (vf_id >= DLB_MAX_NUM_VFS)
+ return -EINVAL;
+
+ src = &hw->pf;
+ dst = &hw->vf[vf_id];
+
+ /* If the VF is locked, its resource assignment can't be changed */
+ if (dlb_vf_is_locked(hw, vf_id))
+ return -EPERM;
+
+ return dlb_transfer_bitmap_resources(src->avail_qed_freelist_entries,
+ dst->avail_qed_freelist_entries,
+ num);
+}
+
+int dlb_update_vf_dir_credits(struct dlb_hw *hw, u32 vf_id, u32 num)
+{
+ struct dlb_function_resources *src, *dst;
+
+ if (vf_id >= DLB_MAX_NUM_VFS)
+ return -EINVAL;
+
+ src = &hw->pf;
+ dst = &hw->vf[vf_id];
+
+ /* If the VF is locked, its resource assignment can't be changed */
+ if (dlb_vf_is_locked(hw, vf_id))
+ return -EPERM;
+
+ return dlb_transfer_bitmap_resources(src->avail_dqed_freelist_entries,
+ dst->avail_dqed_freelist_entries,
+ num);
+}
+
+int dlb_update_vf_hist_list_entries(struct dlb_hw *hw,
+ u32 vf_id,
+ u32 num)
+{
+ struct dlb_function_resources *src, *dst;
+
+ if (vf_id >= DLB_MAX_NUM_VFS)
+ return -EINVAL;
+
+ src = &hw->pf;
+ dst = &hw->vf[vf_id];
+
+ /* If the VF is locked, its resource assignment can't be changed */
+ if (dlb_vf_is_locked(hw, vf_id))
+ return -EPERM;
+
+ return dlb_transfer_bitmap_resources(src->avail_hist_list_entries,
+ dst->avail_hist_list_entries,
+ num);
+}
+
+int dlb_update_vf_atomic_inflights(struct dlb_hw *hw,
+ u32 vf_id,
+ u32 num)
+{
+ struct dlb_function_resources *src, *dst;
+
+ if (vf_id >= DLB_MAX_NUM_VFS)
+ return -EINVAL;
+
+ src = &hw->pf;
+ dst = &hw->vf[vf_id];
+
+ /* If the VF is locked, its resource assignment can't be changed */
+ if (dlb_vf_is_locked(hw, vf_id))
+ return -EPERM;
+
+ return dlb_transfer_bitmap_resources(src->avail_aqed_freelist_entries,
+ dst->avail_aqed_freelist_entries,
+ num);
+}
+
+static int dlb_attach_ldb_queues(struct dlb_hw *hw,
+ struct dlb_function_resources *rsrcs,
+ struct dlb_domain *domain,
+ u32 num_queues,
+ struct dlb_cmd_response *resp)
+{
+ unsigned int i, j;
+
+ if (rsrcs->num_avail_ldb_queues < num_queues) {
+ resp->status = DLB_ST_LDB_QUEUES_UNAVAILABLE;
+ return -1;
+ }
+
+ for (i = 0; i < num_queues; i++) {
+ struct dlb_ldb_queue *queue;
+
+ queue = DLB_FUNC_LIST_HEAD(rsrcs->avail_ldb_queues,
+ typeof(*queue));
+ if (!queue) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: domain validation failed\n",
+ __func__);
+ goto cleanup;
+ }
+
+ dlb_list_del(&rsrcs->avail_ldb_queues, &queue->func_list);
+
+ queue->domain_id = domain->id;
+ queue->owned = true;
+
+ dlb_list_add(&domain->avail_ldb_queues, &queue->domain_list);
+ }
+
+ rsrcs->num_avail_ldb_queues -= num_queues;
+
+ return 0;
+
+cleanup:
+
+ /* Return the assigned queues */
+ for (j = 0; j < i; j++) {
+ struct dlb_ldb_queue *queue;
+
+ queue = DLB_FUNC_LIST_HEAD(domain->avail_ldb_queues,
+ typeof(*queue));
+ /* Unrecoverable internal error */
+ if (!queue)
+ break;
+
+ queue->owned = false;
+
+ dlb_list_del(&domain->avail_ldb_queues, &queue->domain_list);
+
+ dlb_list_add(&rsrcs->avail_ldb_queues, &queue->func_list);
+ }
+
+ return -EFAULT;
+}
+
+static struct dlb_ldb_port *
+dlb_get_next_ldb_port(struct dlb_hw *hw,
+ struct dlb_function_resources *rsrcs,
+ u32 domain_id)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_ldb_port *port;
+
+ /* To reduce the odds of consecutive load-balanced ports mapping to the
+ * same queue(s), the driver attempts to allocate ports whose neighbors
+ * are owned by a different domain.
+ */
+ DLB_FUNC_LIST_FOR(rsrcs->avail_ldb_ports, port, iter) {
+ u32 next, prev;
+ u32 phys_id;
+
+ phys_id = port->id.phys_id;
+ next = phys_id + 1;
+ prev = phys_id - 1;
+
+ if (phys_id == DLB_MAX_NUM_LDB_PORTS - 1)
+ next = 0;
+ if (phys_id == 0)
+ prev = DLB_MAX_NUM_LDB_PORTS - 1;
+
+ if (!hw->rsrcs.ldb_ports[next].owned ||
+ hw->rsrcs.ldb_ports[next].domain_id.phys_id == domain_id)
+ continue;
+
+ if (!hw->rsrcs.ldb_ports[prev].owned ||
+ hw->rsrcs.ldb_ports[prev].domain_id.phys_id == domain_id)
+ continue;
+
+ return port;
+ }
+
+ /* Failing that, the driver looks for a port with one neighbor owned by
+ * a different domain and the other unallocated.
+ */
+ DLB_FUNC_LIST_FOR(rsrcs->avail_ldb_ports, port, iter) {
+ u32 next, prev;
+ u32 phys_id;
+
+ phys_id = port->id.phys_id;
+ next = phys_id + 1;
+ prev = phys_id - 1;
+
+ if (phys_id == DLB_MAX_NUM_LDB_PORTS - 1)
+ next = 0;
+ if (phys_id == 0)
+ prev = DLB_MAX_NUM_LDB_PORTS - 1;
+
+ if (!hw->rsrcs.ldb_ports[prev].owned &&
+ hw->rsrcs.ldb_ports[next].owned &&
+ hw->rsrcs.ldb_ports[next].domain_id.phys_id != domain_id)
+ return port;
+
+ if (!hw->rsrcs.ldb_ports[next].owned &&
+ hw->rsrcs.ldb_ports[prev].owned &&
+ hw->rsrcs.ldb_ports[prev].domain_id.phys_id != domain_id)
+ return port;
+ }
+
+ /* Failing that, the driver looks for a port with both neighbors
+ * unallocated.
+ */
+ DLB_FUNC_LIST_FOR(rsrcs->avail_ldb_ports, port, iter) {
+ u32 next, prev;
+ u32 phys_id;
+
+ phys_id = port->id.phys_id;
+ next = phys_id + 1;
+ prev = phys_id - 1;
+
+ if (phys_id == DLB_MAX_NUM_LDB_PORTS - 1)
+ next = 0;
+ if (phys_id == 0)
+ prev = DLB_MAX_NUM_LDB_PORTS - 1;
+
+ if (!hw->rsrcs.ldb_ports[prev].owned &&
+ !hw->rsrcs.ldb_ports[next].owned)
+ return port;
+ }
+
+ /* If all else fails, the driver returns the next available port. */
+ return DLB_FUNC_LIST_HEAD(rsrcs->avail_ldb_ports, typeof(*port));
+}
+
+static int dlb_attach_ldb_ports(struct dlb_hw *hw,
+ struct dlb_function_resources *rsrcs,
+ struct dlb_domain *domain,
+ u32 num_ports,
+ struct dlb_cmd_response *resp)
+{
+ unsigned int i, j;
+
+ if (rsrcs->num_avail_ldb_ports < num_ports) {
+ resp->status = DLB_ST_LDB_PORTS_UNAVAILABLE;
+ return -1;
+ }
+
+ for (i = 0; i < num_ports; i++) {
+ struct dlb_ldb_port *port;
+
+ port = dlb_get_next_ldb_port(hw, rsrcs, domain->id.phys_id);
+
+ if (!port) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: domain validation failed\n",
+ __func__);
+ goto cleanup;
+ }
+
+ dlb_list_del(&rsrcs->avail_ldb_ports, &port->func_list);
+
+ port->domain_id = domain->id;
+ port->owned = true;
+
+ dlb_list_add(&domain->avail_ldb_ports, &port->domain_list);
+ }
+
+ rsrcs->num_avail_ldb_ports -= num_ports;
+
+ return 0;
+
+cleanup:
+
+ /* Return the assigned ports */
+ for (j = 0; j < i; j++) {
+ struct dlb_ldb_port *port;
+
+ port = DLB_FUNC_LIST_HEAD(domain->avail_ldb_ports,
+ typeof(*port));
+ /* Unrecoverable internal error */
+ if (!port)
+ break;
+
+ port->owned = false;
+
+ dlb_list_del(&domain->avail_ldb_ports, &port->domain_list);
+
+ dlb_list_add(&rsrcs->avail_ldb_ports, &port->func_list);
+ }
+
+ return -EFAULT;
+}
+
+static int dlb_attach_dir_ports(struct dlb_hw *hw,
+ struct dlb_function_resources *rsrcs,
+ struct dlb_domain *domain,
+ u32 num_ports,
+ struct dlb_cmd_response *resp)
+{
+ unsigned int i, j;
+
+ if (rsrcs->num_avail_dir_pq_pairs < num_ports) {
+ resp->status = DLB_ST_DIR_PORTS_UNAVAILABLE;
+ return -1;
+ }
+
+ for (i = 0; i < num_ports; i++) {
+ struct dlb_dir_pq_pair *port;
+
+ port = DLB_FUNC_LIST_HEAD(rsrcs->avail_dir_pq_pairs,
+ typeof(*port));
+ if (!port) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: domain validation failed\n",
+ __func__);
+ goto cleanup;
+ }
+
+ dlb_list_del(&rsrcs->avail_dir_pq_pairs, &port->func_list);
+
+ port->domain_id = domain->id;
+ port->owned = true;
+
+ dlb_list_add(&domain->avail_dir_pq_pairs, &port->domain_list);
+ }
+
+ rsrcs->num_avail_dir_pq_pairs -= num_ports;
+
+ return 0;
+
+cleanup:
+
+ /* Return the assigned ports */
+ for (j = 0; j < i; j++) {
+ struct dlb_dir_pq_pair *port;
+
+ port = DLB_FUNC_LIST_HEAD(domain->avail_dir_pq_pairs,
+ typeof(*port));
+ /* Unrecoverable internal error */
+ if (!port)
+ break;
+
+ port->owned = false;
+
+ dlb_list_del(&domain->avail_dir_pq_pairs, &port->domain_list);
+
+ dlb_list_add(&rsrcs->avail_dir_pq_pairs, &port->func_list);
+ }
+
+ return -EFAULT;
+}
+
+static int dlb_attach_ldb_credits(struct dlb_function_resources *rsrcs,
+ struct dlb_domain *domain,
+ u32 num_credits,
+ struct dlb_cmd_response *resp)
+{
+ struct dlb_bitmap *bitmap = rsrcs->avail_qed_freelist_entries;
+
+ if (dlb_bitmap_count(bitmap) < (int)num_credits) {
+ resp->status = DLB_ST_LDB_CREDITS_UNAVAILABLE;
+ return -1;
+ }
+
+ if (num_credits) {
+ int base;
+
+ base = dlb_bitmap_find_set_bit_range(bitmap, num_credits);
+ if (base < 0)
+ goto error;
+
+ domain->qed_freelist.base = base;
+ domain->qed_freelist.bound = base + num_credits;
+ domain->qed_freelist.offset = 0;
+
+ dlb_bitmap_clear_range(bitmap, base, num_credits);
+ }
+
+ return 0;
+
+error:
+ resp->status = DLB_ST_QED_FREELIST_ENTRIES_UNAVAILABLE;
+ return -1;
+}
+
+static int dlb_attach_dir_credits(struct dlb_function_resources *rsrcs,
+ struct dlb_domain *domain,
+ u32 num_credits,
+ struct dlb_cmd_response *resp)
+{
+ struct dlb_bitmap *bitmap = rsrcs->avail_dqed_freelist_entries;
+
+ if (dlb_bitmap_count(bitmap) < (int)num_credits) {
+ resp->status = DLB_ST_DIR_CREDITS_UNAVAILABLE;
+ return -1;
+ }
+
+ if (num_credits) {
+ int base;
+
+ base = dlb_bitmap_find_set_bit_range(bitmap, num_credits);
+ if (base < 0)
+ goto error;
+
+ domain->dqed_freelist.base = base;
+ domain->dqed_freelist.bound = base + num_credits;
+ domain->dqed_freelist.offset = 0;
+
+ dlb_bitmap_clear_range(bitmap, base, num_credits);
+ }
+
+ return 0;
+
+error:
+ resp->status = DLB_ST_DQED_FREELIST_ENTRIES_UNAVAILABLE;
+ return -1;
+}
+
+static int dlb_attach_ldb_credit_pools(struct dlb_hw *hw,
+ struct dlb_function_resources *rsrcs,
+ struct dlb_domain *domain,
+ u32 num_credit_pools,
+ struct dlb_cmd_response *resp)
+{
+ unsigned int i, j;
+
+ if (rsrcs->num_avail_ldb_credit_pools < num_credit_pools) {
+ resp->status = DLB_ST_LDB_CREDIT_POOLS_UNAVAILABLE;
+ return -1;
+ }
+
+ for (i = 0; i < num_credit_pools; i++) {
+ struct dlb_credit_pool *pool;
+
+ pool = DLB_FUNC_LIST_HEAD(rsrcs->avail_ldb_credit_pools,
+ typeof(*pool));
+ if (!pool) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: domain validation failed\n",
+ __func__);
+ goto cleanup;
+ }
+
+ dlb_list_del(&rsrcs->avail_ldb_credit_pools,
+ &pool->func_list);
+
+ pool->domain_id = domain->id;
+ pool->owned = true;
+
+ dlb_list_add(&domain->avail_ldb_credit_pools,
+ &pool->domain_list);
+ }
+
+ rsrcs->num_avail_ldb_credit_pools -= num_credit_pools;
+
+ return 0;
+
+cleanup:
+
+ /* Return the assigned credit pools */
+ for (j = 0; j < i; j++) {
+ struct dlb_credit_pool *pool;
+
+ pool = DLB_FUNC_LIST_HEAD(domain->avail_ldb_credit_pools,
+ typeof(*pool));
+ /* Unrecoverable internal error */
+ if (!pool)
+ break;
+
+ pool->owned = false;
+
+ dlb_list_del(&domain->avail_ldb_credit_pools,
+ &pool->domain_list);
+
+ dlb_list_add(&rsrcs->avail_ldb_credit_pools,
+ &pool->func_list);
+ }
+
+ return -EFAULT;
+}
+
+static int dlb_attach_dir_credit_pools(struct dlb_hw *hw,
+ struct dlb_function_resources *rsrcs,
+ struct dlb_domain *domain,
+ u32 num_credit_pools,
+ struct dlb_cmd_response *resp)
+{
+ unsigned int i, j;
+
+ if (rsrcs->num_avail_dir_credit_pools < num_credit_pools) {
+ resp->status = DLB_ST_DIR_CREDIT_POOLS_UNAVAILABLE;
+ return -1;
+ }
+
+ for (i = 0; i < num_credit_pools; i++) {
+ struct dlb_credit_pool *pool;
+
+ pool = DLB_FUNC_LIST_HEAD(rsrcs->avail_dir_credit_pools,
+ typeof(*pool));
+ if (!pool) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: domain validation failed\n",
+ __func__);
+ goto cleanup;
+ }
+
+ dlb_list_del(&rsrcs->avail_dir_credit_pools,
+ &pool->func_list);
+
+ pool->domain_id = domain->id;
+ pool->owned = true;
+
+ dlb_list_add(&domain->avail_dir_credit_pools,
+ &pool->domain_list);
+ }
+
+ rsrcs->num_avail_dir_credit_pools -= num_credit_pools;
+
+ return 0;
+
+cleanup:
+
+ /* Return the assigned credit pools */
+ for (j = 0; j < i; j++) {
+ struct dlb_credit_pool *pool;
+
+ pool = DLB_FUNC_LIST_HEAD(domain->avail_dir_credit_pools,
+ typeof(*pool));
+ /* Unrecoverable internal error */
+ if (!pool)
+ break;
+
+ pool->owned = false;
+
+ dlb_list_del(&domain->avail_dir_credit_pools,
+ &pool->domain_list);
+
+ dlb_list_add(&rsrcs->avail_dir_credit_pools,
+ &pool->func_list);
+ }
+
+ return -EFAULT;
+}
+
+static int dlb_attach_atomic_inflights(struct dlb_function_resources *rsrcs,
+ struct dlb_domain *domain,
+ u32 num_atomic_inflights,
+ struct dlb_cmd_response *resp)
+{
+ if (num_atomic_inflights) {
+ struct dlb_bitmap *bitmap =
+ rsrcs->avail_aqed_freelist_entries;
+ int base;
+
+ base = dlb_bitmap_find_set_bit_range(bitmap,
+ num_atomic_inflights);
+ if (base < 0)
+ goto error;
+
+ domain->aqed_freelist.base = base;
+ domain->aqed_freelist.bound = base + num_atomic_inflights;
+ domain->aqed_freelist.offset = 0;
+
+ dlb_bitmap_clear_range(bitmap, base, num_atomic_inflights);
+ }
+
+ return 0;
+
+error:
+ resp->status = DLB_ST_ATOMIC_INFLIGHTS_UNAVAILABLE;
+ return -1;
+}
+
+static int
+dlb_attach_domain_hist_list_entries(struct dlb_function_resources *rsrcs,
+ struct dlb_domain *domain,
+ u32 num_hist_list_entries,
+ struct dlb_cmd_response *resp)
+{
+ struct dlb_bitmap *bitmap;
+ int base;
+
+ if (num_hist_list_entries) {
+ bitmap = rsrcs->avail_hist_list_entries;
+
+ base = dlb_bitmap_find_set_bit_range(bitmap,
+ num_hist_list_entries);
+ if (base < 0)
+ goto error;
+
+ domain->total_hist_list_entries = num_hist_list_entries;
+ domain->avail_hist_list_entries = num_hist_list_entries;
+ domain->hist_list_entry_base = base;
+ domain->hist_list_entry_offset = 0;
+
+ dlb_bitmap_clear_range(bitmap, base, num_hist_list_entries);
+ }
+ return 0;
+
+error:
+ resp->status = DLB_ST_HIST_LIST_ENTRIES_UNAVAILABLE;
+ return -1;
+}
+
+static unsigned int
+dlb_get_num_ports_in_use(struct dlb_hw *hw)
+{
+ unsigned int i, n = 0;
+
+ for (i = 0; i < DLB_MAX_NUM_LDB_PORTS; i++)
+ if (hw->rsrcs.ldb_ports[i].owned)
+ n++;
+
+ for (i = 0; i < DLB_MAX_NUM_DIR_PORTS; i++)
+ if (hw->rsrcs.dir_pq_pairs[i].owned)
+ n++;
+
+ return n;
+}
+
+static int
+dlb_verify_create_sched_domain_args(struct dlb_hw *hw,
+ struct dlb_function_resources *rsrcs,
+ struct dlb_create_sched_domain_args *args,
+ struct dlb_cmd_response *resp)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_bitmap *ldb_credit_freelist;
+ struct dlb_bitmap *dir_credit_freelist;
+ unsigned int ldb_credit_freelist_count;
+ unsigned int dir_credit_freelist_count;
+ unsigned int max_contig_aqed_entries;
+ unsigned int max_contig_dqed_entries;
+ unsigned int max_contig_qed_entries;
+ unsigned int max_contig_hl_entries;
+ struct dlb_bitmap *aqed_freelist;
+ enum dlb_dev_revision revision;
+
+ ldb_credit_freelist = rsrcs->avail_qed_freelist_entries;
+ dir_credit_freelist = rsrcs->avail_dqed_freelist_entries;
+ aqed_freelist = rsrcs->avail_aqed_freelist_entries;
+
+ ldb_credit_freelist_count = dlb_bitmap_count(ldb_credit_freelist);
+ dir_credit_freelist_count = dlb_bitmap_count(dir_credit_freelist);
+
+ max_contig_hl_entries =
+ dlb_bitmap_longest_set_range(rsrcs->avail_hist_list_entries);
+ max_contig_aqed_entries =
+ dlb_bitmap_longest_set_range(aqed_freelist);
+ max_contig_qed_entries =
+ dlb_bitmap_longest_set_range(ldb_credit_freelist);
+ max_contig_dqed_entries =
+ dlb_bitmap_longest_set_range(dir_credit_freelist);
+
+ if (rsrcs->num_avail_domains < 1)
+ resp->status = DLB_ST_DOMAIN_UNAVAILABLE;
+ else if (rsrcs->num_avail_ldb_queues < args->num_ldb_queues)
+ resp->status = DLB_ST_LDB_QUEUES_UNAVAILABLE;
+ else if (rsrcs->num_avail_ldb_ports < args->num_ldb_ports)
+ resp->status = DLB_ST_LDB_PORTS_UNAVAILABLE;
+ else if (args->num_ldb_queues > 0 && args->num_ldb_ports == 0)
+ resp->status = DLB_ST_LDB_PORT_REQUIRED_FOR_LDB_QUEUES;
+ else if (rsrcs->num_avail_dir_pq_pairs < args->num_dir_ports)
+ resp->status = DLB_ST_DIR_PORTS_UNAVAILABLE;
+ else if (ldb_credit_freelist_count < args->num_ldb_credits)
+ resp->status = DLB_ST_LDB_CREDITS_UNAVAILABLE;
+ else if (dir_credit_freelist_count < args->num_dir_credits)
+ resp->status = DLB_ST_DIR_CREDITS_UNAVAILABLE;
+ else if (rsrcs->num_avail_ldb_credit_pools < args->num_ldb_credit_pools)
+ resp->status = DLB_ST_LDB_CREDIT_POOLS_UNAVAILABLE;
+ else if (rsrcs->num_avail_dir_credit_pools < args->num_dir_credit_pools)
+ resp->status = DLB_ST_DIR_CREDIT_POOLS_UNAVAILABLE;
+ else if (max_contig_hl_entries < args->num_hist_list_entries)
+ resp->status = DLB_ST_HIST_LIST_ENTRIES_UNAVAILABLE;
+ else if (max_contig_aqed_entries < args->num_atomic_inflights)
+ resp->status = DLB_ST_ATOMIC_INFLIGHTS_UNAVAILABLE;
+ else if (max_contig_qed_entries < args->num_ldb_credits)
+ resp->status = DLB_ST_QED_FREELIST_ENTRIES_UNAVAILABLE;
+ else if (max_contig_dqed_entries < args->num_dir_credits)
+ resp->status = DLB_ST_DQED_FREELIST_ENTRIES_UNAVAILABLE;
+
+ /* DLB A-stepping workaround for hardware write buffer lock up issue:
+ * limit the maximum configured ports to less than 128 and disable CQ
+ * occupancy interrupts.
+ */
+ revision = os_get_dev_revision(hw);
+
+ if (revision < DLB_B0) {
+ u32 n = dlb_get_num_ports_in_use(hw);
+
+ n += args->num_ldb_ports + args->num_dir_ports;
+
+ if (n >= DLB_A_STEP_MAX_PORTS)
+ resp->status = args->num_ldb_ports ?
+ DLB_ST_LDB_PORTS_UNAVAILABLE :
+ DLB_ST_DIR_PORTS_UNAVAILABLE;
+ }
+
+ if (resp->status)
+ return -1;
+
+ return 0;
+}
+
+static int
+dlb_verify_create_ldb_pool_args(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_create_ldb_pool_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_freelist *qed_freelist;
+ struct dlb_domain *domain;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+
+ if (!domain) {
+ resp->status = DLB_ST_INVALID_DOMAIN_ID;
+ return -1;
+ }
+
+ if (!domain->configured) {
+ resp->status = DLB_ST_DOMAIN_NOT_CONFIGURED;
+ return -1;
+ }
+
+ qed_freelist = &domain->qed_freelist;
+
+ if (dlb_freelist_count(qed_freelist) < args->num_ldb_credits) {
+ resp->status = DLB_ST_LDB_CREDITS_UNAVAILABLE;
+ return -1;
+ }
+
+ if (dlb_list_empty(&domain->avail_ldb_credit_pools)) {
+ resp->status = DLB_ST_LDB_CREDIT_POOLS_UNAVAILABLE;
+ return -1;
+ }
+
+ if (domain->started) {
+ resp->status = DLB_ST_DOMAIN_STARTED;
+ return -1;
+ }
+
+ return 0;
+}
+
+static void
+dlb_configure_ldb_credit_pool(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ struct dlb_create_ldb_pool_args *args,
+ struct dlb_credit_pool *pool)
+{
+ union dlb_sys_ldb_pool_enbld r0 = { {0} };
+ union dlb_chp_ldb_pool_crd_lim r1 = { {0} };
+ union dlb_chp_ldb_pool_crd_cnt r2 = { {0} };
+ union dlb_chp_qed_fl_base r3 = { {0} };
+ union dlb_chp_qed_fl_lim r4 = { {0} };
+ union dlb_chp_qed_fl_push_ptr r5 = { {0} };
+ union dlb_chp_qed_fl_pop_ptr r6 = { {0} };
+
+ r1.field.limit = args->num_ldb_credits;
+
+ DLB_CSR_WR(hw, DLB_CHP_LDB_POOL_CRD_LIM(pool->id.phys_id), r1.val);
+
+ r2.field.count = args->num_ldb_credits;
+
+ DLB_CSR_WR(hw, DLB_CHP_LDB_POOL_CRD_CNT(pool->id.phys_id), r2.val);
+
+ r3.field.base = domain->qed_freelist.base + domain->qed_freelist.offset;
+
+ DLB_CSR_WR(hw, DLB_CHP_QED_FL_BASE(pool->id.phys_id), r3.val);
+
+ r4.field.freelist_disable = 0;
+ r4.field.limit = r3.field.base + args->num_ldb_credits - 1;
+
+ DLB_CSR_WR(hw, DLB_CHP_QED_FL_LIM(pool->id.phys_id), r4.val);
+
+ r5.field.push_ptr = r3.field.base;
+ r5.field.generation = 1;
+
+ DLB_CSR_WR(hw, DLB_CHP_QED_FL_PUSH_PTR(pool->id.phys_id), r5.val);
+
+ r6.field.pop_ptr = r3.field.base;
+ r6.field.generation = 0;
+
+ DLB_CSR_WR(hw, DLB_CHP_QED_FL_POP_PTR(pool->id.phys_id), r6.val);
+
+ r0.field.pool_enabled = 1;
+
+ DLB_CSR_WR(hw, DLB_SYS_LDB_POOL_ENBLD(pool->id.phys_id), r0.val);
+
+ pool->avail_credits = args->num_ldb_credits;
+ pool->total_credits = args->num_ldb_credits;
+ domain->qed_freelist.offset += args->num_ldb_credits;
+
+ pool->configured = true;
+}
+
+static int
+dlb_verify_create_dir_pool_args(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_create_dir_pool_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_freelist *dqed_freelist;
+ struct dlb_domain *domain;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+
+ if (!domain) {
+ resp->status = DLB_ST_INVALID_DOMAIN_ID;
+ return -1;
+ }
+
+ if (!domain->configured) {
+ resp->status = DLB_ST_DOMAIN_NOT_CONFIGURED;
+ return -1;
+ }
+
+ dqed_freelist = &domain->dqed_freelist;
+
+ if (dlb_freelist_count(dqed_freelist) < args->num_dir_credits) {
+ resp->status = DLB_ST_DIR_CREDITS_UNAVAILABLE;
+ return -1;
+ }
+
+ if (dlb_list_empty(&domain->avail_dir_credit_pools)) {
+ resp->status = DLB_ST_DIR_CREDIT_POOLS_UNAVAILABLE;
+ return -1;
+ }
+
+ if (domain->started) {
+ resp->status = DLB_ST_DOMAIN_STARTED;
+ return -1;
+ }
+
+ return 0;
+}
+
+static void
+dlb_configure_dir_credit_pool(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ struct dlb_create_dir_pool_args *args,
+ struct dlb_credit_pool *pool)
+{
+ union dlb_sys_dir_pool_enbld r0 = { {0} };
+ union dlb_chp_dir_pool_crd_lim r1 = { {0} };
+ union dlb_chp_dir_pool_crd_cnt r2 = { {0} };
+ union dlb_chp_dqed_fl_base r3 = { {0} };
+ union dlb_chp_dqed_fl_lim r4 = { {0} };
+ union dlb_chp_dqed_fl_push_ptr r5 = { {0} };
+ union dlb_chp_dqed_fl_pop_ptr r6 = { {0} };
+
+ r1.field.limit = args->num_dir_credits;
+
+ DLB_CSR_WR(hw, DLB_CHP_DIR_POOL_CRD_LIM(pool->id.phys_id), r1.val);
+
+ r2.field.count = args->num_dir_credits;
+
+ DLB_CSR_WR(hw, DLB_CHP_DIR_POOL_CRD_CNT(pool->id.phys_id), r2.val);
+
+ r3.field.base = domain->dqed_freelist.base +
+ domain->dqed_freelist.offset;
+
+ DLB_CSR_WR(hw, DLB_CHP_DQED_FL_BASE(pool->id.phys_id), r3.val);
+
+ r4.field.freelist_disable = 0;
+ r4.field.limit = r3.field.base + args->num_dir_credits - 1;
+
+ DLB_CSR_WR(hw, DLB_CHP_DQED_FL_LIM(pool->id.phys_id), r4.val);
+
+ r5.field.push_ptr = r3.field.base;
+ r5.field.generation = 1;
+
+ DLB_CSR_WR(hw, DLB_CHP_DQED_FL_PUSH_PTR(pool->id.phys_id), r5.val);
+
+ r6.field.pop_ptr = r3.field.base;
+ r6.field.generation = 0;
+
+ DLB_CSR_WR(hw, DLB_CHP_DQED_FL_POP_PTR(pool->id.phys_id), r6.val);
+
+ r0.field.pool_enabled = 1;
+
+ DLB_CSR_WR(hw, DLB_SYS_DIR_POOL_ENBLD(pool->id.phys_id), r0.val);
+
+ pool->avail_credits = args->num_dir_credits;
+ pool->total_credits = args->num_dir_credits;
+ domain->dqed_freelist.offset += args->num_dir_credits;
+
+ pool->configured = true;
+}
+
+static int
+dlb_verify_create_ldb_queue_args(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_create_ldb_queue_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_freelist *aqed_freelist;
+ struct dlb_domain *domain;
+ int i;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+
+ if (!domain) {
+ resp->status = DLB_ST_INVALID_DOMAIN_ID;
+ return -1;
+ }
+
+ if (!domain->configured) {
+ resp->status = DLB_ST_DOMAIN_NOT_CONFIGURED;
+ return -1;
+ }
+
+ if (domain->started) {
+ resp->status = DLB_ST_DOMAIN_STARTED;
+ return -1;
+ }
+
+ if (dlb_list_empty(&domain->avail_ldb_queues)) {
+ resp->status = DLB_ST_LDB_QUEUES_UNAVAILABLE;
+ return -1;
+ }
+
+ if (args->num_sequence_numbers) {
+ for (i = 0; i < DLB_MAX_NUM_SEQUENCE_NUMBER_GROUPS; i++) {
+ struct dlb_sn_group *group = &hw->rsrcs.sn_groups[i];
+
+ if (group->sequence_numbers_per_queue ==
+ args->num_sequence_numbers &&
+ !dlb_sn_group_full(group))
+ break;
+ }
+
+ if (i == DLB_MAX_NUM_SEQUENCE_NUMBER_GROUPS) {
+ resp->status = DLB_ST_SEQUENCE_NUMBERS_UNAVAILABLE;
+ return -1;
+ }
+ }
+
+ if (args->num_qid_inflights > 4096) {
+ resp->status = DLB_ST_INVALID_QID_INFLIGHT_ALLOCATION;
+ return -1;
+ }
+
+ /* Inflights must be <= number of sequence numbers if ordered */
+ if (args->num_sequence_numbers != 0 &&
+ args->num_qid_inflights > args->num_sequence_numbers) {
+ resp->status = DLB_ST_INVALID_QID_INFLIGHT_ALLOCATION;
+ return -1;
+ }
+
+ aqed_freelist = &domain->aqed_freelist;
+
+ if (dlb_freelist_count(aqed_freelist) < args->num_atomic_inflights) {
+ resp->status = DLB_ST_ATOMIC_INFLIGHTS_UNAVAILABLE;
+ return -1;
+ }
+
+ return 0;
+}
+
+static int
+dlb_verify_create_dir_queue_args(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_create_dir_queue_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_domain *domain;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+
+ if (!domain) {
+ resp->status = DLB_ST_INVALID_DOMAIN_ID;
+ return -1;
+ }
+
+ if (!domain->configured) {
+ resp->status = DLB_ST_DOMAIN_NOT_CONFIGURED;
+ return -1;
+ }
+
+ if (domain->started) {
+ resp->status = DLB_ST_DOMAIN_STARTED;
+ return -1;
+ }
+
+ /* If the user claims the port is already configured, validate the port
+ * ID, its domain, and whether the port is configured.
+ */
+ if (args->port_id != -1) {
+ struct dlb_dir_pq_pair *port;
+
+ port = dlb_get_domain_used_dir_pq(args->port_id,
+ vf_request,
+ domain);
+
+ if (!port || port->domain_id.phys_id != domain->id.phys_id ||
+ !port->port_configured) {
+ resp->status = DLB_ST_INVALID_PORT_ID;
+ return -1;
+ }
+ }
+
+ /* If the queue's port is not configured, validate that a free
+ * port-queue pair is available.
+ */
+ if (args->port_id == -1 &&
+ dlb_list_empty(&domain->avail_dir_pq_pairs)) {
+ resp->status = DLB_ST_DIR_QUEUES_UNAVAILABLE;
+ return -1;
+ }
+
+ return 0;
+}
+
+static void dlb_configure_ldb_queue(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ struct dlb_ldb_queue *queue,
+ struct dlb_create_ldb_queue_args *args,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ union dlb_sys_vf_ldb_vqid_v r0 = { {0} };
+ union dlb_sys_vf_ldb_vqid2qid r1 = { {0} };
+ union dlb_sys_ldb_qid2vqid r2 = { {0} };
+ union dlb_sys_ldb_vasqid_v r3 = { {0} };
+ union dlb_lsp_qid_ldb_infl_lim r4 = { {0} };
+ union dlb_lsp_qid_aqed_active_lim r5 = { {0} };
+ union dlb_aqed_pipe_fl_lim r6 = { {0} };
+ union dlb_aqed_pipe_fl_base r7 = { {0} };
+ union dlb_chp_ord_qid_sn_map r11 = { {0} };
+ union dlb_sys_ldb_qid_cfg_v r12 = { {0} };
+ union dlb_sys_ldb_qid_v r13 = { {0} };
+ union dlb_aqed_pipe_fl_push_ptr r14 = { {0} };
+ union dlb_aqed_pipe_fl_pop_ptr r15 = { {0} };
+ union dlb_aqed_pipe_qid_fid_lim r16 = { {0} };
+ union dlb_ro_pipe_qid2grpslt r17 = { {0} };
+ struct dlb_sn_group *sn_group;
+ unsigned int offs;
+
+ /* QID write permissions are turned on when the domain is started */
+ r3.field.vasqid_v = 0;
+
+ offs = domain->id.phys_id * DLB_MAX_NUM_LDB_QUEUES + queue->id.phys_id;
+
+ DLB_CSR_WR(hw, DLB_SYS_LDB_VASQID_V(offs), r3.val);
+
+ /* Unordered QIDs get 4K inflights, ordered get as many as the number
+ * of sequence numbers.
+ */
+ r4.field.limit = args->num_qid_inflights;
+
+ DLB_CSR_WR(hw, DLB_LSP_QID_LDB_INFL_LIM(queue->id.phys_id), r4.val);
+
+ r5.field.limit = queue->aqed_freelist.bound -
+ queue->aqed_freelist.base;
+
+ if (r5.field.limit > DLB_MAX_NUM_AQOS_ENTRIES)
+ r5.field.limit = DLB_MAX_NUM_AQOS_ENTRIES;
+
+ /* AQOS */
+ DLB_CSR_WR(hw, DLB_LSP_QID_AQED_ACTIVE_LIM(queue->id.phys_id), r5.val);
+
+ r6.field.freelist_disable = 0;
+ r6.field.limit = queue->aqed_freelist.bound - 1;
+
+ DLB_CSR_WR(hw, DLB_AQED_PIPE_FL_LIM(queue->id.phys_id), r6.val);
+
+ r7.field.base = queue->aqed_freelist.base;
+
+ DLB_CSR_WR(hw, DLB_AQED_PIPE_FL_BASE(queue->id.phys_id), r7.val);
+
+ r14.field.push_ptr = r7.field.base;
+ r14.field.generation = 1;
+
+ DLB_CSR_WR(hw, DLB_AQED_PIPE_FL_PUSH_PTR(queue->id.phys_id), r14.val);
+
+ r15.field.pop_ptr = r7.field.base;
+ r15.field.generation = 0;
+
+ DLB_CSR_WR(hw, DLB_AQED_PIPE_FL_POP_PTR(queue->id.phys_id), r15.val);
+
+ /* Configure SNs */
+ sn_group = &hw->rsrcs.sn_groups[queue->sn_group];
+ r11.field.mode = sn_group->mode;
+ r11.field.slot = queue->sn_slot;
+ r11.field.grp = sn_group->id;
+
+ DLB_CSR_WR(hw, DLB_CHP_ORD_QID_SN_MAP(queue->id.phys_id), r11.val);
+
+ /* This register limits the number of inflight flows a queue can have
+ * at one time. It has an upper bound of 2048, but can be
+ * over-subscribed. 512 is chosen so that a single queue doesn't use
+ * the entire atomic storage, but can use a substantial portion if
+ * needed.
+ */
+ r16.field.qid_fid_limit = 512;
+
+ DLB_CSR_WR(hw, DLB_AQED_PIPE_QID_FID_LIM(queue->id.phys_id), r16.val);
+
+ r17.field.group = sn_group->id;
+ r17.field.slot = queue->sn_slot;
+
+ DLB_CSR_WR(hw, DLB_RO_PIPE_QID2GRPSLT(queue->id.phys_id), r17.val);
+
+ r12.field.sn_cfg_v = (args->num_sequence_numbers != 0);
+ r12.field.fid_cfg_v = (args->num_atomic_inflights != 0);
+
+ DLB_CSR_WR(hw, DLB_SYS_LDB_QID_CFG_V(queue->id.phys_id), r12.val);
+
+ if (vf_request) {
+ unsigned int offs;
+
+ r0.field.vqid_v = 1;
+
+ offs = vf_id * DLB_MAX_NUM_LDB_QUEUES + queue->id.virt_id;
+
+ DLB_CSR_WR(hw, DLB_SYS_VF_LDB_VQID_V(offs), r0.val);
+
+ r1.field.qid = queue->id.phys_id;
+
+ DLB_CSR_WR(hw, DLB_SYS_VF_LDB_VQID2QID(offs), r1.val);
+
+ r2.field.vqid = queue->id.virt_id;
+
+ offs = vf_id * DLB_MAX_NUM_LDB_QUEUES + queue->id.phys_id;
+
+ DLB_CSR_WR(hw, DLB_SYS_LDB_QID2VQID(offs), r2.val);
+ }
+
+ r13.field.qid_v = 1;
+
+ DLB_CSR_WR(hw, DLB_SYS_LDB_QID_V(queue->id.phys_id), r13.val);
+}
+
+static void dlb_configure_dir_queue(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ struct dlb_dir_pq_pair *queue,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ union dlb_sys_dir_vasqid_v r0 = { {0} };
+ unsigned int offs;
+
+ /* QID write permissions are turned on when the domain is started */
+ r0.field.vasqid_v = 0;
+
+ offs = (domain->id.phys_id * DLB_MAX_NUM_DIR_PORTS) + queue->id.phys_id;
+
+ DLB_CSR_WR(hw, DLB_SYS_DIR_VASQID_V(offs), r0.val);
+
+ if (vf_request) {
+ union dlb_sys_vf_dir_vqid_v r1 = { {0} };
+ union dlb_sys_vf_dir_vqid2qid r2 = { {0} };
+
+ r1.field.vqid_v = 1;
+
+ offs = (vf_id * DLB_MAX_NUM_DIR_PORTS) + queue->id.virt_id;
+
+ DLB_CSR_WR(hw, DLB_SYS_VF_DIR_VQID_V(offs), r1.val);
+
+ r2.field.qid = queue->id.phys_id;
+
+ DLB_CSR_WR(hw, DLB_SYS_VF_DIR_VQID2QID(offs), r2.val);
+ } else {
+ union dlb_sys_dir_qid_v r3 = { {0} };
+
+ r3.field.qid_v = 1;
+
+ DLB_CSR_WR(hw, DLB_SYS_DIR_QID_V(queue->id.phys_id), r3.val);
+ }
+
+ queue->queue_configured = true;
+}
+
+static int
+dlb_verify_create_ldb_port_args(struct dlb_hw *hw,
+ u32 domain_id,
+ u64 pop_count_dma_base,
+ u64 cq_dma_base,
+ struct dlb_create_ldb_port_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_domain *domain;
+ struct dlb_credit_pool *pool;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+
+ if (!domain) {
+ resp->status = DLB_ST_INVALID_DOMAIN_ID;
+ return -1;
+ }
+
+ if (!domain->configured) {
+ resp->status = DLB_ST_DOMAIN_NOT_CONFIGURED;
+ return -1;
+ }
+
+ if (domain->started) {
+ resp->status = DLB_ST_DOMAIN_STARTED;
+ return -1;
+ }
+
+ if (dlb_list_empty(&domain->avail_ldb_ports)) {
+ resp->status = DLB_ST_LDB_PORTS_UNAVAILABLE;
+ return -1;
+ }
+
+ /* If the scheduling domain has no LDB queues, we configure the
+ * hardware to not supply the port with any LDB credits. In that
+ * case, ignore the LDB credit arguments.
+ */
+ if (!dlb_list_empty(&domain->used_ldb_queues) ||
+ !dlb_list_empty(&domain->avail_ldb_queues)) {
+ pool = dlb_get_domain_ldb_pool(args->ldb_credit_pool_id,
+ vf_request,
+ domain);
+
+ if (!pool || !pool->configured ||
+ pool->domain_id.phys_id != domain->id.phys_id) {
+ resp->status = DLB_ST_INVALID_LDB_CREDIT_POOL_ID;
+ return -1;
+ }
+
+ if (args->ldb_credit_high_watermark > pool->avail_credits) {
+ resp->status = DLB_ST_LDB_CREDITS_UNAVAILABLE;
+ return -1;
+ }
+
+ if (args->ldb_credit_low_watermark >=
+ args->ldb_credit_high_watermark) {
+ resp->status = DLB_ST_INVALID_LDB_CREDIT_LOW_WATERMARK;
+ return -1;
+ }
+
+ if (args->ldb_credit_quantum >=
+ args->ldb_credit_high_watermark) {
+ resp->status = DLB_ST_INVALID_LDB_CREDIT_QUANTUM;
+ return -1;
+ }
+
+ if (args->ldb_credit_quantum > DLB_MAX_PORT_CREDIT_QUANTUM) {
+ resp->status = DLB_ST_INVALID_LDB_CREDIT_QUANTUM;
+ return -1;
+ }
+ }
+
+ /* Likewise, if the scheduling domain has no DIR queues, we configure
+ * the hardware to not supply the port with any DIR credits. In that
+ * case, ignore the DIR credit arguments.
+ */
+ if (!dlb_list_empty(&domain->used_dir_pq_pairs) ||
+ !dlb_list_empty(&domain->avail_dir_pq_pairs)) {
+ pool = dlb_get_domain_dir_pool(args->dir_credit_pool_id,
+ vf_request,
+ domain);
+
+ if (!pool || !pool->configured ||
+ pool->domain_id.phys_id != domain->id.phys_id) {
+ resp->status = DLB_ST_INVALID_DIR_CREDIT_POOL_ID;
+ return -1;
+ }
+
+ if (args->dir_credit_high_watermark > pool->avail_credits) {
+ resp->status = DLB_ST_DIR_CREDITS_UNAVAILABLE;
+ return -1;
+ }
+
+ if (args->dir_credit_low_watermark >=
+ args->dir_credit_high_watermark) {
+ resp->status = DLB_ST_INVALID_DIR_CREDIT_LOW_WATERMARK;
+ return -1;
+ }
+
+ if (args->dir_credit_quantum >=
+ args->dir_credit_high_watermark) {
+ resp->status = DLB_ST_INVALID_DIR_CREDIT_QUANTUM;
+ return -1;
+ }
+
+ if (args->dir_credit_quantum > DLB_MAX_PORT_CREDIT_QUANTUM) {
+ resp->status = DLB_ST_INVALID_DIR_CREDIT_QUANTUM;
+ return -1;
+ }
+ }
+
+ /* Check cache-line alignment */
+ if ((pop_count_dma_base & 0x3F) != 0) {
+ resp->status = DLB_ST_INVALID_POP_COUNT_VIRT_ADDR;
+ return -1;
+ }
+
+ if ((cq_dma_base & 0x3F) != 0) {
+ resp->status = DLB_ST_INVALID_CQ_VIRT_ADDR;
+ return -1;
+ }
+
+ if (args->cq_depth != 1 &&
+ args->cq_depth != 2 &&
+ args->cq_depth != 4 &&
+ args->cq_depth != 8 &&
+ args->cq_depth != 16 &&
+ args->cq_depth != 32 &&
+ args->cq_depth != 64 &&
+ args->cq_depth != 128 &&
+ args->cq_depth != 256 &&
+ args->cq_depth != 512 &&
+ args->cq_depth != 1024) {
+ resp->status = DLB_ST_INVALID_CQ_DEPTH;
+ return -1;
+ }
+
+ /* The history list size must be >= 1 */
+ if (!args->cq_history_list_size) {
+ resp->status = DLB_ST_INVALID_HIST_LIST_DEPTH;
+ return -1;
+ }
+
+ if (args->cq_history_list_size > domain->avail_hist_list_entries) {
+ resp->status = DLB_ST_HIST_LIST_ENTRIES_UNAVAILABLE;
+ return -1;
+ }
+
+ return 0;
+}
+
+static int
+dlb_verify_create_dir_port_args(struct dlb_hw *hw,
+ u32 domain_id,
+ u64 pop_count_dma_base,
+ u64 cq_dma_base,
+ struct dlb_create_dir_port_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_domain *domain;
+ struct dlb_credit_pool *pool;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+
+ if (!domain) {
+ resp->status = DLB_ST_INVALID_DOMAIN_ID;
+ return -1;
+ }
+
+ if (!domain->configured) {
+ resp->status = DLB_ST_DOMAIN_NOT_CONFIGURED;
+ return -1;
+ }
+
+ if (domain->started) {
+ resp->status = DLB_ST_DOMAIN_STARTED;
+ return -1;
+ }
+
+ /* If the user claims the queue is already configured, validate
+ * the queue ID, its domain, and whether the queue is configured.
+ */
+ if (args->queue_id != -1) {
+ struct dlb_dir_pq_pair *queue;
+
+ queue = dlb_get_domain_used_dir_pq(args->queue_id,
+ vf_request,
+ domain);
+
+ if (!queue || queue->domain_id.phys_id != domain->id.phys_id ||
+ !queue->queue_configured) {
+ resp->status = DLB_ST_INVALID_DIR_QUEUE_ID;
+ return -1;
+ }
+ }
+
+ /* If the port's queue is not configured, validate that a free
+ * port-queue pair is available.
+ */
+ if (args->queue_id == -1 &&
+ dlb_list_empty(&domain->avail_dir_pq_pairs)) {
+ resp->status = DLB_ST_DIR_PORTS_UNAVAILABLE;
+ return -1;
+ }
+
+ /* If the scheduling domain has no LDB queues, we configure the
+ * hardware to not supply the port with any LDB credits. In that
+ * case, ignore the LDB credit arguments.
+ */
+ if (!dlb_list_empty(&domain->used_ldb_queues) ||
+ !dlb_list_empty(&domain->avail_ldb_queues)) {
+ pool = dlb_get_domain_ldb_pool(args->ldb_credit_pool_id,
+ vf_request,
+ domain);
+
+ if (!pool || !pool->configured ||
+ pool->domain_id.phys_id != domain->id.phys_id) {
+ resp->status = DLB_ST_INVALID_LDB_CREDIT_POOL_ID;
+ return -1;
+ }
+
+ if (args->ldb_credit_high_watermark > pool->avail_credits) {
+ resp->status = DLB_ST_LDB_CREDITS_UNAVAILABLE;
+ return -1;
+ }
+
+ if (args->ldb_credit_low_watermark >=
+ args->ldb_credit_high_watermark) {
+ resp->status = DLB_ST_INVALID_LDB_CREDIT_LOW_WATERMARK;
+ return -1;
+ }
+
+ if (args->ldb_credit_quantum >=
+ args->ldb_credit_high_watermark) {
+ resp->status = DLB_ST_INVALID_LDB_CREDIT_QUANTUM;
+ return -1;
+ }
+
+ if (args->ldb_credit_quantum > DLB_MAX_PORT_CREDIT_QUANTUM) {
+ resp->status = DLB_ST_INVALID_LDB_CREDIT_QUANTUM;
+ return -1;
+ }
+ }
+
+ pool = dlb_get_domain_dir_pool(args->dir_credit_pool_id,
+ vf_request,
+ domain);
+
+ if (!pool || !pool->configured ||
+ pool->domain_id.phys_id != domain->id.phys_id) {
+ resp->status = DLB_ST_INVALID_DIR_CREDIT_POOL_ID;
+ return -1;
+ }
+
+ if (args->dir_credit_high_watermark > pool->avail_credits) {
+ resp->status = DLB_ST_DIR_CREDITS_UNAVAILABLE;
+ return -1;
+ }
+
+ if (args->dir_credit_low_watermark >= args->dir_credit_high_watermark) {
+ resp->status = DLB_ST_INVALID_DIR_CREDIT_LOW_WATERMARK;
+ return -1;
+ }
+
+ if (args->dir_credit_quantum >= args->dir_credit_high_watermark) {
+ resp->status = DLB_ST_INVALID_DIR_CREDIT_QUANTUM;
+ return -1;
+ }
+
+ if (args->dir_credit_quantum > DLB_MAX_PORT_CREDIT_QUANTUM) {
+ resp->status = DLB_ST_INVALID_DIR_CREDIT_QUANTUM;
+ return -1;
+ }
+
+ /* Check cache-line alignment */
+ if ((pop_count_dma_base & 0x3F) != 0) {
+ resp->status = DLB_ST_INVALID_POP_COUNT_VIRT_ADDR;
+ return -1;
+ }
+
+ if ((cq_dma_base & 0x3F) != 0) {
+ resp->status = DLB_ST_INVALID_CQ_VIRT_ADDR;
+ return -1;
+ }
+
+ if (args->cq_depth != 8 &&
+ args->cq_depth != 16 &&
+ args->cq_depth != 32 &&
+ args->cq_depth != 64 &&
+ args->cq_depth != 128 &&
+ args->cq_depth != 256 &&
+ args->cq_depth != 512 &&
+ args->cq_depth != 1024) {
+ resp->status = DLB_ST_INVALID_CQ_DEPTH;
+ return -1;
+ }
+
+ return 0;
+}
+
+static int dlb_verify_start_domain_args(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_domain *domain;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+
+ if (!domain) {
+ resp->status = DLB_ST_INVALID_DOMAIN_ID;
+ return -1;
+ }
+
+ if (!domain->configured) {
+ resp->status = DLB_ST_DOMAIN_NOT_CONFIGURED;
+ return -1;
+ }
+
+ if (domain->started) {
+ resp->status = DLB_ST_DOMAIN_STARTED;
+ return -1;
+ }
+
+ return 0;
+}
+
+static int dlb_verify_map_qid_args(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_map_qid_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_domain *domain;
+ struct dlb_ldb_port *port;
+ struct dlb_ldb_queue *queue;
+ int id;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+
+ if (!domain) {
+ resp->status = DLB_ST_INVALID_DOMAIN_ID;
+ return -1;
+ }
+
+ if (!domain->configured) {
+ resp->status = DLB_ST_DOMAIN_NOT_CONFIGURED;
+ return -1;
+ }
+
+ id = args->port_id;
+
+ port = dlb_get_domain_used_ldb_port(id, vf_request, domain);
+
+ if (!port || !port->configured) {
+ resp->status = DLB_ST_INVALID_PORT_ID;
+ return -1;
+ }
+
+ if (args->priority >= DLB_QID_PRIORITIES) {
+ resp->status = DLB_ST_INVALID_PRIORITY;
+ return -1;
+ }
+
+ queue = dlb_get_domain_ldb_queue(args->qid, vf_request, domain);
+
+ if (!queue || !queue->configured) {
+ resp->status = DLB_ST_INVALID_QID;
+ return -1;
+ }
+
+ if (queue->domain_id.phys_id != domain->id.phys_id) {
+ resp->status = DLB_ST_INVALID_QID;
+ return -1;
+ }
+
+ if (port->domain_id.phys_id != domain->id.phys_id) {
+ resp->status = DLB_ST_INVALID_PORT_ID;
+ return -1;
+ }
+
+ return 0;
+}
+
+static bool dlb_port_find_slot(struct dlb_ldb_port *port,
+ enum dlb_qid_map_state state,
+ int *slot)
+{
+ int i;
+
+ for (i = 0; i < DLB_MAX_NUM_QIDS_PER_LDB_CQ; i++) {
+ if (port->qid_map[i].state == state)
+ break;
+ }
+
+ *slot = i;
+
+ return (i < DLB_MAX_NUM_QIDS_PER_LDB_CQ);
+}
+
+static bool dlb_port_find_slot_queue(struct dlb_ldb_port *port,
+ enum dlb_qid_map_state state,
+ struct dlb_ldb_queue *queue,
+ int *slot)
+{
+ int i;
+
+ for (i = 0; i < DLB_MAX_NUM_QIDS_PER_LDB_CQ; i++) {
+ if (port->qid_map[i].state == state &&
+ port->qid_map[i].qid == queue->id.phys_id)
+ break;
+ }
+
+ *slot = i;
+
+ return (i < DLB_MAX_NUM_QIDS_PER_LDB_CQ);
+}
+
+static bool
+dlb_port_find_slot_with_pending_map_queue(struct dlb_ldb_port *port,
+ struct dlb_ldb_queue *queue,
+ int *slot)
+{
+ int i;
+
+ for (i = 0; i < DLB_MAX_NUM_QIDS_PER_LDB_CQ; i++) {
+ struct dlb_ldb_port_qid_map *map = &port->qid_map[i];
+
+ if (map->state == DLB_QUEUE_UNMAP_IN_PROGRESS_PENDING_MAP &&
+ map->pending_qid == queue->id.phys_id)
+ break;
+ }
+
+ *slot = i;
+
+ return (i < DLB_MAX_NUM_QIDS_PER_LDB_CQ);
+}
+
+static int dlb_port_slot_state_transition(struct dlb_hw *hw,
+ struct dlb_ldb_port *port,
+ struct dlb_ldb_queue *queue,
+ int slot,
+ enum dlb_qid_map_state new_state)
+{
+ enum dlb_qid_map_state curr_state = port->qid_map[slot].state;
+ struct dlb_domain *domain;
+
+ domain = dlb_get_domain_from_id(hw, port->domain_id.phys_id, false, 0);
+ if (!domain) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: unable to find domain %d\n",
+ __func__, port->domain_id.phys_id);
+ return -EFAULT;
+ }
+
+ switch (curr_state) {
+ case DLB_QUEUE_UNMAPPED:
+ switch (new_state) {
+ case DLB_QUEUE_MAPPED:
+ queue->num_mappings++;
+ port->num_mappings++;
+ break;
+ case DLB_QUEUE_MAP_IN_PROGRESS:
+ queue->num_pending_additions++;
+ domain->num_pending_additions++;
+ break;
+ default:
+ goto error;
+ }
+ break;
+ case DLB_QUEUE_MAPPED:
+ switch (new_state) {
+ case DLB_QUEUE_UNMAPPED:
+ queue->num_mappings--;
+ port->num_mappings--;
+ break;
+ case DLB_QUEUE_UNMAP_IN_PROGRESS:
+ port->num_pending_removals++;
+ domain->num_pending_removals++;
+ break;
+ case DLB_QUEUE_MAPPED:
+ /* Priority change, nothing to update */
+ break;
+ default:
+ goto error;
+ }
+ break;
+ case DLB_QUEUE_MAP_IN_PROGRESS:
+ switch (new_state) {
+ case DLB_QUEUE_UNMAPPED:
+ queue->num_pending_additions--;
+ domain->num_pending_additions--;
+ break;
+ case DLB_QUEUE_MAPPED:
+ queue->num_mappings++;
+ port->num_mappings++;
+ queue->num_pending_additions--;
+ domain->num_pending_additions--;
+ break;
+ default:
+ goto error;
+ }
+ break;
+ case DLB_QUEUE_UNMAP_IN_PROGRESS:
+ switch (new_state) {
+ case DLB_QUEUE_UNMAPPED:
+ port->num_pending_removals--;
+ domain->num_pending_removals--;
+ queue->num_mappings--;
+ port->num_mappings--;
+ break;
+ case DLB_QUEUE_MAPPED:
+ port->num_pending_removals--;
+ domain->num_pending_removals--;
+ break;
+ case DLB_QUEUE_UNMAP_IN_PROGRESS_PENDING_MAP:
+ /* Nothing to update */
+ break;
+ default:
+ goto error;
+ }
+ break;
+ case DLB_QUEUE_UNMAP_IN_PROGRESS_PENDING_MAP:
+ switch (new_state) {
+ case DLB_QUEUE_UNMAP_IN_PROGRESS:
+ /* Nothing to update */
+ break;
+ case DLB_QUEUE_UNMAPPED:
+ /* An UNMAP_IN_PROGRESS_PENDING_MAP slot briefly
+ * becomes UNMAPPED before it transitions to
+ * MAP_IN_PROGRESS.
+ */
+ queue->num_mappings--;
+ port->num_mappings--;
+ port->num_pending_removals--;
+ domain->num_pending_removals--;
+ break;
+ default:
+ goto error;
+ }
+ break;
+ default:
+ goto error;
+ }
+
+ port->qid_map[slot].state = new_state;
+
+ DLB_HW_INFO(hw,
+ "[%s()] queue %d -> port %d state transition (%d -> %d)\n",
+ __func__, queue->id.phys_id, port->id.phys_id, curr_state,
+ new_state);
+ return 0;
+
+error:
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: invalid queue %d -> port %d state transition (%d -> %d)\n",
+ __func__, queue->id.phys_id, port->id.phys_id, curr_state,
+ new_state);
+ return -EFAULT;
+}
+
+static int dlb_verify_map_qid_slot_available(struct dlb_ldb_port *port,
+ struct dlb_ldb_queue *queue,
+ struct dlb_cmd_response *resp)
+{
+ enum dlb_qid_map_state state;
+ int i;
+
+ /* Unused slot available? */
+ if (port->num_mappings < DLB_MAX_NUM_QIDS_PER_LDB_CQ)
+ return 0;
+
+ /* If the queue is already mapped (from the application's perspective),
+ * this is simply a priority update.
+ */
+ state = DLB_QUEUE_MAPPED;
+ if (dlb_port_find_slot_queue(port, state, queue, &i))
+ return 0;
+
+ state = DLB_QUEUE_MAP_IN_PROGRESS;
+ if (dlb_port_find_slot_queue(port, state, queue, &i))
+ return 0;
+
+ if (dlb_port_find_slot_with_pending_map_queue(port, queue, &i))
+ return 0;
+
+ /* If the slot contains an unmap in progress, it's considered
+ * available.
+ */
+ state = DLB_QUEUE_UNMAP_IN_PROGRESS;
+ if (dlb_port_find_slot(port, state, &i))
+ return 0;
+
+ state = DLB_QUEUE_UNMAPPED;
+ if (dlb_port_find_slot(port, state, &i))
+ return 0;
+
+ resp->status = DLB_ST_NO_QID_SLOTS_AVAILABLE;
+ return -EINVAL;
+}
+
+static int dlb_verify_unmap_qid_args(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_unmap_qid_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ enum dlb_qid_map_state state;
+ struct dlb_domain *domain;
+ struct dlb_ldb_port *port;
+ struct dlb_ldb_queue *queue;
+ int slot;
+ int id;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+
+ if (!domain) {
+ resp->status = DLB_ST_INVALID_DOMAIN_ID;
+ return -1;
+ }
+
+ if (!domain->configured) {
+ resp->status = DLB_ST_DOMAIN_NOT_CONFIGURED;
+ return -1;
+ }
+
+ id = args->port_id;
+
+ port = dlb_get_domain_used_ldb_port(id, vf_request, domain);
+
+ if (!port || !port->configured) {
+ resp->status = DLB_ST_INVALID_PORT_ID;
+ return -1;
+ }
+
+ if (port->domain_id.phys_id != domain->id.phys_id) {
+ resp->status = DLB_ST_INVALID_PORT_ID;
+ return -1;
+ }
+
+ queue = dlb_get_domain_ldb_queue(args->qid, vf_request, domain);
+
+ if (!queue || !queue->configured) {
+ DLB_HW_ERR(hw, "[%s()] Can't unmap unconfigured queue %d\n",
+ __func__, args->qid);
+ resp->status = DLB_ST_INVALID_QID;
+ return -1;
+ }
+
+ /* Verify that the port has the queue mapped. From the application's
+ * perspective a queue is mapped if it is actually mapped, the map is
+ * in progress, or the map is blocked pending an unmap.
+ */
+ state = DLB_QUEUE_MAPPED;
+ if (dlb_port_find_slot_queue(port, state, queue, &slot))
+ return 0;
+
+ state = DLB_QUEUE_MAP_IN_PROGRESS;
+ if (dlb_port_find_slot_queue(port, state, queue, &slot))
+ return 0;
+
+ if (dlb_port_find_slot_with_pending_map_queue(port, queue, &slot))
+ return 0;
+
+ resp->status = DLB_ST_INVALID_QID;
+ return -1;
+}
+
+static int
+dlb_verify_enable_ldb_port_args(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_enable_ldb_port_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_domain *domain;
+ struct dlb_ldb_port *port;
+ int id;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+
+ if (!domain) {
+ resp->status = DLB_ST_INVALID_DOMAIN_ID;
+ return -1;
+ }
+
+ if (!domain->configured) {
+ resp->status = DLB_ST_DOMAIN_NOT_CONFIGURED;
+ return -1;
+ }
+
+ id = args->port_id;
+
+ port = dlb_get_domain_used_ldb_port(id, vf_request, domain);
+
+ if (!port || !port->configured) {
+ resp->status = DLB_ST_INVALID_PORT_ID;
+ return -1;
+ }
+
+ return 0;
+}
+
+static int
+dlb_verify_enable_dir_port_args(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_enable_dir_port_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_domain *domain;
+ struct dlb_dir_pq_pair *port;
+ int id;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+
+ if (!domain) {
+ resp->status = DLB_ST_INVALID_DOMAIN_ID;
+ return -1;
+ }
+
+ if (!domain->configured) {
+ resp->status = DLB_ST_DOMAIN_NOT_CONFIGURED;
+ return -1;
+ }
+
+ id = args->port_id;
+
+ port = dlb_get_domain_used_dir_pq(id, vf_request, domain);
+
+ if (!port || !port->port_configured) {
+ resp->status = DLB_ST_INVALID_PORT_ID;
+ return -1;
+ }
+
+ return 0;
+}
+
+static int
+dlb_verify_disable_ldb_port_args(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_disable_ldb_port_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_domain *domain;
+ struct dlb_ldb_port *port;
+ int id;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+
+ if (!domain) {
+ resp->status = DLB_ST_INVALID_DOMAIN_ID;
+ return -1;
+ }
+
+ if (!domain->configured) {
+ resp->status = DLB_ST_DOMAIN_NOT_CONFIGURED;
+ return -1;
+ }
+
+ id = args->port_id;
+
+ port = dlb_get_domain_used_ldb_port(id, vf_request, domain);
+
+ if (!port || !port->configured) {
+ resp->status = DLB_ST_INVALID_PORT_ID;
+ return -1;
+ }
+
+ return 0;
+}
+
+static int
+dlb_verify_disable_dir_port_args(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_disable_dir_port_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_domain *domain;
+ struct dlb_dir_pq_pair *port;
+ int id;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+
+ if (!domain) {
+ resp->status = DLB_ST_INVALID_DOMAIN_ID;
+ return -1;
+ }
+
+ if (!domain->configured) {
+ resp->status = DLB_ST_DOMAIN_NOT_CONFIGURED;
+ return -1;
+ }
+
+ id = args->port_id;
+
+ port = dlb_get_domain_used_dir_pq(id, vf_request, domain);
+
+ if (!port || !port->port_configured) {
+ resp->status = DLB_ST_INVALID_PORT_ID;
+ return -1;
+ }
+
+ return 0;
+}
+
+static int
+dlb_domain_attach_resources(struct dlb_hw *hw,
+ struct dlb_function_resources *rsrcs,
+ struct dlb_domain *domain,
+ struct dlb_create_sched_domain_args *args,
+ struct dlb_cmd_response *resp)
+{
+ int ret;
+
+ ret = dlb_attach_ldb_queues(hw,
+ rsrcs,
+ domain,
+ args->num_ldb_queues,
+ resp);
+ if (ret < 0)
+ return ret;
+
+ ret = dlb_attach_ldb_ports(hw,
+ rsrcs,
+ domain,
+ args->num_ldb_ports,
+ resp);
+ if (ret < 0)
+ return ret;
+
+ ret = dlb_attach_dir_ports(hw,
+ rsrcs,
+ domain,
+ args->num_dir_ports,
+ resp);
+ if (ret < 0)
+ return ret;
+
+ ret = dlb_attach_ldb_credits(rsrcs,
+ domain,
+ args->num_ldb_credits,
+ resp);
+ if (ret < 0)
+ return ret;
+
+ ret = dlb_attach_dir_credits(rsrcs,
+ domain,
+ args->num_dir_credits,
+ resp);
+ if (ret < 0)
+ return ret;
+
+ ret = dlb_attach_ldb_credit_pools(hw,
+ rsrcs,
+ domain,
+ args->num_ldb_credit_pools,
+ resp);
+ if (ret < 0)
+ return ret;
+
+ ret = dlb_attach_dir_credit_pools(hw,
+ rsrcs,
+ domain,
+ args->num_dir_credit_pools,
+ resp);
+ if (ret < 0)
+ return ret;
+
+ ret = dlb_attach_domain_hist_list_entries(rsrcs,
+ domain,
+ args->num_hist_list_entries,
+ resp);
+ if (ret < 0)
+ return ret;
+
+ ret = dlb_attach_atomic_inflights(rsrcs,
+ domain,
+ args->num_atomic_inflights,
+ resp);
+ if (ret < 0)
+ return ret;
+
+ domain->configured = true;
+
+ domain->started = false;
+
+ rsrcs->num_avail_domains--;
+
+ return 0;
+}
+
+static int
+dlb_ldb_queue_attach_to_sn_group(struct dlb_hw *hw,
+ struct dlb_ldb_queue *queue,
+ struct dlb_create_ldb_queue_args *args)
+{
+ int slot = -1;
+ int i;
+
+ queue->sn_cfg_valid = false;
+
+ if (args->num_sequence_numbers == 0)
+ return 0;
+
+ for (i = 0; i < DLB_MAX_NUM_SEQUENCE_NUMBER_GROUPS; i++) {
+ struct dlb_sn_group *group = &hw->rsrcs.sn_groups[i];
+
+ if (group->sequence_numbers_per_queue ==
+ args->num_sequence_numbers &&
+ !dlb_sn_group_full(group)) {
+ slot = dlb_sn_group_alloc_slot(group);
+ if (slot >= 0)
+ break;
+ }
+ }
+
+ if (slot == -1) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: no sequence number slots available\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ queue->sn_cfg_valid = true;
+ queue->sn_group = i;
+ queue->sn_slot = slot;
+ return 0;
+}
+
+static int
+dlb_ldb_queue_attach_resources(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ struct dlb_ldb_queue *queue,
+ struct dlb_create_ldb_queue_args *args)
+{
+ int ret;
+
+ ret = dlb_ldb_queue_attach_to_sn_group(hw, queue, args);
+ if (ret)
+ return ret;
+
+ /* Attach QID inflights */
+ queue->num_qid_inflights = args->num_qid_inflights;
+
+ /* Attach atomic inflights */
+ queue->aqed_freelist.base = domain->aqed_freelist.base +
+ domain->aqed_freelist.offset;
+ queue->aqed_freelist.bound = queue->aqed_freelist.base +
+ args->num_atomic_inflights;
+ domain->aqed_freelist.offset += args->num_atomic_inflights;
+
+ return 0;
+}
+
+static void dlb_ldb_port_cq_enable(struct dlb_hw *hw,
+ struct dlb_ldb_port *port)
+{
+ union dlb_lsp_cq_ldb_dsbl reg;
+
+ /* Don't re-enable the port if a removal is pending. The caller should
+ * mark this port as enabled (if it isn't already), and when the
+ * removal completes the port will be enabled.
+ */
+ if (port->num_pending_removals)
+ return;
+
+ reg.field.disabled = 0;
+
+ DLB_CSR_WR(hw, DLB_LSP_CQ_LDB_DSBL(port->id.phys_id), reg.val);
+
+ dlb_flush_csr(hw);
+}
+
+static void dlb_ldb_port_cq_disable(struct dlb_hw *hw,
+ struct dlb_ldb_port *port)
+{
+ union dlb_lsp_cq_ldb_dsbl reg;
+
+ reg.field.disabled = 1;
+
+ DLB_CSR_WR(hw, DLB_LSP_CQ_LDB_DSBL(port->id.phys_id), reg.val);
+
+ dlb_flush_csr(hw);
+}
+
+static void dlb_dir_port_cq_enable(struct dlb_hw *hw,
+ struct dlb_dir_pq_pair *port)
+{
+ union dlb_lsp_cq_dir_dsbl reg;
+
+ reg.field.disabled = 0;
+
+ DLB_CSR_WR(hw, DLB_LSP_CQ_DIR_DSBL(port->id.phys_id), reg.val);
+
+ dlb_flush_csr(hw);
+}
+
+static void dlb_dir_port_cq_disable(struct dlb_hw *hw,
+ struct dlb_dir_pq_pair *port)
+{
+ union dlb_lsp_cq_dir_dsbl reg;
+
+ reg.field.disabled = 1;
+
+ DLB_CSR_WR(hw, DLB_LSP_CQ_DIR_DSBL(port->id.phys_id), reg.val);
+
+ dlb_flush_csr(hw);
+}
+
+static int dlb_ldb_port_configure_pp(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ struct dlb_ldb_port *port,
+ struct dlb_create_ldb_port_args *args,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ union dlb_sys_ldb_pp2ldbpool r0 = { {0} };
+ union dlb_sys_ldb_pp2dirpool r1 = { {0} };
+ union dlb_sys_ldb_pp2vf_pf r2 = { {0} };
+ union dlb_sys_ldb_pp2vas r3 = { {0} };
+ union dlb_sys_ldb_pp_v r4 = { {0} };
+ union dlb_sys_ldb_pp2vpp r5 = { {0} };
+ union dlb_chp_ldb_pp_ldb_crd_hwm r6 = { {0} };
+ union dlb_chp_ldb_pp_dir_crd_hwm r7 = { {0} };
+ union dlb_chp_ldb_pp_ldb_crd_lwm r8 = { {0} };
+ union dlb_chp_ldb_pp_dir_crd_lwm r9 = { {0} };
+ union dlb_chp_ldb_pp_ldb_min_crd_qnt r10 = { {0} };
+ union dlb_chp_ldb_pp_dir_min_crd_qnt r11 = { {0} };
+ union dlb_chp_ldb_pp_ldb_crd_cnt r12 = { {0} };
+ union dlb_chp_ldb_pp_dir_crd_cnt r13 = { {0} };
+ union dlb_chp_ldb_ldb_pp2pool r14 = { {0} };
+ union dlb_chp_ldb_dir_pp2pool r15 = { {0} };
+ union dlb_chp_ldb_pp_crd_req_state r16 = { {0} };
+ union dlb_chp_ldb_pp_ldb_push_ptr r17 = { {0} };
+ union dlb_chp_ldb_pp_dir_push_ptr r18 = { {0} };
+
+ struct dlb_credit_pool *ldb_pool = NULL;
+ struct dlb_credit_pool *dir_pool = NULL;
+ unsigned int offs;
+
+ if (port->ldb_pool_used) {
+ ldb_pool = dlb_get_domain_ldb_pool(args->ldb_credit_pool_id,
+ vf_request,
+ domain);
+ if (!ldb_pool) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: port validation failed\n",
+ __func__);
+ return -EFAULT;
+ }
+ }
+
+ if (port->dir_pool_used) {
+ dir_pool = dlb_get_domain_dir_pool(args->dir_credit_pool_id,
+ vf_request,
+ domain);
+ if (!dir_pool) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: port validation failed\n",
+ __func__);
+ return -EFAULT;
+ }
+ }
+
+ r0.field.ldbpool = (port->ldb_pool_used) ? ldb_pool->id.phys_id : 0;
+
+ DLB_CSR_WR(hw, DLB_SYS_LDB_PP2LDBPOOL(port->id.phys_id), r0.val);
+
+ r1.field.dirpool = (port->dir_pool_used) ? dir_pool->id.phys_id : 0;
+
+ DLB_CSR_WR(hw, DLB_SYS_LDB_PP2DIRPOOL(port->id.phys_id), r1.val);
+
+ r2.field.vf = vf_id;
+ r2.field.is_pf = !vf_request;
+
+ DLB_CSR_WR(hw, DLB_SYS_LDB_PP2VF_PF(port->id.phys_id), r2.val);
+
+ r3.field.vas = domain->id.phys_id;
+
+ DLB_CSR_WR(hw, DLB_SYS_LDB_PP2VAS(port->id.phys_id), r3.val);
+
+ r5.field.vpp = port->id.virt_id;
+
+ offs = (vf_id * DLB_MAX_NUM_LDB_PORTS) + port->id.phys_id;
+
+ DLB_CSR_WR(hw, DLB_SYS_LDB_PP2VPP(offs), r5.val);
+
+ r6.field.hwm = args->ldb_credit_high_watermark;
+
+ DLB_CSR_WR(hw, DLB_CHP_LDB_PP_LDB_CRD_HWM(port->id.phys_id), r6.val);
+
+ r7.field.hwm = args->dir_credit_high_watermark;
+
+ DLB_CSR_WR(hw, DLB_CHP_LDB_PP_DIR_CRD_HWM(port->id.phys_id), r7.val);
+
+ r8.field.lwm = args->ldb_credit_low_watermark;
+
+ DLB_CSR_WR(hw, DLB_CHP_LDB_PP_LDB_CRD_LWM(port->id.phys_id), r8.val);
+
+ r9.field.lwm = args->dir_credit_low_watermark;
+
+ DLB_CSR_WR(hw, DLB_CHP_LDB_PP_DIR_CRD_LWM(port->id.phys_id), r9.val);
+
+ r10.field.quanta = args->ldb_credit_quantum;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_PP_LDB_MIN_CRD_QNT(port->id.phys_id),
+ r10.val);
+
+ r11.field.quanta = args->dir_credit_quantum;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_PP_DIR_MIN_CRD_QNT(port->id.phys_id),
+ r11.val);
+
+ r12.field.count = args->ldb_credit_high_watermark;
+
+ DLB_CSR_WR(hw, DLB_CHP_LDB_PP_LDB_CRD_CNT(port->id.phys_id), r12.val);
+
+ r13.field.count = args->dir_credit_high_watermark;
+
+ DLB_CSR_WR(hw, DLB_CHP_LDB_PP_DIR_CRD_CNT(port->id.phys_id), r13.val);
+
+ r14.field.pool = (port->ldb_pool_used) ? ldb_pool->id.phys_id : 0;
+
+ DLB_CSR_WR(hw, DLB_CHP_LDB_LDB_PP2POOL(port->id.phys_id), r14.val);
+
+ r15.field.pool = (port->dir_pool_used) ? dir_pool->id.phys_id : 0;
+
+ DLB_CSR_WR(hw, DLB_CHP_LDB_DIR_PP2POOL(port->id.phys_id), r15.val);
+
+ r16.field.no_pp_credit_update = 0;
+
+ DLB_CSR_WR(hw, DLB_CHP_LDB_PP_CRD_REQ_STATE(port->id.phys_id), r16.val);
+
+ r17.field.push_pointer = 0;
+
+ DLB_CSR_WR(hw, DLB_CHP_LDB_PP_LDB_PUSH_PTR(port->id.phys_id), r17.val);
+
+ r18.field.push_pointer = 0;
+
+ DLB_CSR_WR(hw, DLB_CHP_LDB_PP_DIR_PUSH_PTR(port->id.phys_id), r18.val);
+
+ if (vf_request) {
+ union dlb_sys_vf_ldb_vpp2pp r16 = { {0} };
+ union dlb_sys_vf_ldb_vpp_v r17 = { {0} };
+
+ r16.field.pp = port->id.phys_id;
+
+ offs = vf_id * DLB_MAX_NUM_LDB_PORTS + port->id.virt_id;
+
+ DLB_CSR_WR(hw, DLB_SYS_VF_LDB_VPP2PP(offs), r16.val);
+
+ r17.field.vpp_v = 1;
+
+ DLB_CSR_WR(hw, DLB_SYS_VF_LDB_VPP_V(offs), r17.val);
+ }
+
+ r4.field.pp_v = 1;
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_PP_V(port->id.phys_id),
+ r4.val);
+
+ return 0;
+}
+
+static int dlb_ldb_port_configure_cq(struct dlb_hw *hw,
+ struct dlb_ldb_port *port,
+ u64 pop_count_dma_base,
+ u64 cq_dma_base,
+ struct dlb_create_ldb_port_args *args,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ int i;
+
+ union dlb_sys_ldb_cq_addr_l r0 = { {0} };
+ union dlb_sys_ldb_cq_addr_u r1 = { {0} };
+ union dlb_sys_ldb_cq2vf_pf r2 = { {0} };
+ union dlb_chp_ldb_cq_tkn_depth_sel r3 = { {0} };
+ union dlb_chp_hist_list_lim r4 = { {0} };
+ union dlb_chp_hist_list_base r5 = { {0} };
+ union dlb_lsp_cq_ldb_infl_lim r6 = { {0} };
+ union dlb_lsp_cq2priov r7 = { {0} };
+ union dlb_chp_hist_list_push_ptr r8 = { {0} };
+ union dlb_chp_hist_list_pop_ptr r9 = { {0} };
+ union dlb_lsp_cq_ldb_tkn_depth_sel r10 = { {0} };
+ union dlb_sys_ldb_pp_addr_l r11 = { {0} };
+ union dlb_sys_ldb_pp_addr_u r12 = { {0} };
+
+ /* The CQ address is 64B-aligned, and the DLB only wants bits [63:6] */
+ r0.field.addr_l = cq_dma_base >> 6;
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_CQ_ADDR_L(port->id.phys_id),
+ r0.val);
+
+ r1.field.addr_u = cq_dma_base >> 32;
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_CQ_ADDR_U(port->id.phys_id),
+ r1.val);
+
+ r2.field.vf = vf_id;
+ r2.field.is_pf = !vf_request;
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_CQ2VF_PF(port->id.phys_id),
+ r2.val);
+
+ if (args->cq_depth <= 8) {
+ r3.field.token_depth_select = 1;
+ } else if (args->cq_depth == 16) {
+ r3.field.token_depth_select = 2;
+ } else if (args->cq_depth == 32) {
+ r3.field.token_depth_select = 3;
+ } else if (args->cq_depth == 64) {
+ r3.field.token_depth_select = 4;
+ } else if (args->cq_depth == 128) {
+ r3.field.token_depth_select = 5;
+ } else if (args->cq_depth == 256) {
+ r3.field.token_depth_select = 6;
+ } else if (args->cq_depth == 512) {
+ r3.field.token_depth_select = 7;
+ } else if (args->cq_depth == 1024) {
+ r3.field.token_depth_select = 8;
+ } else {
+ DLB_HW_ERR(hw, "[%s():%d] Internal error: invalid CQ depth\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_CQ_TKN_DEPTH_SEL(port->id.phys_id),
+ r3.val);
+
+ r10.field.token_depth_select = r3.field.token_depth_select;
+ r10.field.ignore_depth = 0;
+ /* TDT algorithm: DLB must be able to write CQs with depth < 4 */
+ r10.field.enab_shallow_cq = 1;
+
+ DLB_CSR_WR(hw,
+ DLB_LSP_CQ_LDB_TKN_DEPTH_SEL(port->id.phys_id),
+ r10.val);
+
+ /* To support CQs with depth less than 8, program the token count
+ * register with a non-zero initial value. Operations such as domain
+ * reset must take this initial value into account when quiescing the
+ * CQ.
+ */
+ port->init_tkn_cnt = 0;
+
+ if (args->cq_depth < 8) {
+ union dlb_lsp_cq_ldb_tkn_cnt r12 = { {0} };
+
+ port->init_tkn_cnt = 8 - args->cq_depth;
+
+ r12.field.token_count = port->init_tkn_cnt;
+
+ DLB_CSR_WR(hw,
+ DLB_LSP_CQ_LDB_TKN_CNT(port->id.phys_id),
+ r12.val);
+ }
+
+ r4.field.limit = port->hist_list_entry_limit - 1;
+
+ DLB_CSR_WR(hw, DLB_CHP_HIST_LIST_LIM(port->id.phys_id), r4.val);
+
+ r5.field.base = port->hist_list_entry_base;
+
+ DLB_CSR_WR(hw, DLB_CHP_HIST_LIST_BASE(port->id.phys_id), r5.val);
+
+ r8.field.push_ptr = r5.field.base;
+ r8.field.generation = 0;
+
+ DLB_CSR_WR(hw, DLB_CHP_HIST_LIST_PUSH_PTR(port->id.phys_id), r8.val);
+
+ r9.field.pop_ptr = r5.field.base;
+ r9.field.generation = 0;
+
+ DLB_CSR_WR(hw, DLB_CHP_HIST_LIST_POP_PTR(port->id.phys_id), r9.val);
+
+ /* The inflight limit sets a cap on the number of QEs for which this CQ
+ * can owe completions at one time.
+ */
+ r6.field.limit = args->cq_history_list_size;
+
+ DLB_CSR_WR(hw, DLB_LSP_CQ_LDB_INFL_LIM(port->id.phys_id), r6.val);
+
+ /* Disable the port's QID mappings */
+ r7.field.v = 0;
+
+ DLB_CSR_WR(hw, DLB_LSP_CQ2PRIOV(port->id.phys_id), r7.val);
+
+ /* Two cache lines (128B) are dedicated for the port's pop counts */
+ r11.field.addr_l = pop_count_dma_base >> 7;
+
+ DLB_CSR_WR(hw, DLB_SYS_LDB_PP_ADDR_L(port->id.phys_id), r11.val);
+
+ r12.field.addr_u = pop_count_dma_base >> 32;
+
+ DLB_CSR_WR(hw, DLB_SYS_LDB_PP_ADDR_U(port->id.phys_id), r12.val);
+
+ for (i = 0; i < DLB_MAX_NUM_QIDS_PER_LDB_CQ; i++)
+ port->qid_map[i].state = DLB_QUEUE_UNMAPPED;
+
+ return 0;
+}
+
+static void dlb_update_ldb_arb_threshold(struct dlb_hw *hw)
+{
+ union dlb_lsp_ctrl_config_0 r0 = { {0} };
+
+ /* From the hardware spec:
+ * "The optimal value for ldb_arb_threshold is in the region of {8 *
+ * #CQs}. It is expected therefore that the PF will change this value
+ * dynamically as the number of active ports changes."
+ */
+ r0.val = DLB_CSR_RD(hw, DLB_LSP_CTRL_CONFIG_0);
+
+ r0.field.ldb_arb_threshold = hw->pf.num_enabled_ldb_ports * 8;
+ r0.field.ldb_arb_ignore_empty = 1;
+ r0.field.ldb_arb_mode = 1;
+
+ DLB_CSR_WR(hw, DLB_LSP_CTRL_CONFIG_0, r0.val);
+
+ dlb_flush_csr(hw);
+}
+
+static void dlb_ldb_pool_update_credit_count(struct dlb_hw *hw,
+ u32 pool_id,
+ u32 count)
+{
+ hw->rsrcs.ldb_credit_pools[pool_id].avail_credits -= count;
+}
+
+static void dlb_dir_pool_update_credit_count(struct dlb_hw *hw,
+ u32 pool_id,
+ u32 count)
+{
+ hw->rsrcs.dir_credit_pools[pool_id].avail_credits -= count;
+}
+
+static void dlb_ldb_pool_write_credit_count_reg(struct dlb_hw *hw,
+ u32 pool_id)
+{
+ union dlb_chp_ldb_pool_crd_cnt r0 = { {0} };
+ struct dlb_credit_pool *pool;
+
+ pool = &hw->rsrcs.ldb_credit_pools[pool_id];
+
+ r0.field.count = pool->avail_credits;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_POOL_CRD_CNT(pool->id.phys_id),
+ r0.val);
+}
+
+static void dlb_dir_pool_write_credit_count_reg(struct dlb_hw *hw,
+ u32 pool_id)
+{
+ union dlb_chp_dir_pool_crd_cnt r0 = { {0} };
+ struct dlb_credit_pool *pool;
+
+ pool = &hw->rsrcs.dir_credit_pools[pool_id];
+
+ r0.field.count = pool->avail_credits;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_POOL_CRD_CNT(pool->id.phys_id),
+ r0.val);
+}
+
+static int dlb_configure_ldb_port(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ struct dlb_ldb_port *port,
+ u64 pop_count_dma_base,
+ u64 cq_dma_base,
+ struct dlb_create_ldb_port_args *args,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_credit_pool *ldb_pool, *dir_pool;
+ int ret;
+
+ port->hist_list_entry_base = domain->hist_list_entry_base +
+ domain->hist_list_entry_offset;
+ port->hist_list_entry_limit = port->hist_list_entry_base +
+ args->cq_history_list_size;
+
+ domain->hist_list_entry_offset += args->cq_history_list_size;
+ domain->avail_hist_list_entries -= args->cq_history_list_size;
+
+ port->ldb_pool_used = !dlb_list_empty(&domain->used_ldb_queues) ||
+ !dlb_list_empty(&domain->avail_ldb_queues);
+ port->dir_pool_used = !dlb_list_empty(&domain->used_dir_pq_pairs) ||
+ !dlb_list_empty(&domain->avail_dir_pq_pairs);
+
+ if (port->ldb_pool_used) {
+ u32 cnt = args->ldb_credit_high_watermark;
+
+ ldb_pool = dlb_get_domain_ldb_pool(args->ldb_credit_pool_id,
+ vf_request,
+ domain);
+ if (!ldb_pool) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: port validation failed\n",
+ __func__);
+ return -EFAULT;
+ }
+
+ dlb_ldb_pool_update_credit_count(hw, ldb_pool->id.phys_id, cnt);
+ } else {
+ args->ldb_credit_high_watermark = 0;
+ args->ldb_credit_low_watermark = 0;
+ args->ldb_credit_quantum = 0;
+ }
+
+ if (port->dir_pool_used) {
+ u32 cnt = args->dir_credit_high_watermark;
+
+ dir_pool = dlb_get_domain_dir_pool(args->dir_credit_pool_id,
+ vf_request,
+ domain);
+ if (!dir_pool) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: port validation failed\n",
+ __func__);
+ return -EFAULT;
+ }
+
+ dlb_dir_pool_update_credit_count(hw, dir_pool->id.phys_id, cnt);
+ } else {
+ args->dir_credit_high_watermark = 0;
+ args->dir_credit_low_watermark = 0;
+ args->dir_credit_quantum = 0;
+ }
+
+ ret = dlb_ldb_port_configure_cq(hw,
+ port,
+ pop_count_dma_base,
+ cq_dma_base,
+ args,
+ vf_request,
+ vf_id);
+ if (ret < 0)
+ return ret;
+
+ ret = dlb_ldb_port_configure_pp(hw,
+ domain,
+ port,
+ args,
+ vf_request,
+ vf_id);
+ if (ret < 0)
+ return ret;
+
+ dlb_ldb_port_cq_enable(hw, port);
+
+ port->num_mappings = 0;
+
+ port->enabled = true;
+
+ hw->pf.num_enabled_ldb_ports++;
+
+ dlb_update_ldb_arb_threshold(hw);
+
+ port->configured = true;
+
+ return 0;
+}
+
+static int dlb_dir_port_configure_pp(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ struct dlb_dir_pq_pair *port,
+ struct dlb_create_dir_port_args *args,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ union dlb_sys_dir_pp2ldbpool r0 = { {0} };
+ union dlb_sys_dir_pp2dirpool r1 = { {0} };
+ union dlb_sys_dir_pp2vf_pf r2 = { {0} };
+ union dlb_sys_dir_pp2vas r3 = { {0} };
+ union dlb_sys_dir_pp_v r4 = { {0} };
+ union dlb_sys_dir_pp2vpp r5 = { {0} };
+ union dlb_chp_dir_pp_ldb_crd_hwm r6 = { {0} };
+ union dlb_chp_dir_pp_dir_crd_hwm r7 = { {0} };
+ union dlb_chp_dir_pp_ldb_crd_lwm r8 = { {0} };
+ union dlb_chp_dir_pp_dir_crd_lwm r9 = { {0} };
+ union dlb_chp_dir_pp_ldb_min_crd_qnt r10 = { {0} };
+ union dlb_chp_dir_pp_dir_min_crd_qnt r11 = { {0} };
+ union dlb_chp_dir_pp_ldb_crd_cnt r12 = { {0} };
+ union dlb_chp_dir_pp_dir_crd_cnt r13 = { {0} };
+ union dlb_chp_dir_ldb_pp2pool r14 = { {0} };
+ union dlb_chp_dir_dir_pp2pool r15 = { {0} };
+ union dlb_chp_dir_pp_crd_req_state r16 = { {0} };
+ union dlb_chp_dir_pp_ldb_push_ptr r17 = { {0} };
+ union dlb_chp_dir_pp_dir_push_ptr r18 = { {0} };
+
+ struct dlb_credit_pool *ldb_pool = NULL;
+ struct dlb_credit_pool *dir_pool = NULL;
+
+ if (port->ldb_pool_used) {
+ ldb_pool = dlb_get_domain_ldb_pool(args->ldb_credit_pool_id,
+ vf_request,
+ domain);
+ if (!ldb_pool) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: port validation failed\n",
+ __func__);
+ return -EFAULT;
+ }
+ }
+
+ if (port->dir_pool_used) {
+ dir_pool = dlb_get_domain_dir_pool(args->dir_credit_pool_id,
+ vf_request,
+ domain);
+ if (!dir_pool) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: port validation failed\n",
+ __func__);
+ return -EFAULT;
+ }
+ }
+
+ r0.field.ldbpool = (port->ldb_pool_used) ? ldb_pool->id.phys_id : 0;
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_PP2LDBPOOL(port->id.phys_id),
+ r0.val);
+
+ r1.field.dirpool = (port->dir_pool_used) ? dir_pool->id.phys_id : 0;
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_PP2DIRPOOL(port->id.phys_id),
+ r1.val);
+
+ r2.field.vf = vf_id;
+ r2.field.is_pf = !vf_request;
+ r2.field.is_hw_dsi = 0;
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_PP2VF_PF(port->id.phys_id),
+ r2.val);
+
+ r3.field.vas = domain->id.phys_id;
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_PP2VAS(port->id.phys_id),
+ r3.val);
+
+ r5.field.vpp = port->id.virt_id;
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_PP2VPP((vf_id * DLB_MAX_NUM_DIR_PORTS) +
+ port->id.phys_id),
+ r5.val);
+
+ r6.field.hwm = args->ldb_credit_high_watermark;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_LDB_CRD_HWM(port->id.phys_id),
+ r6.val);
+
+ r7.field.hwm = args->dir_credit_high_watermark;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_DIR_CRD_HWM(port->id.phys_id),
+ r7.val);
+
+ r8.field.lwm = args->ldb_credit_low_watermark;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_LDB_CRD_LWM(port->id.phys_id),
+ r8.val);
+
+ r9.field.lwm = args->dir_credit_low_watermark;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_DIR_CRD_LWM(port->id.phys_id),
+ r9.val);
+
+ r10.field.quanta = args->ldb_credit_quantum;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_LDB_MIN_CRD_QNT(port->id.phys_id),
+ r10.val);
+
+ r11.field.quanta = args->dir_credit_quantum;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_DIR_MIN_CRD_QNT(port->id.phys_id),
+ r11.val);
+
+ r12.field.count = args->ldb_credit_high_watermark;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_LDB_CRD_CNT(port->id.phys_id),
+ r12.val);
+
+ r13.field.count = args->dir_credit_high_watermark;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_DIR_CRD_CNT(port->id.phys_id),
+ r13.val);
+
+ r14.field.pool = (port->ldb_pool_used) ? ldb_pool->id.phys_id : 0;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_LDB_PP2POOL(port->id.phys_id),
+ r14.val);
+
+ r15.field.pool = (port->dir_pool_used) ? dir_pool->id.phys_id : 0;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_DIR_PP2POOL(port->id.phys_id),
+ r15.val);
+
+ r16.field.no_pp_credit_update = 0;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_CRD_REQ_STATE(port->id.phys_id),
+ r16.val);
+
+ r17.field.push_pointer = 0;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_LDB_PUSH_PTR(port->id.phys_id),
+ r17.val);
+
+ r18.field.push_pointer = 0;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_DIR_PUSH_PTR(port->id.phys_id),
+ r18.val);
+
+ if (vf_request) {
+ union dlb_sys_vf_dir_vpp2pp r16 = { {0} };
+ union dlb_sys_vf_dir_vpp_v r17 = { {0} };
+ unsigned int offs;
+
+ r16.field.pp = port->id.phys_id;
+
+ offs = vf_id * DLB_MAX_NUM_DIR_PORTS + port->id.virt_id;
+
+ DLB_CSR_WR(hw, DLB_SYS_VF_DIR_VPP2PP(offs), r16.val);
+
+ r17.field.vpp_v = 1;
+
+ DLB_CSR_WR(hw, DLB_SYS_VF_DIR_VPP_V(offs), r17.val);
+ }
+
+ r4.field.pp_v = 1;
+ r4.field.mb_dm = 0;
+
+ DLB_CSR_WR(hw, DLB_SYS_DIR_PP_V(port->id.phys_id), r4.val);
+
+ return 0;
+}
+
+static int dlb_dir_port_configure_cq(struct dlb_hw *hw,
+ struct dlb_dir_pq_pair *port,
+ u64 pop_count_dma_base,
+ u64 cq_dma_base,
+ struct dlb_create_dir_port_args *args,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ union dlb_sys_dir_cq_addr_l r0 = { {0} };
+ union dlb_sys_dir_cq_addr_u r1 = { {0} };
+ union dlb_sys_dir_cq2vf_pf r2 = { {0} };
+ union dlb_chp_dir_cq_tkn_depth_sel r3 = { {0} };
+ union dlb_lsp_cq_dir_tkn_depth_sel_dsi r4 = { {0} };
+ union dlb_sys_dir_pp_addr_l r5 = { {0} };
+ union dlb_sys_dir_pp_addr_u r6 = { {0} };
+
+ /* The CQ address is 64B-aligned, and the DLB only wants bits [63:6] */
+ r0.field.addr_l = cq_dma_base >> 6;
+
+ DLB_CSR_WR(hw, DLB_SYS_DIR_CQ_ADDR_L(port->id.phys_id), r0.val);
+
+ r1.field.addr_u = cq_dma_base >> 32;
+
+ DLB_CSR_WR(hw, DLB_SYS_DIR_CQ_ADDR_U(port->id.phys_id), r1.val);
+
+ r2.field.vf = vf_id;
+ r2.field.is_pf = !vf_request;
+
+ DLB_CSR_WR(hw, DLB_SYS_DIR_CQ2VF_PF(port->id.phys_id), r2.val);
+
+ if (args->cq_depth == 8) {
+ r3.field.token_depth_select = 1;
+ } else if (args->cq_depth == 16) {
+ r3.field.token_depth_select = 2;
+ } else if (args->cq_depth == 32) {
+ r3.field.token_depth_select = 3;
+ } else if (args->cq_depth == 64) {
+ r3.field.token_depth_select = 4;
+ } else if (args->cq_depth == 128) {
+ r3.field.token_depth_select = 5;
+ } else if (args->cq_depth == 256) {
+ r3.field.token_depth_select = 6;
+ } else if (args->cq_depth == 512) {
+ r3.field.token_depth_select = 7;
+ } else if (args->cq_depth == 1024) {
+ r3.field.token_depth_select = 8;
+ } else {
+ DLB_HW_ERR(hw, "[%s():%d] Internal error: invalid CQ depth\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_CQ_TKN_DEPTH_SEL(port->id.phys_id),
+ r3.val);
+
+ r4.field.token_depth_select = r3.field.token_depth_select;
+ r4.field.disable_wb_opt = 0;
+
+ DLB_CSR_WR(hw,
+ DLB_LSP_CQ_DIR_TKN_DEPTH_SEL_DSI(port->id.phys_id),
+ r4.val);
+
+ /* Two cache lines (128B) are dedicated for the port's pop counts */
+ r5.field.addr_l = pop_count_dma_base >> 7;
+
+ DLB_CSR_WR(hw, DLB_SYS_DIR_PP_ADDR_L(port->id.phys_id), r5.val);
+
+ r6.field.addr_u = pop_count_dma_base >> 32;
+
+ DLB_CSR_WR(hw, DLB_SYS_DIR_PP_ADDR_U(port->id.phys_id), r6.val);
+
+ return 0;
+}
+
+static int dlb_configure_dir_port(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ struct dlb_dir_pq_pair *port,
+ u64 pop_count_dma_base,
+ u64 cq_dma_base,
+ struct dlb_create_dir_port_args *args,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_credit_pool *ldb_pool, *dir_pool;
+ int ret;
+
+ port->ldb_pool_used = !dlb_list_empty(&domain->used_ldb_queues) ||
+ !dlb_list_empty(&domain->avail_ldb_queues);
+
+ /* Each directed port has a directed queue, hence this port requires
+ * directed credits.
+ */
+ port->dir_pool_used = true;
+
+ if (port->ldb_pool_used) {
+ u32 cnt = args->ldb_credit_high_watermark;
+
+ ldb_pool = dlb_get_domain_ldb_pool(args->ldb_credit_pool_id,
+ vf_request,
+ domain);
+ if (!ldb_pool) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: port validation failed\n",
+ __func__);
+ return -EFAULT;
+ }
+
+ dlb_ldb_pool_update_credit_count(hw, ldb_pool->id.phys_id, cnt);
+ } else {
+ args->ldb_credit_high_watermark = 0;
+ args->ldb_credit_low_watermark = 0;
+ args->ldb_credit_quantum = 0;
+ }
+
+ dir_pool = dlb_get_domain_dir_pool(args->dir_credit_pool_id,
+ vf_request,
+ domain);
+ if (!dir_pool) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: port validation failed\n",
+ __func__);
+ return -EFAULT;
+ }
+
+ dlb_dir_pool_update_credit_count(hw,
+ dir_pool->id.phys_id,
+ args->dir_credit_high_watermark);
+
+ ret = dlb_dir_port_configure_cq(hw,
+ port,
+ pop_count_dma_base,
+ cq_dma_base,
+ args,
+ vf_request,
+ vf_id);
+
+ if (ret < 0)
+ return ret;
+
+ ret = dlb_dir_port_configure_pp(hw,
+ domain,
+ port,
+ args,
+ vf_request,
+ vf_id);
+ if (ret < 0)
+ return ret;
+
+ dlb_dir_port_cq_enable(hw, port);
+
+ port->enabled = true;
+
+ port->port_configured = true;
+
+ return 0;
+}
+
+static int dlb_ldb_port_map_qid_static(struct dlb_hw *hw,
+ struct dlb_ldb_port *p,
+ struct dlb_ldb_queue *q,
+ u8 priority)
+{
+ union dlb_lsp_cq2priov r0;
+ union dlb_lsp_cq2qid r1;
+ union dlb_atm_pipe_qid_ldb_qid2cqidx r2;
+ union dlb_lsp_qid_ldb_qid2cqidx r3;
+ union dlb_lsp_qid_ldb_qid2cqidx2 r4;
+ enum dlb_qid_map_state state;
+ int i;
+
+ /* Look for a pending or already mapped slot, else an unused slot */
+ if (!dlb_port_find_slot_queue(p, DLB_QUEUE_MAP_IN_PROGRESS, q, &i) &&
+ !dlb_port_find_slot_queue(p, DLB_QUEUE_MAPPED, q, &i) &&
+ !dlb_port_find_slot(p, DLB_QUEUE_UNMAPPED, &i)) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: CQ has no available QID mapping slots\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ if (i >= DLB_MAX_NUM_QIDS_PER_LDB_CQ) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: port slot tracking failed\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ /* Read-modify-write the priority and valid bit register */
+ r0.val = DLB_CSR_RD(hw, DLB_LSP_CQ2PRIOV(p->id.phys_id));
+
+ r0.field.v |= 1 << i;
+ r0.field.prio |= (priority & 0x7) << i * 3;
+
+ DLB_CSR_WR(hw, DLB_LSP_CQ2PRIOV(p->id.phys_id), r0.val);
+
+ /* Read-modify-write the QID map register */
+ r1.val = DLB_CSR_RD(hw, DLB_LSP_CQ2QID(p->id.phys_id, i / 4));
+
+ if (i == 0 || i == 4)
+ r1.field.qid_p0 = q->id.phys_id;
+ if (i == 1 || i == 5)
+ r1.field.qid_p1 = q->id.phys_id;
+ if (i == 2 || i == 6)
+ r1.field.qid_p2 = q->id.phys_id;
+ if (i == 3 || i == 7)
+ r1.field.qid_p3 = q->id.phys_id;
+
+ DLB_CSR_WR(hw, DLB_LSP_CQ2QID(p->id.phys_id, i / 4), r1.val);
+
+ r2.val = DLB_CSR_RD(hw,
+ DLB_ATM_PIPE_QID_LDB_QID2CQIDX(q->id.phys_id,
+ p->id.phys_id / 4));
+
+ r3.val = DLB_CSR_RD(hw,
+ DLB_LSP_QID_LDB_QID2CQIDX(q->id.phys_id,
+ p->id.phys_id / 4));
+
+ r4.val = DLB_CSR_RD(hw,
+ DLB_LSP_QID_LDB_QID2CQIDX2(q->id.phys_id,
+ p->id.phys_id / 4));
+
+ switch (p->id.phys_id % 4) {
+ case 0:
+ r2.field.cq_p0 |= 1 << i;
+ r3.field.cq_p0 |= 1 << i;
+ r4.field.cq_p0 |= 1 << i;
+ break;
+
+ case 1:
+ r2.field.cq_p1 |= 1 << i;
+ r3.field.cq_p1 |= 1 << i;
+ r4.field.cq_p1 |= 1 << i;
+ break;
+
+ case 2:
+ r2.field.cq_p2 |= 1 << i;
+ r3.field.cq_p2 |= 1 << i;
+ r4.field.cq_p2 |= 1 << i;
+ break;
+
+ case 3:
+ r2.field.cq_p3 |= 1 << i;
+ r3.field.cq_p3 |= 1 << i;
+ r4.field.cq_p3 |= 1 << i;
+ break;
+ }
+
+ DLB_CSR_WR(hw,
+ DLB_ATM_PIPE_QID_LDB_QID2CQIDX(q->id.phys_id,
+ p->id.phys_id / 4),
+ r2.val);
+
+ DLB_CSR_WR(hw,
+ DLB_LSP_QID_LDB_QID2CQIDX(q->id.phys_id,
+ p->id.phys_id / 4),
+ r3.val);
+
+ DLB_CSR_WR(hw,
+ DLB_LSP_QID_LDB_QID2CQIDX2(q->id.phys_id,
+ p->id.phys_id / 4),
+ r4.val);
+
+ dlb_flush_csr(hw);
+
+ p->qid_map[i].qid = q->id.phys_id;
+ p->qid_map[i].priority = priority;
+
+ state = DLB_QUEUE_MAPPED;
+
+ return dlb_port_slot_state_transition(hw, p, q, i, state);
+}
+
+static void dlb_ldb_port_change_qid_priority(struct dlb_hw *hw,
+ struct dlb_ldb_port *port,
+ int slot,
+ struct dlb_map_qid_args *args)
+{
+ union dlb_lsp_cq2priov r0;
+
+ /* Read-modify-write the priority and valid bit register */
+ r0.val = DLB_CSR_RD(hw, DLB_LSP_CQ2PRIOV(port->id.phys_id));
+
+ r0.field.v |= 1 << slot;
+ r0.field.prio |= (args->priority & 0x7) << slot * 3;
+
+ DLB_CSR_WR(hw, DLB_LSP_CQ2PRIOV(port->id.phys_id), r0.val);
+
+ dlb_flush_csr(hw);
+
+ port->qid_map[slot].priority = args->priority;
+}
+
+static int dlb_ldb_port_set_has_work_bits(struct dlb_hw *hw,
+ struct dlb_ldb_port *port,
+ struct dlb_ldb_queue *queue,
+ int slot)
+{
+ union dlb_lsp_qid_aqed_active_cnt r0;
+ union dlb_lsp_qid_ldb_enqueue_cnt r1;
+ union dlb_lsp_ldb_sched_ctrl r2 = { {0} };
+
+ /* Set the atomic scheduling haswork bit */
+ r0.val = DLB_CSR_RD(hw, DLB_LSP_QID_AQED_ACTIVE_CNT(queue->id.phys_id));
+
+ r2.field.cq = port->id.phys_id;
+ r2.field.qidix = slot;
+ r2.field.value = 1;
+ r2.field.rlist_haswork_v = r0.field.count > 0;
+
+ /* Set the non-atomic scheduling haswork bit */
+ DLB_CSR_WR(hw, DLB_LSP_LDB_SCHED_CTRL, r2.val);
+
+ r1.val = DLB_CSR_RD(hw, DLB_LSP_QID_LDB_ENQUEUE_CNT(queue->id.phys_id));
+
+ memset(&r2, 0, sizeof(r2));
+
+ r2.field.cq = port->id.phys_id;
+ r2.field.qidix = slot;
+ r2.field.value = 1;
+ r2.field.nalb_haswork_v = (r1.field.count > 0);
+
+ DLB_CSR_WR(hw, DLB_LSP_LDB_SCHED_CTRL, r2.val);
+
+ dlb_flush_csr(hw);
+
+ return 0;
+}
+
+static void dlb_ldb_port_clear_has_work_bits(struct dlb_hw *hw,
+ struct dlb_ldb_port *port,
+ u8 slot)
+{
+ union dlb_lsp_ldb_sched_ctrl r2 = { {0} };
+
+ r2.field.cq = port->id.phys_id;
+ r2.field.qidix = slot;
+ r2.field.value = 0;
+ r2.field.rlist_haswork_v = 1;
+
+ DLB_CSR_WR(hw, DLB_LSP_LDB_SCHED_CTRL, r2.val);
+
+ memset(&r2, 0, sizeof(r2));
+
+ r2.field.cq = port->id.phys_id;
+ r2.field.qidix = slot;
+ r2.field.value = 0;
+ r2.field.nalb_haswork_v = 1;
+
+ DLB_CSR_WR(hw, DLB_LSP_LDB_SCHED_CTRL, r2.val);
+
+ dlb_flush_csr(hw);
+}
+
+static void dlb_ldb_port_clear_queue_if_status(struct dlb_hw *hw,
+ struct dlb_ldb_port *port,
+ int slot)
+{
+ union dlb_lsp_ldb_sched_ctrl r0 = { {0} };
+
+ r0.field.cq = port->id.phys_id;
+ r0.field.qidix = slot;
+ r0.field.value = 0;
+ r0.field.inflight_ok_v = 1;
+
+ DLB_CSR_WR(hw, DLB_LSP_LDB_SCHED_CTRL, r0.val);
+
+ dlb_flush_csr(hw);
+}
+
+static void dlb_ldb_port_set_queue_if_status(struct dlb_hw *hw,
+ struct dlb_ldb_port *port,
+ int slot)
+{
+ union dlb_lsp_ldb_sched_ctrl r0 = { {0} };
+
+ r0.field.cq = port->id.phys_id;
+ r0.field.qidix = slot;
+ r0.field.value = 1;
+ r0.field.inflight_ok_v = 1;
+
+ DLB_CSR_WR(hw, DLB_LSP_LDB_SCHED_CTRL, r0.val);
+
+ dlb_flush_csr(hw);
+}
+
+static void dlb_ldb_queue_set_inflight_limit(struct dlb_hw *hw,
+ struct dlb_ldb_queue *queue)
+{
+ union dlb_lsp_qid_ldb_infl_lim r0 = { {0} };
+
+ r0.field.limit = queue->num_qid_inflights;
+
+ DLB_CSR_WR(hw, DLB_LSP_QID_LDB_INFL_LIM(queue->id.phys_id), r0.val);
+}
+
+static void dlb_ldb_queue_clear_inflight_limit(struct dlb_hw *hw,
+ struct dlb_ldb_queue *queue)
+{
+ DLB_CSR_WR(hw,
+ DLB_LSP_QID_LDB_INFL_LIM(queue->id.phys_id),
+ DLB_LSP_QID_LDB_INFL_LIM_RST);
+}
+
+/* dlb_ldb_queue_{enable, disable}_mapped_cqs() don't operate exactly as their
+ * function names imply, and should only be called by the dynamic CQ mapping
+ * code.
+ */
+static void dlb_ldb_queue_disable_mapped_cqs(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ struct dlb_ldb_queue *queue)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_ldb_port *port;
+ int slot;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, port, iter) {
+ enum dlb_qid_map_state state = DLB_QUEUE_MAPPED;
+
+ if (!dlb_port_find_slot_queue(port, state, queue, &slot))
+ continue;
+
+ if (port->enabled)
+ dlb_ldb_port_cq_disable(hw, port);
+ }
+}
+
+static void dlb_ldb_queue_enable_mapped_cqs(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ struct dlb_ldb_queue *queue)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_ldb_port *port;
+ int slot;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, port, iter) {
+ enum dlb_qid_map_state state = DLB_QUEUE_MAPPED;
+
+ if (!dlb_port_find_slot_queue(port, state, queue, &slot))
+ continue;
+
+ if (port->enabled)
+ dlb_ldb_port_cq_enable(hw, port);
+ }
+}
+
+static int dlb_ldb_port_finish_map_qid_dynamic(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ struct dlb_ldb_port *port,
+ struct dlb_ldb_queue *queue)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ union dlb_lsp_qid_ldb_infl_cnt r0;
+ enum dlb_qid_map_state state;
+ int slot, ret;
+ u8 prio;
+
+ r0.val = DLB_CSR_RD(hw, DLB_LSP_QID_LDB_INFL_CNT(queue->id.phys_id));
+
+ if (r0.field.count) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: non-zero QID inflight count\n",
+ __func__);
+ return -EFAULT;
+ }
+
+ /* For each port with a pending mapping to this queue, perform the
+ * static mapping and set the corresponding has_work bits.
+ */
+ state = DLB_QUEUE_MAP_IN_PROGRESS;
+ if (!dlb_port_find_slot_queue(port, state, queue, &slot))
+ return -EINVAL;
+
+ if (slot >= DLB_MAX_NUM_QIDS_PER_LDB_CQ) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: port slot tracking failed\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ prio = port->qid_map[slot].priority;
+
+ /* Update the CQ2QID, CQ2PRIOV, and QID2CQIDX registers, and
+ * the port's qid_map state.
+ */
+ ret = dlb_ldb_port_map_qid_static(hw, port, queue, prio);
+ if (ret)
+ return ret;
+
+ ret = dlb_ldb_port_set_has_work_bits(hw, port, queue, slot);
+ if (ret)
+ return ret;
+
+ /* Ensure IF_status(cq,qid) is 0 before enabling the port to
+ * prevent spurious schedules to cause the queue's inflight
+ * count to increase.
+ */
+ dlb_ldb_port_clear_queue_if_status(hw, port, slot);
+
+ /* Reset the queue's inflight status */
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, port, iter) {
+ state = DLB_QUEUE_MAPPED;
+ if (!dlb_port_find_slot_queue(port, state, queue, &slot))
+ continue;
+
+ dlb_ldb_port_set_queue_if_status(hw, port, slot);
+ }
+
+ dlb_ldb_queue_set_inflight_limit(hw, queue);
+
+ /* Re-enable CQs mapped to this queue */
+ dlb_ldb_queue_enable_mapped_cqs(hw, domain, queue);
+
+ /* If this queue has other mappings pending, clear its inflight limit */
+ if (queue->num_pending_additions > 0)
+ dlb_ldb_queue_clear_inflight_limit(hw, queue);
+
+ return 0;
+}
+
+/**
+ * dlb_ldb_port_map_qid_dynamic() - perform a "dynamic" QID->CQ mapping
+ * @hw: dlb_hw handle for a particular device.
+ * @port: load-balanced port
+ * @queue: load-balanced queue
+ * @priority: queue servicing priority
+ *
+ * Returns 0 if the queue was mapped, 1 if the mapping is scheduled to occur
+ * at a later point, and <0 if an error occurred.
+ */
+static int dlb_ldb_port_map_qid_dynamic(struct dlb_hw *hw,
+ struct dlb_ldb_port *port,
+ struct dlb_ldb_queue *queue,
+ u8 priority)
+{
+ union dlb_lsp_qid_ldb_infl_cnt r0 = { {0} };
+ enum dlb_qid_map_state state;
+ struct dlb_domain *domain;
+ int slot, ret;
+
+ domain = dlb_get_domain_from_id(hw, port->domain_id.phys_id, false, 0);
+ if (!domain) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: unable to find domain %d\n",
+ __func__, port->domain_id.phys_id);
+ return -EFAULT;
+ }
+
+ /* Set the QID inflight limit to 0 to prevent further scheduling of the
+ * queue.
+ */
+ DLB_CSR_WR(hw, DLB_LSP_QID_LDB_INFL_LIM(queue->id.phys_id), 0);
+
+ if (!dlb_port_find_slot(port, DLB_QUEUE_UNMAPPED, &slot)) {
+ DLB_HW_ERR(hw,
+ "Internal error: No available unmapped slots\n");
+ return -EFAULT;
+ }
+
+ if (slot >= DLB_MAX_NUM_QIDS_PER_LDB_CQ) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: port slot tracking failed\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ port->qid_map[slot].qid = queue->id.phys_id;
+ port->qid_map[slot].priority = priority;
+
+ state = DLB_QUEUE_MAP_IN_PROGRESS;
+ ret = dlb_port_slot_state_transition(hw, port, queue, slot, state);
+ if (ret)
+ return ret;
+
+ r0.val = DLB_CSR_RD(hw, DLB_LSP_QID_LDB_INFL_CNT(queue->id.phys_id));
+
+ if (r0.field.count) {
+ /* The queue is owed completions so it's not safe to map it
+ * yet. Schedule a kernel thread to complete the mapping later,
+ * once software has completed all the queue's inflight events.
+ */
+ if (!os_worker_active(hw))
+ os_schedule_work(hw);
+
+ return 1;
+ }
+
+ /* Disable the affected CQ, and the CQs already mapped to the QID,
+ * before reading the QID's inflight count a second time. There is an
+ * unlikely race in which the QID may schedule one more QE after we
+ * read an inflight count of 0, and disabling the CQs guarantees that
+ * the race will not occur after a re-read of the inflight count
+ * register.
+ */
+ if (port->enabled)
+ dlb_ldb_port_cq_disable(hw, port);
+
+ dlb_ldb_queue_disable_mapped_cqs(hw, domain, queue);
+
+ r0.val = DLB_CSR_RD(hw, DLB_LSP_QID_LDB_INFL_CNT(queue->id.phys_id));
+
+ if (r0.field.count) {
+ if (port->enabled)
+ dlb_ldb_port_cq_enable(hw, port);
+
+ dlb_ldb_queue_enable_mapped_cqs(hw, domain, queue);
+
+ /* The queue is owed completions so it's not safe to map it
+ * yet. Schedule a kernel thread to complete the mapping later,
+ * once software has completed all the queue's inflight events.
+ */
+ if (!os_worker_active(hw))
+ os_schedule_work(hw);
+
+ return 1;
+ }
+
+ return dlb_ldb_port_finish_map_qid_dynamic(hw, domain, port, queue);
+}
+
+static int dlb_ldb_port_map_qid(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ struct dlb_ldb_port *port,
+ struct dlb_ldb_queue *queue,
+ u8 prio)
+{
+ if (domain->started)
+ return dlb_ldb_port_map_qid_dynamic(hw, port, queue, prio);
+ else
+ return dlb_ldb_port_map_qid_static(hw, port, queue, prio);
+}
+
+static int dlb_ldb_port_unmap_qid(struct dlb_hw *hw,
+ struct dlb_ldb_port *port,
+ struct dlb_ldb_queue *queue)
+{
+ enum dlb_qid_map_state mapped, in_progress, pending_map, unmapped;
+ union dlb_lsp_cq2priov r0;
+ union dlb_atm_pipe_qid_ldb_qid2cqidx r1;
+ union dlb_lsp_qid_ldb_qid2cqidx r2;
+ union dlb_lsp_qid_ldb_qid2cqidx2 r3;
+ u32 queue_id;
+ u32 port_id;
+ int i;
+
+ /* Find the queue's slot */
+ mapped = DLB_QUEUE_MAPPED;
+ in_progress = DLB_QUEUE_UNMAP_IN_PROGRESS;
+ pending_map = DLB_QUEUE_UNMAP_IN_PROGRESS_PENDING_MAP;
+
+ if (!dlb_port_find_slot_queue(port, mapped, queue, &i) &&
+ !dlb_port_find_slot_queue(port, in_progress, queue, &i) &&
+ !dlb_port_find_slot_queue(port, pending_map, queue, &i)) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: QID %d isn't mapped\n",
+ __func__, __LINE__, queue->id.phys_id);
+ return -EFAULT;
+ }
+
+ if (i >= DLB_MAX_NUM_QIDS_PER_LDB_CQ) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: port slot tracking failed\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ port_id = port->id.phys_id;
+ queue_id = queue->id.phys_id;
+
+ /* Read-modify-write the priority and valid bit register */
+ r0.val = DLB_CSR_RD(hw, DLB_LSP_CQ2PRIOV(port_id));
+
+ r0.field.v &= ~(1 << i);
+
+ DLB_CSR_WR(hw, DLB_LSP_CQ2PRIOV(port_id), r0.val);
+
+ r1.val = DLB_CSR_RD(hw,
+ DLB_ATM_PIPE_QID_LDB_QID2CQIDX(queue_id,
+ port_id / 4));
+
+ r2.val = DLB_CSR_RD(hw,
+ DLB_LSP_QID_LDB_QID2CQIDX(queue_id,
+ port_id / 4));
+
+ r3.val = DLB_CSR_RD(hw,
+ DLB_LSP_QID_LDB_QID2CQIDX2(queue_id,
+ port_id / 4));
+
+ switch (port_id % 4) {
+ case 0:
+ r1.field.cq_p0 &= ~(1 << i);
+ r2.field.cq_p0 &= ~(1 << i);
+ r3.field.cq_p0 &= ~(1 << i);
+ break;
+
+ case 1:
+ r1.field.cq_p1 &= ~(1 << i);
+ r2.field.cq_p1 &= ~(1 << i);
+ r3.field.cq_p1 &= ~(1 << i);
+ break;
+
+ case 2:
+ r1.field.cq_p2 &= ~(1 << i);
+ r2.field.cq_p2 &= ~(1 << i);
+ r3.field.cq_p2 &= ~(1 << i);
+ break;
+
+ case 3:
+ r1.field.cq_p3 &= ~(1 << i);
+ r2.field.cq_p3 &= ~(1 << i);
+ r3.field.cq_p3 &= ~(1 << i);
+ break;
+ }
+
+ DLB_CSR_WR(hw,
+ DLB_ATM_PIPE_QID_LDB_QID2CQIDX(queue_id, port_id / 4),
+ r1.val);
+
+ DLB_CSR_WR(hw,
+ DLB_LSP_QID_LDB_QID2CQIDX(queue_id, port_id / 4),
+ r2.val);
+
+ DLB_CSR_WR(hw,
+ DLB_LSP_QID_LDB_QID2CQIDX2(queue_id, port_id / 4),
+ r3.val);
+
+ dlb_flush_csr(hw);
+
+ unmapped = DLB_QUEUE_UNMAPPED;
+
+ return dlb_port_slot_state_transition(hw, port, queue, i, unmapped);
+}
+
+static void
+dlb_log_create_sched_domain_args(struct dlb_hw *hw,
+ struct dlb_create_sched_domain_args *args,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ DLB_HW_INFO(hw, "DLB create sched domain arguments:\n");
+ if (vf_request)
+ DLB_HW_INFO(hw, "(Request from VF %d)\n", vf_id);
+ DLB_HW_INFO(hw, "\tNumber of LDB queues: %d\n",
+ args->num_ldb_queues);
+ DLB_HW_INFO(hw, "\tNumber of LDB ports: %d\n",
+ args->num_ldb_ports);
+ DLB_HW_INFO(hw, "\tNumber of DIR ports: %d\n",
+ args->num_dir_ports);
+ DLB_HW_INFO(hw, "\tNumber of ATM inflights: %d\n",
+ args->num_atomic_inflights);
+ DLB_HW_INFO(hw, "\tNumber of hist list entries: %d\n",
+ args->num_hist_list_entries);
+ DLB_HW_INFO(hw, "\tNumber of LDB credits: %d\n",
+ args->num_ldb_credits);
+ DLB_HW_INFO(hw, "\tNumber of DIR credits: %d\n",
+ args->num_dir_credits);
+ DLB_HW_INFO(hw, "\tNumber of LDB credit pools: %d\n",
+ args->num_ldb_credit_pools);
+ DLB_HW_INFO(hw, "\tNumber of DIR credit pools: %d\n",
+ args->num_dir_credit_pools);
+}
+
+/**
+ * dlb_hw_create_sched_domain() - Allocate and initialize a DLB scheduling
+ * domain and its resources.
+ * @hw: Contains the current state of the DLB hardware.
+ * @args: User-provided arguments.
+ * @resp: Response to user.
+ * @vf_request: Request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * Return: returns < 0 on error, 0 otherwise. If the driver is unable to
+ * satisfy a request, resp->status will be set accordingly.
+ */
+int dlb_hw_create_sched_domain(struct dlb_hw *hw,
+ struct dlb_create_sched_domain_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_domain *domain;
+ struct dlb_function_resources *rsrcs;
+ int ret;
+
+ rsrcs = (vf_request) ? &hw->vf[vf_id] : &hw->pf;
+
+ dlb_log_create_sched_domain_args(hw, args, vf_request, vf_id);
+
+ /* Verify that hardware resources are available before attempting to
+ * satisfy the request. This simplifies the error unwinding code.
+ */
+ if (dlb_verify_create_sched_domain_args(hw, rsrcs, args, resp))
+ return -EINVAL;
+
+ domain = DLB_FUNC_LIST_HEAD(rsrcs->avail_domains, typeof(*domain));
+
+ /* Verification should catch this. */
+ if (!domain) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: no available domains\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ if (domain->configured) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: avail_domains contains configured domains.\n",
+ __func__);
+ return -EFAULT;
+ }
+
+ dlb_init_domain_rsrc_lists(domain);
+
+ /* Verification should catch this too. */
+ ret = dlb_domain_attach_resources(hw, rsrcs, domain, args, resp);
+ if (ret < 0) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: failed to verify args.\n",
+ __func__);
+
+ return -EFAULT;
+ }
+
+ dlb_list_del(&rsrcs->avail_domains, &domain->func_list);
+
+ dlb_list_add(&rsrcs->used_domains, &domain->func_list);
+
+ resp->id = (vf_request) ? domain->id.virt_id : domain->id.phys_id;
+ resp->status = 0;
+
+ return 0;
+}
+
+static void
+dlb_log_create_ldb_pool_args(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_create_ldb_pool_args *args,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ DLB_HW_INFO(hw, "DLB create load-balanced credit pool arguments:\n");
+ if (vf_request)
+ DLB_HW_INFO(hw, "(Request from VF %d)\n", vf_id);
+ DLB_HW_INFO(hw, "\tDomain ID: %d\n", domain_id);
+ DLB_HW_INFO(hw, "\tNumber of LDB credits: %d\n",
+ args->num_ldb_credits);
+}
+
+/**
+ * dlb_hw_create_ldb_pool() - Allocate and initialize a DLB credit pool.
+ * @hw: Contains the current state of the DLB hardware.
+ * @args: User-provided arguments.
+ * @resp: Response to user.
+ *
+ * Return: returns < 0 on error, 0 otherwise. If the driver is unable to
+ * satisfy a request, resp->status will be set accordingly.
+ */
+int dlb_hw_create_ldb_pool(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_create_ldb_pool_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_credit_pool *pool;
+ struct dlb_domain *domain;
+
+ dlb_log_create_ldb_pool_args(hw, domain_id, args, vf_request, vf_id);
+
+ /* Verify that hardware resources are available before attempting to
+ * satisfy the request. This simplifies the error unwinding code.
+ */
+ if (dlb_verify_create_ldb_pool_args(hw,
+ domain_id,
+ args,
+ resp,
+ vf_request,
+ vf_id))
+ return -EINVAL;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+ if (!domain) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: domain not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ pool = DLB_DOM_LIST_HEAD(domain->avail_ldb_credit_pools, typeof(*pool));
+
+ /* Verification should catch this. */
+ if (!pool) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: no available ldb credit pools\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ dlb_configure_ldb_credit_pool(hw, domain, args, pool);
+
+ /* Configuration succeeded, so move the resource from the 'avail' to
+ * the 'used' list.
+ */
+ dlb_list_del(&domain->avail_ldb_credit_pools, &pool->domain_list);
+
+ dlb_list_add(&domain->used_ldb_credit_pools, &pool->domain_list);
+
+ resp->status = 0;
+ resp->id = (vf_request) ? pool->id.virt_id : pool->id.phys_id;
+
+ return 0;
+}
+
+static void
+dlb_log_create_dir_pool_args(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_create_dir_pool_args *args,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ DLB_HW_INFO(hw, "DLB create directed credit pool arguments:\n");
+ if (vf_request)
+ DLB_HW_INFO(hw, "(Request from VF %d)\n", vf_id);
+ DLB_HW_INFO(hw, "\tDomain ID: %d\n", domain_id);
+ DLB_HW_INFO(hw, "\tNumber of DIR credits: %d\n",
+ args->num_dir_credits);
+}
+
+/**
+ * dlb_hw_create_dir_pool() - Allocate and initialize a DLB credit pool.
+ * @hw: Contains the current state of the DLB hardware.
+ * @args: User-provided arguments.
+ * @resp: Response to user.
+ *
+ * Return: returns < 0 on error, 0 otherwise. If the driver is unable to
+ * satisfy a request, resp->status will be set accordingly.
+ */
+int dlb_hw_create_dir_pool(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_create_dir_pool_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_credit_pool *pool;
+ struct dlb_domain *domain;
+
+ dlb_log_create_dir_pool_args(hw, domain_id, args, vf_request, vf_id);
+
+ /* Verify that hardware resources are available before attempting to
+ * satisfy the request. This simplifies the error unwinding code.
+ */
+ /* At least one available pool */
+ if (dlb_verify_create_dir_pool_args(hw,
+ domain_id,
+ args,
+ resp,
+ vf_request,
+ vf_id))
+ return -EINVAL;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+ if (!domain) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: domain not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ pool = DLB_DOM_LIST_HEAD(domain->avail_dir_credit_pools, typeof(*pool));
+
+ /* Verification should catch this. */
+ if (!pool) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: no available dir credit pools\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ dlb_configure_dir_credit_pool(hw, domain, args, pool);
+
+ /* Configuration succeeded, so move the resource from the 'avail' to
+ * the 'used' list.
+ */
+ dlb_list_del(&domain->avail_dir_credit_pools, &pool->domain_list);
+
+ dlb_list_add(&domain->used_dir_credit_pools, &pool->domain_list);
+
+ resp->status = 0;
+ resp->id = (vf_request) ? pool->id.virt_id : pool->id.phys_id;
+
+ return 0;
+}
+
+static void
+dlb_log_create_ldb_queue_args(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_create_ldb_queue_args *args,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ DLB_HW_INFO(hw, "DLB create load-balanced queue arguments:\n");
+ if (vf_request)
+ DLB_HW_INFO(hw, "(Request from VF %d)\n", vf_id);
+ DLB_HW_INFO(hw, "\tDomain ID: %d\n",
+ domain_id);
+ DLB_HW_INFO(hw, "\tNumber of sequence numbers: %d\n",
+ args->num_sequence_numbers);
+ DLB_HW_INFO(hw, "\tNumber of QID inflights: %d\n",
+ args->num_qid_inflights);
+ DLB_HW_INFO(hw, "\tNumber of ATM inflights: %d\n",
+ args->num_atomic_inflights);
+}
+
+/**
+ * dlb_hw_create_ldb_queue() - Allocate and initialize a DLB LDB queue.
+ * @hw: Contains the current state of the DLB hardware.
+ * @args: User-provided arguments.
+ * @resp: Response to user.
+ *
+ * Return: returns < 0 on error, 0 otherwise. If the driver is unable to
+ * satisfy a request, resp->status will be set accordingly.
+ */
+int dlb_hw_create_ldb_queue(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_create_ldb_queue_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_ldb_queue *queue;
+ struct dlb_domain *domain;
+ int ret;
+
+ dlb_log_create_ldb_queue_args(hw, domain_id, args, vf_request, vf_id);
+
+ /* Verify that hardware resources are available before attempting to
+ * satisfy the request. This simplifies the error unwinding code.
+ */
+ /* At least one available queue */
+ if (dlb_verify_create_ldb_queue_args(hw,
+ domain_id,
+ args,
+ resp,
+ vf_request,
+ vf_id))
+ return -EINVAL;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+ if (!domain) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: domain not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ queue = DLB_DOM_LIST_HEAD(domain->avail_ldb_queues, typeof(*queue));
+
+ /* Verification should catch this. */
+ if (!queue) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: no available ldb queues\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ ret = dlb_ldb_queue_attach_resources(hw, domain, queue, args);
+ if (ret < 0) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: failed to attach the ldb queue resources\n",
+ __func__, __LINE__);
+ return ret;
+ }
+
+ dlb_configure_ldb_queue(hw, domain, queue, args, vf_request, vf_id);
+
+ queue->num_mappings = 0;
+
+ queue->configured = true;
+
+ /* Configuration succeeded, so move the resource from the 'avail' to
+ * the 'used' list.
+ */
+ dlb_list_del(&domain->avail_ldb_queues, &queue->domain_list);
+
+ dlb_list_add(&domain->used_ldb_queues, &queue->domain_list);
+
+ resp->status = 0;
+ resp->id = (vf_request) ? queue->id.virt_id : queue->id.phys_id;
+
+ return 0;
+}
+
+static void
+dlb_log_create_dir_queue_args(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_create_dir_queue_args *args,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ DLB_HW_INFO(hw, "DLB create directed queue arguments:\n");
+ if (vf_request)
+ DLB_HW_INFO(hw, "(Request from VF %d)\n", vf_id);
+ DLB_HW_INFO(hw, "\tDomain ID: %d\n", domain_id);
+ DLB_HW_INFO(hw, "\tPort ID: %d\n", args->port_id);
+}
+
+/**
+ * dlb_hw_create_dir_queue() - Allocate and initialize a DLB DIR queue.
+ * @hw: Contains the current state of the DLB hardware.
+ * @args: User-provided arguments.
+ * @resp: Response to user.
+ *
+ * Return: returns < 0 on error, 0 otherwise. If the driver is unable to
+ * satisfy a request, resp->status will be set accordingly.
+ */
+int dlb_hw_create_dir_queue(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_create_dir_queue_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_dir_pq_pair *queue;
+ struct dlb_domain *domain;
+
+ dlb_log_create_dir_queue_args(hw, domain_id, args, vf_request, vf_id);
+
+ /* Verify that hardware resources are available before attempting to
+ * satisfy the request. This simplifies the error unwinding code.
+ */
+ if (dlb_verify_create_dir_queue_args(hw,
+ domain_id,
+ args,
+ resp,
+ vf_request,
+ vf_id))
+ return -EINVAL;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+ if (!domain) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: domain not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ if (args->port_id != -1)
+ queue = dlb_get_domain_used_dir_pq(args->port_id,
+ vf_request,
+ domain);
+ else
+ queue = DLB_DOM_LIST_HEAD(domain->avail_dir_pq_pairs,
+ typeof(*queue));
+
+ /* Verification should catch this. */
+ if (!queue) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: no available dir queues\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ dlb_configure_dir_queue(hw, domain, queue, vf_request, vf_id);
+
+ /* Configuration succeeded, so move the resource from the 'avail' to
+ * the 'used' list (if it's not already there).
+ */
+ if (args->port_id == -1) {
+ dlb_list_del(&domain->avail_dir_pq_pairs, &queue->domain_list);
+
+ dlb_list_add(&domain->used_dir_pq_pairs, &queue->domain_list);
+ }
+
+ resp->status = 0;
+
+ resp->id = (vf_request) ? queue->id.virt_id : queue->id.phys_id;
+
+ return 0;
+}
+
+static void dlb_log_create_ldb_port_args(struct dlb_hw *hw,
+ u32 domain_id,
+ u64 pop_count_dma_base,
+ u64 cq_dma_base,
+ struct dlb_create_ldb_port_args *args,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ DLB_HW_INFO(hw, "DLB create load-balanced port arguments:\n");
+ if (vf_request)
+ DLB_HW_INFO(hw, "(Request from VF %d)\n", vf_id);
+ DLB_HW_INFO(hw, "\tDomain ID: %d\n",
+ domain_id);
+ DLB_HW_INFO(hw, "\tLDB credit pool ID: %d\n",
+ args->ldb_credit_pool_id);
+ DLB_HW_INFO(hw, "\tLDB credit high watermark: %d\n",
+ args->ldb_credit_high_watermark);
+ DLB_HW_INFO(hw, "\tLDB credit low watermark: %d\n",
+ args->ldb_credit_low_watermark);
+ DLB_HW_INFO(hw, "\tLDB credit quantum: %d\n",
+ args->ldb_credit_quantum);
+ DLB_HW_INFO(hw, "\tDIR credit pool ID: %d\n",
+ args->dir_credit_pool_id);
+ DLB_HW_INFO(hw, "\tDIR credit high watermark: %d\n",
+ args->dir_credit_high_watermark);
+ DLB_HW_INFO(hw, "\tDIR credit low watermark: %d\n",
+ args->dir_credit_low_watermark);
+ DLB_HW_INFO(hw, "\tDIR credit quantum: %d\n",
+ args->dir_credit_quantum);
+ DLB_HW_INFO(hw, "\tpop_count_address: 0x%"PRIx64"\n",
+ pop_count_dma_base);
+ DLB_HW_INFO(hw, "\tCQ depth: %d\n",
+ args->cq_depth);
+ DLB_HW_INFO(hw, "\tCQ hist list size: %d\n",
+ args->cq_history_list_size);
+ DLB_HW_INFO(hw, "\tCQ base address: 0x%"PRIx64"\n",
+ cq_dma_base);
+}
+
+/**
+ * dlb_hw_create_ldb_port() - Allocate and initialize a load-balanced port and
+ * its resources.
+ * @hw: Contains the current state of the DLB hardware.
+ * @args: User-provided arguments.
+ * @resp: Response to user.
+ *
+ * Return: returns < 0 on error, 0 otherwise. If the driver is unable to
+ * satisfy a request, resp->status will be set accordingly.
+ */
+int dlb_hw_create_ldb_port(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_create_ldb_port_args *args,
+ u64 pop_count_dma_base,
+ u64 cq_dma_base,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_ldb_port *port;
+ struct dlb_domain *domain;
+ int ret;
+
+ dlb_log_create_ldb_port_args(hw,
+ domain_id,
+ pop_count_dma_base,
+ cq_dma_base,
+ args,
+ vf_request,
+ vf_id);
+
+ /* Verify that hardware resources are available before attempting to
+ * satisfy the request. This simplifies the error unwinding code.
+ */
+ if (dlb_verify_create_ldb_port_args(hw,
+ domain_id,
+ pop_count_dma_base,
+ cq_dma_base,
+ args,
+ resp,
+ vf_request,
+ vf_id))
+ return -EINVAL;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+ if (!domain) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: domain not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ port = DLB_DOM_LIST_HEAD(domain->avail_ldb_ports, typeof(*port));
+
+ /* Verification should catch this. */
+ if (!port) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: no available ldb ports\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ if (port->configured) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: avail_ldb_ports contains configured ports.\n",
+ __func__);
+ return -EFAULT;
+ }
+
+ ret = dlb_configure_ldb_port(hw,
+ domain,
+ port,
+ pop_count_dma_base,
+ cq_dma_base,
+ args,
+ vf_request,
+ vf_id);
+ if (ret < 0)
+ return ret;
+
+ /* Configuration succeeded, so move the resource from the 'avail' to
+ * the 'used' list.
+ */
+ dlb_list_del(&domain->avail_ldb_ports, &port->domain_list);
+
+ dlb_list_add(&domain->used_ldb_ports, &port->domain_list);
+
+ resp->status = 0;
+ resp->id = (vf_request) ? port->id.virt_id : port->id.phys_id;
+
+ return 0;
+}
+
+static void dlb_log_create_dir_port_args(struct dlb_hw *hw,
+ u32 domain_id,
+ u64 pop_count_dma_base,
+ u64 cq_dma_base,
+ struct dlb_create_dir_port_args *args,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ DLB_HW_INFO(hw, "DLB create directed port arguments:\n");
+ if (vf_request)
+ DLB_HW_INFO(hw, "(Request from VF %d)\n", vf_id);
+ DLB_HW_INFO(hw, "\tDomain ID: %d\n",
+ domain_id);
+ DLB_HW_INFO(hw, "\tLDB credit pool ID: %d\n",
+ args->ldb_credit_pool_id);
+ DLB_HW_INFO(hw, "\tLDB credit high watermark: %d\n",
+ args->ldb_credit_high_watermark);
+ DLB_HW_INFO(hw, "\tLDB credit low watermark: %d\n",
+ args->ldb_credit_low_watermark);
+ DLB_HW_INFO(hw, "\tLDB credit quantum: %d\n",
+ args->ldb_credit_quantum);
+ DLB_HW_INFO(hw, "\tDIR credit pool ID: %d\n",
+ args->dir_credit_pool_id);
+ DLB_HW_INFO(hw, "\tDIR credit high watermark: %d\n",
+ args->dir_credit_high_watermark);
+ DLB_HW_INFO(hw, "\tDIR credit low watermark: %d\n",
+ args->dir_credit_low_watermark);
+ DLB_HW_INFO(hw, "\tDIR credit quantum: %d\n",
+ args->dir_credit_quantum);
+ DLB_HW_INFO(hw, "\tpop_count_address: 0x%"PRIx64"\n",
+ pop_count_dma_base);
+ DLB_HW_INFO(hw, "\tCQ depth: %d\n",
+ args->cq_depth);
+ DLB_HW_INFO(hw, "\tCQ base address: 0x%"PRIx64"\n",
+ cq_dma_base);
+}
+
+/**
+ * dlb_hw_create_dir_port() - Allocate and initialize a DLB directed port and
+ * queue. The port/queue pair have the same ID and name.
+ * @hw: Contains the current state of the DLB hardware.
+ * @args: User-provided arguments.
+ * @resp: Response to user.
+ *
+ * Return: returns < 0 on error, 0 otherwise. If the driver is unable to
+ * satisfy a request, resp->status will be set accordingly.
+ */
+int dlb_hw_create_dir_port(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_create_dir_port_args *args,
+ u64 pop_count_dma_base,
+ u64 cq_dma_base,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_dir_pq_pair *port;
+ struct dlb_domain *domain;
+ int ret;
+
+ dlb_log_create_dir_port_args(hw,
+ domain_id,
+ pop_count_dma_base,
+ cq_dma_base,
+ args,
+ vf_request,
+ vf_id);
+
+ /* Verify that hardware resources are available before attempting to
+ * satisfy the request. This simplifies the error unwinding code.
+ */
+ if (dlb_verify_create_dir_port_args(hw,
+ domain_id,
+ pop_count_dma_base,
+ cq_dma_base,
+ args,
+ resp,
+ vf_request,
+ vf_id))
+ return -EINVAL;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+ if (!domain) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: domain not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ if (args->queue_id != -1)
+ port = dlb_get_domain_used_dir_pq(args->queue_id,
+ vf_request,
+ domain);
+ else
+ port = DLB_DOM_LIST_HEAD(domain->avail_dir_pq_pairs,
+ typeof(*port));
+
+ /* Verification should catch this. */
+ if (!port) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: no available dir ports\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ ret = dlb_configure_dir_port(hw,
+ domain,
+ port,
+ pop_count_dma_base,
+ cq_dma_base,
+ args,
+ vf_request,
+ vf_id);
+ if (ret < 0)
+ return ret;
+
+ /* Configuration succeeded, so move the resource from the 'avail' to
+ * the 'used' list (if it's not already there).
+ */
+ if (args->queue_id == -1) {
+ dlb_list_del(&domain->avail_dir_pq_pairs, &port->domain_list);
+
+ dlb_list_add(&domain->used_dir_pq_pairs, &port->domain_list);
+ }
+
+ resp->status = 0;
+ resp->id = (vf_request) ? port->id.virt_id : port->id.phys_id;
+
+ return 0;
+}
+
+static void dlb_log_start_domain(struct dlb_hw *hw,
+ u32 domain_id,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ DLB_HW_INFO(hw, "DLB start domain arguments:\n");
+ if (vf_request)
+ DLB_HW_INFO(hw, "(Request from VF %d)\n", vf_id);
+ DLB_HW_INFO(hw, "\tDomain ID: %d\n", domain_id);
+}
+
+/**
+ * dlb_hw_start_domain() - Lock the domain configuration
+ * @hw: Contains the current state of the DLB hardware.
+ * @args: User-provided arguments.
+ * @resp: Response to user.
+ *
+ * Return: returns < 0 on error, 0 otherwise. If the driver is unable to
+ * satisfy a request, resp->status will be set accordingly.
+ */
+int dlb_hw_start_domain(struct dlb_hw *hw,
+ u32 domain_id,
+ __attribute((unused)) struct dlb_start_domain_args *arg,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_dir_pq_pair *dir_queue;
+ struct dlb_ldb_queue *ldb_queue;
+ struct dlb_credit_pool *pool;
+ struct dlb_domain *domain;
+
+ dlb_log_start_domain(hw, domain_id, vf_request, vf_id);
+
+ if (dlb_verify_start_domain_args(hw,
+ domain_id,
+ resp,
+ vf_request,
+ vf_id))
+ return -EINVAL;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+ if (!domain) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: domain not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ /* Write the domain's pool credit counts, which have been updated
+ * during port configuration. The sum of the pool credit count plus
+ * each producer port's credit count must equal the pool's credit
+ * allocation *before* traffic is sent.
+ */
+ DLB_DOM_LIST_FOR(domain->used_ldb_credit_pools, pool, iter)
+ dlb_ldb_pool_write_credit_count_reg(hw, pool->id.phys_id);
+
+ DLB_DOM_LIST_FOR(domain->used_dir_credit_pools, pool, iter)
+ dlb_dir_pool_write_credit_count_reg(hw, pool->id.phys_id);
+
+ /* Enable load-balanced and directed queue write permissions for the
+ * queues this domain owns. Without this, the DLB will drop all
+ * incoming traffic to those queues.
+ */
+ DLB_DOM_LIST_FOR(domain->used_ldb_queues, ldb_queue, iter) {
+ union dlb_sys_ldb_vasqid_v r0 = { {0} };
+ unsigned int offs;
+
+ r0.field.vasqid_v = 1;
+
+ offs = domain->id.phys_id * DLB_MAX_NUM_LDB_QUEUES +
+ ldb_queue->id.phys_id;
+
+ DLB_CSR_WR(hw, DLB_SYS_LDB_VASQID_V(offs), r0.val);
+ }
+
+ DLB_DOM_LIST_FOR(domain->used_dir_pq_pairs, dir_queue, iter) {
+ union dlb_sys_dir_vasqid_v r0 = { {0} };
+ unsigned int offs;
+
+ r0.field.vasqid_v = 1;
+
+ offs = domain->id.phys_id * DLB_MAX_NUM_DIR_PORTS +
+ dir_queue->id.phys_id;
+
+ DLB_CSR_WR(hw, DLB_SYS_DIR_VASQID_V(offs), r0.val);
+ }
+
+ dlb_flush_csr(hw);
+
+ domain->started = true;
+
+ resp->status = 0;
+
+ return 0;
+}
+
+static void dlb_domain_finish_unmap_port_slot(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ struct dlb_ldb_port *port,
+ int slot)
+{
+ enum dlb_qid_map_state state;
+ struct dlb_ldb_queue *queue;
+
+ queue = &hw->rsrcs.ldb_queues[port->qid_map[slot].qid];
+
+ state = port->qid_map[slot].state;
+
+ /* Update the QID2CQIDX and CQ2QID vectors */
+ dlb_ldb_port_unmap_qid(hw, port, queue);
+
+ /* Ensure the QID will not be serviced by this {CQ, slot} by clearing
+ * the has_work bits
+ */
+ dlb_ldb_port_clear_has_work_bits(hw, port, slot);
+
+ /* Reset the {CQ, slot} to its default state */
+ dlb_ldb_port_set_queue_if_status(hw, port, slot);
+
+ /* Re-enable the CQ if it wasn't manually disabled by the user */
+ if (port->enabled)
+ dlb_ldb_port_cq_enable(hw, port);
+
+ /* If there is a mapping that is pending this slot's removal, perform
+ * the mapping now.
+ */
+ if (state == DLB_QUEUE_UNMAP_IN_PROGRESS_PENDING_MAP) {
+ struct dlb_ldb_port_qid_map *map;
+ struct dlb_ldb_queue *map_queue;
+ u8 prio;
+
+ map = &port->qid_map[slot];
+
+ map->qid = map->pending_qid;
+ map->priority = map->pending_priority;
+
+ map_queue = &hw->rsrcs.ldb_queues[map->qid];
+ prio = map->priority;
+
+ dlb_ldb_port_map_qid(hw, domain, port, map_queue, prio);
+ }
+}
+
+static bool dlb_domain_finish_unmap_port(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ struct dlb_ldb_port *port)
+{
+ union dlb_lsp_cq_ldb_infl_cnt r0;
+ int i;
+
+ if (port->num_pending_removals == 0)
+ return false;
+
+ /* The unmap requires all the CQ's outstanding inflights to be
+ * completed.
+ */
+ r0.val = DLB_CSR_RD(hw, DLB_LSP_CQ_LDB_INFL_CNT(port->id.phys_id));
+ if (r0.field.count > 0)
+ return false;
+
+ for (i = 0; i < DLB_MAX_NUM_QIDS_PER_LDB_CQ; i++) {
+ struct dlb_ldb_port_qid_map *map;
+
+ map = &port->qid_map[i];
+
+ if (map->state != DLB_QUEUE_UNMAP_IN_PROGRESS &&
+ map->state != DLB_QUEUE_UNMAP_IN_PROGRESS_PENDING_MAP)
+ continue;
+
+ dlb_domain_finish_unmap_port_slot(hw, domain, port, i);
+ }
+
+ return true;
+}
+
+static unsigned int
+dlb_domain_finish_unmap_qid_procedures(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_ldb_port *port;
+
+ if (!domain->configured || domain->num_pending_removals == 0)
+ return 0;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, port, iter)
+ dlb_domain_finish_unmap_port(hw, domain, port);
+
+ return domain->num_pending_removals;
+}
+
+unsigned int dlb_finish_unmap_qid_procedures(struct dlb_hw *hw)
+{
+ int i, num = 0;
+
+ /* Finish queue unmap jobs for any domain that needs it */
+ for (i = 0; i < DLB_MAX_NUM_DOMAINS; i++) {
+ struct dlb_domain *domain = &hw->domains[i];
+
+ num += dlb_domain_finish_unmap_qid_procedures(hw, domain);
+ }
+
+ return num;
+}
+
+static void dlb_domain_finish_map_port(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ struct dlb_ldb_port *port)
+{
+ int i;
+
+ for (i = 0; i < DLB_MAX_NUM_QIDS_PER_LDB_CQ; i++) {
+ union dlb_lsp_qid_ldb_infl_cnt r0;
+ struct dlb_ldb_queue *queue;
+ int qid;
+
+ if (port->qid_map[i].state != DLB_QUEUE_MAP_IN_PROGRESS)
+ continue;
+
+ qid = port->qid_map[i].qid;
+
+ queue = dlb_get_ldb_queue_from_id(hw, qid, false, 0);
+
+ if (!queue) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: unable to find queue %d\n",
+ __func__, qid);
+ continue;
+ }
+
+ r0.val = DLB_CSR_RD(hw, DLB_LSP_QID_LDB_INFL_CNT(qid));
+
+ if (r0.field.count)
+ continue;
+
+ /* Disable the affected CQ, and the CQs already mapped to the
+ * QID, before reading the QID's inflight count a second time.
+ * There is an unlikely race in which the QID may schedule one
+ * more QE after we read an inflight count of 0, and disabling
+ * the CQs guarantees that the race will not occur after a
+ * re-read of the inflight count register.
+ */
+ if (port->enabled)
+ dlb_ldb_port_cq_disable(hw, port);
+
+ dlb_ldb_queue_disable_mapped_cqs(hw, domain, queue);
+
+ r0.val = DLB_CSR_RD(hw, DLB_LSP_QID_LDB_INFL_CNT(qid));
+
+ if (r0.field.count) {
+ if (port->enabled)
+ dlb_ldb_port_cq_enable(hw, port);
+
+ dlb_ldb_queue_enable_mapped_cqs(hw, domain, queue);
+
+ continue;
+ }
+
+ dlb_ldb_port_finish_map_qid_dynamic(hw, domain, port, queue);
+ }
+}
+
+static unsigned int
+dlb_domain_finish_map_qid_procedures(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_ldb_port *port;
+
+ if (!domain->configured || domain->num_pending_additions == 0)
+ return 0;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, port, iter)
+ dlb_domain_finish_map_port(hw, domain, port);
+
+ return domain->num_pending_additions;
+}
+
+unsigned int dlb_finish_map_qid_procedures(struct dlb_hw *hw)
+{
+ int i, num = 0;
+
+ /* Finish queue map jobs for any domain that needs it */
+ for (i = 0; i < DLB_MAX_NUM_DOMAINS; i++) {
+ struct dlb_domain *domain = &hw->domains[i];
+
+ num += dlb_domain_finish_map_qid_procedures(hw, domain);
+ }
+
+ return num;
+}
+
+static void dlb_log_map_qid(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_map_qid_args *args,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ DLB_HW_INFO(hw, "DLB map QID arguments:\n");
+ if (vf_request)
+ DLB_HW_INFO(hw, "(Request from VF %d)\n", vf_id);
+ DLB_HW_INFO(hw, "\tDomain ID: %d\n",
+ domain_id);
+ DLB_HW_INFO(hw, "\tPort ID: %d\n",
+ args->port_id);
+ DLB_HW_INFO(hw, "\tQueue ID: %d\n",
+ args->qid);
+ DLB_HW_INFO(hw, "\tPriority: %d\n",
+ args->priority);
+}
+
+int dlb_hw_map_qid(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_map_qid_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ enum dlb_qid_map_state state;
+ struct dlb_ldb_queue *queue;
+ struct dlb_ldb_port *port;
+ struct dlb_domain *domain;
+ int ret, i, id;
+ u8 prio;
+
+ dlb_log_map_qid(hw, domain_id, args, vf_request, vf_id);
+
+ /* Verify that hardware resources are available before attempting to
+ * satisfy the request. This simplifies the error unwinding code.
+ */
+ if (dlb_verify_map_qid_args(hw,
+ domain_id,
+ args,
+ resp,
+ vf_request,
+ vf_id))
+ return -EINVAL;
+
+ prio = args->priority;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+ if (!domain) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: domain not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ id = args->port_id;
+
+ port = dlb_get_domain_used_ldb_port(id, vf_request, domain);
+ if (!port) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: port not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ queue = dlb_get_domain_ldb_queue(args->qid, vf_request, domain);
+ if (!queue) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: queue not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ /* If there are any outstanding detach operations for this port,
+ * attempt to complete them. This may be necessary to free up a QID
+ * slot for this requested mapping.
+ */
+ if (port->num_pending_removals)
+ dlb_domain_finish_unmap_port(hw, domain, port);
+
+ ret = dlb_verify_map_qid_slot_available(port, queue, resp);
+ if (ret)
+ return ret;
+
+ /* Hardware requires disabling the CQ before mapping QIDs. */
+ if (port->enabled)
+ dlb_ldb_port_cq_disable(hw, port);
+
+ /* If this is only a priority change, don't perform the full QID->CQ
+ * mapping procedure
+ */
+ state = DLB_QUEUE_MAPPED;
+ if (dlb_port_find_slot_queue(port, state, queue, &i)) {
+ if (i >= DLB_MAX_NUM_QIDS_PER_LDB_CQ) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: port slot tracking failed\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ if (prio != port->qid_map[i].priority) {
+ dlb_ldb_port_change_qid_priority(hw, port, i, args);
+ DLB_HW_INFO(hw, "DLB map: priority change only\n");
+ }
+
+ state = DLB_QUEUE_MAPPED;
+ ret = dlb_port_slot_state_transition(hw, port, queue, i, state);
+ if (ret)
+ return ret;
+
+ goto map_qid_done;
+ }
+
+ state = DLB_QUEUE_UNMAP_IN_PROGRESS;
+ if (dlb_port_find_slot_queue(port, state, queue, &i)) {
+ if (i >= DLB_MAX_NUM_QIDS_PER_LDB_CQ) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: port slot tracking failed\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ if (prio != port->qid_map[i].priority) {
+ dlb_ldb_port_change_qid_priority(hw, port, i, args);
+ DLB_HW_INFO(hw, "DLB map: priority change only\n");
+ }
+
+ state = DLB_QUEUE_MAPPED;
+ ret = dlb_port_slot_state_transition(hw, port, queue, i, state);
+ if (ret)
+ return ret;
+
+ goto map_qid_done;
+ }
+
+ /* If this is a priority change on an in-progress mapping, don't
+ * perform the full QID->CQ mapping procedure.
+ */
+ state = DLB_QUEUE_MAP_IN_PROGRESS;
+ if (dlb_port_find_slot_queue(port, state, queue, &i)) {
+ if (i >= DLB_MAX_NUM_QIDS_PER_LDB_CQ) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: port slot tracking failed\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ port->qid_map[i].priority = prio;
+
+ DLB_HW_INFO(hw, "DLB map: priority change only\n");
+
+ goto map_qid_done;
+ }
+
+ /* If this is a priority change on a pending mapping, update the
+ * pending priority
+ */
+ if (dlb_port_find_slot_with_pending_map_queue(port, queue, &i)) {
+ if (i >= DLB_MAX_NUM_QIDS_PER_LDB_CQ) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: port slot tracking failed\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ port->qid_map[i].pending_priority = prio;
+
+ DLB_HW_INFO(hw, "DLB map: priority change only\n");
+
+ goto map_qid_done;
+ }
+
+ /* If all the CQ's slots are in use, then there's an unmap in progress
+ * (guaranteed by dlb_verify_map_qid_slot_available()), so add this
+ * mapping to pending_map and return. When the removal is completed for
+ * the slot's current occupant, this mapping will be performed.
+ */
+ if (!dlb_port_find_slot(port, DLB_QUEUE_UNMAPPED, &i)) {
+ if (dlb_port_find_slot(port, DLB_QUEUE_UNMAP_IN_PROGRESS, &i)) {
+ enum dlb_qid_map_state state;
+
+ if (i >= DLB_MAX_NUM_QIDS_PER_LDB_CQ) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: port slot tracking failed\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ port->qid_map[i].pending_qid = queue->id.phys_id;
+ port->qid_map[i].pending_priority = prio;
+
+ state = DLB_QUEUE_UNMAP_IN_PROGRESS_PENDING_MAP;
+
+ ret = dlb_port_slot_state_transition(hw, port, queue,
+ i, state);
+ if (ret)
+ return ret;
+
+ DLB_HW_INFO(hw, "DLB map: map pending removal\n");
+
+ goto map_qid_done;
+ }
+ }
+
+ /* If the domain has started, a special "dynamic" CQ->queue mapping
+ * procedure is required in order to safely update the CQ<->QID tables.
+ * The "static" procedure cannot be used when traffic is flowing,
+ * because the CQ<->QID tables cannot be updated atomically and the
+ * scheduler won't see the new mapping unless the queue's if_status
+ * changes, which isn't guaranteed.
+ */
+ ret = dlb_ldb_port_map_qid(hw, domain, port, queue, prio);
+
+ /* If ret is less than zero, it's due to an internal error */
+ if (ret < 0)
+ return ret;
+
+map_qid_done:
+ if (port->enabled)
+ dlb_ldb_port_cq_enable(hw, port);
+
+ resp->status = 0;
+
+ return 0;
+}
+
+static void dlb_log_unmap_qid(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_unmap_qid_args *args,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ DLB_HW_INFO(hw, "DLB unmap QID arguments:\n");
+ if (vf_request)
+ DLB_HW_INFO(hw, "(Request from VF %d)\n", vf_id);
+ DLB_HW_INFO(hw, "\tDomain ID: %d\n",
+ domain_id);
+ DLB_HW_INFO(hw, "\tPort ID: %d\n",
+ args->port_id);
+ DLB_HW_INFO(hw, "\tQueue ID: %d\n",
+ args->qid);
+ if (args->qid < DLB_MAX_NUM_LDB_QUEUES)
+ DLB_HW_INFO(hw, "\tQueue's num mappings: %d\n",
+ hw->rsrcs.ldb_queues[args->qid].num_mappings);
+}
+
+int dlb_hw_unmap_qid(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_unmap_qid_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ enum dlb_qid_map_state state;
+ struct dlb_ldb_queue *queue;
+ struct dlb_ldb_port *port;
+ struct dlb_domain *domain;
+ bool unmap_complete;
+ int i, ret, id;
+
+ dlb_log_unmap_qid(hw, domain_id, args, vf_request, vf_id);
+
+ /* Verify that hardware resources are available before attempting to
+ * satisfy the request. This simplifies the error unwinding code.
+ */
+ if (dlb_verify_unmap_qid_args(hw,
+ domain_id,
+ args,
+ resp,
+ vf_request,
+ vf_id))
+ return -EINVAL;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+ if (!domain) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: domain not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ id = args->port_id;
+
+ port = dlb_get_domain_used_ldb_port(id, vf_request, domain);
+ if (!port) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: port not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ queue = dlb_get_domain_ldb_queue(args->qid, vf_request, domain);
+ if (!queue) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: queue not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ /* If the queue hasn't been mapped yet, we need to update the slot's
+ * state and re-enable the queue's inflights.
+ */
+ state = DLB_QUEUE_MAP_IN_PROGRESS;
+ if (dlb_port_find_slot_queue(port, state, queue, &i)) {
+ if (i >= DLB_MAX_NUM_QIDS_PER_LDB_CQ) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: port slot tracking failed\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ /* Since the in-progress map was aborted, re-enable the QID's
+ * inflights.
+ */
+ if (queue->num_pending_additions == 0)
+ dlb_ldb_queue_set_inflight_limit(hw, queue);
+
+ state = DLB_QUEUE_UNMAPPED;
+ ret = dlb_port_slot_state_transition(hw, port, queue, i, state);
+ if (ret)
+ return ret;
+
+ goto unmap_qid_done;
+ }
+
+ /* If the queue mapping is on hold pending an unmap, we simply need to
+ * update the slot's state.
+ */
+ if (dlb_port_find_slot_with_pending_map_queue(port, queue, &i)) {
+ if (i >= DLB_MAX_NUM_QIDS_PER_LDB_CQ) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: port slot tracking failed\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ state = DLB_QUEUE_UNMAP_IN_PROGRESS;
+ ret = dlb_port_slot_state_transition(hw, port, queue, i, state);
+ if (ret)
+ return ret;
+
+ goto unmap_qid_done;
+ }
+
+ state = DLB_QUEUE_MAPPED;
+ if (!dlb_port_find_slot_queue(port, state, queue, &i)) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: no available CQ slots\n",
+ __func__);
+ return -EFAULT;
+ }
+
+ if (i >= DLB_MAX_NUM_QIDS_PER_LDB_CQ) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: port slot tracking failed\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ /* QID->CQ mapping removal is an asychronous procedure. It requires
+ * stopping the DLB from scheduling this CQ, draining all inflights
+ * from the CQ, then unmapping the queue from the CQ. This function
+ * simply marks the port as needing the queue unmapped, and (if
+ * necessary) starts the unmapping worker thread.
+ */
+ dlb_ldb_port_cq_disable(hw, port);
+
+ state = DLB_QUEUE_UNMAP_IN_PROGRESS;
+ ret = dlb_port_slot_state_transition(hw, port, queue, i, state);
+ if (ret)
+ return ret;
+
+ /* Attempt to finish the unmapping now, in case the port has no
+ * outstanding inflights. If that's not the case, this will fail and
+ * the unmapping will be completed at a later time.
+ */
+ unmap_complete = dlb_domain_finish_unmap_port(hw, domain, port);
+
+ /* If the unmapping couldn't complete immediately, launch the worker
+ * thread (if it isn't already launched) to finish it later.
+ */
+ if (!unmap_complete && !os_worker_active(hw))
+ os_schedule_work(hw);
+
+unmap_qid_done:
+ resp->status = 0;
+
+ return 0;
+}
+
+static void dlb_log_enable_port(struct dlb_hw *hw,
+ u32 domain_id,
+ u32 port_id,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ DLB_HW_INFO(hw, "DLB enable port arguments:\n");
+ if (vf_request)
+ DLB_HW_INFO(hw, "(Request from VF %d)\n", vf_id);
+ DLB_HW_INFO(hw, "\tDomain ID: %d\n",
+ domain_id);
+ DLB_HW_INFO(hw, "\tPort ID: %d\n",
+ port_id);
+}
+
+int dlb_hw_enable_ldb_port(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_enable_ldb_port_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_ldb_port *port;
+ struct dlb_domain *domain;
+ int id;
+
+ dlb_log_enable_port(hw, domain_id, args->port_id, vf_request, vf_id);
+
+ /* Verify that hardware resources are available before attempting to
+ * satisfy the request. This simplifies the error unwinding code.
+ */
+ if (dlb_verify_enable_ldb_port_args(hw,
+ domain_id,
+ args,
+ resp,
+ vf_request,
+ vf_id))
+ return -EINVAL;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+ if (!domain) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: domain not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ id = args->port_id;
+
+ port = dlb_get_domain_used_ldb_port(id, vf_request, domain);
+ if (!port) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: port not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ /* Hardware requires disabling the CQ before unmapping QIDs. */
+ if (!port->enabled) {
+ dlb_ldb_port_cq_enable(hw, port);
+ port->enabled = true;
+
+ hw->pf.num_enabled_ldb_ports++;
+ dlb_update_ldb_arb_threshold(hw);
+ }
+
+ resp->status = 0;
+
+ return 0;
+}
+
+static void dlb_log_disable_port(struct dlb_hw *hw,
+ u32 domain_id,
+ u32 port_id,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ DLB_HW_INFO(hw, "DLB disable port arguments:\n");
+ if (vf_request)
+ DLB_HW_INFO(hw, "(Request from VF %d)\n", vf_id);
+ DLB_HW_INFO(hw, "\tDomain ID: %d\n",
+ domain_id);
+ DLB_HW_INFO(hw, "\tPort ID: %d\n",
+ port_id);
+}
+
+int dlb_hw_disable_ldb_port(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_disable_ldb_port_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_ldb_port *port;
+ struct dlb_domain *domain;
+ int id;
+
+ dlb_log_disable_port(hw, domain_id, args->port_id, vf_request, vf_id);
+
+ /* Verify that hardware resources are available before attempting to
+ * satisfy the request. This simplifies the error unwinding code.
+ */
+ if (dlb_verify_disable_ldb_port_args(hw,
+ domain_id,
+ args,
+ resp,
+ vf_request,
+ vf_id))
+ return -EINVAL;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+ if (!domain) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: domain not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ id = args->port_id;
+
+ port = dlb_get_domain_used_ldb_port(id, vf_request, domain);
+ if (!port) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: port not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ /* Hardware requires disabling the CQ before unmapping QIDs. */
+ if (port->enabled) {
+ dlb_ldb_port_cq_disable(hw, port);
+ port->enabled = false;
+
+ hw->pf.num_enabled_ldb_ports--;
+ dlb_update_ldb_arb_threshold(hw);
+ }
+
+ resp->status = 0;
+
+ return 0;
+}
+
+int dlb_hw_enable_dir_port(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_enable_dir_port_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_dir_pq_pair *port;
+ struct dlb_domain *domain;
+ int id;
+
+ dlb_log_enable_port(hw, domain_id, args->port_id, vf_request, vf_id);
+
+ /* Verify that hardware resources are available before attempting to
+ * satisfy the request. This simplifies the error unwinding code.
+ */
+ if (dlb_verify_enable_dir_port_args(hw,
+ domain_id,
+ args,
+ resp,
+ vf_request,
+ vf_id))
+ return -EINVAL;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+ if (!domain) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: domain not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ id = args->port_id;
+
+ port = dlb_get_domain_used_dir_pq(id, vf_request, domain);
+ if (!port) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: port not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ /* Hardware requires disabling the CQ before unmapping QIDs. */
+ if (!port->enabled) {
+ dlb_dir_port_cq_enable(hw, port);
+ port->enabled = true;
+ }
+
+ resp->status = 0;
+
+ return 0;
+}
+
+int dlb_hw_disable_dir_port(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_disable_dir_port_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_dir_pq_pair *port;
+ struct dlb_domain *domain;
+ int id;
+
+ dlb_log_disable_port(hw, domain_id, args->port_id, vf_request, vf_id);
+
+ /* Verify that hardware resources are available before attempting to
+ * satisfy the request. This simplifies the error unwinding code.
+ */
+ if (dlb_verify_disable_dir_port_args(hw,
+ domain_id,
+ args,
+ resp,
+ vf_request,
+ vf_id))
+ return -EINVAL;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+ if (!domain) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: domain not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ id = args->port_id;
+
+ port = dlb_get_domain_used_dir_pq(id, vf_request, domain);
+ if (!port) {
+ DLB_HW_ERR(hw,
+ "[%s():%d] Internal error: port not found\n",
+ __func__, __LINE__);
+ return -EFAULT;
+ }
+
+ /* Hardware requires disabling the CQ before unmapping QIDs. */
+ if (port->enabled) {
+ dlb_dir_port_cq_disable(hw, port);
+ port->enabled = false;
+ }
+
+ resp->status = 0;
+
+ return 0;
+}
+
+int dlb_notify_vf(struct dlb_hw *hw,
+ unsigned int vf_id,
+ enum dlb_mbox_vf_notification_type notification)
+{
+ struct dlb_mbox_vf_notification_cmd_req req;
+ int retry_cnt;
+
+ req.hdr.type = DLB_MBOX_VF_CMD_NOTIFICATION;
+ req.notification = notification;
+
+ if (dlb_pf_write_vf_mbox_req(hw, vf_id, &req, sizeof(req)))
+ return -1;
+
+ dlb_send_async_pf_to_vf_msg(hw, vf_id);
+
+ /* Timeout after 1 second of inactivity */
+ retry_cnt = 0;
+ while (!dlb_pf_to_vf_complete(hw, vf_id)) {
+ os_msleep(1);
+ if (++retry_cnt >= 1000) {
+ DLB_HW_ERR(hw,
+ "PF driver timed out waiting for mbox response\n");
+ return -1;
+ }
+ }
+
+ /* No response data expected for notifications. */
+
+ return 0;
+}
+
+int dlb_vf_in_use(struct dlb_hw *hw, unsigned int vf_id)
+{
+ struct dlb_mbox_vf_in_use_cmd_resp resp;
+ struct dlb_mbox_vf_in_use_cmd_req req;
+ int retry_cnt;
+
+ req.hdr.type = DLB_MBOX_VF_CMD_IN_USE;
+
+ if (dlb_pf_write_vf_mbox_req(hw, vf_id, &req, sizeof(req)))
+ return -1;
+
+ dlb_send_async_pf_to_vf_msg(hw, vf_id);
+
+ /* Timeout after 1 second of inactivity */
+ retry_cnt = 0;
+ while (!dlb_pf_to_vf_complete(hw, vf_id)) {
+ os_msleep(1);
+ if (++retry_cnt >= 1000) {
+ DLB_HW_ERR(hw,
+ "PF driver timed out waiting for mbox response\n");
+ return -1;
+ }
+ }
+
+ if (dlb_pf_read_vf_mbox_resp(hw, vf_id, &resp, sizeof(resp)))
+ return -1;
+
+ if (resp.hdr.status != DLB_MBOX_ST_SUCCESS) {
+ DLB_HW_ERR(hw,
+ "[%s()]: failed with mailbox error: %s\n",
+ __func__,
+ DLB_MBOX_ST_STRING(&resp));
+
+ return -1;
+ }
+
+ return resp.in_use;
+}
+
+static int dlb_vf_domain_alert(struct dlb_hw *hw,
+ unsigned int vf_id,
+ u32 domain_id,
+ u32 alert_id,
+ u32 aux_alert_data)
+{
+ struct dlb_mbox_vf_alert_cmd_req req;
+ int retry_cnt;
+
+ req.hdr.type = DLB_MBOX_VF_CMD_DOMAIN_ALERT;
+ req.domain_id = domain_id;
+ req.alert_id = alert_id;
+ req.aux_alert_data = aux_alert_data;
+
+ if (dlb_pf_write_vf_mbox_req(hw, vf_id, &req, sizeof(req)))
+ return -1;
+
+ dlb_send_async_pf_to_vf_msg(hw, vf_id);
+
+ /* Timeout after 1 second of inactivity */
+ retry_cnt = 0;
+ while (!dlb_pf_to_vf_complete(hw, vf_id)) {
+ os_msleep(1);
+ if (++retry_cnt >= 1000) {
+ DLB_HW_ERR(hw,
+ "PF driver timed out waiting for mbox response\n");
+ return -1;
+ }
+ }
+
+ /* No response data expected for alarm notifications. */
+
+ return 0;
+}
+
+void dlb_set_msix_mode(struct dlb_hw *hw, int mode)
+{
+ union dlb_sys_msix_mode r0 = { {0} };
+
+ r0.field.mode = mode;
+
+ DLB_CSR_WR(hw, DLB_SYS_MSIX_MODE, r0.val);
+}
+
+int dlb_configure_ldb_cq_interrupt(struct dlb_hw *hw,
+ int port_id,
+ int vector,
+ int mode,
+ unsigned int vf,
+ unsigned int owner_vf,
+ u16 threshold)
+{
+ union dlb_chp_ldb_cq_int_depth_thrsh r0 = { {0} };
+ union dlb_chp_ldb_cq_int_enb r1 = { {0} };
+ union dlb_sys_ldb_cq_isr r2 = { {0} };
+ struct dlb_ldb_port *port;
+ bool vf_request;
+
+ vf_request = (mode == DLB_CQ_ISR_MODE_MSI);
+
+ port = dlb_get_ldb_port_from_id(hw, port_id, vf_request, vf);
+ if (!port) {
+ DLB_HW_ERR(hw,
+ "[%s()]: Internal error: failed to enable LDB CQ int\n\tport_id: %u, vf_req: %u, vf: %u\n",
+ __func__, port_id, vf_request, vf);
+ return -EINVAL;
+ }
+
+ /* Trigger the interrupt when threshold or more QEs arrive in the CQ */
+ r0.field.depth_threshold = threshold - 1;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_CQ_INT_DEPTH_THRSH(port->id.phys_id),
+ r0.val);
+
+ r1.field.en_depth = 1;
+
+ DLB_CSR_WR(hw, DLB_CHP_LDB_CQ_INT_ENB(port->id.phys_id), r1.val);
+
+ r2.field.vector = vector;
+ r2.field.vf = owner_vf;
+ r2.field.en_code = mode;
+
+ DLB_CSR_WR(hw, DLB_SYS_LDB_CQ_ISR(port->id.phys_id), r2.val);
+
+ return 0;
+}
+
+int dlb_configure_dir_cq_interrupt(struct dlb_hw *hw,
+ int port_id,
+ int vector,
+ int mode,
+ unsigned int vf,
+ unsigned int owner_vf,
+ u16 threshold)
+{
+ union dlb_chp_dir_cq_int_depth_thrsh r0 = { {0} };
+ union dlb_chp_dir_cq_int_enb r1 = { {0} };
+ union dlb_sys_dir_cq_isr r2 = { {0} };
+ struct dlb_dir_pq_pair *port;
+ bool vf_request;
+
+ vf_request = (mode == DLB_CQ_ISR_MODE_MSI);
+
+ port = dlb_get_dir_pq_from_id(hw, port_id, vf_request, vf);
+ if (!port) {
+ DLB_HW_ERR(hw,
+ "[%s()]: Internal error: failed to enable DIR CQ int\n\tport_id: %u, vf_req: %u, vf: %u\n",
+ __func__, port_id, vf_request, vf);
+ return -EINVAL;
+ }
+
+ /* Trigger the interrupt when threshold or more QEs arrive in the CQ */
+ r0.field.depth_threshold = threshold - 1;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_CQ_INT_DEPTH_THRSH(port->id.phys_id),
+ r0.val);
+
+ r1.field.en_depth = 1;
+
+ DLB_CSR_WR(hw, DLB_CHP_DIR_CQ_INT_ENB(port->id.phys_id), r1.val);
+
+ r2.field.vector = vector;
+ r2.field.vf = owner_vf;
+ r2.field.en_code = mode;
+
+ DLB_CSR_WR(hw, DLB_SYS_DIR_CQ_ISR(port->id.phys_id), r2.val);
+
+ return 0;
+}
+
+int dlb_arm_cq_interrupt(struct dlb_hw *hw,
+ int port_id,
+ bool is_ldb,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ u32 val;
+ u32 reg;
+
+ if (vf_request && is_ldb) {
+ struct dlb_ldb_port *ldb_port;
+
+ ldb_port = dlb_get_ldb_port_from_id(hw, port_id, true, vf_id);
+
+ if (!ldb_port || !ldb_port->configured)
+ return -EINVAL;
+
+ port_id = ldb_port->id.phys_id;
+ } else if (vf_request && !is_ldb) {
+ struct dlb_dir_pq_pair *dir_port;
+
+ dir_port = dlb_get_dir_pq_from_id(hw, port_id, true, vf_id);
+
+ if (!dir_port || !dir_port->port_configured)
+ return -EINVAL;
+
+ port_id = dir_port->id.phys_id;
+ }
+
+ val = 1 << (port_id % 32);
+
+ if (is_ldb && port_id < 32)
+ reg = DLB_CHP_LDB_CQ_INTR_ARMED0;
+ else if (is_ldb && port_id < 64)
+ reg = DLB_CHP_LDB_CQ_INTR_ARMED1;
+ else if (!is_ldb && port_id < 32)
+ reg = DLB_CHP_DIR_CQ_INTR_ARMED0;
+ else if (!is_ldb && port_id < 64)
+ reg = DLB_CHP_DIR_CQ_INTR_ARMED1;
+ else if (!is_ldb && port_id < 96)
+ reg = DLB_CHP_DIR_CQ_INTR_ARMED2;
+ else
+ reg = DLB_CHP_DIR_CQ_INTR_ARMED3;
+
+ DLB_CSR_WR(hw, reg, val);
+
+ dlb_flush_csr(hw);
+
+ return 0;
+}
+
+void dlb_read_compressed_cq_intr_status(struct dlb_hw *hw,
+ u32 *ldb_interrupts,
+ u32 *dir_interrupts)
+{
+ /* Read every CQ's interrupt status */
+
+ ldb_interrupts[0] = DLB_CSR_RD(hw, DLB_SYS_LDB_CQ_31_0_OCC_INT_STS);
+ ldb_interrupts[1] = DLB_CSR_RD(hw, DLB_SYS_LDB_CQ_63_32_OCC_INT_STS);
+
+ dir_interrupts[0] = DLB_CSR_RD(hw, DLB_SYS_DIR_CQ_31_0_OCC_INT_STS);
+ dir_interrupts[1] = DLB_CSR_RD(hw, DLB_SYS_DIR_CQ_63_32_OCC_INT_STS);
+ dir_interrupts[2] = DLB_CSR_RD(hw, DLB_SYS_DIR_CQ_95_64_OCC_INT_STS);
+ dir_interrupts[3] = DLB_CSR_RD(hw, DLB_SYS_DIR_CQ_127_96_OCC_INT_STS);
+}
+
+static void dlb_ack_msix_interrupt(struct dlb_hw *hw, int vector)
+{
+ union dlb_sys_msix_ack r0 = { {0} };
+
+ switch (vector) {
+ case 0:
+ r0.field.msix_0_ack = 1;
+ break;
+ case 1:
+ r0.field.msix_1_ack = 1;
+ break;
+ case 2:
+ r0.field.msix_2_ack = 1;
+ break;
+ case 3:
+ r0.field.msix_3_ack = 1;
+ break;
+ case 4:
+ r0.field.msix_4_ack = 1;
+ break;
+ case 5:
+ r0.field.msix_5_ack = 1;
+ break;
+ case 6:
+ r0.field.msix_6_ack = 1;
+ break;
+ case 7:
+ r0.field.msix_7_ack = 1;
+ break;
+ case 8:
+ r0.field.msix_8_ack = 1;
+ /*
+ * CSSY-1650
+ * workaround h/w bug for lost MSI-X interrupts
+ *
+ * The recommended workaround for acknowledging
+ * vector 8 interrupts is :
+ * 1: set MSI-X mask
+ * 2: set MSIX_PASSTHROUGH
+ * 3: clear MSIX_ACK
+ * 4: clear MSIX_PASSTHROUGH
+ * 5: clear MSI-X mask
+ *
+ * The MSIX-ACK (step 3) is cleared for all vectors
+ * below. We handle steps 1 & 2 for vector 8 here.
+ *
+ * The bitfields for MSIX_ACK and MSIX_PASSTHRU are
+ * defined the same, so we just use the MSIX_ACK
+ * value when writing to PASSTHRU.
+ */
+
+ /* set MSI-X mask and passthrough for vector 8 */
+ DLB_FUNC_WR(hw, DLB_MSIX_MEM_VECTOR_CTRL(8), 1);
+ DLB_CSR_WR(hw, DLB_SYS_MSIX_PASSTHRU, r0.val);
+ break;
+ }
+
+ /* clear MSIX_ACK (write one to clear) */
+ DLB_CSR_WR(hw, DLB_SYS_MSIX_ACK, r0.val);
+
+ if (vector == 8) {
+ /*
+ * finish up steps 4 & 5 of the workaround -
+ * clear pasthrough and mask
+ */
+ DLB_CSR_WR(hw, DLB_SYS_MSIX_PASSTHRU, 0);
+ DLB_FUNC_WR(hw, DLB_MSIX_MEM_VECTOR_CTRL(8), 0);
+ }
+
+ dlb_flush_csr(hw);
+}
+
+void dlb_ack_compressed_cq_intr(struct dlb_hw *hw,
+ u32 *ldb_interrupts,
+ u32 *dir_interrupts)
+{
+ /* Write back the status regs to ack the interrupts */
+ if (ldb_interrupts[0])
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_CQ_31_0_OCC_INT_STS,
+ ldb_interrupts[0]);
+ if (ldb_interrupts[1])
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_CQ_63_32_OCC_INT_STS,
+ ldb_interrupts[1]);
+
+ if (dir_interrupts[0])
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_CQ_31_0_OCC_INT_STS,
+ dir_interrupts[0]);
+ if (dir_interrupts[1])
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_CQ_63_32_OCC_INT_STS,
+ dir_interrupts[1]);
+ if (dir_interrupts[2])
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_CQ_95_64_OCC_INT_STS,
+ dir_interrupts[2]);
+ if (dir_interrupts[3])
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_CQ_127_96_OCC_INT_STS,
+ dir_interrupts[3]);
+
+ dlb_ack_msix_interrupt(hw, DLB_PF_COMPRESSED_MODE_CQ_VECTOR_ID);
+}
+
+u32 dlb_read_vf_intr_status(struct dlb_hw *hw)
+{
+ return DLB_FUNC_RD(hw, DLB_FUNC_VF_VF_MSI_ISR);
+}
+
+void dlb_ack_vf_intr_status(struct dlb_hw *hw, u32 interrupts)
+{
+ DLB_FUNC_WR(hw, DLB_FUNC_VF_VF_MSI_ISR, interrupts);
+}
+
+void dlb_ack_vf_msi_intr(struct dlb_hw *hw, u32 interrupts)
+{
+ DLB_FUNC_WR(hw, DLB_FUNC_VF_VF_MSI_ISR_PEND, interrupts);
+}
+
+void dlb_ack_pf_mbox_int(struct dlb_hw *hw)
+{
+ union dlb_func_vf_pf2vf_mailbox_isr r0;
+
+ r0.field.pf_isr = 1;
+
+ DLB_FUNC_WR(hw, DLB_FUNC_VF_PF2VF_MAILBOX_ISR, r0.val);
+}
+
+u32 dlb_read_vf_to_pf_int_bitvec(struct dlb_hw *hw)
+{
+ /* The PF has one VF->PF MBOX ISR register per VF space, but they all
+ * alias to the same physical register.
+ */
+ return DLB_FUNC_RD(hw, DLB_FUNC_PF_VF2PF_MAILBOX_ISR(0));
+}
+
+void dlb_ack_vf_mbox_int(struct dlb_hw *hw, u32 bitvec)
+{
+ /* The PF has one VF->PF MBOX ISR register per VF space, but they all
+ * alias to the same physical register.
+ */
+ DLB_FUNC_WR(hw, DLB_FUNC_PF_VF2PF_MAILBOX_ISR(0), bitvec);
+}
+
+u32 dlb_read_vf_flr_int_bitvec(struct dlb_hw *hw)
+{
+ /* The PF has one VF->PF FLR ISR register per VF space, but they all
+ * alias to the same physical register.
+ */
+ return DLB_FUNC_RD(hw, DLB_FUNC_PF_VF2PF_FLR_ISR(0));
+}
+
+void dlb_set_vf_reset_in_progress(struct dlb_hw *hw, int vf)
+{
+ u32 bitvec = DLB_FUNC_RD(hw, DLB_FUNC_PF_VF_RESET_IN_PROGRESS(0));
+
+ bitvec |= (1 << vf);
+
+ DLB_FUNC_WR(hw, DLB_FUNC_PF_VF_RESET_IN_PROGRESS(0), bitvec);
+}
+
+void dlb_clr_vf_reset_in_progress(struct dlb_hw *hw, int vf)
+{
+ u32 bitvec = DLB_FUNC_RD(hw, DLB_FUNC_PF_VF_RESET_IN_PROGRESS(0));
+
+ bitvec &= ~(1 << vf);
+
+ DLB_FUNC_WR(hw, DLB_FUNC_PF_VF_RESET_IN_PROGRESS(0), bitvec);
+}
+
+void dlb_ack_vf_flr_int(struct dlb_hw *hw, u32 bitvec, bool a_stepping)
+{
+ union dlb_sys_func_vf_bar_dsbl r0 = { {0} };
+ u32 clear;
+ int i;
+
+ if (!bitvec)
+ return;
+
+ /* Re-enable access to the VF BAR */
+ r0.field.func_vf_bar_dis = 0;
+ for (i = 0; i < DLB_MAX_NUM_VFS; i++) {
+ if (!(bitvec & (1 << i)))
+ continue;
+
+ DLB_CSR_WR(hw, DLB_SYS_FUNC_VF_BAR_DSBL(i), r0.val);
+ }
+
+ /* Notify the VF driver that the reset has completed. This register is
+ * RW in A-stepping devices, WOCLR otherwise.
+ */
+ if (a_stepping) {
+ clear = DLB_FUNC_RD(hw, DLB_FUNC_PF_VF_RESET_IN_PROGRESS(0));
+ clear &= ~bitvec;
+ } else {
+ clear = bitvec;
+ }
+
+ DLB_FUNC_WR(hw, DLB_FUNC_PF_VF_RESET_IN_PROGRESS(0), clear);
+
+ /* Mark the FLR ISR as complete */
+ DLB_FUNC_WR(hw, DLB_FUNC_PF_VF2PF_FLR_ISR(0), bitvec);
+}
+
+void dlb_ack_vf_to_pf_int(struct dlb_hw *hw,
+ u32 mbox_bitvec,
+ u32 flr_bitvec)
+{
+ int i;
+
+ dlb_ack_msix_interrupt(hw, DLB_INT_VF_TO_PF_MBOX);
+
+ for (i = 0; i < DLB_MAX_NUM_VFS; i++) {
+ union dlb_func_pf_vf2pf_isr_pend r0 = { {0} };
+
+ if (!((mbox_bitvec & (1 << i)) || (flr_bitvec & (1 << i))))
+ continue;
+
+ /* Unset the VF's ISR pending bit */
+ r0.field.isr_pend = 1;
+ DLB_FUNC_WR(hw, DLB_FUNC_PF_VF2PF_ISR_PEND(i), r0.val);
+ }
+}
+
+void dlb_enable_alarm_interrupts(struct dlb_hw *hw)
+{
+ union dlb_sys_ingress_alarm_enbl r0;
+
+ r0.val = DLB_CSR_RD(hw, DLB_SYS_INGRESS_ALARM_ENBL);
+
+ r0.field.illegal_hcw = 1;
+ r0.field.illegal_pp = 1;
+ r0.field.disabled_pp = 1;
+ r0.field.illegal_qid = 1;
+ r0.field.disabled_qid = 1;
+ r0.field.illegal_ldb_qid_cfg = 1;
+ r0.field.illegal_cqid = 1;
+
+ DLB_CSR_WR(hw, DLB_SYS_INGRESS_ALARM_ENBL, r0.val);
+}
+
+void dlb_disable_alarm_interrupts(struct dlb_hw *hw)
+{
+ union dlb_sys_ingress_alarm_enbl r0;
+
+ r0.val = DLB_CSR_RD(hw, DLB_SYS_INGRESS_ALARM_ENBL);
+
+ r0.field.illegal_hcw = 0;
+ r0.field.illegal_pp = 0;
+ r0.field.disabled_pp = 0;
+ r0.field.illegal_qid = 0;
+ r0.field.disabled_qid = 0;
+ r0.field.illegal_ldb_qid_cfg = 0;
+ r0.field.illegal_cqid = 0;
+
+ DLB_CSR_WR(hw, DLB_SYS_INGRESS_ALARM_ENBL, r0.val);
+}
+
+static void dlb_log_alarm_syndrome(struct dlb_hw *hw,
+ const char *str,
+ union dlb_sys_alarm_hw_synd r0)
+{
+ DLB_HW_ERR(hw, "%s:\n", str);
+ DLB_HW_ERR(hw, "\tsyndrome: 0x%x\n", r0.field.syndrome);
+ DLB_HW_ERR(hw, "\trtype: 0x%x\n", r0.field.rtype);
+ DLB_HW_ERR(hw, "\tfrom_dmv: 0x%x\n", r0.field.from_dmv);
+ DLB_HW_ERR(hw, "\tis_ldb: 0x%x\n", r0.field.is_ldb);
+ DLB_HW_ERR(hw, "\tcls: 0x%x\n", r0.field.cls);
+ DLB_HW_ERR(hw, "\taid: 0x%x\n", r0.field.aid);
+ DLB_HW_ERR(hw, "\tunit: 0x%x\n", r0.field.unit);
+ DLB_HW_ERR(hw, "\tsource: 0x%x\n", r0.field.source);
+ DLB_HW_ERR(hw, "\tmore: 0x%x\n", r0.field.more);
+ DLB_HW_ERR(hw, "\tvalid: 0x%x\n", r0.field.valid);
+}
+
+/* Note: this array's contents must match dlb_alert_id() */
+static const char dlb_alert_strings[NUM_DLB_DOMAIN_ALERTS][128] = {
+ [DLB_DOMAIN_ALERT_PP_OUT_OF_CREDITS] = "Insufficient credits",
+ [DLB_DOMAIN_ALERT_PP_ILLEGAL_ENQ] = "Illegal enqueue",
+ [DLB_DOMAIN_ALERT_PP_EXCESS_TOKEN_POPS] = "Excess token pops",
+ [DLB_DOMAIN_ALERT_ILLEGAL_HCW] = "Illegal HCW",
+ [DLB_DOMAIN_ALERT_ILLEGAL_QID] = "Illegal QID",
+ [DLB_DOMAIN_ALERT_DISABLED_QID] = "Disabled QID",
+};
+
+static void dlb_log_pf_vf_syndrome(struct dlb_hw *hw,
+ const char *str,
+ union dlb_sys_alarm_pf_synd0 r0,
+ union dlb_sys_alarm_pf_synd1 r1,
+ union dlb_sys_alarm_pf_synd2 r2,
+ u32 alert_id)
+{
+ DLB_HW_ERR(hw, "%s:\n", str);
+ if (alert_id < NUM_DLB_DOMAIN_ALERTS)
+ DLB_HW_ERR(hw, "Alert: %s\n", dlb_alert_strings[alert_id]);
+ DLB_HW_ERR(hw, "\tsyndrome: 0x%x\n", r0.field.syndrome);
+ DLB_HW_ERR(hw, "\trtype: 0x%x\n", r0.field.rtype);
+ DLB_HW_ERR(hw, "\tfrom_dmv: 0x%x\n", r0.field.from_dmv);
+ DLB_HW_ERR(hw, "\tis_ldb: 0x%x\n", r0.field.is_ldb);
+ DLB_HW_ERR(hw, "\tcls: 0x%x\n", r0.field.cls);
+ DLB_HW_ERR(hw, "\taid: 0x%x\n", r0.field.aid);
+ DLB_HW_ERR(hw, "\tunit: 0x%x\n", r0.field.unit);
+ DLB_HW_ERR(hw, "\tsource: 0x%x\n", r0.field.source);
+ DLB_HW_ERR(hw, "\tmore: 0x%x\n", r0.field.more);
+ DLB_HW_ERR(hw, "\tvalid: 0x%x\n", r0.field.valid);
+ DLB_HW_ERR(hw, "\tdsi: 0x%x\n", r1.field.dsi);
+ DLB_HW_ERR(hw, "\tqid: 0x%x\n", r1.field.qid);
+ DLB_HW_ERR(hw, "\tqtype: 0x%x\n", r1.field.qtype);
+ DLB_HW_ERR(hw, "\tqpri: 0x%x\n", r1.field.qpri);
+ DLB_HW_ERR(hw, "\tmsg_type: 0x%x\n", r1.field.msg_type);
+ DLB_HW_ERR(hw, "\tlock_id: 0x%x\n", r2.field.lock_id);
+ DLB_HW_ERR(hw, "\tmeas: 0x%x\n", r2.field.meas);
+ DLB_HW_ERR(hw, "\tdebug: 0x%x\n", r2.field.debug);
+ DLB_HW_ERR(hw, "\tcq_pop: 0x%x\n", r2.field.cq_pop);
+ DLB_HW_ERR(hw, "\tqe_uhl: 0x%x\n", r2.field.qe_uhl);
+ DLB_HW_ERR(hw, "\tqe_orsp: 0x%x\n", r2.field.qe_orsp);
+ DLB_HW_ERR(hw, "\tqe_valid: 0x%x\n", r2.field.qe_valid);
+ DLB_HW_ERR(hw, "\tcq_int_rearm: 0x%x\n", r2.field.cq_int_rearm);
+ DLB_HW_ERR(hw, "\tdsi_error: 0x%x\n", r2.field.dsi_error);
+}
+
+static void dlb_clear_syndrome_register(struct dlb_hw *hw, u32 offset)
+{
+ union dlb_sys_alarm_hw_synd r0 = { {0} };
+
+ r0.field.valid = 1;
+ r0.field.more = 1;
+
+ DLB_CSR_WR(hw, offset, r0.val);
+}
+
+void dlb_process_alarm_interrupt(struct dlb_hw *hw)
+{
+ union dlb_sys_alarm_hw_synd r0;
+
+ r0.val = DLB_CSR_RD(hw, DLB_SYS_ALARM_HW_SYND);
+
+ dlb_log_alarm_syndrome(hw, "HW alarm syndrome", r0);
+
+ dlb_clear_syndrome_register(hw, DLB_SYS_ALARM_HW_SYND);
+
+ dlb_ack_msix_interrupt(hw, DLB_INT_ALARM);
+}
+
+static void dlb_process_ingress_error(struct dlb_hw *hw,
+ union dlb_sys_alarm_pf_synd0 r0,
+ u32 alert_id,
+ bool vf_error,
+ unsigned int vf_id)
+{
+ struct dlb_domain *domain;
+ bool is_ldb;
+ u8 port_id;
+ int ret;
+
+ port_id = r0.field.syndrome & 0x7F;
+ if (r0.field.source == DLB_ALARM_HW_SOURCE_SYS)
+ is_ldb = r0.field.is_ldb;
+ else
+ is_ldb = (r0.field.syndrome & 0x80) != 0;
+
+ /* Get the domain ID and, if it's a VF domain, the virtual port ID */
+ if (is_ldb) {
+ struct dlb_ldb_port *port;
+
+ port = dlb_get_ldb_port_from_id(hw, port_id, vf_error, vf_id);
+
+ if (!port) {
+ DLB_HW_ERR(hw,
+ "[%s()]: Internal error: unable to find LDB port\n\tport: %u, vf_error: %u, vf_id: %u\n",
+ __func__, port_id, vf_error, vf_id);
+ return;
+ }
+
+ domain = &hw->domains[port->domain_id.phys_id];
+ } else {
+ struct dlb_dir_pq_pair *port;
+
+ port = dlb_get_dir_pq_from_id(hw, port_id, vf_error, vf_id);
+
+ if (!port) {
+ DLB_HW_ERR(hw,
+ "[%s()]: Internal error: unable to find DIR port\n\tport: %u, vf_error: %u, vf_id: %u\n",
+ __func__, port_id, vf_error, vf_id);
+ return;
+ }
+
+ domain = &hw->domains[port->domain_id.phys_id];
+ }
+
+ if (vf_error)
+ ret = dlb_vf_domain_alert(hw,
+ vf_id,
+ domain->id.virt_id,
+ alert_id,
+ (is_ldb << 8) | port_id);
+ else
+ ret = os_notify_user_space(hw,
+ domain->id.phys_id,
+ alert_id,
+ (is_ldb << 8) | port_id);
+
+ if (ret)
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: failed to notify\n",
+ __func__);
+}
+
+static u32 dlb_alert_id(union dlb_sys_alarm_pf_synd0 r0)
+{
+ if (r0.field.unit == DLB_ALARM_HW_UNIT_CHP &&
+ r0.field.aid == DLB_ALARM_HW_CHP_AID_OUT_OF_CREDITS)
+ return DLB_DOMAIN_ALERT_PP_OUT_OF_CREDITS;
+ else if (r0.field.unit == DLB_ALARM_HW_UNIT_CHP &&
+ r0.field.aid == DLB_ALARM_HW_CHP_AID_ILLEGAL_ENQ)
+ return DLB_DOMAIN_ALERT_PP_ILLEGAL_ENQ;
+ else if (r0.field.unit == DLB_ALARM_HW_UNIT_LSP &&
+ r0.field.aid == DLB_ALARM_HW_LSP_AID_EXCESS_TOKEN_POPS)
+ return DLB_DOMAIN_ALERT_PP_EXCESS_TOKEN_POPS;
+ else if (r0.field.source == DLB_ALARM_HW_SOURCE_SYS &&
+ r0.field.aid == DLB_ALARM_SYS_AID_ILLEGAL_HCW)
+ return DLB_DOMAIN_ALERT_ILLEGAL_HCW;
+ else if (r0.field.source == DLB_ALARM_HW_SOURCE_SYS &&
+ r0.field.aid == DLB_ALARM_SYS_AID_ILLEGAL_QID)
+ return DLB_DOMAIN_ALERT_ILLEGAL_QID;
+ else if (r0.field.source == DLB_ALARM_HW_SOURCE_SYS &&
+ r0.field.aid == DLB_ALARM_SYS_AID_DISABLED_QID)
+ return DLB_DOMAIN_ALERT_DISABLED_QID;
+ else
+ return NUM_DLB_DOMAIN_ALERTS;
+}
+
+void dlb_process_ingress_error_interrupt(struct dlb_hw *hw)
+{
+ union dlb_sys_alarm_pf_synd0 r0;
+ union dlb_sys_alarm_pf_synd1 r1;
+ union dlb_sys_alarm_pf_synd2 r2;
+ u32 alert_id;
+ int i;
+
+ r0.val = DLB_CSR_RD(hw, DLB_SYS_ALARM_PF_SYND0);
+
+ if (r0.field.valid) {
+ r1.val = DLB_CSR_RD(hw, DLB_SYS_ALARM_PF_SYND1);
+ r2.val = DLB_CSR_RD(hw, DLB_SYS_ALARM_PF_SYND2);
+
+ alert_id = dlb_alert_id(r0);
+
+ dlb_log_pf_vf_syndrome(hw,
+ "PF Ingress error alarm",
+ r0, r1, r2, alert_id);
+
+ dlb_clear_syndrome_register(hw, DLB_SYS_ALARM_PF_SYND0);
+
+ dlb_process_ingress_error(hw, r0, alert_id, false, 0);
+ }
+
+ for (i = 0; i < DLB_MAX_NUM_VFS; i++) {
+ r0.val = DLB_CSR_RD(hw, DLB_SYS_ALARM_VF_SYND0(i));
+
+ if (!r0.field.valid)
+ continue;
+
+ r1.val = DLB_CSR_RD(hw, DLB_SYS_ALARM_VF_SYND1(i));
+ r2.val = DLB_CSR_RD(hw, DLB_SYS_ALARM_VF_SYND2(i));
+
+ alert_id = dlb_alert_id(r0);
+
+ dlb_log_pf_vf_syndrome(hw,
+ "VF Ingress error alarm",
+ r0, r1, r2, alert_id);
+
+ dlb_clear_syndrome_register(hw,
+ DLB_SYS_ALARM_VF_SYND0(i));
+
+ dlb_process_ingress_error(hw, r0, alert_id, true, i);
+ }
+
+ dlb_ack_msix_interrupt(hw, DLB_INT_INGRESS_ERROR);
+}
+
+int dlb_get_group_sequence_numbers(struct dlb_hw *hw, unsigned int group_id)
+{
+ if (group_id >= DLB_MAX_NUM_SEQUENCE_NUMBER_GROUPS)
+ return -EINVAL;
+
+ return hw->rsrcs.sn_groups[group_id].sequence_numbers_per_queue;
+}
+
+int dlb_get_group_sequence_number_occupancy(struct dlb_hw *hw,
+ unsigned int group_id)
+{
+ if (group_id >= DLB_MAX_NUM_SEQUENCE_NUMBER_GROUPS)
+ return -EINVAL;
+
+ return dlb_sn_group_used_slots(&hw->rsrcs.sn_groups[group_id]);
+}
+
+static void dlb_log_set_group_sequence_numbers(struct dlb_hw *hw,
+ unsigned int group_id,
+ unsigned long val)
+{
+ DLB_HW_INFO(hw, "DLB set group sequence numbers:\n");
+ DLB_HW_INFO(hw, "\tGroup ID: %u\n", group_id);
+ DLB_HW_INFO(hw, "\tValue: %lu\n", val);
+}
+
+int dlb_set_group_sequence_numbers(struct dlb_hw *hw,
+ unsigned int group_id,
+ unsigned long val)
+{
+ u32 valid_allocations[6] = {32, 64, 128, 256, 512, 1024};
+ union dlb_ro_pipe_grp_sn_mode r0 = { {0} };
+ struct dlb_sn_group *group;
+ int mode;
+
+ if (group_id >= DLB_MAX_NUM_SEQUENCE_NUMBER_GROUPS)
+ return -EINVAL;
+
+ group = &hw->rsrcs.sn_groups[group_id];
+
+ /* Once the first load-balanced queue using an SN group is configured,
+ * the group cannot be changed.
+ */
+ if (group->slot_use_bitmap != 0)
+ return -EPERM;
+
+ for (mode = 0; mode < DLB_MAX_NUM_SEQUENCE_NUMBER_MODES; mode++)
+ if (val == valid_allocations[mode])
+ break;
+
+ if (mode == DLB_MAX_NUM_SEQUENCE_NUMBER_MODES)
+ return -EINVAL;
+
+ group->mode = mode;
+ group->sequence_numbers_per_queue = val;
+
+ r0.field.sn_mode_0 = hw->rsrcs.sn_groups[0].mode;
+ r0.field.sn_mode_1 = hw->rsrcs.sn_groups[1].mode;
+ r0.field.sn_mode_2 = hw->rsrcs.sn_groups[2].mode;
+ r0.field.sn_mode_3 = hw->rsrcs.sn_groups[3].mode;
+
+ DLB_CSR_WR(hw, DLB_RO_PIPE_GRP_SN_MODE, r0.val);
+
+ dlb_log_set_group_sequence_numbers(hw, group_id, val);
+
+ return 0;
+}
+
+void dlb_disable_dp_vasr_feature(struct dlb_hw *hw)
+{
+ union dlb_dp_dir_csr_ctrl r0;
+
+ r0.val = DLB_CSR_RD(hw, DLB_DP_DIR_CSR_CTRL);
+
+ r0.field.cfg_vasr_dis = 1;
+
+ DLB_CSR_WR(hw, DLB_DP_DIR_CSR_CTRL, r0.val);
+}
+
+void dlb_enable_excess_tokens_alarm(struct dlb_hw *hw)
+{
+ union dlb_chp_cfg_chp_csr_ctrl r0;
+
+ r0.val = DLB_CSR_RD(hw, DLB_CHP_CFG_CHP_CSR_CTRL);
+
+ r0.val |= 1 << DLB_CHP_CFG_EXCESS_TOKENS_SHIFT;
+
+ DLB_CSR_WR(hw, DLB_CHP_CFG_CHP_CSR_CTRL, r0.val);
+}
+
+void dlb_disable_excess_tokens_alarm(struct dlb_hw *hw)
+{
+ union dlb_chp_cfg_chp_csr_ctrl r0;
+
+ r0.val = DLB_CSR_RD(hw, DLB_CHP_CFG_CHP_CSR_CTRL);
+
+ r0.val &= ~(1 << DLB_CHP_CFG_EXCESS_TOKENS_SHIFT);
+
+ DLB_CSR_WR(hw, DLB_CHP_CFG_CHP_CSR_CTRL, r0.val);
+}
+
+static int dlb_reset_hw_resource(struct dlb_hw *hw, int type, int id)
+{
+ union dlb_cfg_mstr_diag_reset_sts r0 = { {0} };
+ union dlb_cfg_mstr_bcast_reset_vf_start r1 = { {0} };
+ int i;
+
+ r1.field.vf_reset_start = 1;
+
+ r1.field.vf_reset_type = type;
+ r1.field.vf_reset_id = id;
+
+ DLB_CSR_WR(hw, DLB_CFG_MSTR_BCAST_RESET_VF_START, r1.val);
+
+ /* Wait for hardware to complete. This is a finite time operation,
+ * but wait set a loop bound just in case.
+ */
+ for (i = 0; i < 1024 * 1024; i++) {
+ r0.val = DLB_CSR_RD(hw, DLB_CFG_MSTR_DIAG_RESET_STS);
+
+ if (r0.field.chp_vf_reset_done &&
+ r0.field.rop_vf_reset_done &&
+ r0.field.lsp_vf_reset_done &&
+ r0.field.nalb_vf_reset_done &&
+ r0.field.ap_vf_reset_done &&
+ r0.field.dp_vf_reset_done &&
+ r0.field.qed_vf_reset_done &&
+ r0.field.dqed_vf_reset_done &&
+ r0.field.aqed_vf_reset_done)
+ return 0;
+
+ os_udelay(1);
+ }
+
+ return -ETIMEDOUT;
+}
+
+static int dlb_domain_reset_hw_resources(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_dir_pq_pair *dir_port;
+ struct dlb_ldb_queue *ldb_queue;
+ struct dlb_ldb_port *ldb_port;
+ struct dlb_credit_pool *pool;
+ int ret;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_credit_pools, pool, iter) {
+ ret = dlb_reset_hw_resource(hw,
+ VF_RST_TYPE_POOL_LDB,
+ pool->id.phys_id);
+ if (ret)
+ return ret;
+ }
+
+ DLB_DOM_LIST_FOR(domain->used_dir_credit_pools, pool, iter) {
+ ret = dlb_reset_hw_resource(hw,
+ VF_RST_TYPE_POOL_DIR,
+ pool->id.phys_id);
+ if (ret)
+ return ret;
+ }
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_queues, ldb_queue, iter) {
+ ret = dlb_reset_hw_resource(hw,
+ VF_RST_TYPE_QID_LDB,
+ ldb_queue->id.phys_id);
+ if (ret)
+ return ret;
+ }
+
+ DLB_DOM_LIST_FOR(domain->used_dir_pq_pairs, dir_port, iter) {
+ ret = dlb_reset_hw_resource(hw,
+ VF_RST_TYPE_QID_DIR,
+ dir_port->id.phys_id);
+ if (ret)
+ return ret;
+ }
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, ldb_port, iter) {
+ ret = dlb_reset_hw_resource(hw,
+ VF_RST_TYPE_CQ_LDB,
+ ldb_port->id.phys_id);
+ if (ret)
+ return ret;
+ }
+
+ DLB_DOM_LIST_FOR(domain->used_dir_pq_pairs, dir_port, iter) {
+ ret = dlb_reset_hw_resource(hw,
+ VF_RST_TYPE_CQ_DIR,
+ dir_port->id.phys_id);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
+static u32 dlb_ldb_cq_inflight_count(struct dlb_hw *hw,
+ struct dlb_ldb_port *port)
+{
+ union dlb_lsp_cq_ldb_infl_cnt r0;
+
+ r0.val = DLB_CSR_RD(hw, DLB_LSP_CQ_LDB_INFL_CNT(port->id.phys_id));
+
+ return r0.field.count;
+}
+
+static u32 dlb_ldb_cq_token_count(struct dlb_hw *hw,
+ struct dlb_ldb_port *port)
+{
+ union dlb_lsp_cq_ldb_tkn_cnt r0;
+
+ r0.val = DLB_CSR_RD(hw, DLB_LSP_CQ_LDB_TKN_CNT(port->id.phys_id));
+
+ return r0.field.token_count;
+}
+
+static int dlb_drain_ldb_cq(struct dlb_hw *hw, struct dlb_ldb_port *port)
+{
+ u32 infl_cnt, tkn_cnt;
+ unsigned int i;
+
+ infl_cnt = dlb_ldb_cq_inflight_count(hw, port);
+
+ /* Account for the initial token count, which is used in order to
+ * provide a CQ with depth less than 8.
+ */
+ tkn_cnt = dlb_ldb_cq_token_count(hw, port) - port->init_tkn_cnt;
+
+ if (infl_cnt || tkn_cnt) {
+ struct dlb_hcw hcw_mem[8], *hcw;
+ void *pp_addr;
+
+ pp_addr = os_map_producer_port(hw, port->id.phys_id, true);
+
+ /* Point hcw to a 64B-aligned location */
+ hcw = (struct dlb_hcw *)((uintptr_t)&hcw_mem[4] & ~0x3F);
+
+ /* Program the first HCW for a completion and token return and
+ * the other HCWs as NOOPS
+ */
+
+ memset(hcw, 0, 4 * sizeof(*hcw));
+ hcw->qe_comp = (infl_cnt > 0);
+ hcw->cq_token = (tkn_cnt > 0);
+ hcw->lock_id = tkn_cnt - 1;
+
+ /* Return tokens in the first HCW */
+ os_enqueue_four_hcws(hw, hcw, pp_addr);
+
+ hcw->cq_token = 0;
+
+ /* Issue remaining completions (if any) */
+ for (i = 1; i < infl_cnt; i++)
+ os_enqueue_four_hcws(hw, hcw, pp_addr);
+
+ os_fence_hcw(hw, pp_addr);
+
+ os_unmap_producer_port(hw, pp_addr);
+ }
+
+ return 0;
+}
+
+static int dlb_domain_wait_for_ldb_cqs_to_empty(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_ldb_port *port;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, port, iter) {
+ int i;
+
+ for (i = 0; i < DLB_MAX_CQ_COMP_CHECK_LOOPS; i++) {
+ if (dlb_ldb_cq_inflight_count(hw, port) == 0)
+ break;
+ }
+
+ if (i == DLB_MAX_CQ_COMP_CHECK_LOOPS) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: failed to flush load-balanced port %d's completions.\n",
+ __func__, port->id.phys_id);
+ return -EFAULT;
+ }
+ }
+
+ return 0;
+}
+
+static int dlb_domain_reset_software_state(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_ldb_queue *tmp_ldb_queue __attribute__((unused));
+ struct dlb_dir_pq_pair *tmp_dir_port __attribute__((unused));
+ struct dlb_ldb_port *tmp_ldb_port __attribute__((unused));
+ struct dlb_credit_pool *tmp_pool __attribute__((unused));
+ struct dlb_list_entry *iter1 __attribute__((unused));
+ struct dlb_list_entry *iter2 __attribute__((unused));
+ struct dlb_ldb_queue *ldb_queue;
+ struct dlb_dir_pq_pair *dir_port;
+ struct dlb_ldb_port *ldb_port;
+ struct dlb_credit_pool *pool;
+
+ struct dlb_function_resources *rsrcs;
+ struct dlb_list_head *list;
+ int ret;
+
+ rsrcs = domain->parent_func;
+
+ /* Move the domain's ldb queues to the function's avail list */
+ list = &domain->used_ldb_queues;
+ DLB_DOM_LIST_FOR_SAFE(*list, ldb_queue, tmp_ldb_queue, iter1, iter2) {
+ if (ldb_queue->sn_cfg_valid) {
+ struct dlb_sn_group *grp;
+
+ grp = &hw->rsrcs.sn_groups[ldb_queue->sn_group];
+
+ dlb_sn_group_free_slot(grp, ldb_queue->sn_slot);
+ ldb_queue->sn_cfg_valid = false;
+ }
+
+ ldb_queue->owned = false;
+ ldb_queue->num_mappings = 0;
+ ldb_queue->num_pending_additions = 0;
+
+ dlb_list_del(&domain->used_ldb_queues, &ldb_queue->domain_list);
+ dlb_list_add(&rsrcs->avail_ldb_queues, &ldb_queue->func_list);
+ rsrcs->num_avail_ldb_queues++;
+ }
+
+ list = &domain->avail_ldb_queues;
+ DLB_DOM_LIST_FOR_SAFE(*list, ldb_queue, tmp_ldb_queue, iter1, iter2) {
+ ldb_queue->owned = false;
+
+ dlb_list_del(&domain->avail_ldb_queues,
+ &ldb_queue->domain_list);
+ dlb_list_add(&rsrcs->avail_ldb_queues,
+ &ldb_queue->func_list);
+ rsrcs->num_avail_ldb_queues++;
+ }
+
+ /* Move the domain's ldb ports to the function's avail list */
+ list = &domain->used_ldb_ports;
+ DLB_DOM_LIST_FOR_SAFE(*list, ldb_port, tmp_ldb_port, iter1, iter2) {
+ int i;
+
+ ldb_port->owned = false;
+ ldb_port->configured = false;
+ ldb_port->num_pending_removals = 0;
+ ldb_port->num_mappings = 0;
+ for (i = 0; i < DLB_MAX_NUM_QIDS_PER_LDB_CQ; i++)
+ ldb_port->qid_map[i].state = DLB_QUEUE_UNMAPPED;
+
+ dlb_list_del(&domain->used_ldb_ports, &ldb_port->domain_list);
+ dlb_list_add(&rsrcs->avail_ldb_ports, &ldb_port->func_list);
+ rsrcs->num_avail_ldb_ports++;
+ }
+
+ list = &domain->avail_ldb_ports;
+ DLB_DOM_LIST_FOR_SAFE(*list, ldb_port, tmp_ldb_port, iter1, iter2) {
+ ldb_port->owned = false;
+
+ dlb_list_del(&domain->avail_ldb_ports, &ldb_port->domain_list);
+ dlb_list_add(&rsrcs->avail_ldb_ports, &ldb_port->func_list);
+ rsrcs->num_avail_ldb_ports++;
+ }
+
+ /* Move the domain's dir ports to the function's avail list */
+ list = &domain->used_dir_pq_pairs;
+ DLB_DOM_LIST_FOR_SAFE(*list, dir_port, tmp_dir_port, iter1, iter2) {
+ dir_port->owned = false;
+ dir_port->port_configured = false;
+
+ dlb_list_del(&domain->used_dir_pq_pairs,
+ &dir_port->domain_list);
+
+ dlb_list_add(&rsrcs->avail_dir_pq_pairs,
+ &dir_port->func_list);
+ rsrcs->num_avail_dir_pq_pairs++;
+ }
+
+ list = &domain->avail_dir_pq_pairs;
+ DLB_DOM_LIST_FOR_SAFE(*list, dir_port, tmp_dir_port, iter1, iter2) {
+ dir_port->owned = false;
+
+ dlb_list_del(&domain->avail_dir_pq_pairs,
+ &dir_port->domain_list);
+
+ dlb_list_add(&rsrcs->avail_dir_pq_pairs,
+ &dir_port->func_list);
+ rsrcs->num_avail_dir_pq_pairs++;
+ }
+
+ /* Return hist list entries to the function */
+ ret = dlb_bitmap_set_range(rsrcs->avail_hist_list_entries,
+ domain->hist_list_entry_base,
+ domain->total_hist_list_entries);
+ if (ret) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: domain hist list base doesn't match the function's bitmap.\n",
+ __func__);
+ return -EFAULT;
+ }
+
+ domain->total_hist_list_entries = 0;
+ domain->avail_hist_list_entries = 0;
+ domain->hist_list_entry_base = 0;
+ domain->hist_list_entry_offset = 0;
+
+ /* Return QED entries to the function */
+ ret = dlb_bitmap_set_range(rsrcs->avail_qed_freelist_entries,
+ domain->qed_freelist.base,
+ (domain->qed_freelist.bound -
+ domain->qed_freelist.base));
+ if (ret) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: domain QED base doesn't match the function's bitmap.\n",
+ __func__);
+ return -EFAULT;
+ }
+
+ domain->qed_freelist.base = 0;
+ domain->qed_freelist.bound = 0;
+ domain->qed_freelist.offset = 0;
+
+ /* Return DQED entries back to the function */
+ ret = dlb_bitmap_set_range(rsrcs->avail_dqed_freelist_entries,
+ domain->dqed_freelist.base,
+ (domain->dqed_freelist.bound -
+ domain->dqed_freelist.base));
+ if (ret) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: domain DQED base doesn't match the function's bitmap.\n",
+ __func__);
+ return -EFAULT;
+ }
+
+ domain->dqed_freelist.base = 0;
+ domain->dqed_freelist.bound = 0;
+ domain->dqed_freelist.offset = 0;
+
+ /* Return AQED entries back to the function */
+ ret = dlb_bitmap_set_range(rsrcs->avail_aqed_freelist_entries,
+ domain->aqed_freelist.base,
+ (domain->aqed_freelist.bound -
+ domain->aqed_freelist.base));
+ if (ret) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: domain AQED base doesn't match the function's bitmap.\n",
+ __func__);
+ return -EFAULT;
+ }
+
+ domain->aqed_freelist.base = 0;
+ domain->aqed_freelist.bound = 0;
+ domain->aqed_freelist.offset = 0;
+
+ /* Return ldb credit pools back to the function's avail list */
+ list = &domain->used_ldb_credit_pools;
+ DLB_DOM_LIST_FOR_SAFE(*list, pool, tmp_pool, iter1, iter2) {
+ pool->owned = false;
+ pool->configured = false;
+
+ dlb_list_del(&domain->used_ldb_credit_pools,
+ &pool->domain_list);
+ dlb_list_add(&rsrcs->avail_ldb_credit_pools,
+ &pool->func_list);
+ rsrcs->num_avail_ldb_credit_pools++;
+ }
+
+ list = &domain->avail_ldb_credit_pools;
+ DLB_DOM_LIST_FOR_SAFE(*list, pool, tmp_pool, iter1, iter2) {
+ pool->owned = false;
+
+ dlb_list_del(&domain->avail_ldb_credit_pools,
+ &pool->domain_list);
+ dlb_list_add(&rsrcs->avail_ldb_credit_pools,
+ &pool->func_list);
+ rsrcs->num_avail_ldb_credit_pools++;
+ }
+
+ /* Move dir credit pools back to the function */
+ list = &domain->used_dir_credit_pools;
+ DLB_DOM_LIST_FOR_SAFE(*list, pool, tmp_pool, iter1, iter2) {
+ pool->owned = false;
+ pool->configured = false;
+
+ dlb_list_del(&domain->used_dir_credit_pools,
+ &pool->domain_list);
+ dlb_list_add(&rsrcs->avail_dir_credit_pools,
+ &pool->func_list);
+ rsrcs->num_avail_dir_credit_pools++;
+ }
+
+ list = &domain->avail_dir_credit_pools;
+ DLB_DOM_LIST_FOR_SAFE(*list, pool, tmp_pool, iter1, iter2) {
+ pool->owned = false;
+
+ dlb_list_del(&domain->avail_dir_credit_pools,
+ &pool->domain_list);
+ dlb_list_add(&rsrcs->avail_dir_credit_pools,
+ &pool->func_list);
+ rsrcs->num_avail_dir_credit_pools++;
+ }
+
+ domain->num_pending_removals = 0;
+ domain->num_pending_additions = 0;
+ domain->configured = false;
+ domain->started = false;
+
+ /* Move the domain out of the used_domains list and back to the
+ * function's avail_domains list.
+ */
+ dlb_list_del(&rsrcs->used_domains, &domain->func_list);
+ dlb_list_add(&rsrcs->avail_domains, &domain->func_list);
+ rsrcs->num_avail_domains++;
+
+ return 0;
+}
+
+void dlb_resource_reset(struct dlb_hw *hw)
+{
+ struct dlb_domain *domain, *next __attribute__((unused));
+ struct dlb_list_entry *iter1 __attribute__((unused));
+ struct dlb_list_entry *iter2 __attribute__((unused));
+ int i;
+
+ for (i = 0; i < DLB_MAX_NUM_VFS; i++) {
+ DLB_FUNC_LIST_FOR_SAFE(hw->vf[i].used_domains, domain,
+ next, iter1, iter2)
+ dlb_domain_reset_software_state(hw, domain);
+ }
+
+ DLB_FUNC_LIST_FOR_SAFE(hw->pf.used_domains, domain, next, iter1, iter2)
+ dlb_domain_reset_software_state(hw, domain);
+}
+
+static u32 dlb_dir_queue_depth(struct dlb_hw *hw,
+ struct dlb_dir_pq_pair *queue)
+{
+ union dlb_lsp_qid_dir_enqueue_cnt r0;
+
+ r0.val = DLB_CSR_RD(hw, DLB_LSP_QID_DIR_ENQUEUE_CNT(queue->id.phys_id));
+
+ return r0.field.count;
+}
+
+static bool dlb_dir_queue_is_empty(struct dlb_hw *hw,
+ struct dlb_dir_pq_pair *queue)
+{
+ return dlb_dir_queue_depth(hw, queue) == 0;
+}
+
+static void dlb_log_get_dir_queue_depth(struct dlb_hw *hw,
+ u32 domain_id,
+ u32 queue_id,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ DLB_HW_INFO(hw, "DLB get directed queue depth:\n");
+ if (vf_request)
+ DLB_HW_INFO(hw, "(Request from VF %d)\n", vf_id);
+ DLB_HW_INFO(hw, "\tDomain ID: %d\n", domain_id);
+ DLB_HW_INFO(hw, "\tQueue ID: %d\n", queue_id);
+}
+
+int dlb_hw_get_dir_queue_depth(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_get_dir_queue_depth_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_dir_pq_pair *queue;
+ struct dlb_domain *domain;
+ int id;
+
+ id = domain_id;
+
+ dlb_log_get_dir_queue_depth(hw, domain_id, args->queue_id,
+ vf_request, vf_id);
+
+ domain = dlb_get_domain_from_id(hw, id, vf_request, vf_id);
+ if (!domain) {
+ resp->status = DLB_ST_INVALID_DOMAIN_ID;
+ return -EINVAL;
+ }
+
+ id = args->queue_id;
+
+ queue = dlb_get_domain_used_dir_pq(id, vf_request, domain);
+ if (!queue) {
+ resp->status = DLB_ST_INVALID_QID;
+ return -EINVAL;
+ }
+
+ resp->id = dlb_dir_queue_depth(hw, queue);
+
+ return 0;
+}
+
+static void
+dlb_log_pending_port_unmaps_args(struct dlb_hw *hw,
+ struct dlb_pending_port_unmaps_args *args,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ DLB_HW_INFO(hw, "DLB pending port unmaps arguments:\n");
+ if (vf_request)
+ DLB_HW_INFO(hw, "(Request from VF %d)\n", vf_id);
+ DLB_HW_INFO(hw, "\tPort ID: %d\n", args->port_id);
+}
+
+int dlb_hw_pending_port_unmaps(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_pending_port_unmaps_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_domain *domain;
+ struct dlb_ldb_port *port;
+
+ dlb_log_pending_port_unmaps_args(hw, args, vf_request, vf_id);
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+
+ if (!domain) {
+ resp->status = DLB_ST_INVALID_DOMAIN_ID;
+ return -EINVAL;
+ }
+
+ port = dlb_get_domain_used_ldb_port(args->port_id, vf_request, domain);
+ if (!port || !port->configured) {
+ resp->status = DLB_ST_INVALID_PORT_ID;
+ return -EINVAL;
+ }
+
+ resp->id = port->num_pending_removals;
+
+ return 0;
+}
+
+/* Returns whether the queue is empty, including its inflight and replay
+ * counts.
+ */
+static bool dlb_ldb_queue_is_empty(struct dlb_hw *hw,
+ struct dlb_ldb_queue *queue)
+{
+ union dlb_lsp_qid_ldb_replay_cnt r0;
+ union dlb_lsp_qid_aqed_active_cnt r1;
+ union dlb_lsp_qid_atq_enqueue_cnt r2;
+ union dlb_lsp_qid_ldb_enqueue_cnt r3;
+ union dlb_lsp_qid_ldb_infl_cnt r4;
+
+ r0.val = DLB_CSR_RD(hw,
+ DLB_LSP_QID_LDB_REPLAY_CNT(queue->id.phys_id));
+ if (r0.val)
+ return false;
+
+ r1.val = DLB_CSR_RD(hw,
+ DLB_LSP_QID_AQED_ACTIVE_CNT(queue->id.phys_id));
+ if (r1.val)
+ return false;
+
+ r2.val = DLB_CSR_RD(hw,
+ DLB_LSP_QID_ATQ_ENQUEUE_CNT(queue->id.phys_id));
+ if (r2.val)
+ return false;
+
+ r3.val = DLB_CSR_RD(hw,
+ DLB_LSP_QID_LDB_ENQUEUE_CNT(queue->id.phys_id));
+ if (r3.val)
+ return false;
+
+ r4.val = DLB_CSR_RD(hw,
+ DLB_LSP_QID_LDB_INFL_CNT(queue->id.phys_id));
+ if (r4.val)
+ return false;
+
+ return true;
+}
+
+static void dlb_log_get_ldb_queue_depth(struct dlb_hw *hw,
+ u32 domain_id,
+ u32 queue_id,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ DLB_HW_INFO(hw, "DLB get load-balanced queue depth:\n");
+ if (vf_request)
+ DLB_HW_INFO(hw, "(Request from VF %d)\n", vf_id);
+ DLB_HW_INFO(hw, "\tDomain ID: %d\n", domain_id);
+ DLB_HW_INFO(hw, "\tQueue ID: %d\n", queue_id);
+}
+
+int dlb_hw_get_ldb_queue_depth(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_get_ldb_queue_depth_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_req,
+ unsigned int vf_id)
+{
+ union dlb_lsp_qid_aqed_active_cnt r0;
+ union dlb_lsp_qid_atq_enqueue_cnt r1;
+ union dlb_lsp_qid_ldb_enqueue_cnt r2;
+ struct dlb_ldb_queue *queue;
+ struct dlb_domain *domain;
+
+ dlb_log_get_ldb_queue_depth(hw, domain_id, args->queue_id,
+ vf_req, vf_id);
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_req, vf_id);
+ if (!domain) {
+ resp->status = DLB_ST_INVALID_DOMAIN_ID;
+ return -EINVAL;
+ }
+
+ queue = dlb_get_domain_ldb_queue(args->queue_id, vf_req, domain);
+ if (!queue) {
+ resp->status = DLB_ST_INVALID_QID;
+ return -EINVAL;
+ }
+
+ r0.val = DLB_CSR_RD(hw,
+ DLB_LSP_QID_AQED_ACTIVE_CNT(queue->id.phys_id));
+
+ r1.val = DLB_CSR_RD(hw,
+ DLB_LSP_QID_ATQ_ENQUEUE_CNT(queue->id.phys_id));
+
+ r2.val = DLB_CSR_RD(hw,
+ DLB_LSP_QID_LDB_ENQUEUE_CNT(queue->id.phys_id));
+
+ resp->id = r0.val + r1.val + r2.val;
+
+ return 0;
+}
+
+static u32 dlb_dir_cq_token_count(struct dlb_hw *hw,
+ struct dlb_dir_pq_pair *port)
+{
+ union dlb_lsp_cq_dir_tkn_cnt r0;
+
+ r0.val = DLB_CSR_RD(hw, DLB_LSP_CQ_DIR_TKN_CNT(port->id.phys_id));
+
+ return r0.field.count;
+}
+
+static int dlb_domain_verify_reset_success(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_dir_pq_pair *dir_port;
+ struct dlb_ldb_port *ldb_port;
+ struct dlb_credit_pool *pool;
+ struct dlb_ldb_queue *queue;
+
+ /* Confirm that all credits are returned to the domain's credit pools */
+ DLB_DOM_LIST_FOR(domain->used_dir_credit_pools, pool, iter) {
+ union dlb_chp_dqed_fl_pop_ptr r0;
+ union dlb_chp_dqed_fl_push_ptr r1;
+
+ r0.val = DLB_CSR_RD(hw,
+ DLB_CHP_DQED_FL_POP_PTR(pool->id.phys_id));
+
+ r1.val = DLB_CSR_RD(hw,
+ DLB_CHP_DQED_FL_PUSH_PTR(pool->id.phys_id));
+
+ if (r0.field.pop_ptr != r1.field.push_ptr ||
+ r0.field.generation == r1.field.generation) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: failed to refill directed pool %d's credits.\n",
+ __func__, pool->id.phys_id);
+ return -EFAULT;
+ }
+ }
+
+ /* Confirm that all the domain's queue's inflight counts and AQED
+ * active counts are 0.
+ */
+ DLB_DOM_LIST_FOR(domain->used_ldb_queues, queue, iter) {
+ if (!dlb_ldb_queue_is_empty(hw, queue)) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: failed to empty ldb queue %d\n",
+ __func__, queue->id.phys_id);
+ return -EFAULT;
+ }
+ }
+
+ /* Confirm that all the domain's CQs inflight and token counts are 0. */
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, ldb_port, iter) {
+ if (dlb_ldb_cq_inflight_count(hw, ldb_port) ||
+ dlb_ldb_cq_token_count(hw, ldb_port)) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: failed to empty ldb port %d\n",
+ __func__, ldb_port->id.phys_id);
+ return -EFAULT;
+ }
+ }
+
+ DLB_DOM_LIST_FOR(domain->used_dir_pq_pairs, dir_port, iter) {
+ if (!dlb_dir_queue_is_empty(hw, dir_port)) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: failed to empty dir queue %d\n",
+ __func__, dir_port->id.phys_id);
+ return -EFAULT;
+ }
+
+ if (dlb_dir_cq_token_count(hw, dir_port)) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: failed to empty dir port %d\n",
+ __func__, dir_port->id.phys_id);
+ return -EFAULT;
+ }
+ }
+
+ return 0;
+}
+
+static void __dlb_domain_reset_ldb_port_registers(struct dlb_hw *hw,
+ struct dlb_ldb_port *port)
+{
+ union dlb_chp_ldb_pp_state_reset r0 = { {0} };
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_PP_CRD_REQ_STATE(port->id.phys_id),
+ DLB_CHP_LDB_PP_CRD_REQ_STATE_RST);
+
+ /* Reset the port's load-balanced and directed credit state */
+ r0.field.dir_type = 0;
+ r0.field.reset_pp_state = 1;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_PP_STATE_RESET(port->id.phys_id),
+ r0.val);
+
+ r0.field.dir_type = 1;
+ r0.field.reset_pp_state = 1;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_PP_STATE_RESET(port->id.phys_id),
+ r0.val);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_PP_DIR_PUSH_PTR(port->id.phys_id),
+ DLB_CHP_LDB_PP_DIR_PUSH_PTR_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_PP_LDB_PUSH_PTR(port->id.phys_id),
+ DLB_CHP_LDB_PP_LDB_PUSH_PTR_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_PP_LDB_MIN_CRD_QNT(port->id.phys_id),
+ DLB_CHP_LDB_PP_LDB_MIN_CRD_QNT_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_PP_LDB_CRD_LWM(port->id.phys_id),
+ DLB_CHP_LDB_PP_LDB_CRD_LWM_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_PP_LDB_CRD_HWM(port->id.phys_id),
+ DLB_CHP_LDB_PP_LDB_CRD_HWM_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_LDB_PP2POOL(port->id.phys_id),
+ DLB_CHP_LDB_LDB_PP2POOL_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_PP_DIR_MIN_CRD_QNT(port->id.phys_id),
+ DLB_CHP_LDB_PP_DIR_MIN_CRD_QNT_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_PP_DIR_CRD_LWM(port->id.phys_id),
+ DLB_CHP_LDB_PP_DIR_CRD_LWM_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_PP_DIR_CRD_HWM(port->id.phys_id),
+ DLB_CHP_LDB_PP_DIR_CRD_HWM_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_DIR_PP2POOL(port->id.phys_id),
+ DLB_CHP_LDB_DIR_PP2POOL_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_PP2LDBPOOL(port->id.phys_id),
+ DLB_SYS_LDB_PP2LDBPOOL_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_PP2DIRPOOL(port->id.phys_id),
+ DLB_SYS_LDB_PP2DIRPOOL_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_HIST_LIST_LIM(port->id.phys_id),
+ DLB_CHP_HIST_LIST_LIM_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_HIST_LIST_BASE(port->id.phys_id),
+ DLB_CHP_HIST_LIST_BASE_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_HIST_LIST_POP_PTR(port->id.phys_id),
+ DLB_CHP_HIST_LIST_POP_PTR_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_HIST_LIST_PUSH_PTR(port->id.phys_id),
+ DLB_CHP_HIST_LIST_PUSH_PTR_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_CQ_WPTR(port->id.phys_id),
+ DLB_CHP_LDB_CQ_WPTR_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_CQ_INT_DEPTH_THRSH(port->id.phys_id),
+ DLB_CHP_LDB_CQ_INT_DEPTH_THRSH_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_CQ_TMR_THRESHOLD(port->id.phys_id),
+ DLB_CHP_LDB_CQ_TMR_THRESHOLD_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_CQ_INT_ENB(port->id.phys_id),
+ DLB_CHP_LDB_CQ_INT_ENB_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_LSP_CQ_LDB_INFL_LIM(port->id.phys_id),
+ DLB_LSP_CQ_LDB_INFL_LIM_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_LSP_CQ2PRIOV(port->id.phys_id),
+ DLB_LSP_CQ2PRIOV_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_LSP_CQ_LDB_TOT_SCH_CNT_CTRL(port->id.phys_id),
+ DLB_LSP_CQ_LDB_TOT_SCH_CNT_CTRL_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_LSP_CQ_LDB_TKN_DEPTH_SEL(port->id.phys_id),
+ DLB_LSP_CQ_LDB_TKN_DEPTH_SEL_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_CQ_TKN_DEPTH_SEL(port->id.phys_id),
+ DLB_CHP_LDB_CQ_TKN_DEPTH_SEL_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_LSP_CQ_LDB_DSBL(port->id.phys_id),
+ DLB_LSP_CQ_LDB_DSBL_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_CQ2VF_PF(port->id.phys_id),
+ DLB_SYS_LDB_CQ2VF_PF_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_PP2VF_PF(port->id.phys_id),
+ DLB_SYS_LDB_PP2VF_PF_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_CQ_ADDR_L(port->id.phys_id),
+ DLB_SYS_LDB_CQ_ADDR_L_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_CQ_ADDR_U(port->id.phys_id),
+ DLB_SYS_LDB_CQ_ADDR_U_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_PP_ADDR_L(port->id.phys_id),
+ DLB_SYS_LDB_PP_ADDR_L_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_PP_ADDR_U(port->id.phys_id),
+ DLB_SYS_LDB_PP_ADDR_U_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_PP_V(port->id.phys_id),
+ DLB_SYS_LDB_PP_V_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_PP2VAS(port->id.phys_id),
+ DLB_SYS_LDB_PP2VAS_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_CQ_ISR(port->id.phys_id),
+ DLB_SYS_LDB_CQ_ISR_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_WBUF_LDB_FLAGS(port->id.phys_id),
+ DLB_SYS_WBUF_LDB_FLAGS_RST);
+}
+
+static void dlb_domain_reset_ldb_port_registers(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_ldb_port *port;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, port, iter)
+ __dlb_domain_reset_ldb_port_registers(hw, port);
+}
+
+static void __dlb_domain_reset_dir_port_registers(struct dlb_hw *hw,
+ struct dlb_dir_pq_pair *port)
+{
+ union dlb_chp_dir_pp_state_reset r0 = { {0} };
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_CRD_REQ_STATE(port->id.phys_id),
+ DLB_CHP_DIR_PP_CRD_REQ_STATE_RST);
+
+ /* Reset the port's load-balanced and directed credit state */
+ r0.field.dir_type = 0;
+ r0.field.reset_pp_state = 1;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_STATE_RESET(port->id.phys_id),
+ r0.val);
+
+ r0.field.dir_type = 1;
+ r0.field.reset_pp_state = 1;
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_STATE_RESET(port->id.phys_id),
+ r0.val);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_DIR_PUSH_PTR(port->id.phys_id),
+ DLB_CHP_DIR_PP_DIR_PUSH_PTR_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_LDB_PUSH_PTR(port->id.phys_id),
+ DLB_CHP_DIR_PP_LDB_PUSH_PTR_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_LDB_MIN_CRD_QNT(port->id.phys_id),
+ DLB_CHP_DIR_PP_LDB_MIN_CRD_QNT_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_LDB_CRD_LWM(port->id.phys_id),
+ DLB_CHP_DIR_PP_LDB_CRD_LWM_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_LDB_CRD_HWM(port->id.phys_id),
+ DLB_CHP_DIR_PP_LDB_CRD_HWM_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_LDB_PP2POOL(port->id.phys_id),
+ DLB_CHP_DIR_LDB_PP2POOL_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_DIR_MIN_CRD_QNT(port->id.phys_id),
+ DLB_CHP_DIR_PP_DIR_MIN_CRD_QNT_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_DIR_CRD_LWM(port->id.phys_id),
+ DLB_CHP_DIR_PP_DIR_CRD_LWM_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_DIR_CRD_HWM(port->id.phys_id),
+ DLB_CHP_DIR_PP_DIR_CRD_HWM_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_DIR_PP2POOL(port->id.phys_id),
+ DLB_CHP_DIR_DIR_PP2POOL_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_PP2LDBPOOL(port->id.phys_id),
+ DLB_SYS_DIR_PP2LDBPOOL_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_PP2DIRPOOL(port->id.phys_id),
+ DLB_SYS_DIR_PP2DIRPOOL_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_CQ_WPTR(port->id.phys_id),
+ DLB_CHP_DIR_CQ_WPTR_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_LSP_CQ_DIR_TKN_DEPTH_SEL_DSI(port->id.phys_id),
+ DLB_LSP_CQ_DIR_TKN_DEPTH_SEL_DSI_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_CQ_TKN_DEPTH_SEL(port->id.phys_id),
+ DLB_CHP_DIR_CQ_TKN_DEPTH_SEL_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_LSP_CQ_DIR_DSBL(port->id.phys_id),
+ DLB_LSP_CQ_DIR_DSBL_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_CQ_WPTR(port->id.phys_id),
+ DLB_CHP_DIR_CQ_WPTR_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_CQ_INT_DEPTH_THRSH(port->id.phys_id),
+ DLB_CHP_DIR_CQ_INT_DEPTH_THRSH_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_CQ_TMR_THRESHOLD(port->id.phys_id),
+ DLB_CHP_DIR_CQ_TMR_THRESHOLD_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_CQ_INT_ENB(port->id.phys_id),
+ DLB_CHP_DIR_CQ_INT_ENB_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_CQ2VF_PF(port->id.phys_id),
+ DLB_SYS_DIR_CQ2VF_PF_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_PP2VF_PF(port->id.phys_id),
+ DLB_SYS_DIR_PP2VF_PF_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_CQ_ADDR_L(port->id.phys_id),
+ DLB_SYS_DIR_CQ_ADDR_L_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_CQ_ADDR_U(port->id.phys_id),
+ DLB_SYS_DIR_CQ_ADDR_U_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_PP_ADDR_L(port->id.phys_id),
+ DLB_SYS_DIR_PP_ADDR_L_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_PP_ADDR_U(port->id.phys_id),
+ DLB_SYS_DIR_PP_ADDR_U_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_PP_V(port->id.phys_id),
+ DLB_SYS_DIR_PP_V_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_PP2VAS(port->id.phys_id),
+ DLB_SYS_DIR_PP2VAS_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_CQ_ISR(port->id.phys_id),
+ DLB_SYS_DIR_CQ_ISR_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_WBUF_DIR_FLAGS(port->id.phys_id),
+ DLB_SYS_WBUF_DIR_FLAGS_RST);
+}
+
+static void dlb_domain_reset_dir_port_registers(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_dir_pq_pair *port;
+
+ DLB_DOM_LIST_FOR(domain->used_dir_pq_pairs, port, iter)
+ __dlb_domain_reset_dir_port_registers(hw, port);
+}
+
+static void dlb_domain_reset_ldb_queue_registers(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_ldb_queue *queue;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_queues, queue, iter) {
+ DLB_CSR_WR(hw,
+ DLB_AQED_PIPE_FL_LIM(queue->id.phys_id),
+ DLB_AQED_PIPE_FL_LIM_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_AQED_PIPE_FL_BASE(queue->id.phys_id),
+ DLB_AQED_PIPE_FL_BASE_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_AQED_PIPE_FL_POP_PTR(queue->id.phys_id),
+ DLB_AQED_PIPE_FL_POP_PTR_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_AQED_PIPE_FL_PUSH_PTR(queue->id.phys_id),
+ DLB_AQED_PIPE_FL_PUSH_PTR_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_AQED_PIPE_QID_FID_LIM(queue->id.phys_id),
+ DLB_AQED_PIPE_QID_FID_LIM_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_LSP_QID_AQED_ACTIVE_LIM(queue->id.phys_id),
+ DLB_LSP_QID_AQED_ACTIVE_LIM_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_LSP_QID_LDB_INFL_LIM(queue->id.phys_id),
+ DLB_LSP_QID_LDB_INFL_LIM_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_QID_V(queue->id.phys_id),
+ DLB_SYS_LDB_QID_V_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_QID_V(queue->id.phys_id),
+ DLB_SYS_LDB_QID_V_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_ORD_QID_SN(queue->id.phys_id),
+ DLB_CHP_ORD_QID_SN_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_ORD_QID_SN_MAP(queue->id.phys_id),
+ DLB_CHP_ORD_QID_SN_MAP_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_RO_PIPE_QID2GRPSLT(queue->id.phys_id),
+ DLB_RO_PIPE_QID2GRPSLT_RST);
+ }
+}
+
+static void dlb_domain_reset_dir_queue_registers(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_dir_pq_pair *queue;
+
+ DLB_DOM_LIST_FOR(domain->used_dir_pq_pairs, queue, iter) {
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_QID_V(queue->id.phys_id),
+ DLB_SYS_DIR_QID_V_RST);
+ }
+}
+
+static void dlb_domain_reset_ldb_pool_registers(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_credit_pool *pool;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_credit_pools, pool, iter) {
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_POOL_CRD_LIM(pool->id.phys_id),
+ DLB_CHP_LDB_POOL_CRD_LIM_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_POOL_CRD_CNT(pool->id.phys_id),
+ DLB_CHP_LDB_POOL_CRD_CNT_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_QED_FL_BASE(pool->id.phys_id),
+ DLB_CHP_QED_FL_BASE_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_QED_FL_LIM(pool->id.phys_id),
+ DLB_CHP_QED_FL_LIM_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_QED_FL_PUSH_PTR(pool->id.phys_id),
+ DLB_CHP_QED_FL_PUSH_PTR_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_QED_FL_POP_PTR(pool->id.phys_id),
+ DLB_CHP_QED_FL_POP_PTR_RST);
+ }
+}
+
+static void dlb_domain_reset_dir_pool_registers(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_credit_pool *pool;
+
+ DLB_DOM_LIST_FOR(domain->used_dir_credit_pools, pool, iter) {
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_POOL_CRD_LIM(pool->id.phys_id),
+ DLB_CHP_DIR_POOL_CRD_LIM_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_POOL_CRD_CNT(pool->id.phys_id),
+ DLB_CHP_DIR_POOL_CRD_CNT_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DQED_FL_BASE(pool->id.phys_id),
+ DLB_CHP_DQED_FL_BASE_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DQED_FL_LIM(pool->id.phys_id),
+ DLB_CHP_DQED_FL_LIM_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DQED_FL_PUSH_PTR(pool->id.phys_id),
+ DLB_CHP_DQED_FL_PUSH_PTR_RST);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DQED_FL_POP_PTR(pool->id.phys_id),
+ DLB_CHP_DQED_FL_POP_PTR_RST);
+ }
+}
+
+static void dlb_domain_reset_registers(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ dlb_domain_reset_ldb_port_registers(hw, domain);
+
+ dlb_domain_reset_dir_port_registers(hw, domain);
+
+ dlb_domain_reset_ldb_queue_registers(hw, domain);
+
+ dlb_domain_reset_dir_queue_registers(hw, domain);
+
+ dlb_domain_reset_ldb_pool_registers(hw, domain);
+
+ dlb_domain_reset_dir_pool_registers(hw, domain);
+}
+
+static int dlb_domain_drain_ldb_cqs(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ bool toggle_port)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_ldb_port *port;
+ int ret;
+
+ /* If the domain hasn't been started, there's no traffic to drain */
+ if (!domain->started)
+ return 0;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, port, iter) {
+ if (toggle_port)
+ dlb_ldb_port_cq_disable(hw, port);
+
+ ret = dlb_drain_ldb_cq(hw, port);
+ if (ret < 0)
+ return ret;
+
+ if (toggle_port)
+ dlb_ldb_port_cq_enable(hw, port);
+ }
+
+ return 0;
+}
+
+static bool dlb_domain_mapped_queues_empty(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_ldb_queue *queue;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_queues, queue, iter) {
+ if (queue->num_mappings == 0)
+ continue;
+
+ if (!dlb_ldb_queue_is_empty(hw, queue))
+ return false;
+ }
+
+ return true;
+}
+
+static int dlb_domain_drain_mapped_queues(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ int i, ret;
+
+ /* If the domain hasn't been started, there's no traffic to drain */
+ if (!domain->started)
+ return 0;
+
+ if (domain->num_pending_removals > 0) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: failed to unmap domain queues\n",
+ __func__);
+ return -EFAULT;
+ }
+
+ for (i = 0; i < DLB_MAX_QID_EMPTY_CHECK_LOOPS; i++) {
+ ret = dlb_domain_drain_ldb_cqs(hw, domain, true);
+ if (ret < 0)
+ return ret;
+
+ if (dlb_domain_mapped_queues_empty(hw, domain))
+ break;
+ }
+
+ if (i == DLB_MAX_QID_EMPTY_CHECK_LOOPS) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: failed to empty queues\n",
+ __func__);
+ return -EFAULT;
+ }
+
+ /* Drain the CQs one more time. For the queues to go empty, they would
+ * have scheduled one or more QEs.
+ */
+ ret = dlb_domain_drain_ldb_cqs(hw, domain, true);
+ if (ret < 0)
+ return ret;
+
+ return 0;
+}
+
+static int dlb_domain_drain_unmapped_queue(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ struct dlb_ldb_queue *queue)
+{
+ struct dlb_ldb_port *port;
+ int ret;
+
+ /* If a domain has LDB queues, it must have LDB ports */
+ if (dlb_list_empty(&domain->used_ldb_ports)) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: No configured LDB ports\n",
+ __func__);
+ return -EFAULT;
+ }
+
+ port = DLB_DOM_LIST_HEAD(domain->used_ldb_ports, typeof(*port));
+
+ /* If necessary, free up a QID slot in this CQ */
+ if (port->num_mappings == DLB_MAX_NUM_QIDS_PER_LDB_CQ) {
+ struct dlb_ldb_queue *mapped_queue;
+
+ mapped_queue = &hw->rsrcs.ldb_queues[port->qid_map[0].qid];
+
+ ret = dlb_ldb_port_unmap_qid(hw, port, mapped_queue);
+ if (ret)
+ return ret;
+ }
+
+ ret = dlb_ldb_port_map_qid_dynamic(hw, port, queue, 0);
+ if (ret)
+ return ret;
+
+ return dlb_domain_drain_mapped_queues(hw, domain);
+}
+
+static int dlb_domain_drain_unmapped_queues(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_ldb_queue *queue;
+ int ret;
+
+ /* If the domain hasn't been started, there's no traffic to drain */
+ if (!domain->started)
+ return 0;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_queues, queue, iter) {
+ if (queue->num_mappings != 0 ||
+ dlb_ldb_queue_is_empty(hw, queue))
+ continue;
+
+ ret = dlb_domain_drain_unmapped_queue(hw, domain, queue);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
+static void dlb_drain_dir_cq(struct dlb_hw *hw, struct dlb_dir_pq_pair *port)
+{
+ unsigned int port_id = port->id.phys_id;
+ u32 cnt;
+
+ /* Return any outstanding tokens */
+ cnt = dlb_dir_cq_token_count(hw, port);
+
+ if (cnt != 0) {
+ struct dlb_hcw hcw_mem[8], *hcw;
+ void *pp_addr;
+
+ pp_addr = os_map_producer_port(hw, port_id, false);
+
+ /* Point hcw to a 64B-aligned location */
+ hcw = (struct dlb_hcw *)((uintptr_t)&hcw_mem[4] & ~0x3F);
+
+ /* Program the first HCW for a batch token return and
+ * the rest as NOOPS
+ */
+ memset(hcw, 0, 4 * sizeof(*hcw));
+ hcw->cq_token = 1;
+ hcw->lock_id = cnt - 1;
+
+ os_enqueue_four_hcws(hw, hcw, pp_addr);
+
+ os_fence_hcw(hw, pp_addr);
+
+ os_unmap_producer_port(hw, pp_addr);
+ }
+}
+
+static int dlb_domain_drain_dir_cqs(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ bool toggle_port)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_dir_pq_pair *port;
+
+ DLB_DOM_LIST_FOR(domain->used_dir_pq_pairs, port, iter) {
+ /* Can't drain a port if it's not configured, and there's
+ * nothing to drain if its queue is unconfigured.
+ */
+ if (!port->port_configured || !port->queue_configured)
+ continue;
+
+ if (toggle_port)
+ dlb_dir_port_cq_disable(hw, port);
+
+ dlb_drain_dir_cq(hw, port);
+
+ if (toggle_port)
+ dlb_dir_port_cq_enable(hw, port);
+ }
+
+ return 0;
+}
+
+static bool dlb_domain_dir_queues_empty(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_dir_pq_pair *queue;
+
+ DLB_DOM_LIST_FOR(domain->used_dir_pq_pairs, queue, iter) {
+ if (!dlb_dir_queue_is_empty(hw, queue))
+ return false;
+ }
+
+ return true;
+}
+
+static int dlb_domain_drain_dir_queues(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ int i;
+
+ /* If the domain hasn't been started, there's no traffic to drain */
+ if (!domain->started)
+ return 0;
+
+ for (i = 0; i < DLB_MAX_QID_EMPTY_CHECK_LOOPS; i++) {
+ dlb_domain_drain_dir_cqs(hw, domain, true);
+
+ if (dlb_domain_dir_queues_empty(hw, domain))
+ break;
+ }
+
+ if (i == DLB_MAX_QID_EMPTY_CHECK_LOOPS) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: failed to empty queues\n",
+ __func__);
+ return -EFAULT;
+ }
+
+ /* Drain the CQs one more time. For the queues to go empty, they would
+ * have scheduled one or more QEs.
+ */
+ dlb_domain_drain_dir_cqs(hw, domain, true);
+
+ return 0;
+}
+
+static void dlb_domain_disable_dir_producer_ports(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_dir_pq_pair *port;
+ union dlb_sys_dir_pp_v r1;
+
+ r1.field.pp_v = 0;
+
+ DLB_DOM_LIST_FOR(domain->used_dir_pq_pairs, port, iter)
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_PP_V(port->id.phys_id),
+ r1.val);
+}
+
+static void dlb_domain_disable_ldb_producer_ports(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ union dlb_sys_ldb_pp_v r1;
+ struct dlb_ldb_port *port;
+
+ r1.field.pp_v = 0;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, port, iter) {
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_PP_V(port->id.phys_id),
+ r1.val);
+
+ hw->pf.num_enabled_ldb_ports--;
+ }
+}
+
+static void dlb_domain_disable_dir_vpps(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ unsigned int vf_id)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ union dlb_sys_vf_dir_vpp_v r1;
+ struct dlb_dir_pq_pair *port;
+
+ r1.field.vpp_v = 0;
+
+ DLB_DOM_LIST_FOR(domain->used_dir_pq_pairs, port, iter) {
+ unsigned int offs;
+
+ offs = vf_id * DLB_MAX_NUM_DIR_PORTS + port->id.virt_id;
+
+ DLB_CSR_WR(hw, DLB_SYS_VF_DIR_VPP_V(offs), r1.val);
+ }
+}
+
+static void dlb_domain_disable_ldb_vpps(struct dlb_hw *hw,
+ struct dlb_domain *domain,
+ unsigned int vf_id)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ union dlb_sys_vf_ldb_vpp_v r1;
+ struct dlb_ldb_port *port;
+
+ r1.field.vpp_v = 0;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, port, iter) {
+ unsigned int offs;
+
+ offs = vf_id * DLB_MAX_NUM_LDB_PORTS + port->id.virt_id;
+
+ DLB_CSR_WR(hw, DLB_SYS_VF_LDB_VPP_V(offs), r1.val);
+ }
+}
+
+static void dlb_domain_disable_dir_pools(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ union dlb_sys_dir_pool_enbld r0 = { {0} };
+ struct dlb_credit_pool *pool;
+
+ DLB_DOM_LIST_FOR(domain->used_dir_credit_pools, pool, iter)
+ DLB_CSR_WR(hw,
+ DLB_SYS_DIR_POOL_ENBLD(pool->id.phys_id),
+ r0.val);
+}
+
+static void dlb_domain_disable_ldb_pools(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ union dlb_sys_ldb_pool_enbld r0 = { {0} };
+ struct dlb_credit_pool *pool;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_credit_pools, pool, iter)
+ DLB_CSR_WR(hw,
+ DLB_SYS_LDB_POOL_ENBLD(pool->id.phys_id),
+ r0.val);
+}
+
+static void dlb_domain_disable_ldb_seq_checks(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ union dlb_chp_sn_chk_enbl r1;
+ struct dlb_ldb_port *port;
+
+ r1.field.en = 0;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, port, iter)
+ DLB_CSR_WR(hw,
+ DLB_CHP_SN_CHK_ENBL(port->id.phys_id),
+ r1.val);
+}
+
+static void dlb_domain_disable_ldb_port_crd_updates(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ union dlb_chp_ldb_pp_crd_req_state r0;
+ struct dlb_ldb_port *port;
+
+ r0.field.no_pp_credit_update = 1;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, port, iter)
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_PP_CRD_REQ_STATE(port->id.phys_id),
+ r0.val);
+}
+
+static void dlb_domain_disable_ldb_port_interrupts(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ union dlb_chp_ldb_cq_int_enb r0 = { {0} };
+ union dlb_chp_ldb_cq_wd_enb r1 = { {0} };
+ struct dlb_ldb_port *port;
+
+ r0.field.en_tim = 0;
+ r0.field.en_depth = 0;
+
+ r1.field.wd_enable = 0;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, port, iter) {
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_CQ_INT_ENB(port->id.phys_id),
+ r0.val);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_LDB_CQ_WD_ENB(port->id.phys_id),
+ r1.val);
+ }
+}
+
+static void dlb_domain_disable_dir_port_interrupts(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ union dlb_chp_dir_cq_int_enb r0 = { {0} };
+ union dlb_chp_dir_cq_wd_enb r1 = { {0} };
+ struct dlb_dir_pq_pair *port;
+
+ r0.field.en_tim = 0;
+ r0.field.en_depth = 0;
+
+ r1.field.wd_enable = 0;
+
+ DLB_DOM_LIST_FOR(domain->used_dir_pq_pairs, port, iter) {
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_CQ_INT_ENB(port->id.phys_id),
+ r0.val);
+
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_CQ_WD_ENB(port->id.phys_id),
+ r1.val);
+ }
+}
+
+static void dlb_domain_disable_dir_port_crd_updates(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ union dlb_chp_dir_pp_crd_req_state r0;
+ struct dlb_dir_pq_pair *port;
+
+ r0.field.no_pp_credit_update = 1;
+
+ DLB_DOM_LIST_FOR(domain->used_dir_pq_pairs, port, iter)
+ DLB_CSR_WR(hw,
+ DLB_CHP_DIR_PP_CRD_REQ_STATE(port->id.phys_id),
+ r0.val);
+}
+
+static void dlb_domain_disable_ldb_queue_write_perms(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ int domain_offset = domain->id.phys_id * DLB_MAX_NUM_LDB_QUEUES;
+ struct dlb_list_entry *iter __attribute__((unused));
+ union dlb_sys_ldb_vasqid_v r0;
+ struct dlb_ldb_queue *queue;
+
+ r0.field.vasqid_v = 0;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_queues, queue, iter) {
+ int idx = domain_offset + queue->id.phys_id;
+
+ DLB_CSR_WR(hw, DLB_SYS_LDB_VASQID_V(idx), r0.val);
+ }
+}
+
+static void dlb_domain_disable_dir_queue_write_perms(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ int domain_offset = domain->id.phys_id * DLB_MAX_NUM_DIR_PORTS;
+ struct dlb_list_entry *iter __attribute__((unused));
+ union dlb_sys_dir_vasqid_v r0;
+ struct dlb_dir_pq_pair *port;
+
+ r0.field.vasqid_v = 0;
+
+ DLB_DOM_LIST_FOR(domain->used_dir_pq_pairs, port, iter) {
+ int idx = domain_offset + port->id.phys_id;
+
+ DLB_CSR_WR(hw, DLB_SYS_DIR_VASQID_V(idx), r0.val);
+ }
+}
+
+static void dlb_domain_disable_dir_cqs(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_dir_pq_pair *port;
+
+ DLB_DOM_LIST_FOR(domain->used_dir_pq_pairs, port, iter) {
+ port->enabled = false;
+
+ dlb_dir_port_cq_disable(hw, port);
+ }
+}
+
+static void dlb_domain_disable_ldb_cqs(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_ldb_port *port;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, port, iter) {
+ port->enabled = false;
+
+ dlb_ldb_port_cq_disable(hw, port);
+ }
+}
+
+static void dlb_domain_enable_ldb_cqs(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_ldb_port *port;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, port, iter) {
+ port->enabled = true;
+
+ dlb_ldb_port_cq_enable(hw, port);
+ }
+}
+
+static int dlb_domain_wait_for_ldb_pool_refill(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_credit_pool *pool;
+
+ /* Confirm that all credits are returned to the domain's credit pools */
+ DLB_DOM_LIST_FOR(domain->used_ldb_credit_pools, pool, iter) {
+ union dlb_chp_qed_fl_push_ptr r0;
+ union dlb_chp_qed_fl_pop_ptr r1;
+ unsigned long pop_offs, push_offs;
+ int i;
+
+ push_offs = DLB_CHP_QED_FL_PUSH_PTR(pool->id.phys_id);
+ pop_offs = DLB_CHP_QED_FL_POP_PTR(pool->id.phys_id);
+
+ for (i = 0; i < DLB_MAX_QID_EMPTY_CHECK_LOOPS; i++) {
+ r0.val = DLB_CSR_RD(hw, push_offs);
+
+ r1.val = DLB_CSR_RD(hw, pop_offs);
+
+ /* Break early if the freelist is replenished */
+ if (r1.field.pop_ptr == r0.field.push_ptr &&
+ r1.field.generation != r0.field.generation) {
+ break;
+ }
+ }
+
+ /* Error if the freelist is not full */
+ if (r1.field.pop_ptr != r0.field.push_ptr ||
+ r1.field.generation == r0.field.generation) {
+ return -EFAULT;
+ }
+ }
+
+ return 0;
+}
+
+static int dlb_domain_wait_for_dir_pool_refill(struct dlb_hw *hw,
+ struct dlb_domain *domain)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_credit_pool *pool;
+
+ /* Confirm that all credits are returned to the domain's credit pools */
+ DLB_DOM_LIST_FOR(domain->used_dir_credit_pools, pool, iter) {
+ union dlb_chp_dqed_fl_push_ptr r0;
+ union dlb_chp_dqed_fl_pop_ptr r1;
+ unsigned long pop_offs, push_offs;
+ int i;
+
+ push_offs = DLB_CHP_DQED_FL_PUSH_PTR(pool->id.phys_id);
+ pop_offs = DLB_CHP_DQED_FL_POP_PTR(pool->id.phys_id);
+
+ for (i = 0; i < DLB_MAX_QID_EMPTY_CHECK_LOOPS; i++) {
+ r0.val = DLB_CSR_RD(hw, push_offs);
+
+ r1.val = DLB_CSR_RD(hw, pop_offs);
+
+ /* Break early if the freelist is replenished */
+ if (r1.field.pop_ptr == r0.field.push_ptr &&
+ r1.field.generation != r0.field.generation) {
+ break;
+ }
+ }
+
+ /* Error if the freelist is not full */
+ if (r1.field.pop_ptr != r0.field.push_ptr ||
+ r1.field.generation == r0.field.generation) {
+ return -EFAULT;
+ }
+ }
+
+ return 0;
+}
+
+static void dlb_log_reset_domain(struct dlb_hw *hw,
+ u32 domain_id,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ DLB_HW_INFO(hw, "DLB reset domain:\n");
+ if (vf_request)
+ DLB_HW_INFO(hw, "(Request from VF %d)\n", vf_id);
+ DLB_HW_INFO(hw, "\tDomain ID: %d\n", domain_id);
+}
+
+/**
+ * dlb_reset_domain() - Reset a DLB scheduling domain and its associated
+ * hardware resources.
+ * @hw: Contains the current state of the DLB hardware.
+ * @args: User-provided arguments.
+ * @resp: Response to user.
+ *
+ * Note: User software *must* stop sending to this domain's producer ports
+ * before invoking this function, otherwise undefined behavior will result.
+ *
+ * Return: returns < 0 on error, 0 otherwise.
+ */
+int dlb_reset_domain(struct dlb_hw *hw,
+ u32 domain_id,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_domain *domain;
+ int ret;
+
+ dlb_log_reset_domain(hw, domain_id, vf_request, vf_id);
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+
+ if (!domain || !domain->configured)
+ return -EINVAL;
+
+ if (vf_request) {
+ dlb_domain_disable_dir_vpps(hw, domain, vf_id);
+
+ dlb_domain_disable_ldb_vpps(hw, domain, vf_id);
+ }
+
+ /* For each queue owned by this domain, disable its write permissions to
+ * cause any traffic sent to it to be dropped. Well-behaved software
+ * should not be sending QEs at this point.
+ */
+ dlb_domain_disable_dir_queue_write_perms(hw, domain);
+
+ dlb_domain_disable_ldb_queue_write_perms(hw, domain);
+
+ /* Disable credit updates and turn off completion tracking on all the
+ * domain's PPs.
+ */
+ dlb_domain_disable_dir_port_crd_updates(hw, domain);
+
+ dlb_domain_disable_ldb_port_crd_updates(hw, domain);
+
+ dlb_domain_disable_dir_port_interrupts(hw, domain);
+
+ dlb_domain_disable_ldb_port_interrupts(hw, domain);
+
+ dlb_domain_disable_ldb_seq_checks(hw, domain);
+
+ /* Disable the LDB CQs and drain them in order to complete the map and
+ * unmap procedures, which require zero CQ inflights and zero QID
+ * inflights respectively.
+ */
+ dlb_domain_disable_ldb_cqs(hw, domain);
+
+ ret = dlb_domain_drain_ldb_cqs(hw, domain, false);
+ if (ret < 0)
+ return ret;
+
+ ret = dlb_domain_wait_for_ldb_cqs_to_empty(hw, domain);
+ if (ret < 0)
+ return ret;
+
+ ret = dlb_domain_finish_unmap_qid_procedures(hw, domain);
+ if (ret < 0)
+ return ret;
+
+ ret = dlb_domain_finish_map_qid_procedures(hw, domain);
+ if (ret < 0)
+ return ret;
+
+ /* Re-enable the CQs in order to drain the mapped queues. */
+ dlb_domain_enable_ldb_cqs(hw, domain);
+
+ ret = dlb_domain_drain_mapped_queues(hw, domain);
+ if (ret < 0)
+ return ret;
+
+ ret = dlb_domain_drain_unmapped_queues(hw, domain);
+ if (ret < 0)
+ return ret;
+
+ ret = dlb_domain_wait_for_ldb_pool_refill(hw, domain);
+ if (ret) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: LDB credits failed to refill\n",
+ __func__);
+ return ret;
+ }
+
+ /* Done draining LDB QEs, so disable the CQs. */
+ dlb_domain_disable_ldb_cqs(hw, domain);
+
+ /* Directed queues are reset in dlb_domain_reset_hw_resources(), but
+ * that process doesn't decrement the directed queue size counters used
+ * by SMON for its average DQED depth measurement. So, we manually drain
+ * the directed queues here.
+ */
+ dlb_domain_drain_dir_queues(hw, domain);
+
+ ret = dlb_domain_wait_for_dir_pool_refill(hw, domain);
+ if (ret) {
+ DLB_HW_ERR(hw,
+ "[%s()] Internal error: DIR credits failed to refill\n",
+ __func__);
+ return ret;
+ }
+
+ /* Done draining DIR QEs, so disable the CQs. */
+ dlb_domain_disable_dir_cqs(hw, domain);
+
+ dlb_domain_disable_dir_producer_ports(hw, domain);
+
+ dlb_domain_disable_ldb_producer_ports(hw, domain);
+
+ dlb_domain_disable_dir_pools(hw, domain);
+
+ dlb_domain_disable_ldb_pools(hw, domain);
+
+ /* Reset the QID, credit pool, and CQ hardware.
+ *
+ * Note: DLB 1.0 A0 h/w does not disarm CQ interrupts during VAS reset.
+ * A spurious interrupt can occur on subsequent use of a reset CQ.
+ */
+ ret = dlb_domain_reset_hw_resources(hw, domain);
+ if (ret)
+ return ret;
+
+ ret = dlb_domain_verify_reset_success(hw, domain);
+ if (ret)
+ return ret;
+
+ dlb_domain_reset_registers(hw, domain);
+
+ /* Hardware reset complete. Reset the domain's software state */
+ ret = dlb_domain_reset_software_state(hw, domain);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+int dlb_reset_vf(struct dlb_hw *hw, unsigned int vf_id)
+{
+ struct dlb_domain *domain, *next __attribute__((unused));
+ struct dlb_list_entry *it1 __attribute__((unused));
+ struct dlb_list_entry *it2 __attribute__((unused));
+ struct dlb_function_resources *rsrcs;
+
+ if (vf_id >= DLB_MAX_NUM_VFS) {
+ DLB_HW_ERR(hw, "[%s()] Internal error: invalid VF ID %d\n",
+ __func__, vf_id);
+ return -EFAULT;
+ }
+
+ rsrcs = &hw->vf[vf_id];
+
+ DLB_FUNC_LIST_FOR_SAFE(rsrcs->used_domains, domain, next, it1, it2) {
+ int ret = dlb_reset_domain(hw,
+ domain->id.virt_id,
+ true,
+ vf_id);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
+int dlb_ldb_port_owned_by_domain(struct dlb_hw *hw,
+ u32 domain_id,
+ u32 port_id,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_ldb_port *port;
+ struct dlb_domain *domain;
+
+ if (vf_request && vf_id >= DLB_MAX_NUM_VFS)
+ return -1;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+
+ if (!domain || !domain->configured)
+ return -EINVAL;
+
+ port = dlb_get_domain_ldb_port(port_id, vf_request, domain);
+
+ if (!port)
+ return -EINVAL;
+
+ return port->domain_id.phys_id == domain->id.phys_id;
+}
+
+int dlb_dir_port_owned_by_domain(struct dlb_hw *hw,
+ u32 domain_id,
+ u32 port_id,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_dir_pq_pair *port;
+ struct dlb_domain *domain;
+
+ if (vf_request && vf_id >= DLB_MAX_NUM_VFS)
+ return -1;
+
+ domain = dlb_get_domain_from_id(hw, domain_id, vf_request, vf_id);
+
+ if (!domain || !domain->configured)
+ return -EINVAL;
+
+ port = dlb_get_domain_dir_pq(port_id, vf_request, domain);
+
+ if (!port)
+ return -EINVAL;
+
+ return port->domain_id.phys_id == domain->id.phys_id;
+}
+
+int dlb_hw_get_num_resources(struct dlb_hw *hw,
+ struct dlb_get_num_resources_args *arg,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_function_resources *rsrcs;
+ struct dlb_bitmap *map;
+
+ if (vf_request && vf_id >= DLB_MAX_NUM_VFS)
+ return -1;
+
+ if (vf_request)
+ rsrcs = &hw->vf[vf_id];
+ else
+ rsrcs = &hw->pf;
+
+ arg->num_sched_domains = rsrcs->num_avail_domains;
+
+ arg->num_ldb_queues = rsrcs->num_avail_ldb_queues;
+
+ arg->num_ldb_ports = rsrcs->num_avail_ldb_ports;
+
+ arg->num_dir_ports = rsrcs->num_avail_dir_pq_pairs;
+
+ map = rsrcs->avail_aqed_freelist_entries;
+
+ arg->num_atomic_inflights = dlb_bitmap_count(map);
+
+ arg->max_contiguous_atomic_inflights =
+ dlb_bitmap_longest_set_range(map);
+
+ map = rsrcs->avail_hist_list_entries;
+
+ arg->num_hist_list_entries = dlb_bitmap_count(map);
+
+ arg->max_contiguous_hist_list_entries =
+ dlb_bitmap_longest_set_range(map);
+
+ map = rsrcs->avail_qed_freelist_entries;
+
+ arg->num_ldb_credits = dlb_bitmap_count(map);
+
+ arg->max_contiguous_ldb_credits = dlb_bitmap_longest_set_range(map);
+
+ map = rsrcs->avail_dqed_freelist_entries;
+
+ arg->num_dir_credits = dlb_bitmap_count(map);
+
+ arg->max_contiguous_dir_credits = dlb_bitmap_longest_set_range(map);
+
+ arg->num_ldb_credit_pools = rsrcs->num_avail_ldb_credit_pools;
+
+ arg->num_dir_credit_pools = rsrcs->num_avail_dir_credit_pools;
+
+ return 0;
+}
+
+int dlb_hw_get_num_used_resources(struct dlb_hw *hw,
+ struct dlb_get_num_resources_args *arg,
+ bool vf_request,
+ unsigned int vf_id)
+{
+ struct dlb_list_entry *iter1 __attribute__((unused));
+ struct dlb_list_entry *iter2 __attribute__((unused));
+ struct dlb_function_resources *rsrcs;
+ struct dlb_domain *domain;
+
+ if (vf_request && vf_id >= DLB_MAX_NUM_VFS)
+ return -1;
+
+ rsrcs = (vf_request) ? &hw->vf[vf_id] : &hw->pf;
+
+ memset(arg, 0, sizeof(*arg));
+
+ DLB_FUNC_LIST_FOR(rsrcs->used_domains, domain, iter1) {
+ struct dlb_dir_pq_pair *dir_port;
+ struct dlb_ldb_port *ldb_port;
+ struct dlb_credit_pool *pool;
+ struct dlb_ldb_queue *queue;
+
+ arg->num_sched_domains++;
+
+ arg->num_atomic_inflights +=
+ domain->aqed_freelist.bound -
+ domain->aqed_freelist.base;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_queues, queue, iter2)
+ arg->num_ldb_queues++;
+ DLB_DOM_LIST_FOR(domain->avail_ldb_queues, queue, iter2)
+ arg->num_ldb_queues++;
+
+ DLB_DOM_LIST_FOR(domain->used_ldb_ports, ldb_port, iter2)
+ arg->num_ldb_ports++;
+ DLB_DOM_LIST_FOR(domain->avail_ldb_ports, ldb_port, iter2)
+ arg->num_ldb_ports++;
+
+ DLB_DOM_LIST_FOR(domain->used_dir_pq_pairs, dir_port, iter2)
+ arg->num_dir_ports++;
+ DLB_DOM_LIST_FOR(domain->avail_dir_pq_pairs, dir_port, iter2)
+ arg->num_dir_ports++;
+
+ arg->num_ldb_credits +=
+ domain->qed_freelist.bound -
+ domain->qed_freelist.base;
+
+ DLB_DOM_LIST_FOR(domain->avail_ldb_credit_pools, pool, iter2)
+ arg->num_ldb_credit_pools++;
+ DLB_DOM_LIST_FOR(domain->used_ldb_credit_pools, pool, iter2) {
+ arg->num_ldb_credit_pools++;
+ arg->num_ldb_credits += pool->total_credits;
+ }
+
+ arg->num_dir_credits +=
+ domain->dqed_freelist.bound -
+ domain->dqed_freelist.base;
+
+ DLB_DOM_LIST_FOR(domain->avail_dir_credit_pools, pool, iter2)
+ arg->num_dir_credit_pools++;
+ DLB_DOM_LIST_FOR(domain->used_dir_credit_pools, pool, iter2) {
+ arg->num_dir_credit_pools++;
+ arg->num_dir_credits += pool->total_credits;
+ }
+
+ arg->num_hist_list_entries += domain->total_hist_list_entries;
+ }
+
+ return 0;
+}
+
+static inline bool dlb_ldb_port_owned_by_vf(struct dlb_hw *hw,
+ u32 vf_id,
+ u32 port_id)
+{
+ return (hw->rsrcs.ldb_ports[port_id].id.vf_owned &&
+ hw->rsrcs.ldb_ports[port_id].id.vf_id == vf_id);
+}
+
+static inline bool dlb_dir_port_owned_by_vf(struct dlb_hw *hw,
+ u32 vf_id,
+ u32 port_id)
+{
+ return (hw->rsrcs.dir_pq_pairs[port_id].id.vf_owned &&
+ hw->rsrcs.dir_pq_pairs[port_id].id.vf_id == vf_id);
+}
+
+void dlb_send_async_pf_to_vf_msg(struct dlb_hw *hw, unsigned int vf_id)
+{
+ union dlb_func_pf_pf2vf_mailbox_isr r0 = { {0} };
+
+ r0.field.isr = 1 << vf_id;
+
+ DLB_FUNC_WR(hw, DLB_FUNC_PF_PF2VF_MAILBOX_ISR(0), r0.val);
+}
+
+bool dlb_pf_to_vf_complete(struct dlb_hw *hw, unsigned int vf_id)
+{
+ union dlb_func_pf_pf2vf_mailbox_isr r0;
+
+ r0.val = DLB_FUNC_RD(hw, DLB_FUNC_PF_PF2VF_MAILBOX_ISR(vf_id));
+
+ return (r0.val & (1 << vf_id)) == 0;
+}
+
+void dlb_send_async_vf_to_pf_msg(struct dlb_hw *hw)
+{
+ union dlb_func_vf_vf2pf_mailbox_isr r0 = { {0} };
+
+ r0.field.isr = 1;
+ DLB_FUNC_WR(hw, DLB_FUNC_VF_VF2PF_MAILBOX_ISR, r0.val);
+}
+
+bool dlb_vf_to_pf_complete(struct dlb_hw *hw)
+{
+ union dlb_func_vf_vf2pf_mailbox_isr r0;
+
+ r0.val = DLB_FUNC_RD(hw, DLB_FUNC_VF_VF2PF_MAILBOX_ISR);
+
+ return (r0.field.isr == 0);
+}
+
+bool dlb_vf_flr_complete(struct dlb_hw *hw)
+{
+ union dlb_func_vf_vf_reset_in_progress r0;
+
+ r0.val = DLB_FUNC_RD(hw, DLB_FUNC_VF_VF_RESET_IN_PROGRESS);
+
+ return (r0.field.reset_in_progress == 0);
+}
+
+int dlb_pf_read_vf_mbox_req(struct dlb_hw *hw,
+ unsigned int vf_id,
+ void *data,
+ int len)
+{
+ u32 buf[DLB_VF2PF_REQ_BYTES / 4];
+ int num_words;
+ int i;
+
+ if (len > DLB_VF2PF_REQ_BYTES) {
+ DLB_HW_ERR(hw, "[%s()] len (%d) > VF->PF mailbox req size\n",
+ __func__, len);
+ return -EINVAL;
+ }
+
+ if (len == 0) {
+ DLB_HW_ERR(hw, "[%s()] invalid len (0)\n", __func__);
+ return -EINVAL;
+ }
+
+ /* Round up len to the nearest 4B boundary, since the mailbox registers
+ * are 32b wide.
+ */
+ num_words = len / 4;
+ if (len % 4 != 0)
+ num_words++;
+
+ for (i = 0; i < num_words; i++) {
+ u32 idx = i + DLB_VF2PF_REQ_BASE_WORD;
+
+ buf[i] = DLB_FUNC_RD(hw, DLB_FUNC_PF_VF2PF_MAILBOX(vf_id, idx));
+ }
+
+ memcpy(data, buf, len);
+
+ return 0;
+}
+
+int dlb_pf_read_vf_mbox_resp(struct dlb_hw *hw,
+ unsigned int vf_id,
+ void *data,
+ int len)
+{
+ u32 buf[DLB_VF2PF_RESP_BYTES / 4];
+ int num_words;
+ int i;
+
+ if (len > DLB_VF2PF_RESP_BYTES) {
+ DLB_HW_ERR(hw, "[%s()] len (%d) > VF->PF mailbox resp size\n",
+ __func__, len);
+ return -EINVAL;
+ }
+
+ /* Round up len to the nearest 4B boundary, since the mailbox registers
+ * are 32b wide.
+ */
+ num_words = len / 4;
+ if (len % 4 != 0)
+ num_words++;
+
+ for (i = 0; i < num_words; i++) {
+ u32 idx = i + DLB_VF2PF_RESP_BASE_WORD;
+
+ buf[i] = DLB_FUNC_RD(hw, DLB_FUNC_PF_VF2PF_MAILBOX(vf_id, idx));
+ }
+
+ memcpy(data, buf, len);
+
+ return 0;
+}
+
+int dlb_pf_write_vf_mbox_resp(struct dlb_hw *hw,
+ unsigned int vf_id,
+ void *data,
+ int len)
+{
+ u32 buf[DLB_PF2VF_RESP_BYTES / 4];
+ int num_words;
+ int i;
+
+ if (len > DLB_PF2VF_RESP_BYTES) {
+ DLB_HW_ERR(hw, "[%s()] len (%d) > PF->VF mailbox resp size\n",
+ __func__, len);
+ return -EINVAL;
+ }
+
+ memcpy(buf, data, len);
+
+ /* Round up len to the nearest 4B boundary, since the mailbox registers
+ * are 32b wide.
+ */
+ num_words = len / 4;
+ if (len % 4 != 0)
+ num_words++;
+
+ for (i = 0; i < num_words; i++) {
+ u32 idx = i + DLB_PF2VF_RESP_BASE_WORD;
+
+ DLB_FUNC_WR(hw, DLB_FUNC_PF_PF2VF_MAILBOX(vf_id, idx), buf[i]);
+ }
+
+ return 0;
+}
+
+int dlb_pf_write_vf_mbox_req(struct dlb_hw *hw,
+ unsigned int vf_id,
+ void *data,
+ int len)
+{
+ u32 buf[DLB_PF2VF_REQ_BYTES / 4];
+ int num_words;
+ int i;
+
+ if (len > DLB_PF2VF_REQ_BYTES) {
+ DLB_HW_ERR(hw, "[%s()] len (%d) > PF->VF mailbox req size\n",
+ __func__, len);
+ return -EINVAL;
+ }
+
+ memcpy(buf, data, len);
+
+ /* Round up len to the nearest 4B boundary, since the mailbox registers
+ * are 32b wide.
+ */
+ num_words = len / 4;
+ if (len % 4 != 0)
+ num_words++;
+
+ for (i = 0; i < num_words; i++) {
+ u32 idx = i + DLB_PF2VF_REQ_BASE_WORD;
+
+ DLB_FUNC_WR(hw, DLB_FUNC_PF_PF2VF_MAILBOX(vf_id, idx), buf[i]);
+ }
+
+ return 0;
+}
+
+int dlb_vf_read_pf_mbox_resp(struct dlb_hw *hw, void *data, int len)
+{
+ u32 buf[DLB_PF2VF_RESP_BYTES / 4];
+ int num_words;
+ int i;
+
+ if (len > DLB_PF2VF_RESP_BYTES) {
+ DLB_HW_ERR(hw, "[%s()] len (%d) > PF->VF mailbox resp size\n",
+ __func__, len);
+ return -EINVAL;
+ }
+
+ if (len == 0) {
+ DLB_HW_ERR(hw, "[%s()] invalid len (0)\n", __func__);
+ return -EINVAL;
+ }
+
+ /* Round up len to the nearest 4B boundary, since the mailbox registers
+ * are 32b wide.
+ */
+ num_words = len / 4;
+ if (len % 4 != 0)
+ num_words++;
+
+ for (i = 0; i < num_words; i++) {
+ u32 idx = i + DLB_PF2VF_RESP_BASE_WORD;
+
+ buf[i] = DLB_FUNC_RD(hw, DLB_FUNC_VF_PF2VF_MAILBOX(idx));
+ }
+
+ memcpy(data, buf, len);
+
+ return 0;
+}
+
+int dlb_vf_read_pf_mbox_req(struct dlb_hw *hw, void *data, int len)
+{
+ u32 buf[DLB_PF2VF_REQ_BYTES / 4];
+ int num_words;
+ int i;
+
+ if (len > DLB_PF2VF_REQ_BYTES) {
+ DLB_HW_ERR(hw, "[%s()] len (%d) > PF->VF mailbox req size\n",
+ __func__, len);
+ return -EINVAL;
+ }
+
+ /* Round up len to the nearest 4B boundary, since the mailbox registers
+ * are 32b wide.
+ */
+ num_words = len / 4;
+ if ((len % 4) != 0)
+ num_words++;
+
+ for (i = 0; i < num_words; i++) {
+ u32 idx = i + DLB_PF2VF_REQ_BASE_WORD;
+
+ buf[i] = DLB_FUNC_RD(hw, DLB_FUNC_VF_PF2VF_MAILBOX(idx));
+ }
+
+ memcpy(data, buf, len);
+
+ return 0;
+}
+
+int dlb_vf_write_pf_mbox_req(struct dlb_hw *hw, void *data, int len)
+{
+ u32 buf[DLB_VF2PF_REQ_BYTES / 4];
+ int num_words;
+ int i;
+
+ if (len > DLB_VF2PF_REQ_BYTES) {
+ DLB_HW_ERR(hw, "[%s()] len (%d) > VF->PF mailbox req size\n",
+ __func__, len);
+ return -EINVAL;
+ }
+
+ memcpy(buf, data, len);
+
+ /* Round up len to the nearest 4B boundary, since the mailbox registers
+ * are 32b wide.
+ */
+ num_words = len / 4;
+ if (len % 4 != 0)
+ num_words++;
+
+ for (i = 0; i < num_words; i++) {
+ u32 idx = i + DLB_VF2PF_REQ_BASE_WORD;
+
+ DLB_FUNC_WR(hw, DLB_FUNC_VF_VF2PF_MAILBOX(idx), buf[i]);
+ }
+
+ return 0;
+}
+
+int dlb_vf_write_pf_mbox_resp(struct dlb_hw *hw, void *data, int len)
+{
+ u32 buf[DLB_VF2PF_RESP_BYTES / 4];
+ int num_words;
+ int i;
+
+ if (len > DLB_VF2PF_RESP_BYTES) {
+ DLB_HW_ERR(hw, "[%s()] len (%d) > VF->PF mailbox resp size\n",
+ __func__, len);
+ return -EINVAL;
+ }
+
+ memcpy(buf, data, len);
+
+ /* Round up len to the nearest 4B boundary, since the mailbox registers
+ * are 32b wide.
+ */
+ num_words = len / 4;
+ if (len % 4 != 0)
+ num_words++;
+
+ for (i = 0; i < num_words; i++) {
+ u32 idx = i + DLB_VF2PF_RESP_BASE_WORD;
+
+ DLB_FUNC_WR(hw, DLB_FUNC_VF_VF2PF_MAILBOX(idx), buf[i]);
+ }
+
+ return 0;
+}
+
+bool dlb_vf_is_locked(struct dlb_hw *hw, unsigned int vf_id)
+{
+ return hw->vf[vf_id].locked;
+}
+
+static void dlb_vf_set_rsrc_virt_ids(struct dlb_function_resources *rsrcs,
+ unsigned int vf_id)
+{
+ struct dlb_list_entry *iter __attribute__((unused));
+ struct dlb_dir_pq_pair *dir_port;
+ struct dlb_ldb_queue *ldb_queue;
+ struct dlb_ldb_port *ldb_port;
+ struct dlb_credit_pool *pool;
+ struct dlb_domain *domain;
+ int i;
+
+ i = 0;
+ DLB_FUNC_LIST_FOR(rsrcs->avail_domains, domain, iter) {
+ domain->id.virt_id = i;
+ domain->id.vf_owned = true;
+ domain->id.vf_id = vf_id;
+ i++;
+ }
+
+ i = 0;
+ DLB_FUNC_LIST_FOR(rsrcs->avail_ldb_queues, ldb_queue, iter) {
+ ldb_queue->id.virt_id = i;
+ ldb_queue->id.vf_owned = true;
+ ldb_queue->id.vf_id = vf_id;
+ i++;
+ }
+
+ i = 0;
+ DLB_FUNC_LIST_FOR(rsrcs->avail_ldb_ports, ldb_port, iter) {
+ ldb_port->id.virt_id = i;
+ ldb_port->id.vf_owned = true;
+ ldb_port->id.vf_id = vf_id;
+ i++;
+ }
+
+ i = 0;
+ DLB_FUNC_LIST_FOR(rsrcs->avail_dir_pq_pairs, dir_port, iter) {
+ dir_port->id.virt_id = i;
+ dir_port->id.vf_owned = true;
+ dir_port->id.vf_id = vf_id;
+ i++;
+ }
+
+ i = 0;
+ DLB_FUNC_LIST_FOR(rsrcs->avail_ldb_credit_pools, pool, iter) {
+ pool->id.virt_id = i;
+ pool->id.vf_owned = true;
+ pool->id.vf_id = vf_id;
+ i++;
+ }
+
+ i = 0;
+ DLB_FUNC_LIST_FOR(rsrcs->avail_dir_credit_pools, pool, iter) {
+ pool->id.virt_id = i;
+ pool->id.vf_owned = true;
+ pool->id.vf_id = vf_id;
+ i++;
+ }
+}
+
+void dlb_lock_vf(struct dlb_hw *hw, unsigned int vf_id)
+{
+ struct dlb_function_resources *rsrcs = &hw->vf[vf_id];
+
+ rsrcs->locked = true;
+
+ dlb_vf_set_rsrc_virt_ids(rsrcs, vf_id);
+}
+
+void dlb_unlock_vf(struct dlb_hw *hw, unsigned int vf_id)
+{
+ hw->vf[vf_id].locked = false;
+}
+
+int dlb_reset_vf_resources(struct dlb_hw *hw, unsigned int vf_id)
+{
+ if (vf_id >= DLB_MAX_NUM_VFS)
+ return -EINVAL;
+
+ /* If the VF is locked, its resource assignment can't be changed */
+ if (dlb_vf_is_locked(hw, vf_id))
+ return -EPERM;
+
+ dlb_update_vf_sched_domains(hw, vf_id, 0);
+ dlb_update_vf_ldb_queues(hw, vf_id, 0);
+ dlb_update_vf_ldb_ports(hw, vf_id, 0);
+ dlb_update_vf_dir_ports(hw, vf_id, 0);
+ dlb_update_vf_ldb_credit_pools(hw, vf_id, 0);
+ dlb_update_vf_dir_credit_pools(hw, vf_id, 0);
+ dlb_update_vf_ldb_credits(hw, vf_id, 0);
+ dlb_update_vf_dir_credits(hw, vf_id, 0);
+ dlb_update_vf_hist_list_entries(hw, vf_id, 0);
+ dlb_update_vf_atomic_inflights(hw, vf_id, 0);
+
+ return 0;
+}
+
+void dlb_hw_enable_sparse_ldb_cq_mode(struct dlb_hw *hw)
+{
+ union dlb_sys_cq_mode r0;
+
+ r0.val = DLB_CSR_RD(hw, DLB_SYS_CQ_MODE);
+
+ r0.field.ldb_cq64 = 1;
+
+ DLB_CSR_WR(hw, DLB_SYS_CQ_MODE, r0.val);
+}
+
+void dlb_hw_enable_sparse_dir_cq_mode(struct dlb_hw *hw)
+{
+ union dlb_sys_cq_mode r0;
+
+ r0.val = DLB_CSR_RD(hw, DLB_SYS_CQ_MODE);
+
+ r0.field.dir_cq64 = 1;
+
+ DLB_CSR_WR(hw, DLB_SYS_CQ_MODE, r0.val);
+}
+
+void dlb_hw_set_qe_arbiter_weights(struct dlb_hw *hw, u8 weight[8])
+{
+ union dlb_atm_pipe_ctrl_arb_weights_rdy_bin r0 = { {0} };
+ union dlb_nalb_pipe_ctrl_arb_weights_tqpri_nalb_0 r1 = { {0} };
+ union dlb_nalb_pipe_ctrl_arb_weights_tqpri_nalb_1 r2 = { {0} };
+ union dlb_nalb_pipe_cfg_ctrl_arb_weights_tqpri_replay_0 r3 = { {0} };
+ union dlb_nalb_pipe_cfg_ctrl_arb_weights_tqpri_replay_1 r4 = { {0} };
+ union dlb_dp_cfg_ctrl_arb_weights_tqpri_replay_0 r5 = { {0} };
+ union dlb_dp_cfg_ctrl_arb_weights_tqpri_replay_1 r6 = { {0} };
+ union dlb_dp_cfg_ctrl_arb_weights_tqpri_dir_0 r7 = { {0} };
+ union dlb_dp_cfg_ctrl_arb_weights_tqpri_dir_1 r8 = { {0} };
+ union dlb_nalb_pipe_cfg_ctrl_arb_weights_tqpri_atq_0 r9 = { {0} };
+ union dlb_nalb_pipe_cfg_ctrl_arb_weights_tqpri_atq_1 r10 = { {0} };
+ union dlb_atm_pipe_cfg_ctrl_arb_weights_sched_bin r11 = { {0} };
+ union dlb_aqed_pipe_cfg_ctrl_arb_weights_tqpri_atm_0 r12 = { {0} };
+
+ r0.field.bin0 = weight[1];
+ r0.field.bin1 = weight[3];
+ r0.field.bin2 = weight[5];
+ r0.field.bin3 = weight[7];
+
+ r1.field.pri0 = weight[0];
+ r1.field.pri1 = weight[1];
+ r1.field.pri2 = weight[2];
+ r1.field.pri3 = weight[3];
+ r2.field.pri4 = weight[4];
+ r2.field.pri5 = weight[5];
+ r2.field.pri6 = weight[6];
+ r2.field.pri7 = weight[7];
+
+ r3.field.pri0 = weight[0];
+ r3.field.pri1 = weight[1];
+ r3.field.pri2 = weight[2];
+ r3.field.pri3 = weight[3];
+ r4.field.pri4 = weight[4];
+ r4.field.pri5 = weight[5];
+ r4.field.pri6 = weight[6];
+ r4.field.pri7 = weight[7];
+
+ r5.field.pri0 = weight[0];
+ r5.field.pri1 = weight[1];
+ r5.field.pri2 = weight[2];
+ r5.field.pri3 = weight[3];
+ r6.field.pri4 = weight[4];
+ r6.field.pri5 = weight[5];
+ r6.field.pri6 = weight[6];
+ r6.field.pri7 = weight[7];
+
+ r7.field.pri0 = weight[0];
+ r7.field.pri1 = weight[1];
+ r7.field.pri2 = weight[2];
+ r7.field.pri3 = weight[3];
+ r8.field.pri4 = weight[4];
+ r8.field.pri5 = weight[5];
+ r8.field.pri6 = weight[6];
+ r8.field.pri7 = weight[7];
+
+ r9.field.pri0 = weight[0];
+ r9.field.pri1 = weight[1];
+ r9.field.pri2 = weight[2];
+ r9.field.pri3 = weight[3];
+ r10.field.pri4 = weight[4];
+ r10.field.pri5 = weight[5];
+ r10.field.pri6 = weight[6];
+ r10.field.pri7 = weight[7];
+
+ r11.field.bin0 = weight[1];
+ r11.field.bin1 = weight[3];
+ r11.field.bin2 = weight[5];
+ r11.field.bin3 = weight[7];
+
+ r12.field.pri0 = weight[1];
+ r12.field.pri1 = weight[3];
+ r12.field.pri2 = weight[5];
+ r12.field.pri3 = weight[7];
+
+ DLB_CSR_WR(hw, DLB_ATM_PIPE_CTRL_ARB_WEIGHTS_RDY_BIN, r0.val);
+ DLB_CSR_WR(hw, DLB_NALB_PIPE_CTRL_ARB_WEIGHTS_TQPRI_NALB_0, r1.val);
+ DLB_CSR_WR(hw, DLB_NALB_PIPE_CTRL_ARB_WEIGHTS_TQPRI_NALB_1, r2.val);
+ DLB_CSR_WR(hw,
+ DLB_NALB_PIPE_CFG_CTRL_ARB_WEIGHTS_TQPRI_REPLAY_0,
+ r3.val);
+ DLB_CSR_WR(hw,
+ DLB_NALB_PIPE_CFG_CTRL_ARB_WEIGHTS_TQPRI_REPLAY_1,
+ r4.val);
+ DLB_CSR_WR(hw, DLB_DP_CFG_CTRL_ARB_WEIGHTS_TQPRI_REPLAY_0, r5.val);
+ DLB_CSR_WR(hw, DLB_DP_CFG_CTRL_ARB_WEIGHTS_TQPRI_REPLAY_1, r6.val);
+ DLB_CSR_WR(hw, DLB_DP_CFG_CTRL_ARB_WEIGHTS_TQPRI_DIR_0, r7.val);
+ DLB_CSR_WR(hw, DLB_DP_CFG_CTRL_ARB_WEIGHTS_TQPRI_DIR_1, r8.val);
+ DLB_CSR_WR(hw, DLB_NALB_PIPE_CFG_CTRL_ARB_WEIGHTS_TQPRI_ATQ_0, r9.val);
+ DLB_CSR_WR(hw, DLB_NALB_PIPE_CFG_CTRL_ARB_WEIGHTS_TQPRI_ATQ_1, r10.val);
+ DLB_CSR_WR(hw, DLB_ATM_PIPE_CFG_CTRL_ARB_WEIGHTS_SCHED_BIN, r11.val);
+ DLB_CSR_WR(hw, DLB_AQED_PIPE_CFG_CTRL_ARB_WEIGHTS_TQPRI_ATM_0, r12.val);
+}
+
+void dlb_hw_set_qid_arbiter_weights(struct dlb_hw *hw, u8 weight[8])
+{
+ union dlb_lsp_cfg_arb_weight_ldb_qid_0 r0 = { {0} };
+ union dlb_lsp_cfg_arb_weight_ldb_qid_1 r1 = { {0} };
+ union dlb_lsp_cfg_arb_weight_atm_nalb_qid_0 r2 = { {0} };
+ union dlb_lsp_cfg_arb_weight_atm_nalb_qid_1 r3 = { {0} };
+
+ r0.field.slot0_weight = weight[0];
+ r0.field.slot1_weight = weight[1];
+ r0.field.slot2_weight = weight[2];
+ r0.field.slot3_weight = weight[3];
+ r1.field.slot4_weight = weight[4];
+ r1.field.slot5_weight = weight[5];
+ r1.field.slot6_weight = weight[6];
+ r1.field.slot7_weight = weight[7];
+
+ r2.field.slot0_weight = weight[0];
+ r2.field.slot1_weight = weight[1];
+ r2.field.slot2_weight = weight[2];
+ r2.field.slot3_weight = weight[3];
+ r3.field.slot4_weight = weight[4];
+ r3.field.slot5_weight = weight[5];
+ r3.field.slot6_weight = weight[6];
+ r3.field.slot7_weight = weight[7];
+
+ DLB_CSR_WR(hw, DLB_LSP_CFG_ARB_WEIGHT_LDB_QID_0, r0.val);
+ DLB_CSR_WR(hw, DLB_LSP_CFG_ARB_WEIGHT_LDB_QID_1, r1.val);
+ DLB_CSR_WR(hw, DLB_LSP_CFG_ARB_WEIGHT_ATM_NALB_QID_0, r2.val);
+ DLB_CSR_WR(hw, DLB_LSP_CFG_ARB_WEIGHT_ATM_NALB_QID_1, r3.val);
+}
+
+void dlb_hw_enable_pp_sw_alarms(struct dlb_hw *hw)
+{
+ union dlb_chp_cfg_ldb_pp_sw_alarm_en r0 = { {0} };
+ union dlb_chp_cfg_dir_pp_sw_alarm_en r1 = { {0} };
+ int i;
+
+ r0.field.alarm_enable = 1;
+ r1.field.alarm_enable = 1;
+
+ for (i = 0; i < DLB_MAX_NUM_LDB_PORTS; i++)
+ DLB_CSR_WR(hw, DLB_CHP_CFG_LDB_PP_SW_ALARM_EN(i), r0.val);
+
+ for (i = 0; i < DLB_MAX_NUM_DIR_PORTS; i++)
+ DLB_CSR_WR(hw, DLB_CHP_CFG_DIR_PP_SW_ALARM_EN(i), r1.val);
+}
+
+void dlb_hw_disable_pp_sw_alarms(struct dlb_hw *hw)
+{
+ union dlb_chp_cfg_ldb_pp_sw_alarm_en r0 = { {0} };
+ union dlb_chp_cfg_dir_pp_sw_alarm_en r1 = { {0} };
+ int i;
+
+ r0.field.alarm_enable = 0;
+ r1.field.alarm_enable = 0;
+
+ for (i = 0; i < DLB_MAX_NUM_LDB_PORTS; i++)
+ DLB_CSR_WR(hw, DLB_CHP_CFG_LDB_PP_SW_ALARM_EN(i), r0.val);
+
+ for (i = 0; i < DLB_MAX_NUM_DIR_PORTS; i++)
+ DLB_CSR_WR(hw, DLB_CHP_CFG_DIR_PP_SW_ALARM_EN(i), r1.val);
+}
+
+void dlb_hw_disable_pf_to_vf_isr_pend_err(struct dlb_hw *hw)
+{
+ union dlb_sys_sys_alarm_int_enable r0;
+
+ r0.val = DLB_CSR_RD(hw, DLB_SYS_SYS_ALARM_INT_ENABLE);
+
+ r0.field.pf_to_vf_isr_pend_error = 0;
+
+ DLB_CSR_WR(hw, DLB_SYS_SYS_ALARM_INT_ENABLE, r0.val);
+}
+
+void dlb_hw_disable_vf_to_pf_isr_pend_err(struct dlb_hw *hw)
+{
+ union dlb_sys_sys_alarm_int_enable r0;
+
+ r0.val = DLB_CSR_RD(hw, DLB_SYS_SYS_ALARM_INT_ENABLE);
+
+ r0.field.vf_to_pf_isr_pend_error = 0;
+
+ DLB_CSR_WR(hw, DLB_SYS_SYS_ALARM_INT_ENABLE, r0.val);
+}
diff --git a/drivers/event/dlb/pf/base/dlb_resource.h b/drivers/event/dlb/pf/base/dlb_resource.h
new file mode 100644
index 0000000..b67424a
--- /dev/null
+++ b/drivers/event/dlb/pf/base/dlb_resource.h
@@ -0,0 +1,1639 @@
+/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-3-Clause)
+ * Copyright(c) 2016-2020 Intel Corporation
+ */
+
+#ifndef __DLB_RESOURCE_H
+#define __DLB_RESOURCE_H
+
+#include "dlb_hw_types.h"
+#include "dlb_osdep_types.h"
+#include "dlb_user.h"
+
+/**
+ * dlb_resource_init() - initialize the device
+ * @hw: pointer to struct dlb_hw.
+ *
+ * This function initializes the device's software state (pointed to by the hw
+ * argument) and programs global scheduling QoS registers. This function should
+ * be called during driver initialization.
+ *
+ * The dlb_hw struct must be unique per DLB device and persist until the device
+ * is reset.
+ *
+ * Return:
+ * Returns 0 upon success, -1 otherwise.
+ */
+int dlb_resource_init(struct dlb_hw *hw);
+
+/**
+ * dlb_resource_free() - free device state memory
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * This function frees software state pointed to by dlb_hw. This function
+ * should be called when resetting the device or unloading the driver.
+ */
+void dlb_resource_free(struct dlb_hw *hw);
+
+/**
+ * dlb_resource_reset() - reset in-use resources to their initial state
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * This function resets in-use resources, and makes them available for use.
+ * All resources go back to their owning function, whether a PF or a VF.
+ */
+void dlb_resource_reset(struct dlb_hw *hw);
+
+/**
+ * dlb_hw_create_sched_domain() - create a scheduling domain
+ * @hw: dlb_hw handle for a particular device.
+ * @args: scheduling domain creation arguments.
+ * @resp: response structure.
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function creates a scheduling domain containing the resources specified
+ * in args. The individual resources (queues, ports, credit pools) can be
+ * configured after creating a scheduling domain.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb_error. If successful, resp->id
+ * contains the domain ID.
+ *
+ * Note: resp->id contains a virtual ID if vf_request is true.
+ *
+ * Errors:
+ * EINVAL - A requested resource is unavailable, or the requested domain name
+ * is already in use.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb_hw_create_sched_domain(struct dlb_hw *hw,
+ struct dlb_create_sched_domain_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_hw_create_ldb_pool() - create a load-balanced credit pool
+ * @hw: dlb_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: credit pool creation arguments.
+ * @resp: response structure.
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function creates a load-balanced credit pool containing the number of
+ * requested credits.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb_error. If successful, resp->id
+ * contains the pool ID.
+ *
+ * Note: resp->id contains a virtual ID if vf_request is true.
+ *
+ * Errors:
+ * EINVAL - A requested resource is unavailable, the domain is not configured,
+ * or the domain has already been started.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb_hw_create_ldb_pool(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_create_ldb_pool_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_hw_create_dir_pool() - create a directed credit pool
+ * @hw: dlb_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: credit pool creation arguments.
+ * @resp: response structure.
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function creates a directed credit pool containing the number of
+ * requested credits.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb_error. If successful, resp->id
+ * contains the pool ID.
+ *
+ * Note: resp->id contains a virtual ID if vf_request is true.
+ *
+ * Errors:
+ * EINVAL - A requested resource is unavailable, the domain is not configured,
+ * or the domain has already been started.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb_hw_create_dir_pool(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_create_dir_pool_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_hw_create_ldb_queue() - create a load-balanced queue
+ * @hw: dlb_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: queue creation arguments.
+ * @resp: response structure.
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function creates a load-balanced queue.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb_error. If successful, resp->id
+ * contains the queue ID.
+ *
+ * Note: resp->id contains a virtual ID if vf_request is true.
+ *
+ * Errors:
+ * EINVAL - A requested resource is unavailable, the domain is not configured,
+ * the domain has already been started, or the requested queue name is
+ * already in use.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb_hw_create_ldb_queue(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_create_ldb_queue_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_hw_create_dir_queue() - create a directed queue
+ * @hw: dlb_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: queue creation arguments.
+ * @resp: response structure.
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function creates a directed queue.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb_error. If successful, resp->id
+ * contains the queue ID.
+ *
+ * Note: resp->id contains a virtual ID if vf_request is true.
+ *
+ * Errors:
+ * EINVAL - A requested resource is unavailable, the domain is not configured,
+ * or the domain has already been started.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb_hw_create_dir_queue(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_create_dir_queue_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_hw_create_dir_port() - create a directed port
+ * @hw: dlb_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: port creation arguments.
+ * @pop_count_dma_base: base address of the pop count memory. This can be
+ * a PA or an IOVA.
+ * @cq_dma_base: base address of the CQ memory. This can be a PA or an IOVA.
+ * @resp: response structure.
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function creates a directed port.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb_error. If successful, resp->id
+ * contains the port ID.
+ *
+ * Note: resp->id contains a virtual ID if vf_request is true.
+ *
+ * Errors:
+ * EINVAL - A requested resource is unavailable, a credit setting is invalid, a
+ * pool ID is invalid, a pointer address is not properly aligned, the
+ * domain is not configured, or the domain has already been started.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb_hw_create_dir_port(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_create_dir_port_args *args,
+ u64 pop_count_dma_base,
+ u64 cq_dma_base,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_hw_create_ldb_port() - create a load-balanced port
+ * @hw: dlb_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: port creation arguments.
+ * @pop_count_dma_base: base address of the pop count memory. This can be
+ * a PA or an IOVA.
+ * @cq_dma_base: base address of the CQ memory. This can be a PA or an IOVA.
+ * @resp: response structure.
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function creates a load-balanced port.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb_error. If successful, resp->id
+ * contains the port ID.
+ *
+ * Note: resp->id contains a virtual ID if vf_request is true.
+ *
+ * Errors:
+ * EINVAL - A requested resource is unavailable, a credit setting is invalid, a
+ * pool ID is invalid, a pointer address is not properly aligned, the
+ * domain is not configured, or the domain has already been started.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb_hw_create_ldb_port(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_create_ldb_port_args *args,
+ u64 pop_count_dma_base,
+ u64 cq_dma_base,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_hw_start_domain() - start a scheduling domain
+ * @hw: dlb_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: start domain arguments.
+ * @resp: response structure.
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function starts a scheduling domain, which allows applications to send
+ * traffic through it. Once a domain is started, its resources can no longer be
+ * configured (besides QID remapping and port enable/disable).
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb_error.
+ *
+ * Errors:
+ * EINVAL - the domain is not configured, or the domain is already started.
+ */
+int dlb_hw_start_domain(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_start_domain_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_hw_map_qid() - map a load-balanced queue to a load-balanced port
+ * @hw: dlb_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: map QID arguments.
+ * @resp: response structure.
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function configures the DLB to schedule QEs from the specified queue to
+ * the specified port. Each load-balanced port can be mapped to up to 8 queues;
+ * each load-balanced queue can potentially map to all the load-balanced ports.
+ *
+ * A successful return does not necessarily mean the mapping was configured. If
+ * this function is unable to immediately map the queue to the port, it will
+ * add the requested operation to a per-port list of pending map/unmap
+ * operations, and (if it's not already running) launch a kernel thread that
+ * periodically attempts to process all pending operations. In a sense, this is
+ * an asynchronous function.
+ *
+ * This asynchronicity creates two views of the state of hardware: the actual
+ * hardware state and the requested state (as if every request completed
+ * immediately). If there are any pending map/unmap operations, the requested
+ * state will differ from the actual state. All validation is performed with
+ * respect to the pending state; for instance, if there are 8 pending map
+ * operations for port X, a request for a 9th will fail because a load-balanced
+ * port can only map up to 8 queues.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb_error.
+ *
+ * Errors:
+ * EINVAL - A requested resource is unavailable, invalid port or queue ID, or
+ * the domain is not configured.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb_hw_map_qid(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_map_qid_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_hw_unmap_qid() - Unmap a load-balanced queue from a load-balanced port
+ * @hw: dlb_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: unmap QID arguments.
+ * @resp: response structure.
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function configures the DLB to stop scheduling QEs from the specified
+ * queue to the specified port.
+ *
+ * A successful return does not necessarily mean the mapping was removed. If
+ * this function is unable to immediately unmap the queue from the port, it
+ * will add the requested operation to a per-port list of pending map/unmap
+ * operations, and (if it's not already running) launch a kernel thread that
+ * periodically attempts to process all pending operations. See
+ * dlb_hw_map_qid() for more details.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb_error.
+ *
+ * Errors:
+ * EINVAL - A requested resource is unavailable, invalid port or queue ID, or
+ * the domain is not configured.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb_hw_unmap_qid(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_unmap_qid_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_finish_unmap_qid_procedures() - finish any pending unmap procedures
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * This function attempts to finish any outstanding unmap procedures.
+ * This function should be called by the kernel thread responsible for
+ * finishing map/unmap procedures.
+ *
+ * Return:
+ * Returns the number of procedures that weren't completed.
+ */
+unsigned int dlb_finish_unmap_qid_procedures(struct dlb_hw *hw);
+
+/**
+ * dlb_finish_map_qid_procedures() - finish any pending map procedures
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * This function attempts to finish any outstanding map procedures.
+ * This function should be called by the kernel thread responsible for
+ * finishing map/unmap procedures.
+ *
+ * Return:
+ * Returns the number of procedures that weren't completed.
+ */
+unsigned int dlb_finish_map_qid_procedures(struct dlb_hw *hw);
+
+/**
+ * dlb_hw_enable_ldb_port() - enable a load-balanced port for scheduling
+ * @hw: dlb_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: port enable arguments.
+ * @resp: response structure.
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function configures the DLB to schedule QEs to a load-balanced port.
+ * Ports are enabled by default.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb_error.
+ *
+ * Errors:
+ * EINVAL - The port ID is invalid or the domain is not configured.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb_hw_enable_ldb_port(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_enable_ldb_port_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_hw_disable_ldb_port() - disable a load-balanced port for scheduling
+ * @hw: dlb_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: port disable arguments.
+ * @resp: response structure.
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function configures the DLB to stop scheduling QEs to a load-balanced
+ * port. Ports are enabled by default.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb_error.
+ *
+ * Errors:
+ * EINVAL - The port ID is invalid or the domain is not configured.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb_hw_disable_ldb_port(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_disable_ldb_port_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_hw_enable_dir_port() - enable a directed port for scheduling
+ * @hw: dlb_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: port enable arguments.
+ * @resp: response structure.
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function configures the DLB to schedule QEs to a directed port.
+ * Ports are enabled by default.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb_error.
+ *
+ * Errors:
+ * EINVAL - The port ID is invalid or the domain is not configured.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb_hw_enable_dir_port(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_enable_dir_port_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_hw_disable_dir_port() - disable a directed port for scheduling
+ * @hw: dlb_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: port disable arguments.
+ * @resp: response structure.
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function configures the DLB to stop scheduling QEs to a directed port.
+ * Ports are enabled by default.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb_error.
+ *
+ * Errors:
+ * EINVAL - The port ID is invalid or the domain is not configured.
+ * EFAULT - Internal error (resp->status not set).
+ */
+int dlb_hw_disable_dir_port(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_disable_dir_port_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_configure_ldb_cq_interrupt() - configure load-balanced CQ for interrupts
+ * @hw: dlb_hw handle for a particular device.
+ * @port_id: load-balancd port ID.
+ * @vector: interrupt vector ID. Should be 0 for MSI or compressed MSI-X mode,
+ * else a value up to 64.
+ * @mode: interrupt type (DLB_CQ_ISR_MODE_MSI or DLB_CQ_ISR_MODE_MSIX)
+ * @vf: If the port is VF-owned, the VF's ID. This is used for translating the
+ * virtual port ID to a physical port ID. Ignored if mode is not MSI.
+ * @owner_vf: the VF to route the interrupt to. Ignore if mode is not MSI.
+ * @threshold: the minimum CQ depth at which the interrupt can fire. Must be
+ * greater than 0.
+ *
+ * This function configures the DLB registers for load-balanced CQ's interrupts.
+ * This doesn't enable the CQ's interrupt; that can be done with
+ * dlb_arm_cq_interrupt() or through an interrupt arm QE.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise.
+ *
+ * Errors:
+ * EINVAL - The port ID is invalid.
+ */
+int dlb_configure_ldb_cq_interrupt(struct dlb_hw *hw,
+ int port_id,
+ int vector,
+ int mode,
+ unsigned int vf,
+ unsigned int owner_vf,
+ u16 threshold);
+
+/**
+ * dlb_configure_dir_cq_interrupt() - configure directed CQ for interrupts
+ * @hw: dlb_hw handle for a particular device.
+ * @port_id: load-balancd port ID.
+ * @vector: interrupt vector ID. Should be 0 for MSI or compressed MSI-X mode,
+ * else a value up to 64.
+ * @mode: interrupt type (DLB_CQ_ISR_MODE_MSI or DLB_CQ_ISR_MODE_MSIX)
+ * @vf: If the port is VF-owned, the VF's ID. This is used for translating the
+ * virtual port ID to a physical port ID. Ignored if mode is not MSI.
+ * @owner_vf: the VF to route the interrupt to. Ignore if mode is not MSI.
+ * @threshold: the minimum CQ depth at which the interrupt can fire. Must be
+ * greater than 0.
+ *
+ * This function configures the DLB registers for directed CQ's interrupts.
+ * This doesn't enable the CQ's interrupt; that can be done with
+ * dlb_arm_cq_interrupt() or through an interrupt arm QE.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise.
+ *
+ * Errors:
+ * EINVAL - The port ID is invalid.
+ */
+int dlb_configure_dir_cq_interrupt(struct dlb_hw *hw,
+ int port_id,
+ int vector,
+ int mode,
+ unsigned int vf,
+ unsigned int owner_vf,
+ u16 threshold);
+
+/**
+ * dlb_enable_alarm_interrupts() - enable certain hardware alarm interrupts
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * This function configures the ingress error alarm. (Other alarms are enabled
+ * by default.)
+ */
+void dlb_enable_alarm_interrupts(struct dlb_hw *hw);
+
+/**
+ * dlb_disable_alarm_interrupts() - disable certain hardware alarm interrupts
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * This function configures the ingress error alarm. (Other alarms are disabled
+ * by default.)
+ */
+void dlb_disable_alarm_interrupts(struct dlb_hw *hw);
+
+/**
+ * dlb_set_msix_mode() - enable certain hardware alarm interrupts
+ * @hw: dlb_hw handle for a particular device.
+ * @mode: MSI-X mode (DLB_MSIX_MODE_PACKED or DLB_MSIX_MODE_COMPRESSED)
+ *
+ * This function configures the hardware to use either packed or compressed
+ * mode. This function should not be called if using MSI interrupts.
+ */
+void dlb_set_msix_mode(struct dlb_hw *hw, int mode);
+
+/**
+ * dlb_arm_cq_interrupt() - arm a CQ's interrupt
+ * @hw: dlb_hw handle for a particular device.
+ * @port_id: port ID
+ * @is_ldb: true for load-balanced port, false for a directed port
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function arms the CQ's interrupt. The CQ must be configured prior to
+ * calling this function.
+ *
+ * The function does no parameter validation; that is the caller's
+ * responsibility.
+ *
+ * Return: returns 0 upon success, <0 otherwise.
+ *
+ * EINVAL - Invalid port ID.
+ */
+int dlb_arm_cq_interrupt(struct dlb_hw *hw,
+ int port_id,
+ bool is_ldb,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_read_compressed_cq_intr_status() - read compressed CQ interrupt status
+ * @hw: dlb_hw handle for a particular device.
+ * @ldb_interrupts: 2-entry array of u32 bitmaps
+ * @dir_interrupts: 4-entry array of u32 bitmaps
+ *
+ * This function can be called from a compressed CQ interrupt handler to
+ * determine which CQ interrupts have fired. The caller should take appropriate
+ * (such as waking threads blocked on a CQ's interrupt) then ack the interrupts
+ * with dlb_ack_compressed_cq_intr().
+ */
+void dlb_read_compressed_cq_intr_status(struct dlb_hw *hw,
+ u32 *ldb_interrupts,
+ u32 *dir_interrupts);
+
+/**
+ * dlb_ack_compressed_cq_intr_status() - ack compressed CQ interrupts
+ * @hw: dlb_hw handle for a particular device.
+ * @ldb_interrupts: 2-entry array of u32 bitmaps
+ * @dir_interrupts: 4-entry array of u32 bitmaps
+ *
+ * This function ACKs compressed CQ interrupts. Its arguments should be the
+ * same ones passed to dlb_read_compressed_cq_intr_status().
+ */
+void dlb_ack_compressed_cq_intr(struct dlb_hw *hw,
+ u32 *ldb_interrupts,
+ u32 *dir_interrupts);
+
+/**
+ * dlb_read_vf_intr_status() - read the VF interrupt status register
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * This function can be called from a VF's interrupt handler to determine
+ * which interrupts have fired. The first 31 bits correspond to CQ interrupt
+ * vectors, and the final bit is for the PF->VF mailbox interrupt vector.
+ *
+ * Return:
+ * Returns a bit vector indicating which interrupt vectors are active.
+ */
+u32 dlb_read_vf_intr_status(struct dlb_hw *hw);
+
+/**
+ * dlb_ack_vf_intr_status() - ack VF interrupts
+ * @hw: dlb_hw handle for a particular device.
+ * @interrupts: 32-bit bitmap
+ *
+ * This function ACKs a VF's interrupts. Its interrupts argument should be the
+ * value returned by dlb_read_vf_intr_status().
+ */
+void dlb_ack_vf_intr_status(struct dlb_hw *hw, u32 interrupts);
+
+/**
+ * dlb_ack_vf_msi_intr() - ack VF MSI interrupt
+ * @hw: dlb_hw handle for a particular device.
+ * @interrupts: 32-bit bitmap
+ *
+ * This function clears the VF's MSI interrupt pending register. Its interrupts
+ * argument should be contain the MSI vectors to ACK. For example, if MSI MME
+ * is in mode 0, then one bit 0 should ever be set.
+ */
+void dlb_ack_vf_msi_intr(struct dlb_hw *hw, u32 interrupts);
+
+/**
+ * dlb_ack_vf_mbox_int() - ack PF->VF mailbox interrupt
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * When done processing the PF mailbox request, this function unsets
+ * the PF's mailbox ISR register.
+ */
+void dlb_ack_pf_mbox_int(struct dlb_hw *hw);
+
+/**
+ * dlb_read_vf_to_pf_int_bitvec() - return a bit vector of all requesting VFs
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * When the VF->PF ISR fires, this function can be called to determine which
+ * VF(s) are requesting service. This bitvector must be passed to
+ * dlb_ack_vf_to_pf_int() when processing is complete for all requesting VFs.
+ *
+ * Return:
+ * Returns a bit vector indicating which VFs (0-15) have requested service.
+ */
+u32 dlb_read_vf_to_pf_int_bitvec(struct dlb_hw *hw);
+
+/**
+ * dlb_ack_vf_mbox_int() - ack processed VF->PF mailbox interrupt
+ * @hw: dlb_hw handle for a particular device.
+ * @bitvec: bit vector returned by dlb_read_vf_to_pf_int_bitvec()
+ *
+ * When done processing all VF mailbox requests, this function unsets the VF's
+ * mailbox ISR register.
+ */
+void dlb_ack_vf_mbox_int(struct dlb_hw *hw, u32 bitvec);
+
+/**
+ * dlb_read_vf_flr_int_bitvec() - return a bit vector of all VFs requesting FLR
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * When the VF FLR ISR fires, this function can be called to determine which
+ * VF(s) are requesting FLRs. This bitvector must passed to
+ * dlb_ack_vf_flr_int() when processing is complete for all requesting VFs.
+ *
+ * Return:
+ * Returns a bit vector indicating which VFs (0-15) have requested FLRs.
+ */
+u32 dlb_read_vf_flr_int_bitvec(struct dlb_hw *hw);
+
+/**
+ * dlb_ack_vf_flr_int() - ack processed VF<->PF interrupt(s)
+ * @hw: dlb_hw handle for a particular device.
+ * @bitvec: bit vector returned by dlb_read_vf_flr_int_bitvec()
+ * @a_stepping: device is A-stepping
+ *
+ * When done processing all VF FLR requests, this function unsets the VF's FLR
+ * ISR register.
+ *
+ * Note: The caller must ensure dlb_set_vf_reset_in_progress(),
+ * dlb_clr_vf_reset_in_progress(), and dlb_ack_vf_flr_int() are not executed in
+ * parallel, because the reset-in-progress register does not support atomic
+ * updates on A-stepping devices.
+ */
+void dlb_ack_vf_flr_int(struct dlb_hw *hw, u32 bitvec, bool a_stepping);
+
+/**
+ * dlb_ack_vf_to_pf_int() - ack processed VF mbox and FLR interrupt(s)
+ * @hw: dlb_hw handle for a particular device.
+ * @mbox_bitvec: bit vector returned by dlb_read_vf_to_pf_int_bitvec()
+ * @flr_bitvec: bit vector returned by dlb_read_vf_flr_int_bitvec()
+ *
+ * When done processing all VF requests, this function communicates to the
+ * hardware that processing is complete. When this function completes, hardware
+ * can immediately generate another VF mbox or FLR interrupt.
+ */
+void dlb_ack_vf_to_pf_int(struct dlb_hw *hw,
+ u32 mbox_bitvec,
+ u32 flr_bitvec);
+
+/**
+ * dlb_process_alarm_interrupt() - process an alarm interrupt
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * This function reads the alarm syndrome, logs its, and acks the interrupt.
+ * This function should be called from the alarm interrupt handler when
+ * interrupt vector DLB_INT_ALARM fires.
+ */
+void dlb_process_alarm_interrupt(struct dlb_hw *hw);
+
+/**
+ * dlb_process_ingress_error_interrupt() - process ingress error interrupts
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * This function reads the alarm syndrome, logs it, notifies user-space, and
+ * acks the interrupt. This function should be called from the alarm interrupt
+ * handler when interrupt vector DLB_INT_INGRESS_ERROR fires.
+ */
+void dlb_process_ingress_error_interrupt(struct dlb_hw *hw);
+
+/**
+ * dlb_get_group_sequence_numbers() - return a group's number of SNs per queue
+ * @hw: dlb_hw handle for a particular device.
+ * @group_id: sequence number group ID.
+ *
+ * This function returns the configured number of sequence numbers per queue
+ * for the specified group.
+ *
+ * Return:
+ * Returns -EINVAL if group_id is invalid, else the group's SNs per queue.
+ */
+int dlb_get_group_sequence_numbers(struct dlb_hw *hw, unsigned int group_id);
+
+/**
+ * dlb_get_group_sequence_number_occupancy() - return a group's in-use slots
+ * @hw: dlb_hw handle for a particular device.
+ * @group_id: sequence number group ID.
+ *
+ * This function returns the group's number of in-use slots (i.e. load-balanced
+ * queues using the specified group).
+ *
+ * Return:
+ * Returns -EINVAL if group_id is invalid, else the group's occupancy.
+ */
+int dlb_get_group_sequence_number_occupancy(struct dlb_hw *hw,
+ unsigned int group_id);
+
+/**
+ * dlb_set_group_sequence_numbers() - assign a group's number of SNs per queue
+ * @hw: dlb_hw handle for a particular device.
+ * @group_id: sequence number group ID.
+ * @val: requested amount of sequence numbers per queue.
+ *
+ * This function configures the group's number of sequence numbers per queue.
+ * val can be a power-of-two between 32 and 1024, inclusive. This setting can
+ * be configured until the first ordered load-balanced queue is configured, at
+ * which point the configuration is locked.
+ *
+ * Return:
+ * Returns 0 upon success; -EINVAL if group_id or val is invalid, -EPERM if an
+ * ordered queue is configured.
+ */
+int dlb_set_group_sequence_numbers(struct dlb_hw *hw,
+ unsigned int group_id,
+ unsigned long val);
+
+/**
+ * dlb_reset_domain() - reset a scheduling domain
+ * @hw: dlb_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function resets and frees a DLB scheduling domain and its associated
+ * resources.
+ *
+ * Pre-condition: the driver must ensure software has stopped sending QEs
+ * through this domain's producer ports before invoking this function, or
+ * undefined behavior will result.
+ *
+ * Return:
+ * Returns 0 upon success, -1 otherwise.
+ *
+ * EINVAL - Invalid domain ID, or the domain is not configured.
+ * EFAULT - Internal error. (Possibly caused if software is the pre-condition
+ * is not met.)
+ * ETIMEDOUT - Hardware component didn't reset in the expected time.
+ */
+int dlb_reset_domain(struct dlb_hw *hw,
+ u32 domain_id,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_ldb_port_owned_by_domain() - query whether a port is owned by a domain
+ * @hw: dlb_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @port_id: indicates whether this request came from a VF.
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function returns whether a load-balanced port is owned by a specified
+ * domain.
+ *
+ * Return:
+ * Returns 0 if false, 1 if true, <0 otherwise.
+ *
+ * EINVAL - Invalid domain or port ID, or the domain is not configured.
+ */
+int dlb_ldb_port_owned_by_domain(struct dlb_hw *hw,
+ u32 domain_id,
+ u32 port_id,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_dir_port_owned_by_domain() - query whether a port is owned by a domain
+ * @hw: dlb_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @port_id: indicates whether this request came from a VF.
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function returns whether a directed port is owned by a specified
+ * domain.
+ *
+ * Return:
+ * Returns 0 if false, 1 if true, <0 otherwise.
+ *
+ * EINVAL - Invalid domain or port ID, or the domain is not configured.
+ */
+int dlb_dir_port_owned_by_domain(struct dlb_hw *hw,
+ u32 domain_id,
+ u32 port_id,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_hw_get_num_resources() - query the PCI function's available resources
+ * @arg: pointer to resource counts.
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function returns the number of available resources for the PF or for a
+ * VF.
+ *
+ * Return:
+ * Returns 0 upon success, -1 if vf_request is true and vf_id is invalid.
+ */
+int dlb_hw_get_num_resources(struct dlb_hw *hw,
+ struct dlb_get_num_resources_args *arg,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_hw_get_num_used_resources() - query the PCI function's used resources
+ * @arg: pointer to resource counts.
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function returns the number of resources in use by the PF or a VF. It
+ * fills in the fields that args points to, except the following:
+ * - max_contiguous_atomic_inflights
+ * - max_contiguous_hist_list_entries
+ * - max_contiguous_ldb_credits
+ * - max_contiguous_dir_credits
+ *
+ * Return:
+ * Returns 0 upon success, -1 if vf_request is true and vf_id is invalid.
+ */
+int dlb_hw_get_num_used_resources(struct dlb_hw *hw,
+ struct dlb_get_num_resources_args *arg,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_send_async_vf_to_pf_msg() - (VF only) send a mailbox message to the PF
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * This function sends a VF->PF mailbox message. It is asynchronous, so it
+ * returns once the message is sent but potentially before the PF has processed
+ * the message. The caller must call dlb_vf_to_pf_complete() to determine when
+ * the PF has finished processing the request.
+ */
+void dlb_send_async_vf_to_pf_msg(struct dlb_hw *hw);
+
+/**
+ * dlb_vf_to_pf_complete() - check the status of an asynchronous mailbox request
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * This function returns a boolean indicating whether the PF has finished
+ * processing a VF->PF mailbox request. It should only be called after sending
+ * an asynchronous request with dlb_send_async_vf_to_pf_msg().
+ */
+bool dlb_vf_to_pf_complete(struct dlb_hw *hw);
+
+/**
+ * dlb_vf_flr_complete() - check the status of a VF FLR
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * This function returns a boolean indicating whether the PF has finished
+ * executing the VF FLR. It should only be called after setting the VF's FLR
+ * bit.
+ */
+bool dlb_vf_flr_complete(struct dlb_hw *hw);
+
+/**
+ * dlb_set_vf_reset_in_progress() - set a VF's reset in progress bit
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID.
+ *
+ * Note: This function is only supported on A-stepping devices.
+ *
+ * Note: The caller must ensure dlb_set_vf_reset_in_progress(),
+ * dlb_clr_vf_reset_in_progress(), and dlb_ack_vf_flr_int() are not executed in
+ * parallel, because the reset-in-progress register does not support atomic
+ * updates on A-stepping devices.
+ */
+void dlb_set_vf_reset_in_progress(struct dlb_hw *hw, int vf_id);
+
+/**
+ * dlb_clr_vf_reset_in_progress() - clear a VF's reset in progress bit
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID.
+ *
+ * Note: This function is only supported on A-stepping devices.
+ *
+ * Note: The caller must ensure dlb_set_vf_reset_in_progress(),
+ * dlb_clr_vf_reset_in_progress(), and dlb_ack_vf_flr_int() are not executed in
+ * parallel, because the reset-in-progress register does not support atomic
+ * updates on A-stepping devices.
+ */
+void dlb_clr_vf_reset_in_progress(struct dlb_hw *hw, int vf_id);
+
+/**
+ * dlb_send_async_pf_to_vf_msg() - (PF only) send a mailbox message to the VF
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID.
+ *
+ * This function sends a PF->VF mailbox message. It is asynchronous, so it
+ * returns once the message is sent but potentially before the VF has processed
+ * the message. The caller must call dlb_pf_to_vf_complete() to determine when
+ * the VF has finished processing the request.
+ */
+void dlb_send_async_pf_to_vf_msg(struct dlb_hw *hw, unsigned int vf_id);
+
+/**
+ * dlb_pf_to_vf_complete() - check the status of an asynchronous mailbox request
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID.
+ *
+ * This function returns a boolean indicating whether the VF has finished
+ * processing a PF->VF mailbox request. It should only be called after sending
+ * an asynchronous request with dlb_send_async_pf_to_vf_msg().
+ */
+bool dlb_pf_to_vf_complete(struct dlb_hw *hw, unsigned int vf_id);
+
+/**
+ * dlb_pf_read_vf_mbox_req() - (PF only) read a VF->PF mailbox request
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID.
+ * @data: pointer to message data.
+ * @len: size, in bytes, of the data array.
+ *
+ * This function copies one of the PF's VF->PF mailboxes into the array pointed
+ * to by data.
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ *
+ * EINVAL - len >= DLB_VF2PF_REQ_BYTES.
+ */
+int dlb_pf_read_vf_mbox_req(struct dlb_hw *hw,
+ unsigned int vf_id,
+ void *data,
+ int len);
+
+/**
+ * dlb_pf_read_vf_mbox_resp() - (PF only) read a VF->PF mailbox response
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID.
+ * @data: pointer to message data.
+ * @len: size, in bytes, of the data array.
+ *
+ * This function copies one of the PF's VF->PF mailboxes into the array pointed
+ * to by data.
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ *
+ * EINVAL - len >= DLB_VF2PF_RESP_BYTES.
+ */
+int dlb_pf_read_vf_mbox_resp(struct dlb_hw *hw,
+ unsigned int vf_id,
+ void *data,
+ int len);
+
+/**
+ * dlb_pf_write_vf_mbox_resp() - (PF only) write a PF->VF mailbox response
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID.
+ * @data: pointer to message data.
+ * @len: size, in bytes, of the data array.
+ *
+ * This function copies the user-provided message data into of the PF's VF->PF
+ * mailboxes.
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ *
+ * EINVAL - len >= DLB_PF2VF_RESP_BYTES.
+ */
+int dlb_pf_write_vf_mbox_resp(struct dlb_hw *hw,
+ unsigned int vf_id,
+ void *data,
+ int len);
+
+/**
+ * dlb_pf_write_vf_mbox_req() - (PF only) write a PF->VF mailbox request
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID.
+ * @data: pointer to message data.
+ * @len: size, in bytes, of the data array.
+ *
+ * This function copies the user-provided message data into of the PF's VF->PF
+ * mailboxes.
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ *
+ * EINVAL - len >= DLB_PF2VF_REQ_BYTES.
+ */
+int dlb_pf_write_vf_mbox_req(struct dlb_hw *hw,
+ unsigned int vf_id,
+ void *data,
+ int len);
+
+/**
+ * dlb_vf_read_pf_mbox_resp() - (VF only) read a PF->VF mailbox response
+ * @hw: dlb_hw handle for a particular device.
+ * @data: pointer to message data.
+ * @len: size, in bytes, of the data array.
+ *
+ * This function copies the VF's PF->VF mailbox into the array pointed to by
+ * data.
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ *
+ * EINVAL - len >= DLB_PF2VF_RESP_BYTES.
+ */
+int dlb_vf_read_pf_mbox_resp(struct dlb_hw *hw, void *data, int len);
+
+/**
+ * dlb_vf_read_pf_mbox_req() - (VF only) read a PF->VF mailbox request
+ * @hw: dlb_hw handle for a particular device.
+ * @data: pointer to message data.
+ * @len: size, in bytes, of the data array.
+ *
+ * This function copies the VF's PF->VF mailbox into the array pointed to by
+ * data.
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ *
+ * EINVAL - len >= DLB_PF2VF_REQ_BYTES.
+ */
+int dlb_vf_read_pf_mbox_req(struct dlb_hw *hw, void *data, int len);
+
+/**
+ * dlb_vf_write_pf_mbox_req() - (VF only) write a VF->PF mailbox request
+ * @hw: dlb_hw handle for a particular device.
+ * @data: pointer to message data.
+ * @len: size, in bytes, of the data array.
+ *
+ * This function copies the user-provided message data into of the VF's PF->VF
+ * mailboxes.
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ *
+ * EINVAL - len >= DLB_VF2PF_REQ_BYTES.
+ */
+int dlb_vf_write_pf_mbox_req(struct dlb_hw *hw, void *data, int len);
+
+/**
+ * dlb_vf_write_pf_mbox_resp() - (VF only) write a VF->PF mailbox response
+ * @hw: dlb_hw handle for a particular device.
+ * @data: pointer to message data.
+ * @len: size, in bytes, of the data array.
+ *
+ * This function copies the user-provided message data into of the VF's PF->VF
+ * mailboxes.
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ *
+ * EINVAL - len >= DLB_VF2PF_RESP_BYTES.
+ */
+int dlb_vf_write_pf_mbox_resp(struct dlb_hw *hw, void *data, int len);
+
+/**
+ * dlb_reset_vf() - reset the hardware owned by a VF
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID
+ *
+ * This function resets the hardware owned by a VF (if any), by resetting the
+ * VF's domains one by one.
+ */
+int dlb_reset_vf(struct dlb_hw *hw, unsigned int vf_id);
+
+/**
+ * dlb_vf_is_locked() - check whether the VF's resources are locked
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID
+ *
+ * This function returns whether or not the VF's resource assignments are
+ * locked. If locked, no resources can be added to or subtracted from the
+ * group.
+ */
+bool dlb_vf_is_locked(struct dlb_hw *hw, unsigned int vf_id);
+
+/**
+ * dlb_lock_vf() - lock the VF's resources
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID
+ *
+ * This function sets a flag indicating that the VF is using its resources.
+ * When VF is locked, its resource assignment cannot be changed.
+ */
+void dlb_lock_vf(struct dlb_hw *hw, unsigned int vf_id);
+
+/**
+ * dlb_unlock_vf() - unlock the VF's resources
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID
+ *
+ * This function unlocks the VF's resource assignment, allowing it to be
+ * modified.
+ */
+void dlb_unlock_vf(struct dlb_hw *hw, unsigned int vf_id);
+
+/**
+ * dlb_update_vf_sched_domains() - update the domains assigned to a VF
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID
+ * @num: number of scheduling domains to assign to this VF
+ *
+ * This function assigns num scheduling domains to the specified VF. If the VF
+ * already has domains assigned, this existing assignment is adjusted
+ * accordingly.
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ *
+ * Errors:
+ * EINVAL - vf_id is invalid, or the requested number of resources are
+ * unavailable.
+ * EPERM - The VF's resource assignment is locked and cannot be changed.
+ */
+int dlb_update_vf_sched_domains(struct dlb_hw *hw,
+ u32 vf_id,
+ u32 num);
+
+/**
+ * dlb_update_vf_ldb_queues() - update the LDB queues assigned to a VF
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID
+ * @num: number of LDB queues to assign to this VF
+ *
+ * This function assigns num LDB queues to the specified VF. If the VF already
+ * has LDB queues assigned, this existing assignment is adjusted
+ * accordingly.
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ *
+ * Errors:
+ * EINVAL - vf_id is invalid, or the requested number of resources are
+ * unavailable.
+ * EPERM - The VF's resource assignment is locked and cannot be changed.
+ */
+int dlb_update_vf_ldb_queues(struct dlb_hw *hw, u32 vf_id, u32 num);
+
+/**
+ * dlb_update_vf_ldb_ports() - update the LDB ports assigned to a VF
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID
+ * @num: number of LDB ports to assign to this VF
+ *
+ * This function assigns num LDB ports to the specified VF. If the VF already
+ * has LDB ports assigned, this existing assignment is adjusted accordingly.
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ *
+ * Errors:
+ * EINVAL - vf_id is invalid, or the requested number of resources are
+ * unavailable.
+ * EPERM - The VF's resource assignment is locked and cannot be changed.
+ */
+int dlb_update_vf_ldb_ports(struct dlb_hw *hw, u32 vf_id, u32 num);
+
+/**
+ * dlb_update_vf_dir_ports() - update the DIR ports assigned to a VF
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID
+ * @num: number of DIR ports to assign to this VF
+ *
+ * This function assigns num DIR ports to the specified VF. If the VF already
+ * has DIR ports assigned, this existing assignment is adjusted accordingly.
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ *
+ * Errors:
+ * EINVAL - vf_id is invalid, or the requested number of resources are
+ * unavailable.
+ * EPERM - The VF's resource assignment is locked and cannot be changed.
+ */
+int dlb_update_vf_dir_ports(struct dlb_hw *hw, u32 vf_id, u32 num);
+
+/**
+ * dlb_update_vf_ldb_credit_pools() - update the VF's assigned LDB pools
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID
+ * @num: number of LDB credit pools to assign to this VF
+ *
+ * This function assigns num LDB credit pools to the specified VF. If the VF
+ * already has LDB credit pools assigned, this existing assignment is adjusted
+ * accordingly.
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ *
+ * Errors:
+ * EINVAL - vf_id is invalid, or the requested number of resources are
+ * unavailable.
+ * EPERM - The VF's resource assignment is locked and cannot be changed.
+ */
+int dlb_update_vf_ldb_credit_pools(struct dlb_hw *hw,
+ u32 vf_id,
+ u32 num);
+
+/**
+ * dlb_update_vf_dir_credit_pools() - update the VF's assigned DIR pools
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID
+ * @num: number of DIR credit pools to assign to this VF
+ *
+ * This function assigns num DIR credit pools to the specified VF. If the VF
+ * already has DIR credit pools assigned, this existing assignment is adjusted
+ * accordingly.
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ *
+ * Errors:
+ * EINVAL - vf_id is invalid, or the requested number of resources are
+ * unavailable.
+ * EPERM - The VF's resource assignment is locked and cannot be changed.
+ */
+int dlb_update_vf_dir_credit_pools(struct dlb_hw *hw,
+ u32 vf_id,
+ u32 num);
+
+/**
+ * dlb_update_vf_ldb_credits() - update the VF's assigned LDB credits
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID
+ * @num: number of LDB credits to assign to this VF
+ *
+ * This function assigns num LDB credits to the specified VF. If the VF already
+ * has LDB credits assigned, this existing assignment is adjusted accordingly.
+ * VF's are assigned a contiguous chunk of credits, so this function may fail
+ * if a sufficiently large contiguous chunk is not available.
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ *
+ * Errors:
+ * EINVAL - vf_id is invalid, or the requested number of resources are
+ * unavailable.
+ * EPERM - The VF's resource assignment is locked and cannot be changed.
+ */
+int dlb_update_vf_ldb_credits(struct dlb_hw *hw, u32 vf_id, u32 num);
+
+/**
+ * dlb_update_vf_dir_credits() - update the VF's assigned DIR credits
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID
+ * @num: number of DIR credits to assign to this VF
+ *
+ * This function assigns num DIR credits to the specified VF. If the VF already
+ * has DIR credits assigned, this existing assignment is adjusted accordingly.
+ * VF's are assigned a contiguous chunk of credits, so this function may fail
+ * if a sufficiently large contiguous chunk is not available.
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ *
+ * Errors:
+ * EINVAL - vf_id is invalid, or the requested number of resources are
+ * unavailable.
+ * EPERM - The VF's resource assignment is locked and cannot be changed.
+ */
+int dlb_update_vf_dir_credits(struct dlb_hw *hw, u32 vf_id, u32 num);
+
+/**
+ * dlb_update_vf_hist_list_entries() - update the VF's assigned HL entries
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID
+ * @num: number of history list entries to assign to this VF
+ *
+ * This function assigns num history list entries to the specified VF. If the
+ * VF already has history list entries assigned, this existing assignment is
+ * adjusted accordingly. VF's are assigned a contiguous chunk of entries, so
+ * this function may fail if a sufficiently large contiguous chunk is not
+ * available.
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ *
+ * Errors:
+ * EINVAL - vf_id is invalid, or the requested number of resources are
+ * unavailable.
+ * EPERM - The VF's resource assignment is locked and cannot be changed.
+ */
+int dlb_update_vf_hist_list_entries(struct dlb_hw *hw,
+ u32 vf_id,
+ u32 num);
+
+/**
+ * dlb_update_vf_atomic_inflights() - update the VF's atomic inflights
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID
+ * @num: number of atomic inflights to assign to this VF
+ *
+ * This function assigns num atomic inflights to the specified VF. If the VF
+ * already has atomic inflights assigned, this existing assignment is adjusted
+ * accordingly. VF's are assigned a contiguous chunk of entries, so this
+ * function may fail if a sufficiently large contiguous chunk is not available.
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ *
+ * Errors:
+ * EINVAL - vf_id is invalid, or the requested number of resources are
+ * unavailable.
+ * EPERM - The VF's resource assignment is locked and cannot be changed.
+ */
+int dlb_update_vf_atomic_inflights(struct dlb_hw *hw,
+ u32 vf_id,
+ u32 num);
+
+/**
+ * dlb_reset_vf_resources() - reassign the VF's resources to the PF
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID
+ *
+ * This function takes any resources currently assigned to the VF and reassigns
+ * them to the PF.
+ *
+ * Return:
+ * Returns 0 upon success, <0 otherwise.
+ *
+ * Errors:
+ * EINVAL - vf_id is invalid
+ * EPERM - The VF's resource assignment is locked and cannot be changed.
+ */
+int dlb_reset_vf_resources(struct dlb_hw *hw, unsigned int vf_id);
+
+/**
+ * dlb_notify_vf() - send a notification to a VF
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID
+ * @notification: notification
+ *
+ * This function sends a notification (as defined in dlb_mbox.h) to a VF.
+ *
+ * Return:
+ * Returns 0 upon success, -1 if the VF doesn't ACK the PF->VF interrupt.
+ */
+int dlb_notify_vf(struct dlb_hw *hw,
+ unsigned int vf_id,
+ u32 notification);
+
+/**
+ * dlb_vf_in_use() - query whether a VF is in use
+ * @hw: dlb_hw handle for a particular device.
+ * @vf_id: VF ID
+ *
+ * This function sends a mailbox request to the VF to query whether the VF is in
+ * use.
+ *
+ * Return:
+ * Returns 0 for false, 1 for true, and -1 if the mailbox request times out or
+ * an internal error occurs.
+ */
+int dlb_vf_in_use(struct dlb_hw *hw, unsigned int vf_id);
+
+/**
+ * dlb_disable_dp_vasr_feature() - disable directed pipe VAS reset hardware
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * This function disables certain hardware in the directed pipe,
+ * necessary to workaround a DLB VAS reset issue.
+ */
+void dlb_disable_dp_vasr_feature(struct dlb_hw *hw);
+
+/**
+ * dlb_enable_excess_tokens_alarm() - enable interrupts for the excess token
+ * pop alarm
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * This function enables the PF ingress error alarm interrupt to fire when an
+ * excess token pop occurs.
+ */
+void dlb_enable_excess_tokens_alarm(struct dlb_hw *hw);
+
+/**
+ * dlb_disable_excess_tokens_alarm() - disable interrupts for the excess token
+ * pop alarm
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * This function disables the PF ingress error alarm interrupt to fire when an
+ * excess token pop occurs.
+ */
+void dlb_disable_excess_tokens_alarm(struct dlb_hw *hw);
+
+/**
+ * dlb_hw_get_ldb_queue_depth() - returns the depth of a load-balanced queue
+ * @hw: dlb_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: queue depth args
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function returns the depth of a load-balanced queue.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb_error. If successful, resp->id
+ * contains the depth.
+ *
+ * Errors:
+ * EINVAL - Invalid domain ID or queue ID.
+ */
+int dlb_hw_get_ldb_queue_depth(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_get_ldb_queue_depth_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_hw_get_dir_queue_depth() - returns the depth of a directed queue
+ * @hw: dlb_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: queue depth args
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * This function returns the depth of a directed queue.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb_error. If successful, resp->id
+ * contains the depth.
+ *
+ * Errors:
+ * EINVAL - Invalid domain ID or queue ID.
+ */
+int dlb_hw_get_dir_queue_depth(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_get_dir_queue_depth_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_hw_pending_port_unmaps() - returns the number of unmap operations in
+ * progress for a load-balanced port.
+ * @hw: dlb_hw handle for a particular device.
+ * @domain_id: domain ID.
+ * @args: number of unmaps in progress args
+ * @vf_request: indicates whether this request came from a VF.
+ * @vf_id: If vf_request is true, this contains the VF's ID.
+ *
+ * Return:
+ * Returns 0 upon success, < 0 otherwise. If an error occurs, resp->status is
+ * assigned a detailed error code from enum dlb_error. If successful, resp->id
+ * contains the number of unmaps in progress.
+ *
+ * Errors:
+ * EINVAL - Invalid port ID.
+ */
+int dlb_hw_pending_port_unmaps(struct dlb_hw *hw,
+ u32 domain_id,
+ struct dlb_pending_port_unmaps_args *args,
+ struct dlb_cmd_response *resp,
+ bool vf_request,
+ unsigned int vf_id);
+
+/**
+ * dlb_hw_enable_sparse_ldb_cq_mode() - enable sparse mode for load-balanced
+ * ports.
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * This function must be called prior to configuring scheduling domains.
+ */
+void dlb_hw_enable_sparse_ldb_cq_mode(struct dlb_hw *hw);
+
+/**
+ * dlb_hw_enable_sparse_dir_cq_mode() - enable sparse mode for directed ports
+ * @hw: dlb_hw handle for a particular device.
+ *
+ * This function must be called prior to configuring scheduling domains.
+ */
+void dlb_hw_enable_sparse_dir_cq_mode(struct dlb_hw *hw);
+
+/**
+ * dlb_hw_set_qe_arbiter_weights() - program QE arbiter weights
+ * @hw: dlb_hw handle for a particular device.
+ * @weight: 8-entry array of arbiter weights.
+ *
+ * weight[N] programs priority N's weight. In cases where the 8 priorities are
+ * reduced to 4 bins, the mapping is:
+ * - weight[1] programs bin 0
+ * - weight[3] programs bin 1
+ * - weight[5] programs bin 2
+ * - weight[7] programs bin 3
+ */
+void dlb_hw_set_qe_arbiter_weights(struct dlb_hw *hw, u8 weight[8]);
+
+/**
+ * dlb_hw_set_qid_arbiter_weights() - program QID arbiter weights
+ * @hw: dlb_hw handle for a particular device.
+ * @weight: 8-entry array of arbiter weights.
+ *
+ * weight[N] programs priority N's weight. In cases where the 8 priorities are
+ * reduced to 4 bins, the mapping is:
+ * - weight[1] programs bin 0
+ * - weight[3] programs bin 1
+ * - weight[5] programs bin 2
+ * - weight[7] programs bin 3
+ */
+void dlb_hw_set_qid_arbiter_weights(struct dlb_hw *hw, u8 weight[8]);
+
+/**
+ * dlb_hw_enable_pp_sw_alarms() - enable out-of-credit alarm for all producer
+ * ports
+ * @hw: dlb_hw handle for a particular device.
+ */
+void dlb_hw_enable_pp_sw_alarms(struct dlb_hw *hw);
+
+/**
+ * dlb_hw_disable_pp_sw_alarms() - disable out-of-credit alarm for all producer
+ * ports
+ * @hw: dlb_hw handle for a particular device.
+ */
+void dlb_hw_disable_pp_sw_alarms(struct dlb_hw *hw);
+
+/**
+ * dlb_hw_disable_pf_to_vf_isr_pend_err() - disable alarm triggered by PF
+ * access to VF's ISR pending register
+ * @hw: dlb_hw handle for a particular device.
+ */
+void dlb_hw_disable_pf_to_vf_isr_pend_err(struct dlb_hw *hw);
+
+/**
+ * dlb_hw_disable_vf_to_pf_isr_pend_err() - disable alarm triggered by VF
+ * access to PF's ISR pending register
+ * @hw: dlb_hw handle for a particular device.
+ */
+void dlb_hw_disable_vf_to_pf_isr_pend_err(struct dlb_hw *hw);
+
+#endif /* __DLB_RESOURCE_H */
diff --git a/drivers/event/dlb/pf/base/dlb_user.h b/drivers/event/dlb/pf/base/dlb_user.h
new file mode 100644
index 0000000..6e7ee2e
--- /dev/null
+++ b/drivers/event/dlb/pf/base/dlb_user.h
@@ -0,0 +1,1084 @@
+/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)
+ * Copyright(c) 2016-2020 Intel Corporation
+ */
+
+#ifndef __DLB_USER_H
+#define __DLB_USER_H
+
+#define DLB_MAX_NAME_LEN 64
+
+#include "dlb_osdep_types.h"
+
+enum dlb_error {
+ DLB_ST_SUCCESS = 0,
+ DLB_ST_NAME_EXISTS,
+ DLB_ST_DOMAIN_UNAVAILABLE,
+ DLB_ST_LDB_PORTS_UNAVAILABLE,
+ DLB_ST_DIR_PORTS_UNAVAILABLE,
+ DLB_ST_LDB_QUEUES_UNAVAILABLE,
+ DLB_ST_LDB_CREDITS_UNAVAILABLE,
+ DLB_ST_DIR_CREDITS_UNAVAILABLE,
+ DLB_ST_LDB_CREDIT_POOLS_UNAVAILABLE,
+ DLB_ST_DIR_CREDIT_POOLS_UNAVAILABLE,
+ DLB_ST_SEQUENCE_NUMBERS_UNAVAILABLE,
+ DLB_ST_INVALID_DOMAIN_ID,
+ DLB_ST_INVALID_QID_INFLIGHT_ALLOCATION,
+ DLB_ST_ATOMIC_INFLIGHTS_UNAVAILABLE,
+ DLB_ST_HIST_LIST_ENTRIES_UNAVAILABLE,
+ DLB_ST_INVALID_LDB_CREDIT_POOL_ID,
+ DLB_ST_INVALID_DIR_CREDIT_POOL_ID,
+ DLB_ST_INVALID_POP_COUNT_VIRT_ADDR,
+ DLB_ST_INVALID_LDB_QUEUE_ID,
+ DLB_ST_INVALID_CQ_DEPTH,
+ DLB_ST_INVALID_CQ_VIRT_ADDR,
+ DLB_ST_INVALID_PORT_ID,
+ DLB_ST_INVALID_QID,
+ DLB_ST_INVALID_PRIORITY,
+ DLB_ST_NO_QID_SLOTS_AVAILABLE,
+ DLB_ST_QED_FREELIST_ENTRIES_UNAVAILABLE,
+ DLB_ST_DQED_FREELIST_ENTRIES_UNAVAILABLE,
+ DLB_ST_INVALID_DIR_QUEUE_ID,
+ DLB_ST_DIR_QUEUES_UNAVAILABLE,
+ DLB_ST_INVALID_LDB_CREDIT_LOW_WATERMARK,
+ DLB_ST_INVALID_LDB_CREDIT_QUANTUM,
+ DLB_ST_INVALID_DIR_CREDIT_LOW_WATERMARK,
+ DLB_ST_INVALID_DIR_CREDIT_QUANTUM,
+ DLB_ST_DOMAIN_NOT_CONFIGURED,
+ DLB_ST_PID_ALREADY_ATTACHED,
+ DLB_ST_PID_NOT_ATTACHED,
+ DLB_ST_INTERNAL_ERROR,
+ DLB_ST_DOMAIN_IN_USE,
+ DLB_ST_IOMMU_MAPPING_ERROR,
+ DLB_ST_FAIL_TO_PIN_MEMORY_PAGE,
+ DLB_ST_UNABLE_TO_PIN_POPCOUNT_PAGES,
+ DLB_ST_UNABLE_TO_PIN_CQ_PAGES,
+ DLB_ST_DISCONTIGUOUS_CQ_MEMORY,
+ DLB_ST_DISCONTIGUOUS_POP_COUNT_MEMORY,
+ DLB_ST_DOMAIN_STARTED,
+ DLB_ST_LARGE_POOL_NOT_SPECIFIED,
+ DLB_ST_SMALL_POOL_NOT_SPECIFIED,
+ DLB_ST_NEITHER_POOL_SPECIFIED,
+ DLB_ST_DOMAIN_NOT_STARTED,
+ DLB_ST_INVALID_MEASUREMENT_DURATION,
+ DLB_ST_INVALID_PERF_METRIC_GROUP_ID,
+ DLB_ST_LDB_PORT_REQUIRED_FOR_LDB_QUEUES,
+ DLB_ST_DOMAIN_RESET_FAILED,
+ DLB_ST_MBOX_ERROR,
+ DLB_ST_INVALID_HIST_LIST_DEPTH,
+ DLB_ST_NO_MEMORY,
+};
+
+static const char dlb_error_strings[][128] = {
+ "DLB_ST_SUCCESS",
+ "DLB_ST_NAME_EXISTS",
+ "DLB_ST_DOMAIN_UNAVAILABLE",
+ "DLB_ST_LDB_PORTS_UNAVAILABLE",
+ "DLB_ST_DIR_PORTS_UNAVAILABLE",
+ "DLB_ST_LDB_QUEUES_UNAVAILABLE",
+ "DLB_ST_LDB_CREDITS_UNAVAILABLE",
+ "DLB_ST_DIR_CREDITS_UNAVAILABLE",
+ "DLB_ST_LDB_CREDIT_POOLS_UNAVAILABLE",
+ "DLB_ST_DIR_CREDIT_POOLS_UNAVAILABLE",
+ "DLB_ST_SEQUENCE_NUMBERS_UNAVAILABLE",
+ "DLB_ST_INVALID_DOMAIN_ID",
+ "DLB_ST_INVALID_QID_INFLIGHT_ALLOCATION",
+ "DLB_ST_ATOMIC_INFLIGHTS_UNAVAILABLE",
+ "DLB_ST_HIST_LIST_ENTRIES_UNAVAILABLE",
+ "DLB_ST_INVALID_LDB_CREDIT_POOL_ID",
+ "DLB_ST_INVALID_DIR_CREDIT_POOL_ID",
+ "DLB_ST_INVALID_POP_COUNT_VIRT_ADDR",
+ "DLB_ST_INVALID_LDB_QUEUE_ID",
+ "DLB_ST_INVALID_CQ_DEPTH",
+ "DLB_ST_INVALID_CQ_VIRT_ADDR",
+ "DLB_ST_INVALID_PORT_ID",
+ "DLB_ST_INVALID_QID",
+ "DLB_ST_INVALID_PRIORITY",
+ "DLB_ST_NO_QID_SLOTS_AVAILABLE",
+ "DLB_ST_QED_FREELIST_ENTRIES_UNAVAILABLE",
+ "DLB_ST_DQED_FREELIST_ENTRIES_UNAVAILABLE",
+ "DLB_ST_INVALID_DIR_QUEUE_ID",
+ "DLB_ST_DIR_QUEUES_UNAVAILABLE",
+ "DLB_ST_INVALID_LDB_CREDIT_LOW_WATERMARK",
+ "DLB_ST_INVALID_LDB_CREDIT_QUANTUM",
+ "DLB_ST_INVALID_DIR_CREDIT_LOW_WATERMARK",
+ "DLB_ST_INVALID_DIR_CREDIT_QUANTUM",
+ "DLB_ST_DOMAIN_NOT_CONFIGURED",
+ "DLB_ST_PID_ALREADY_ATTACHED",
+ "DLB_ST_PID_NOT_ATTACHED",
+ "DLB_ST_INTERNAL_ERROR",
+ "DLB_ST_DOMAIN_IN_USE",
+ "DLB_ST_IOMMU_MAPPING_ERROR",
+ "DLB_ST_FAIL_TO_PIN_MEMORY_PAGE",
+ "DLB_ST_UNABLE_TO_PIN_POPCOUNT_PAGES",
+ "DLB_ST_UNABLE_TO_PIN_CQ_PAGES",
+ "DLB_ST_DISCONTIGUOUS_CQ_MEMORY",
+ "DLB_ST_DISCONTIGUOUS_POP_COUNT_MEMORY",
+ "DLB_ST_DOMAIN_STARTED",
+ "DLB_ST_LARGE_POOL_NOT_SPECIFIED",
+ "DLB_ST_SMALL_POOL_NOT_SPECIFIED",
+ "DLB_ST_NEITHER_POOL_SPECIFIED",
+ "DLB_ST_DOMAIN_NOT_STARTED",
+ "DLB_ST_INVALID_MEASUREMENT_DURATION",
+ "DLB_ST_INVALID_PERF_METRIC_GROUP_ID",
+ "DLB_ST_LDB_PORT_REQUIRED_FOR_LDB_QUEUES",
+ "DLB_ST_DOMAIN_RESET_FAILED",
+ "DLB_ST_MBOX_ERROR",
+ "DLB_ST_INVALID_HIST_LIST_DEPTH",
+ "DLB_ST_NO_MEMORY",
+};
+
+struct dlb_cmd_response {
+ __u32 status; /* Interpret using enum dlb_error */
+ __u32 id;
+};
+
+/******************************/
+/* 'dlb' device file commands */
+/******************************/
+
+#define DLB_DEVICE_VERSION(x) (((x) >> 8) & 0xFF)
+#define DLB_DEVICE_REVISION(x) ((x) & 0xFF)
+
+enum dlb_revisions {
+ DLB_REV_A0 = 0,
+ DLB_REV_A1 = 1,
+ DLB_REV_A2 = 2,
+ DLB_REV_A3 = 3,
+ DLB_REV_B0 = 4,
+};
+
+/*
+ * DLB_CMD_GET_DEVICE_VERSION: Query the DLB device version.
+ *
+ * This ioctl interface is the same in all driver versions and is always
+ * the first ioctl.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id[7:0]: Device revision.
+ * response.id[15:8]: Device version.
+ */
+
+struct dlb_get_device_version_args {
+ /* Output parameters */
+ __u64 response;
+};
+
+#define DLB_VERSION_MAJOR_NUMBER 10
+#define DLB_VERSION_MINOR_NUMBER 7
+#define DLB_VERSION_REVISION_NUMBER 9
+#define DLB_VERSION (DLB_VERSION_MAJOR_NUMBER << 24 | \
+ DLB_VERSION_MINOR_NUMBER << 16 | \
+ DLB_VERSION_REVISION_NUMBER)
+
+#define DLB_VERSION_GET_MAJOR_NUMBER(x) (((x) >> 24) & 0xFF)
+#define DLB_VERSION_GET_MINOR_NUMBER(x) (((x) >> 16) & 0xFF)
+#define DLB_VERSION_GET_REVISION_NUMBER(x) ((x) & 0xFFFF)
+
+static inline __u8 dlb_version_incompatible(__u32 version)
+{
+ __u8 inc;
+
+ inc = DLB_VERSION_GET_MAJOR_NUMBER(version) != DLB_VERSION_MAJOR_NUMBER;
+ inc |= (int)DLB_VERSION_GET_MINOR_NUMBER(version) <
+ DLB_VERSION_MINOR_NUMBER;
+
+ return inc;
+}
+
+/*
+ * DLB_CMD_GET_DRIVER_VERSION: Query the DLB driver version. The major number
+ * is changed when there is an ABI-breaking change, the minor number is
+ * changed if the API is changed in a backwards-compatible way, and the
+ * revision number is changed for fixes that don't affect the API.
+ *
+ * If the kernel driver's API version major number and the header's
+ * DLB_VERSION_MAJOR_NUMBER differ, the two are incompatible, or if the
+ * major numbers match but the kernel driver's minor number is less than
+ * the header file's, they are incompatible. The DLB_VERSION_INCOMPATIBLE
+ * macro should be used to check for compatibility.
+ *
+ * This ioctl interface is the same in all driver versions. Applications
+ * should check the driver version before performing any other ioctl
+ * operations.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: Driver API version. Use the DLB_VERSION_GET_MAJOR_NUMBER,
+ * DLB_VERSION_GET_MINOR_NUMBER, and
+ * DLB_VERSION_GET_REVISION_NUMBER macros to interpret the field.
+ */
+
+struct dlb_get_driver_version_args {
+ /* Output parameters */
+ __u64 response;
+};
+
+/*
+ * DLB_CMD_CREATE_SCHED_DOMAIN: Create a DLB scheduling domain and reserve the
+ * resources (queues, ports, etc.) that it contains.
+ *
+ * Input parameters:
+ * - num_ldb_queues: Number of load-balanced queues.
+ * - num_ldb_ports: Number of load-balanced ports.
+ * - num_dir_ports: Number of directed ports. A directed port has one directed
+ * queue, so no num_dir_queues argument is necessary.
+ * - num_atomic_inflights: This specifies the amount of temporary atomic QE
+ * storage for the domain. This storage is divided among the domain's
+ * load-balanced queues that are configured for atomic scheduling.
+ * - num_hist_list_entries: Amount of history list storage. This is divided
+ * among the domain's CQs.
+ * - num_ldb_credits: Amount of load-balanced QE storage (QED). QEs occupy this
+ * space until they are scheduled to a load-balanced CQ. One credit
+ * represents the storage for one QE.
+ * - num_dir_credits: Amount of directed QE storage (DQED). QEs occupy this
+ * space until they are scheduled to a directed CQ. One credit represents
+ * the storage for one QE.
+ * - num_ldb_credit_pools: Number of pools into which the load-balanced credits
+ * are placed.
+ * - num_dir_credit_pools: Number of pools into which the directed credits are
+ * placed.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: domain ID.
+ */
+struct dlb_create_sched_domain_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 num_ldb_queues;
+ __u32 num_ldb_ports;
+ __u32 num_dir_ports;
+ __u32 num_atomic_inflights;
+ __u32 num_hist_list_entries;
+ __u32 num_ldb_credits;
+ __u32 num_dir_credits;
+ __u32 num_ldb_credit_pools;
+ __u32 num_dir_credit_pools;
+};
+
+/*
+ * DLB_CMD_GET_NUM_RESOURCES: Return the number of available resources
+ * (queues, ports, etc.) that this device owns.
+ *
+ * Output parameters:
+ * - num_domains: Number of available scheduling domains.
+ * - num_ldb_queues: Number of available load-balanced queues.
+ * - num_ldb_ports: Number of available load-balanced ports.
+ * - num_dir_ports: Number of available directed ports. There is one directed
+ * queue for every directed port.
+ * - num_atomic_inflights: Amount of available temporary atomic QE storage.
+ * - max_contiguous_atomic_inflights: When a domain is created, the temporary
+ * atomic QE storage is allocated in a contiguous chunk. This return value
+ * is the longest available contiguous range of atomic QE storage.
+ * - num_hist_list_entries: Amount of history list storage.
+ * - max_contiguous_hist_list_entries: History list storage is allocated in
+ * a contiguous chunk, and this return value is the longest available
+ * contiguous range of history list entries.
+ * - num_ldb_credits: Amount of available load-balanced QE storage.
+ * - max_contiguous_ldb_credits: QED storage is allocated in a contiguous
+ * chunk, and this return value is the longest available contiguous range
+ * of load-balanced credit storage.
+ * - num_dir_credits: Amount of available directed QE storage.
+ * - max_contiguous_dir_credits: DQED storage is allocated in a contiguous
+ * chunk, and this return value is the longest available contiguous range
+ * of directed credit storage.
+ * - num_ldb_credit_pools: Number of available load-balanced credit pools.
+ * - num_dir_credit_pools: Number of available directed credit pools.
+ * - padding0: Reserved for future use.
+ */
+struct dlb_get_num_resources_args {
+ /* Output parameters */
+ __u32 num_sched_domains;
+ __u32 num_ldb_queues;
+ __u32 num_ldb_ports;
+ __u32 num_dir_ports;
+ __u32 num_atomic_inflights;
+ __u32 max_contiguous_atomic_inflights;
+ __u32 num_hist_list_entries;
+ __u32 max_contiguous_hist_list_entries;
+ __u32 num_ldb_credits;
+ __u32 max_contiguous_ldb_credits;
+ __u32 num_dir_credits;
+ __u32 max_contiguous_dir_credits;
+ __u32 num_ldb_credit_pools;
+ __u32 num_dir_credit_pools;
+ __u32 padding0;
+};
+
+/*
+ * DLB_CMD_SET_SN_ALLOCATION: Configure a sequence number group
+ *
+ * Input parameters:
+ * - group: Sequence number group ID.
+ * - num: Number of sequence numbers per queue.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ */
+struct dlb_set_sn_allocation_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 group;
+ __u32 num;
+};
+
+/*
+ * DLB_CMD_GET_SN_ALLOCATION: Get a sequence number group's configuration
+ *
+ * Input parameters:
+ * - group: Sequence number group ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: Specified group's number of sequence numbers per queue.
+ */
+struct dlb_get_sn_allocation_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 group;
+ __u32 padding0;
+};
+
+/*
+ * DLB_CMD_QUERY_CQ_POLL_MODE: Query the CQ poll mode the kernel driver is using
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: CQ poll mode (see enum dlb_cq_poll_modes).
+ */
+struct dlb_query_cq_poll_mode_args {
+ /* Output parameters */
+ __u64 response;
+};
+
+enum dlb_cq_poll_modes {
+ DLB_CQ_POLL_MODE_STD,
+ DLB_CQ_POLL_MODE_SPARSE,
+
+ /* NUM_DLB_CQ_POLL_MODE must be last */
+ NUM_DLB_CQ_POLL_MODE,
+};
+
+/*
+ * DLB_CMD_GET_SN_OCCUPANCY: Get a sequence number group's occupancy
+ *
+ * Each sequence number group has one or more slots, depending on its
+ * configuration. I.e.:
+ * - If configured for 1024 sequence numbers per queue, the group has 1 slot
+ * - If configured for 512 sequence numbers per queue, the group has 2 slots
+ * ...
+ * - If configured for 32 sequence numbers per queue, the group has 32 slots
+ *
+ * This ioctl returns the group's number of in-use slots. If its occupancy is
+ * 0, the group's sequence number allocation can be reconfigured.
+ *
+ * Input parameters:
+ * - group: Sequence number group ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: Specified group's number of used slots.
+ */
+struct dlb_get_sn_occupancy_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 group;
+ __u32 padding0;
+};
+
+enum dlb_user_interface_commands {
+ DLB_CMD_GET_DEVICE_VERSION,
+ DLB_CMD_CREATE_SCHED_DOMAIN,
+ DLB_CMD_GET_NUM_RESOURCES,
+ DLB_CMD_GET_DRIVER_VERSION,
+ DLB_CMD_SAMPLE_PERF_COUNTERS,
+ DLB_CMD_SET_SN_ALLOCATION,
+ DLB_CMD_GET_SN_ALLOCATION,
+ DLB_CMD_MEASURE_SCHED_COUNTS,
+ DLB_CMD_QUERY_CQ_POLL_MODE,
+ DLB_CMD_GET_SN_OCCUPANCY,
+
+ /* NUM_DLB_CMD must be last */
+ NUM_DLB_CMD,
+};
+
+/*******************************/
+/* 'domain' device file alerts */
+/*******************************/
+
+/* Scheduling domain device files can be read to receive domain-specific
+ * notifications, for alerts such as hardware errors.
+ *
+ * Each alert is encoded in a 16B message. The first 8B contains the alert ID,
+ * and the second 8B is optional and contains additional information.
+ * Applications should cast read data to a struct dlb_domain_alert, and
+ * interpret the struct's alert_id according to dlb_domain_alert_id. The read
+ * length must be 16B, or the function will return -EINVAL.
+ *
+ * Reads are destructive, and in the case of multiple file descriptors for the
+ * same domain device file, an alert will be read by only one of the file
+ * descriptors.
+ *
+ * The driver stores alerts in a fixed-size alert ring until they are read. If
+ * the alert ring fills completely, subsequent alerts will be dropped. It is
+ * recommended that DLB applications dedicate a thread to perform blocking
+ * reads on the device file.
+ */
+enum dlb_domain_alert_id {
+ /* A destination domain queue that this domain connected to has
+ * unregistered, and can no longer be sent to. The aux alert data
+ * contains the queue ID.
+ */
+ DLB_DOMAIN_ALERT_REMOTE_QUEUE_UNREGISTER,
+ /* A producer port in this domain attempted to send a QE without a
+ * credit. aux_alert_data[7:0] contains the port ID, and
+ * aux_alert_data[15:8] contains a flag indicating whether the port is
+ * load-balanced (1) or directed (0).
+ */
+ DLB_DOMAIN_ALERT_PP_OUT_OF_CREDITS,
+ /* Software issued an illegal enqueue for a port in this domain. An
+ * illegal enqueue could be:
+ * - Illegal (excess) completion
+ * - Illegal fragment
+ * - Illegal enqueue command
+ * aux_alert_data[7:0] contains the port ID, and aux_alert_data[15:8]
+ * contains a flag indicating whether the port is load-balanced (1) or
+ * directed (0).
+ */
+ DLB_DOMAIN_ALERT_PP_ILLEGAL_ENQ,
+ /* Software issued excess CQ token pops for a port in this domain.
+ * aux_alert_data[7:0] contains the port ID, and aux_alert_data[15:8]
+ * contains a flag indicating whether the port is load-balanced (1) or
+ * directed (0).
+ */
+ DLB_DOMAIN_ALERT_PP_EXCESS_TOKEN_POPS,
+ /* A enqueue contained either an invalid command encoding or a REL,
+ * REL_T, RLS, FWD, FWD_T, FRAG, or FRAG_T from a directed port.
+ *
+ * aux_alert_data[7:0] contains the port ID, and aux_alert_data[15:8]
+ * contains a flag indicating whether the port is load-balanced (1) or
+ * directed (0).
+ */
+ DLB_DOMAIN_ALERT_ILLEGAL_HCW,
+ /* The QID must be valid and less than 128.
+ *
+ * aux_alert_data[7:0] contains the port ID, and aux_alert_data[15:8]
+ * contains a flag indicating whether the port is load-balanced (1) or
+ * directed (0).
+ */
+ DLB_DOMAIN_ALERT_ILLEGAL_QID,
+ /* An enqueue went to a disabled QID.
+ *
+ * aux_alert_data[7:0] contains the port ID, and aux_alert_data[15:8]
+ * contains a flag indicating whether the port is load-balanced (1) or
+ * directed (0).
+ */
+ DLB_DOMAIN_ALERT_DISABLED_QID,
+ /* The device containing this domain was reset. All applications using
+ * the device need to exit for the driver to complete the reset
+ * procedure.
+ *
+ * aux_alert_data doesn't contain any information for this alert.
+ */
+ DLB_DOMAIN_ALERT_DEVICE_RESET,
+ /* User-space has enqueued an alert.
+ *
+ * aux_alert_data contains user-provided data.
+ */
+ DLB_DOMAIN_ALERT_USER,
+
+ /* Number of DLB domain alerts */
+ NUM_DLB_DOMAIN_ALERTS
+};
+
+static const char dlb_domain_alert_strings[][128] = {
+ "DLB_DOMAIN_ALERT_REMOTE_QUEUE_UNREGISTER",
+ "DLB_DOMAIN_ALERT_PP_OUT_OF_CREDITS",
+ "DLB_DOMAIN_ALERT_PP_ILLEGAL_ENQ",
+ "DLB_DOMAIN_ALERT_PP_EXCESS_TOKEN_POPS",
+ "DLB_DOMAIN_ALERT_ILLEGAL_HCW",
+ "DLB_DOMAIN_ALERT_ILLEGAL_QID",
+ "DLB_DOMAIN_ALERT_DISABLED_QID",
+ "DLB_DOMAIN_ALERT_DEVICE_RESET",
+ "DLB_DOMAIN_ALERT_USER",
+};
+
+struct dlb_domain_alert {
+ __u64 alert_id;
+ __u64 aux_alert_data;
+};
+
+/*********************************/
+/* 'domain' device file commands */
+/*********************************/
+
+/*
+ * DLB_DOMAIN_CMD_CREATE_LDB_POOL: Configure a load-balanced credit pool.
+ * Input parameters:
+ * - num_ldb_credits: Number of load-balanced credits (QED space) for this
+ * pool.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: pool ID.
+ */
+struct dlb_create_ldb_pool_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 num_ldb_credits;
+ __u32 padding0;
+};
+
+/*
+ * DLB_DOMAIN_CMD_CREATE_DIR_POOL: Configure a directed credit pool.
+ * Input parameters:
+ * - num_dir_credits: Number of directed credits (DQED space) for this pool.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: Pool ID.
+ */
+struct dlb_create_dir_pool_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 num_dir_credits;
+ __u32 padding0;
+};
+
+/*
+ * DLB_DOMAIN_CMD_CREATE_LDB_QUEUE: Configure a load-balanced queue.
+ * Input parameters:
+ * - num_atomic_inflights: This specifies the amount of temporary atomic QE
+ * storage for this queue. If zero, the queue will not support atomic
+ * scheduling.
+ * - num_sequence_numbers: This specifies the number of sequence numbers used
+ * by this queue. If zero, the queue will not support ordered scheduling.
+ * If non-zero, the queue will not support unordered scheduling.
+ * - num_qid_inflights: The maximum number of QEs that can be inflight
+ * (scheduled to a CQ but not completed) at any time. If
+ * num_sequence_numbers is non-zero, num_qid_inflights must be set equal
+ * to num_sequence_numbers.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: Queue ID.
+ */
+struct dlb_create_ldb_queue_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 num_sequence_numbers;
+ __u32 num_qid_inflights;
+ __u32 num_atomic_inflights;
+ __u32 padding0;
+};
+
+/*
+ * DLB_DOMAIN_CMD_CREATE_DIR_QUEUE: Configure a directed queue.
+ * Input parameters:
+ * - port_id: Port ID. If the corresponding directed port is already created,
+ * specify its ID here. Else this argument must be 0xFFFFFFFF to indicate
+ * that the queue is being created before the port.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: Queue ID.
+ */
+struct dlb_create_dir_queue_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __s32 port_id;
+ __u32 padding0;
+};
+
+/*
+ * DLB_DOMAIN_CMD_CREATE_LDB_PORT: Configure a load-balanced port.
+ * Input parameters:
+ * - ldb_credit_pool_id: Load-balanced credit pool this port will belong to.
+ * - dir_credit_pool_id: Directed credit pool this port will belong to.
+ * - ldb_credit_high_watermark: Number of load-balanced credits from the pool
+ * that this port will own.
+ *
+ * If this port's scheduling domain doesn't have any load-balanced queues,
+ * this argument is ignored and the port is given no load-balanced
+ * credits.
+ * - dir_credit_high_watermark: Number of directed credits from the pool that
+ * this port will own.
+ *
+ * If this port's scheduling domain doesn't have any directed queues,
+ * this argument is ignored and the port is given no directed credits.
+ * - ldb_credit_low_watermark: Load-balanced credit low watermark. When the
+ * port's credits reach this watermark, they become eligible to be
+ * refilled by the DLB as credits until the high watermark
+ * (num_ldb_credits) is reached.
+ *
+ * If this port's scheduling domain doesn't have any load-balanced queues,
+ * this argument is ignored and the port is given no load-balanced
+ * credits.
+ * - dir_credit_low_watermark: Directed credit low watermark. When the port's
+ * credits reach this watermark, they become eligible to be refilled by
+ * the DLB as credits until the high watermark (num_dir_credits) is
+ * reached.
+ *
+ * If this port's scheduling domain doesn't have any directed queues,
+ * this argument is ignored and the port is given no directed credits.
+ * - ldb_credit_quantum: Number of load-balanced credits for the DLB to refill
+ * per refill operation.
+ *
+ * If this port's scheduling domain doesn't have any load-balanced queues,
+ * this argument is ignored and the port is given no load-balanced
+ * credits.
+ * - dir_credit_quantum: Number of directed credits for the DLB to refill per
+ * refill operation.
+ *
+ * If this port's scheduling domain doesn't have any directed queues,
+ * this argument is ignored and the port is given no directed credits.
+ * - padding0: Reserved for future use.
+ * - cq_depth: Depth of the port's CQ. Must be a power-of-two between 8 and
+ * 1024, inclusive.
+ * - cq_depth_threshold: CQ depth interrupt threshold. A value of N means that
+ * the CQ interrupt won't fire until there are N or more outstanding CQ
+ * tokens.
+ * - cq_history_list_size: Number of history list entries. This must be greater
+ * than or equal to cq_depth.
+ * - padding1: Reserved for future use.
+ * - padding2: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: port ID.
+ */
+struct dlb_create_ldb_port_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 ldb_credit_pool_id;
+ __u32 dir_credit_pool_id;
+ __u16 ldb_credit_high_watermark;
+ __u16 ldb_credit_low_watermark;
+ __u16 ldb_credit_quantum;
+ __u16 dir_credit_high_watermark;
+ __u16 dir_credit_low_watermark;
+ __u16 dir_credit_quantum;
+ __u16 padding0;
+ __u16 cq_depth;
+ __u16 cq_depth_threshold;
+ __u16 cq_history_list_size;
+ __u32 padding1;
+};
+
+/*
+ * DLB_DOMAIN_CMD_CREATE_DIR_PORT: Configure a directed port.
+ * Input parameters:
+ * - ldb_credit_pool_id: Load-balanced credit pool this port will belong to.
+ * - dir_credit_pool_id: Directed credit pool this port will belong to.
+ * - ldb_credit_high_watermark: Number of load-balanced credits from the pool
+ * that this port will own.
+ *
+ * If this port's scheduling domain doesn't have any load-balanced queues,
+ * this argument is ignored and the port is given no load-balanced
+ * credits.
+ * - dir_credit_high_watermark: Number of directed credits from the pool that
+ * this port will own.
+ * - ldb_credit_low_watermark: Load-balanced credit low watermark. When the
+ * port's credits reach this watermark, they become eligible to be
+ * refilled by the DLB as credits until the high watermark
+ * (num_ldb_credits) is reached.
+ *
+ * If this port's scheduling domain doesn't have any load-balanced queues,
+ * this argument is ignored and the port is given no load-balanced
+ * credits.
+ * - dir_credit_low_watermark: Directed credit low watermark. When the port's
+ * credits reach this watermark, they become eligible to be refilled by
+ * the DLB as credits until the high watermark (num_dir_credits) is
+ * reached.
+ * - ldb_credit_quantum: Number of load-balanced credits for the DLB to refill
+ * per refill operation.
+ *
+ * If this port's scheduling domain doesn't have any load-balanced queues,
+ * this argument is ignored and the port is given no load-balanced
+ * credits.
+ * - dir_credit_quantum: Number of directed credits for the DLB to refill per
+ * refill operation.
+ * - cq_depth: Depth of the port's CQ. Must be a power-of-two between 8 and
+ * 1024, inclusive.
+ * - cq_depth_threshold: CQ depth interrupt threshold. A value of N means that
+ * the CQ interrupt won't fire until there are N or more outstanding CQ
+ * tokens.
+ * - qid: Queue ID. If the corresponding directed queue is already created,
+ * specify its ID here. Else this argument must be 0xFFFFFFFF to indicate
+ * that the port is being created before the queue.
+ * - padding1: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: Port ID.
+ */
+struct dlb_create_dir_port_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 ldb_credit_pool_id;
+ __u32 dir_credit_pool_id;
+ __u16 ldb_credit_high_watermark;
+ __u16 ldb_credit_low_watermark;
+ __u16 ldb_credit_quantum;
+ __u16 dir_credit_high_watermark;
+ __u16 dir_credit_low_watermark;
+ __u16 dir_credit_quantum;
+ __u16 cq_depth;
+ __u16 cq_depth_threshold;
+ __s32 queue_id;
+ __u32 padding1;
+};
+
+/*
+ * DLB_DOMAIN_CMD_START_DOMAIN: Mark the end of the domain configuration. This
+ * must be called before passing QEs into the device, and no configuration
+ * ioctls can be issued once the domain has started. Sending QEs into the
+ * device before calling this ioctl will result in undefined behavior.
+ * Input parameters:
+ * - (None)
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ */
+struct dlb_start_domain_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+};
+
+/*
+ * DLB_DOMAIN_CMD_MAP_QID: Map a load-balanced queue to a load-balanced port.
+ * Input parameters:
+ * - port_id: Load-balanced port ID.
+ * - qid: Load-balanced queue ID.
+ * - priority: Queue->port service priority.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ */
+struct dlb_map_qid_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 port_id;
+ __u32 qid;
+ __u32 priority;
+ __u32 padding0;
+};
+
+/*
+ * DLB_DOMAIN_CMD_UNMAP_QID: Unmap a load-balanced queue to a load-balanced
+ * port.
+ * Input parameters:
+ * - port_id: Load-balanced port ID.
+ * - qid: Load-balanced queue ID.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ */
+struct dlb_unmap_qid_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 port_id;
+ __u32 qid;
+};
+
+/*
+ * DLB_DOMAIN_CMD_ENABLE_LDB_PORT: Enable scheduling to a load-balanced port.
+ * Input parameters:
+ * - port_id: Load-balanced port ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ */
+struct dlb_enable_ldb_port_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 port_id;
+ __u32 padding0;
+};
+
+/*
+ * DLB_DOMAIN_CMD_ENABLE_DIR_PORT: Enable scheduling to a directed port.
+ * Input parameters:
+ * - port_id: Directed port ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ */
+struct dlb_enable_dir_port_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 port_id;
+};
+
+/*
+ * DLB_DOMAIN_CMD_DISABLE_LDB_PORT: Disable scheduling to a load-balanced port.
+ * Input parameters:
+ * - port_id: Load-balanced port ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ */
+struct dlb_disable_ldb_port_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 port_id;
+ __u32 padding0;
+};
+
+/*
+ * DLB_DOMAIN_CMD_DISABLE_DIR_PORT: Disable scheduling to a directed port.
+ * Input parameters:
+ * - port_id: Directed port ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ */
+struct dlb_disable_dir_port_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 port_id;
+ __u32 padding0;
+};
+
+/*
+ * DLB_DOMAIN_CMD_BLOCK_ON_CQ_INTERRUPT: Block on a CQ interrupt until a QE
+ * arrives for the specified port. If a QE is already present, the ioctl
+ * will immediately return.
+ *
+ * Note: Only one thread can block on a CQ's interrupt at a time. Doing
+ * otherwise can result in hung threads.
+ *
+ * Input parameters:
+ * - port_id: Port ID.
+ * - is_ldb: True if the port is load-balanced, false otherwise.
+ * - arm: Tell the driver to arm the interrupt.
+ * - cq_gen: Current CQ generation bit.
+ * - padding0: Reserved for future use.
+ * - cq_va: VA of the CQ entry where the next QE will be placed.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ */
+struct dlb_block_on_cq_interrupt_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 port_id;
+ __u8 is_ldb;
+ __u8 arm;
+ __u8 cq_gen;
+ __u8 padding0;
+ __u64 cq_va;
+};
+
+/*
+ * DLB_DOMAIN_CMD_ENQUEUE_DOMAIN_ALERT: Enqueue a domain alert that will be
+ * read by one reader thread.
+ *
+ * Input parameters:
+ * - aux_alert_data: user-defined auxiliary data.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ */
+struct dlb_enqueue_domain_alert_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u64 aux_alert_data;
+};
+
+/*
+ * DLB_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH: Get a load-balanced queue's depth.
+ * Input parameters:
+ * - queue_id: The load-balanced queue ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: queue depth.
+ */
+struct dlb_get_ldb_queue_depth_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 queue_id;
+ __u32 padding0;
+};
+
+/*
+ * DLB_DOMAIN_CMD_GET_DIR_QUEUE_DEPTH: Get a directed queue's depth.
+ * Input parameters:
+ * - queue_id: The directed queue ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: queue depth.
+ */
+struct dlb_get_dir_queue_depth_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 queue_id;
+ __u32 padding0;
+};
+
+/*
+ * DLB_DOMAIN_CMD_PENDING_PORT_UNMAPS: Get number of queue unmap operations in
+ * progress for a load-balanced port.
+ *
+ * Note: This is a snapshot; the number of unmap operations in progress
+ * is subject to change at any time.
+ *
+ * Input parameters:
+ * - port_id: Load-balanced port ID.
+ * - padding0: Reserved for future use.
+ *
+ * Output parameters:
+ * - response: pointer to a struct dlb_cmd_response.
+ * response.status: Detailed error code. In certain cases, such as if the
+ * response pointer is invalid, the driver won't set status.
+ * response.id: number of unmaps in progress.
+ */
+struct dlb_pending_port_unmaps_args {
+ /* Output parameters */
+ __u64 response;
+ /* Input parameters */
+ __u32 port_id;
+ __u32 padding0;
+};
+
+enum dlb_domain_user_interface_commands {
+ DLB_DOMAIN_CMD_CREATE_LDB_POOL,
+ DLB_DOMAIN_CMD_CREATE_DIR_POOL,
+ DLB_DOMAIN_CMD_CREATE_LDB_QUEUE,
+ DLB_DOMAIN_CMD_CREATE_DIR_QUEUE,
+ DLB_DOMAIN_CMD_CREATE_LDB_PORT,
+ DLB_DOMAIN_CMD_CREATE_DIR_PORT,
+ DLB_DOMAIN_CMD_START_DOMAIN,
+ DLB_DOMAIN_CMD_MAP_QID,
+ DLB_DOMAIN_CMD_UNMAP_QID,
+ DLB_DOMAIN_CMD_ENABLE_LDB_PORT,
+ DLB_DOMAIN_CMD_ENABLE_DIR_PORT,
+ DLB_DOMAIN_CMD_DISABLE_LDB_PORT,
+ DLB_DOMAIN_CMD_DISABLE_DIR_PORT,
+ DLB_DOMAIN_CMD_BLOCK_ON_CQ_INTERRUPT,
+ DLB_DOMAIN_CMD_ENQUEUE_DOMAIN_ALERT,
+ DLB_DOMAIN_CMD_GET_LDB_QUEUE_DEPTH,
+ DLB_DOMAIN_CMD_GET_DIR_QUEUE_DEPTH,
+ DLB_DOMAIN_CMD_PENDING_PORT_UNMAPS,
+
+ /* NUM_DLB_DOMAIN_CMD must be last */
+ NUM_DLB_DOMAIN_CMD,
+};
+
+/*
+ * Base addresses for memory mapping the consumer queue (CQ) and popcount (PC)
+ * memory space, and producer port (PP) MMIO space. The CQ, PC, and PP
+ * addresses are per-port. Every address is page-separated (e.g. LDB PP 0 is at
+ * 0x2100000 and LDB PP 1 is at 0x2101000).
+ */
+#define DLB_LDB_CQ_BASE 0x3000000
+#define DLB_LDB_CQ_MAX_SIZE 65536
+#define DLB_LDB_CQ_OFFS(id) (DLB_LDB_CQ_BASE + (id) * DLB_LDB_CQ_MAX_SIZE)
+
+#define DLB_DIR_CQ_BASE 0x3800000
+#define DLB_DIR_CQ_MAX_SIZE 65536
+#define DLB_DIR_CQ_OFFS(id) (DLB_DIR_CQ_BASE + (id) * DLB_DIR_CQ_MAX_SIZE)
+
+#define DLB_LDB_PC_BASE 0x2300000
+#define DLB_LDB_PC_MAX_SIZE 4096
+#define DLB_LDB_PC_OFFS(id) (DLB_LDB_PC_BASE + (id) * DLB_LDB_PC_MAX_SIZE)
+
+#define DLB_DIR_PC_BASE 0x2200000
+#define DLB_DIR_PC_MAX_SIZE 4096
+#define DLB_DIR_PC_OFFS(id) (DLB_DIR_PC_BASE + (id) * DLB_DIR_PC_MAX_SIZE)
+
+#define DLB_LDB_PP_BASE 0x2100000
+#define DLB_LDB_PP_MAX_SIZE 4096
+#define DLB_LDB_PP_OFFS(id) (DLB_LDB_PP_BASE + (id) * DLB_LDB_PP_MAX_SIZE)
+
+#define DLB_DIR_PP_BASE 0x2000000
+#define DLB_DIR_PP_MAX_SIZE 4096
+#define DLB_DIR_PP_OFFS(id) (DLB_DIR_PP_BASE + (id) * DLB_DIR_PP_MAX_SIZE)
+
+#endif /* __DLB_USER_H */
--
1.7.10
^ permalink raw reply [relevance 1%]
* Re: [dpdk-dev] [PATCH] doc: announce changes to eventdev public data structures
@ 2020-07-30 18:48 3% ` Jerin Jacob
0 siblings, 0 replies; 200+ results
From: Jerin Jacob @ 2020-07-30 18:48 UTC (permalink / raw)
To: McDaniel, Timothy
Cc: dev, jerinj, Eads, Gage, Van Haaren, Harry, mdr, nhorman, Rao,
Nikhil, Carrillo, Erik G, Gujjar, Abhinandan S, pbhagavatula,
hemant.agrawal, mattias.ronnblom, Mccarthy, Peter
On Thu, Jul 30, 2020 at 10:03 PM McDaniel, Timothy
<timothy.mcdaniel@intel.com> wrote:
>
>
> >-----Original Message-----
> >From: McDaniel, Timothy <timothy.mcdaniel@intel.com>
> >Sent: Thursday, July 2, 2020 5:14 PM
> >To: dev@dpdk.org
> >Cc: jerinj@marvell.com; Eads, Gage <gage.eads@intel.com>; Van Haaren, Harry
> ><harry.van.haaren@intel.com>; mdr@ashroe.eu; nhorman@tuxdriver.com; Rao,
> >Nikhil <nikhil.rao@intel.com>; Carrillo, Erik G <Erik.G.Carrillo@intel.com>; Gujjar,
> >Abhinandan S <abhinandan.gujjar@intel.com>; pbhagavatula@marvell.com;
> >hemant.agrawal@nxp.com; mattias.ronnblom@ericsson.com; Mccarthy, Peter
> ><Peter.Mccarthy@intel.com>; McDaniel, Timothy
> ><timothy.mcdaniel@intel.com>
> >Subject: [PATCH] doc: announce changes to eventdev public data structures
> >
> >From: "McDaniel, Timothy" <timothy.mcdaniel@intel.com>
> >
> >Signed-off-by: McDaniel, Timothy <timothy.mcdaniel@intel.com>
> >---
> > doc/guides/rel_notes/deprecation.rst | 28 ++++++++++++++++++++++++++++
> > 1 file changed, 28 insertions(+)
> >
> >diff --git a/doc/guides/rel_notes/deprecation.rst
> >b/doc/guides/rel_notes/deprecation.rst
> >index d1034f6..6af9b40 100644
> >--- a/doc/guides/rel_notes/deprecation.rst
> >+++ b/doc/guides/rel_notes/deprecation.rst
> >@@ -130,3 +130,31 @@ Deprecation Notices
> > Python 2 support will be completely removed in 20.11.
> > In 20.08, explicit deprecation warnings will be displayed when running
> > scripts with Python 2.
> >+
> >+* eventdev: Three public data structures will be updated in 20.11;
> >+ ``rte_event_dev_info``, ``rte_event_dev_config``, and
> >+ ``rte_event_port_conf``.
> >+ Two new members will be added to the ``rte_event_dev_info`` struct.
> >+ The first, max_event_port_links, will be a uint8_t, and represents the
> >+ maximum number of queues that can be linked to a single event port by
> >+ this device. The second, max_single_link_event_port_queue_pairs, will be a
> >+ uint8_t, and represents the maximum number of event ports and queues that
> >+ are optimized for (and only capable of) single-link configurations
> >+ supported by this device. These ports and queues are not accounted for in
> >+ max_event_ports or max_event_queues.
> >+ One new member will be added to the ``rte_event_dev_config`` struct. The
> >+ nb_single_link_event_port_queues member will be a uint8_t, and will
> >+ represent the number of event ports and queues that will be singly-linked
> >+ to each other. These are a subset of the overall event ports and queues.
> >+ This value cannot exceed nb_event_ports or nb_event_queues. If the
> >+ device has ports and queues that are optimized for single-link usage, this
> >+ field is a hint for how many to allocate; otherwise, regular event ports and
> >+ queues can be used.
> >+ Finally, the ``rte_event_port_conf`` struct will be
> >+ modified as follows. The uint8_t implicit_release_disabled field
> >+ will be replaced by a uint32_t event_port_cfg field. The new field will
> >+ initially have two bits assigned. RTE_EVENT_PORT_CFG_DISABLE_IMPL_REL
> >+ will have the same meaning as implicit_release_disabled. The second bit,
> >+ RTE_EVENT_PORT_CFG_SINGLE_LINK will be set if the event port links only
> >+ to a single event queue.
> >+
> >--
> >1.7.10
>
> Hello Jerin and DPDK community. Please review and approve the eventdev interface changes announced in this notice.
Changes look good to me in general. Could you reword the description
such way or similar like below to accommodate
1) DLB PMD requirements
2) Future extensions[1]
I think, We don't need exact mention the data structure member additions,
(This is to get the flexibility on additions/deletion on member fields
after the patch rework)
something like:
eventdev: ABI change to support DLB PMD and future extensions
The following structures and will be modified to support to DLB PMD and future
extension in the eventdev library.
And then please enumerate the structures[2] of your patch which needs change,
[2]``rte_event_dev_info``, ``rte_event_dev_config``, and
``rte_event_port_conf``
and structures in [1]. Please mention the patches [1] and your spec change patch
as a reference in the description.
[1]
http://patches.dpdk.org/patch/72786/
http://patches.dpdk.org/patch/72787/
>
> Thanks,
> Tim
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v2 20.08 4/6] doc: announce deprecation blacklist/whitelist
2020-07-30 8:45 3% ` Bruce Richardson
@ 2020-07-30 15:10 0% ` Stephen Hemminger
0 siblings, 0 replies; 200+ results
From: Stephen Hemminger @ 2020-07-30 15:10 UTC (permalink / raw)
To: Bruce Richardson; +Cc: dev
On Thu, 30 Jul 2020 09:45:19 +0100
Bruce Richardson <bruce.richardson@intel.com> wrote:
> On Wed, Jul 29, 2020 at 05:58:02PM -0700, Stephen Hemminger wrote:
> > Announce upcoming changes for 20.11.
> >
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > ---
> > doc/guides/rel_notes/deprecation.rst | 21 +++++++++++++++++++++
> > 1 file changed, 21 insertions(+)
> >
> > diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> > index 7c60779f3e68..abfec0aeaa4b 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -165,3 +165,24 @@ Deprecation Notices
> >
> > The ``master-lcore`` argument to testpmd will be replaced
> > with ``initial-lcore``.
> > +
> > +* eal: The terms blacklist and whitelist to describe devices used
> > + by DPDK will be replaced in the 20.11 relase.
> > + This will apply to command line arguments as well as macros.
> > +
> > + The macro ``RTE_DEV_BLACKLISTED`` will be replaced with ``RTE_DEV_EXCLUDED``
> > + and ``RTE_DEV_WHITELISTED`` will be replaced with ``RTE_DEV_INCLUDED``
> > + ``RTE_BUS_SCAN_BLACKLIST`` and ``RTE_BUS_SCAN_WHITELIST`` will be
> > + replaced with ``RTE_BUS_SCAN_EXCLUDED`` and ``RTE_BUS_SCAN_INCLUDED``
> > + respectively. Likewise ``RTE_DEVTYPE_BLACKLISTED_PCI`` and
> > + ``RTE_DEVTYPE_WHITELISTED_PCI`` will be replaced with
> > + ``RTE_DEVTYPE_EXCLUDED`` and ``RTE_DEVTYPE_INCLUDED``.
> > +
> > + The old macros will be marked as deprecated in 20.11 and removed
> > + in the 21.11 release.
> > +
>
> Since these are macros and therefore not part of the ABI I think we can
> remove them sooner than 21.11. Therefore similar to the previous patch can
> we just use "future" relase rather than 21.11
If these are internal, we don't need to wrap them in 21.11.
> > + The command line arguments to ``rte_eal_init`` will change from
> > + ``-b, --pci-blacklist`` to ``-x, --exclude`` and
> > + ``-w, --pci-whitelist`` to ``-i, --include``.
> > + The old command line arguments will continue to be accepted in 20.11
> > + but will cause a runtime error message.
> > --
>
> Error message, or warning message?
Some message to standard error and keep going.
>
> Overall, though
>
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
^ permalink raw reply [relevance 0%]
* [dpdk-dev] DPDK Release Status Meeting 30/07/2020
@ 2020-07-30 10:37 3% Ferruh Yigit
0 siblings, 0 replies; 200+ results
From: Ferruh Yigit @ 2020-07-30 10:37 UTC (permalink / raw)
To: dev; +Cc: Thomas Monjalon
Minutes 30 July 2020
--------------------
Agenda:
* Release Dates
* -rc2 status
* Subtrees
* LTS
* Opens
Participants:
* Arm
* Broadcom
* Debian/Microsoft
* Intel
* Marvell
* Nvidia
* NXP
* Red Hat
Release Dates
-------------
* v20.08 dates:
* -rc3 pushed to *Thursday, 30 July 2020*
* -rc4: Wednesday, 5 August 2020
* Release Friday, 7 August 2020
* v20.11 proposal dates, please comment:
* Proposal/V1: Wednesday, 2 September 2020
* -rc1: Wednesday, 30 September 2020
* -rc2: Friday, 16 October 2020
* Release: Friday, 6 November 2020
* Please remember to send roadmap for 20.11.
-rc2 status
-----------
* Intel finished testing, mostly good except two critical defects
* A driver TSO issue, fix merged
* vhost performance drop
* Maxime's patch verified, will be merged for -rc3
* Interrupt issue will be worked in next release
Subtrees
--------
* main
* Worrying that there is no more fixes for libraries
* Will close the -rc3 today
* More review required for examples, please help
* https://patches.dpdk.org/project/dpdk/list/?q=example
* -rc4 will be for documentation patches
* Need to finalize the deprecation notices
* Please check the ABI improvements for 20.11, and send
required deprecation notices.
* There are already lots of deprecation notice for 20.11
It is worrying that 20.11 will be too big
* Data hiding and struct splitting already planned for ethdev
Same can be applied for other device abstraction layers,
Please check and send deprecation notices.
* John will be back next week, may support for final doc reviews
* There is a public holiday on Monday in Ireland
* Please all maintainers cleanup the patchwork before release
* next-net
* Pulled from vendor sub-trees and some fixes merged
* There is a last minute bnxt patch, will check for -rc3
* No more patches expected for release
* next-crypto
* Pull request merged
* There are deprecation notices in backlog
* next-eventdev
* Pull request merged
* There can be a few critical patches for -rc3
* May send another pull request, will close today
* next-virtio
* Performance fix and some other fixes merged
* Pulled by next-net, no more patch expected for release
* next-net-mlx, next-net-intel
* Pulled for -rc3
* next-net-mrvl
* Qede patches postponed to next release
* No more patch expected for release
* next-net-brcm
* There is a patchset merged in sub-tree
* Ajit will check if they are fixes or can be postponed
LTS
---
* 19.11.4 work is going on
* Some patches are backported, email sent, please review
* 18.11.10 work is going on
* Some patches are backported
Opens
-----
* Bugzilla
* Nothing critical for the release
* Three new defects to Intel, some are to old releases
* There are open defects needs attention
* Intel pulled some defects to internal to address them
* No visible change in the Bugzilla
DPDK Release Status Meetings
============================
The DPDK Release Status Meeting is intended for DPDK Committers to discuss the
status of the master tree and sub-trees, and for project managers to track
progress or milestone dates.
The meeting occurs on every Thursdays at 8:30 UTC. on https://meet.jit.si/DPDK
If you wish to attend just send an email to
"John McNamara <john.mcnamara@intel.com>" for the invite.
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v2 20.08 4/6] doc: announce deprecation blacklist/whitelist
@ 2020-07-30 8:45 3% ` Bruce Richardson
2020-07-30 15:10 0% ` Stephen Hemminger
0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2020-07-30 8:45 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev
On Wed, Jul 29, 2020 at 05:58:02PM -0700, Stephen Hemminger wrote:
> Announce upcoming changes for 20.11.
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> doc/guides/rel_notes/deprecation.rst | 21 +++++++++++++++++++++
> 1 file changed, 21 insertions(+)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> index 7c60779f3e68..abfec0aeaa4b 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -165,3 +165,24 @@ Deprecation Notices
>
> The ``master-lcore`` argument to testpmd will be replaced
> with ``initial-lcore``.
> +
> +* eal: The terms blacklist and whitelist to describe devices used
> + by DPDK will be replaced in the 20.11 relase.
> + This will apply to command line arguments as well as macros.
> +
> + The macro ``RTE_DEV_BLACKLISTED`` will be replaced with ``RTE_DEV_EXCLUDED``
> + and ``RTE_DEV_WHITELISTED`` will be replaced with ``RTE_DEV_INCLUDED``
> + ``RTE_BUS_SCAN_BLACKLIST`` and ``RTE_BUS_SCAN_WHITELIST`` will be
> + replaced with ``RTE_BUS_SCAN_EXCLUDED`` and ``RTE_BUS_SCAN_INCLUDED``
> + respectively. Likewise ``RTE_DEVTYPE_BLACKLISTED_PCI`` and
> + ``RTE_DEVTYPE_WHITELISTED_PCI`` will be replaced with
> + ``RTE_DEVTYPE_EXCLUDED`` and ``RTE_DEVTYPE_INCLUDED``.
> +
> + The old macros will be marked as deprecated in 20.11 and removed
> + in the 21.11 release.
> +
Since these are macros and therefore not part of the ABI I think we can
remove them sooner than 21.11. Therefore similar to the previous patch can
we just use "future" relase rather than 21.11
> + The command line arguments to ``rte_eal_init`` will change from
> + ``-b, --pci-blacklist`` to ``-x, --exclude`` and
> + ``-w, --pci-whitelist`` to ``-i, --include``.
> + The old command line arguments will continue to be accepted in 20.11
> + but will cause a runtime error message.
> --
Error message, or warning message?
Overall, though
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v2 20.08 1/6] doc: announce deprecation of master lcore
@ 2020-07-30 8:42 3% ` Bruce Richardson
0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2020-07-30 8:42 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev
On Wed, Jul 29, 2020 at 05:57:59PM -0700, Stephen Hemminger wrote:
> Announce upcoming changes related to master/slave in reference
> to lcore.
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> doc/guides/rel_notes/deprecation.rst | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> index 99c98062ffc2..7c60779f3e68 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -148,3 +148,20 @@ Deprecation Notices
> Python 2 support will be completely removed in 20.11.
> In 20.08, explicit deprecation warnings will be displayed when running
> scripts with Python 2.
> +
> +* eal: To be more inclusive in choice of naming, the DPDK project
> + will replace uses of master/slave in the API's and command line arguments.
> +
> + References to master/slave in relation to lcore will be renamed
> + to initial/worker. The function ``rte_get_master_lcore()``
> + will be renamed to ``rte_get_initial_lcore()``.
> + For the 20.11, release both names will be present and the
> + old function will be marked with the deprecated tag.
> + The old function will be removed in 21.11 version.
Since 20.11 is going to be ABI incompatible with previous versions anyway,
can we not have the function as a macro alias for the new, allowing us to
remove the old sooner without breaking any apps? Even if this is not the
case, I think it might be better to change "21.11 version" to "a future
release" to allow us some flexibility on when to remove them in case we can
remove them sooner.
> +
> + The iterator for worker lcores will also change:
> + ``RTE_LCORE_FOREACH_SLAVE`` will be replaced with
> + ``RTE_LCORE_FOREACH_WORKER``.
> +
> + The ``master-lcore`` argument to testpmd will be replaced
> + with ``initial-lcore``.
> --
With the above comment.
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH] [RFC] doc: announce removal of crypto list end enumerators
@ 2020-07-29 15:18 3% ` Bruce Richardson
0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2020-07-29 15:18 UTC (permalink / raw)
To: Arek Kusztal
Cc: dev, akhil.goyal, fiona.trahe, anoobj, shallyv, declan.doherty,
roy.fan.zhang, konstantin.ananyev
On Wed, Jul 29, 2020 at 04:46:51PM +0200, Arek Kusztal wrote:
> Enumerators RTE_CRYPTO_CIPHER_LIST_END, RTE_CRYPTO_AUTH_LIST_END,
> RTE_CRYPTO_AEAD_LIST_END will be removed to prevent some problems
> that may arise when adding new algorithms.
>
> Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Since these seem to cause us ABI problems:
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v3] lib/librte_timer:fix corruption with reset
2020-07-10 15:19 3% ` Stephen Hemminger
@ 2020-07-28 19:04 3% ` Carrillo, Erik G
1 sibling, 0 replies; 200+ results
From: Carrillo, Erik G @ 2020-07-28 19:04 UTC (permalink / raw)
To: Sarosh Arif, dev, rsanford; +Cc: h.mikita89, Stephen Hemminger
Hi Sarosh,
Some comments in-line:
> -----Original Message-----
> From: Sarosh Arif <sarosh.arif@emumba.com>
> Sent: Friday, July 10, 2020 2:00 AM
> To: rsanford@akamai.com; Carrillo, Erik G <erik.g.carrillo@intel.com>;
> dev@dpdk.org
> Cc: stable@dpdk.org; Sarosh Arif <sarosh.arif@emumba.com>;
> h.mikita89@gmail.com
> Subject: [PATCH v3] lib/librte_timer:fix corruption with reset
The subject is misleading - perhaps wording closer to the title of the Bugzilla bug would be more helpful.
>
> If the user tries to reset/stop some other timer in it's callback function, which
> is also about to expire, using rte_timer_reset_sync/rte_timer_stop_sync the
> application goes into an infinite loop. This happens because
> rte_timer_reset_sync/rte_timer_stop_sync loop until the timer resets/stops
> and there is check inside timer_set_config_state which prevents a running
> timer from being reset/stopped by not it's own timer_cb. Therefore
> timer_set_config_state returns -1 due to which rte_timer_reset returns -1
> and rte_timer_reset_sync goes into an infinite loop.
>
> The soloution to this problem is to return -1 from
> rte_timer_reset_sync/rte_timer_stop_sync in case the user tries to
> reset/stop some other timer in it's callback function.
>
> Bugzilla ID: 491
> Fixes: 20d159f20543 ("timer: fix corruption with reset")
> Cc: h.mikita89@gmail.com
> Signed-off-by: Sarosh Arif <sarosh.arif@emumba.com>
> ---
> v2: remove line continuations
> v3: separate code and declarations
> ---
> lib/librte_timer/rte_timer.c | 26 ++++++++++++++++++++++++--
> lib/librte_timer/rte_timer.h | 4 ++--
> 2 files changed, 26 insertions(+), 4 deletions(-)
>
> diff --git a/lib/librte_timer/rte_timer.c b/lib/librte_timer/rte_timer.c index
> 6d19ce469..0cd3e2c86 100644
> --- a/lib/librte_timer/rte_timer.c
> +++ b/lib/librte_timer/rte_timer.c
> @@ -576,14 +576,24 @@ rte_timer_alt_reset(uint32_t timer_data_id, struct
> rte_timer *tim, }
>
> /* loop until rte_timer_reset() succeed */ -void
> +int
> rte_timer_reset_sync(struct rte_timer *tim, uint64_t ticks,
> enum rte_timer_type type, unsigned tim_lcore,
> rte_timer_cb_t fct, void *arg)
> {
> + struct rte_timer_data *timer_data;
> + TIMER_DATA_VALID_GET_OR_ERR_RET(default_data_id,
> timer_data, -EINVAL);
> +
> + if (tim->status.state == RTE_TIMER_RUNNING &&
> + (tim->status.owner != (uint16_t)tim_lcore ||
> + tim != timer_data->priv_timer[tim_lcore].running_tim))
> + return -1;
> +
As I understand it, Bugzilla 491 describes two scenarios where a hang can occur:
1. A timer's callback tries to synchronously reset/stop another timer in the same run list
2. A timer's callback tries to synchronously reset/stop another timer in a different run list whose lcore happens to be running a timer callback that is synchronously resetting/stopping a timer in the first run list
The if condition from the patch above can be broken up as:
(tim->status.state == RTE_TIMER_RUNNING && tim->status.owner == (uint16_t)lcore_id && tim != timer_data->priv_timer[lcore_id].running_tim)
And
(tim->status.state == RTE_TIMER_RUNNING && tim->status.owner != (uint16_t)lcore_id)
This second condition could be transient and doesn't necessarily identify scenario (2) above. In this case, the *_sync() calls could fail unnecessarily.
Offhand, I'm not seeing a way to more precisely detect scenario 2 above. I'm wondering if some kind of a timeout parameter could be added to avoid hanging instead. Thoughts?
As Stephen mentioned in another response, it looks like this will require an API change. I believe this can be announced in the next release via doc/guides/rel_notes/deprecation.rst. Then, the new API can be added in the next ABI-breaking release, possibly with versioned symbols (http://doc.dpdk.org/guides/contributing/abi_versioning.html#versioning-macros).
Thanks,
Erik
> while (rte_timer_reset(tim, ticks, type, tim_lcore,
> fct, arg) != 0)
> rte_pause();
> +
> + return 0;
> }
>
> static int
> @@ -642,11 +652,23 @@ rte_timer_alt_stop(uint32_t timer_data_id, struct
> rte_timer *tim) }
>
> /* loop until rte_timer_stop() succeed */ -void
> +int
> rte_timer_stop_sync(struct rte_timer *tim) {
> + struct rte_timer_data *timer_data;
> + unsigned int lcore_id = rte_lcore_id();
> +
> + TIMER_DATA_VALID_GET_OR_ERR_RET(default_data_id,
> timer_data, -EINVAL);
> +
> + if (tim->status.state == RTE_TIMER_RUNNING &&
> + (tim->status.owner != (uint16_t)lcore_id ||
> + tim != timer_data->priv_timer[lcore_id].running_tim))
> + return -1;
> +
> while (rte_timer_stop(tim) != 0)
> rte_pause();
> +
> + return 0;
> }
>
> /* Test the PENDING status of the timer handle tim */ diff --git
> a/lib/librte_timer/rte_timer.h b/lib/librte_timer/rte_timer.h index
> c6b3d450d..392ca423d 100644
> --- a/lib/librte_timer/rte_timer.h
> +++ b/lib/librte_timer/rte_timer.h
> @@ -275,7 +275,7 @@ int rte_timer_reset(struct rte_timer *tim, uint64_t
> ticks,
> * @param arg
> * The user argument of the callback function.
> */
> -void
> +int
> rte_timer_reset_sync(struct rte_timer *tim, uint64_t ticks,
> enum rte_timer_type type, unsigned tim_lcore,
> rte_timer_cb_t fct, void *arg);
> @@ -314,7 +314,7 @@ int rte_timer_stop(struct rte_timer *tim);
> * @param tim
> * The timer handle.
> */
> -void rte_timer_stop_sync(struct rte_timer *tim);
> +int rte_timer_stop_sync(struct rte_timer *tim);
>
> /**
> * Test if a timer is pending.
> --
> 2.17.1
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v10 01/10] eal: introduce macro for bit definition
2020-07-28 9:29 0% ` Gaëtan Rivet
2020-07-28 11:11 0% ` Morten Brørup
@ 2020-07-28 15:39 0% ` Honnappa Nagarahalli
1 sibling, 0 replies; 200+ results
From: Honnappa Nagarahalli @ 2020-07-28 15:39 UTC (permalink / raw)
To: Gaëtan Rivet, Morten Brørup
Cc: Parav Pandit, dev, ferruh.yigit, thomas, Ray Kinsella,
Neil Horman, rasland, orika, matan, Joyce Kong, nd,
Honnappa Nagarahalli, nd
<snip>
>
> On 28/07/20 10:24 +0200, Morten Brørup wrote:
> > + Ray and Neil as ABI Policy maintainers.
> >
> > > From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> > > Sent: Tuesday, July 28, 2020 4:19 AM
> > >
> > > <snip>
> > > > >
> > > > > > Subject: [dpdk-dev] [PATCH v10 01/10] eal: introduce macro for
> > > bit
> > > > > definition
> > > > > >
> > > > > > There are several drivers which duplicate bit generation macro.
> > > > > > Introduce a generic bit macros so that such drivers avoid
> > > redefining
> > > > > same in
> > > > > > multiple drivers.
> > > > > >
> > > > > > Signed-off-by: Parav Pandit <parav@mellanox.com>
> > > > > > Acked-by: Matan Azrad <matan@mellanox.com>
> > > > > > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> > > > > > ---
> > > > > > Changelog:
> > > > > > v4->v5:
> > > > > > - Addressed comments from Morten Brørup
> > > > > > - Renamed newly added macro to RTE_BIT64
> > > > > > - Added doxygen comment section for the macro
> > > > > > v1->v2:
> > > > > > - Addressed comments from Thomas and Gaten.
> > > > > > - Avoided new file, added macro to rte_bitops.h
> > > > > > ---
> > > > > > lib/librte_eal/include/rte_bitops.h | 8 ++++++++
> > > > > > 1 file changed, 8 insertions(+)
> > > > > >
> > > > > > diff --git a/lib/librte_eal/include/rte_bitops.h
> > > > > > b/lib/librte_eal/include/rte_bitops.h
> > > > > > index 740927f3b..ca46a110f 100644
> > > > > > --- a/lib/librte_eal/include/rte_bitops.h
> > > > > > +++ b/lib/librte_eal/include/rte_bitops.h
> > > > > > @@ -17,6 +17,14 @@
> > > > > > #include <rte_debug.h>
> > > > > > #include <rte_compat.h>
> > > > > >
> > > > > > +/**
> > > > > > + * Get the uint64_t value for a specified bit set.
> > > > > > + *
> > > > > > + * @param nr
> > > > > > + * The bit number in range of 0 to 63.
> > > > > > + */
> > > > > > +#define RTE_BIT64(nr) (UINT64_C(1) << (nr))
> > > > > In general, the macros have been avoided in this file. Suggest
> > > > > changing this to an inline function.
> > > >
> > > > That has been discussed already, and rejected for good reasons:
> > > >
> http://inbox.dpdk.org/dev/AM0PR05MB4866823B0170B90F679A2765D1640@
> > > > AM0PR05MB4866.eurprd05.prod.outlook.com/
> > > Thank you for the link.
> > > In this patch series, I see the macro being used in enum
> > > initialization
> > > (7/10 in v11) as well as in functions (8/10 in v11). Does it make
> > > sense to introduce use inline functions and use the inline functions
> > > for 8/10?
> > > If we do this, we should document in rte_bitops.h that inline
> > > functions should be used wherever possible.
> >
> > I would agree, but only in theory. I disagree in reality, and argue that there
> should only be macros for this. Here is why:
> >
> > rte_byteorder.h has both RTE_BEnn() macros and rte_cpu_to_be_nn()
> functions, for doing the same thing at compile time or at run time. There are
> no compile time warnings if the wrong one is being used, so I am certain that
> we can find code that uses the macro where the function should be used, or
> vice versa.
Agree, there is not a suitable way to enforce the use of one over the other (other than code review).
When the APIs in rte_bitops.h were introduced, there was a discussion around using the macros. I was for using macros as it would have kept the code as well as number of APIs smaller. However, there was a decision made not to use macros and instead provide inline functions. It was nothing to do with performance. So, I am just saying that we need to follow the same principles at least for this file.
> >
>
> Hi,
>
> It is not clear to me, reading this thread, what is the motivation to enforce
> use of inline functions? Is it perf, compiler type checking, or usage checks?
>
> Macros are checked at compile time when possible, though it can be
> improved upon. But I agree with Morten, proposing two forms ensures devs
> will sometimes use the wrong one, and we would need a practical way to
> check usages.
>
> > Which opens another, higher level, question: Would it be possible to add a
> compile time check macro in rte_common.h for these and similar?
> >
>
> Can you clarify your idea? Is is something similar to:
>
> #define _BIT64(n) (UINT64_C(1) << (n))
> static inline uint64_t
> bit64(uint64_t n)
> {
> assert(n < 64);
> return (UINT64_C(1) << n);
> }
> /* Integer Constant Expression? */
> #define ICE_P(x) (sizeof(int) == sizeof(*(1 ? ((void*)((x) * 0l)) : (int*)1)))
> #define BIT64(n) (ICE_P(n) ? _BIT64(n) : bit64(n))
>
> I don't think so, but this is as close as automatic compile-time check and
> automatic use of proper macro vs. function I know of, did you have something
> else in mind?
>
> In this kind of code:
>
> #include <stdio.h>
> #include <stdint.h>
> #include <inttypes.h>
> #include <assert.h>
>
> enum vals {
> ZERO = 0,
> ONE = BIT64(1),
> TWO = BIT64(2),
> THREE = BIT64(3),
> };
>
> int main(void)
> {
> uint64_t x = ONE;
>
> x = BIT64(0);
> x = BIT64(1);
> x = BIT64(60);
> x = BIT64(64);
> x = BIT64(x);
>
> printf("x: 0x%" PRIx64 "\n", x);
>
> return 0;
> }
>
> The enum is defined using the macro, x = BIT64(64); triggers the following
> warning with GCC:
>
> constant_bitop.c:6:32: warning: left shift count >= width of type [-Wshift-
> count-overflow]
> 6 | #define _BIT64(n) (UINT64_C(1) << (n))
>
> and x = BIT64(x); triggers the assert() at runtime.
>
> > Furthermore: For the RTE_BITnn() operations in this patch set, I expect the
> compiler to generate perfectly efficient code using the macro for run time use.
> I.e. there would be no performance advantage by also implementing the
> macros as functions for run time use.
> >
>
> Regards,
> --
> Gaëtan
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v10 01/10] eal: introduce macro for bit definition
2020-07-28 9:29 0% ` Gaëtan Rivet
@ 2020-07-28 11:11 0% ` Morten Brørup
2020-07-28 15:39 0% ` Honnappa Nagarahalli
1 sibling, 0 replies; 200+ results
From: Morten Brørup @ 2020-07-28 11:11 UTC (permalink / raw)
To: Gaëtan Rivet
Cc: Honnappa Nagarahalli, Parav Pandit, dev, ferruh.yigit, thomas,
Ray Kinsella, Neil Horman, rasland, orika, matan, Joyce Kong, nd
> From: Gaëtan Rivet [mailto:grive@u256.net]
> Sent: Tuesday, July 28, 2020 11:29 AM
>
> On 28/07/20 10:24 +0200, Morten Brørup wrote:
> > + Ray and Neil as ABI Policy maintainers.
> >
> > > From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> > > Sent: Tuesday, July 28, 2020 4:19 AM
> > >
> > > <snip>
> > > > >
> > > > > > Subject: [dpdk-dev] [PATCH v10 01/10] eal: introduce macro
> for
> > > bit
> > > > > definition
> > > > > >
> > > > > > There are several drivers which duplicate bit generation
> macro.
> > > > > > Introduce a generic bit macros so that such drivers avoid
> > > redefining
> > > > > same in
> > > > > > multiple drivers.
> > > > > >
> > > > > > Signed-off-by: Parav Pandit <parav@mellanox.com>
> > > > > > Acked-by: Matan Azrad <matan@mellanox.com>
> > > > > > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> > > > > > ---
> > > > > > Changelog:
> > > > > > v4->v5:
> > > > > > - Addressed comments from Morten Brørup
> > > > > > - Renamed newly added macro to RTE_BIT64
> > > > > > - Added doxygen comment section for the macro
> > > > > > v1->v2:
> > > > > > - Addressed comments from Thomas and Gaten.
> > > > > > - Avoided new file, added macro to rte_bitops.h
> > > > > > ---
> > > > > > lib/librte_eal/include/rte_bitops.h | 8 ++++++++
> > > > > > 1 file changed, 8 insertions(+)
> > > > > >
> > > > > > diff --git a/lib/librte_eal/include/rte_bitops.h
> > > > > > b/lib/librte_eal/include/rte_bitops.h
> > > > > > index 740927f3b..ca46a110f 100644
> > > > > > --- a/lib/librte_eal/include/rte_bitops.h
> > > > > > +++ b/lib/librte_eal/include/rte_bitops.h
> > > > > > @@ -17,6 +17,14 @@
> > > > > > #include <rte_debug.h>
> > > > > > #include <rte_compat.h>
> > > > > >
> > > > > > +/**
> > > > > > + * Get the uint64_t value for a specified bit set.
> > > > > > + *
> > > > > > + * @param nr
> > > > > > + * The bit number in range of 0 to 63.
> > > > > > + */
> > > > > > +#define RTE_BIT64(nr) (UINT64_C(1) << (nr))
> > > > > In general, the macros have been avoided in this file. Suggest
> > > > > changing this to an inline function.
> > > >
> > > > That has been discussed already, and rejected for good reasons:
> > > > http://inbox.dpdk.org/dev/AM0PR05MB4866823B0170B90F679A2765D1640@
> > > > AM0PR05MB4866.eurprd05.prod.outlook.com/
> > > Thank you for the link.
> > > In this patch series, I see the macro being used in enum
> initialization
> > > (7/10 in v11) as well as in functions (8/10 in v11). Does it make
> sense
> > > to introduce use inline functions and use the inline functions for
> > > 8/10?
> > > If we do this, we should document in rte_bitops.h that inline
> functions
> > > should be used wherever possible.
> >
> > I would agree, but only in theory. I disagree in reality, and argue
> that there should only be macros for this. Here is why:
> >
> > rte_byteorder.h has both RTE_BEnn() macros and rte_cpu_to_be_nn()
> functions, for doing the same thing at compile time or at run time.
> There are no compile time warnings if the wrong one is being used, so I
> am certain that we can find code that uses the macro where the function
> should be used, or vice versa.
> >
>
> Hi,
>
> It is not clear to me, reading this thread, what is the motivation to
> enforce use of inline functions? Is it perf, compiler type checking, or
> usage checks?
>
> Macros are checked at compile time when possible, though it can be
> improved upon. But I agree with Morten, proposing two forms ensures
> devs
> will sometimes use the wrong one, and we would need a practical way to
> check usages.
>
> > Which opens another, higher level, question: Would it be possible to
> add a compile time check macro in rte_common.h for these and similar?
> >
>
> Can you clarify your idea? Is is something similar to:
>
> #define _BIT64(n) (UINT64_C(1) << (n))
> static inline uint64_t
> bit64(uint64_t n)
> {
> assert(n < 64);
> return (UINT64_C(1) << n);
> }
> /* Integer Constant Expression? */
> #define ICE_P(x) (sizeof(int) == sizeof(*(1 ? ((void*)((x) * 0l)) :
> (int*)1)))
> #define BIT64(n) (ICE_P(n) ? _BIT64(n) : bit64(n))
>
> I don't think so, but this is as close as automatic compile-time check
> and automatic use of proper macro vs. function I know of, did you have
> something else in mind?
I was only thinking of adding a compile time warning if the function was being used where the macro should be used, and vice versa.
Your proposed solution for automatic use of the function or macro is even better. Thank you! And it could be used in rte_byteorder.h too.
But as I mentioned, it is a higher level discussion, so for this patch, let's settle with the macro as already provided by Parav. And the higher level discussion about how to do this generally in DPDK libraries, where both macros and functions for the same calculation are provided, can be resumed later.
>
> In this kind of code:
>
> #include <stdio.h>
> #include <stdint.h>
> #include <inttypes.h>
> #include <assert.h>
>
> enum vals {
> ZERO = 0,
> ONE = BIT64(1),
> TWO = BIT64(2),
> THREE = BIT64(3),
> };
>
> int main(void)
> {
> uint64_t x = ONE;
>
> x = BIT64(0);
> x = BIT64(1);
> x = BIT64(60);
> x = BIT64(64);
> x = BIT64(x);
>
> printf("x: 0x%" PRIx64 "\n", x);
>
> return 0;
> }
>
> The enum is defined using the macro, x = BIT64(64); triggers the
> following warning with GCC:
>
> constant_bitop.c:6:32: warning: left shift count >= width of type [-
> Wshift-count-overflow]
> 6 | #define _BIT64(n) (UINT64_C(1) << (n))
>
> and x = BIT64(x); triggers the assert() at runtime.
>
> > Furthermore: For the RTE_BITnn() operations in this patch set, I
> expect the compiler to generate perfectly efficient code using the
> macro for run time use. I.e. there would be no performance advantage by
> also implementing the macros as functions for run time use.
> >
>
> Regards,
> --
> Gaëtan
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v10 01/10] eal: introduce macro for bit definition
2020-07-28 8:24 3% ` Morten Brørup
@ 2020-07-28 9:29 0% ` Gaëtan Rivet
2020-07-28 11:11 0% ` Morten Brørup
2020-07-28 15:39 0% ` Honnappa Nagarahalli
0 siblings, 2 replies; 200+ results
From: Gaëtan Rivet @ 2020-07-28 9:29 UTC (permalink / raw)
To: Morten Brørup
Cc: Honnappa Nagarahalli, Parav Pandit, dev, ferruh.yigit, thomas,
Ray Kinsella, Neil Horman, rasland, orika, matan, Joyce Kong, nd
On 28/07/20 10:24 +0200, Morten Brørup wrote:
> + Ray and Neil as ABI Policy maintainers.
>
> > From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> > Sent: Tuesday, July 28, 2020 4:19 AM
> >
> > <snip>
> > > >
> > > > > Subject: [dpdk-dev] [PATCH v10 01/10] eal: introduce macro for
> > bit
> > > > definition
> > > > >
> > > > > There are several drivers which duplicate bit generation macro.
> > > > > Introduce a generic bit macros so that such drivers avoid
> > redefining
> > > > same in
> > > > > multiple drivers.
> > > > >
> > > > > Signed-off-by: Parav Pandit <parav@mellanox.com>
> > > > > Acked-by: Matan Azrad <matan@mellanox.com>
> > > > > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> > > > > ---
> > > > > Changelog:
> > > > > v4->v5:
> > > > > - Addressed comments from Morten Brørup
> > > > > - Renamed newly added macro to RTE_BIT64
> > > > > - Added doxygen comment section for the macro
> > > > > v1->v2:
> > > > > - Addressed comments from Thomas and Gaten.
> > > > > - Avoided new file, added macro to rte_bitops.h
> > > > > ---
> > > > > lib/librte_eal/include/rte_bitops.h | 8 ++++++++
> > > > > 1 file changed, 8 insertions(+)
> > > > >
> > > > > diff --git a/lib/librte_eal/include/rte_bitops.h
> > > > > b/lib/librte_eal/include/rte_bitops.h
> > > > > index 740927f3b..ca46a110f 100644
> > > > > --- a/lib/librte_eal/include/rte_bitops.h
> > > > > +++ b/lib/librte_eal/include/rte_bitops.h
> > > > > @@ -17,6 +17,14 @@
> > > > > #include <rte_debug.h>
> > > > > #include <rte_compat.h>
> > > > >
> > > > > +/**
> > > > > + * Get the uint64_t value for a specified bit set.
> > > > > + *
> > > > > + * @param nr
> > > > > + * The bit number in range of 0 to 63.
> > > > > + */
> > > > > +#define RTE_BIT64(nr) (UINT64_C(1) << (nr))
> > > > In general, the macros have been avoided in this file. Suggest
> > > > changing this to an inline function.
> > >
> > > That has been discussed already, and rejected for good reasons:
> > > http://inbox.dpdk.org/dev/AM0PR05MB4866823B0170B90F679A2765D1640@
> > > AM0PR05MB4866.eurprd05.prod.outlook.com/
> > Thank you for the link.
> > In this patch series, I see the macro being used in enum initialization
> > (7/10 in v11) as well as in functions (8/10 in v11). Does it make sense
> > to introduce use inline functions and use the inline functions for
> > 8/10?
> > If we do this, we should document in rte_bitops.h that inline functions
> > should be used wherever possible.
>
> I would agree, but only in theory. I disagree in reality, and argue that there should only be macros for this. Here is why:
>
> rte_byteorder.h has both RTE_BEnn() macros and rte_cpu_to_be_nn() functions, for doing the same thing at compile time or at run time. There are no compile time warnings if the wrong one is being used, so I am certain that we can find code that uses the macro where the function should be used, or vice versa.
>
Hi,
It is not clear to me, reading this thread, what is the motivation to
enforce use of inline functions? Is it perf, compiler type checking, or
usage checks?
Macros are checked at compile time when possible, though it can be
improved upon. But I agree with Morten, proposing two forms ensures devs
will sometimes use the wrong one, and we would need a practical way to
check usages.
> Which opens another, higher level, question: Would it be possible to add a compile time check macro in rte_common.h for these and similar?
>
Can you clarify your idea? Is is something similar to:
#define _BIT64(n) (UINT64_C(1) << (n))
static inline uint64_t
bit64(uint64_t n)
{
assert(n < 64);
return (UINT64_C(1) << n);
}
/* Integer Constant Expression? */
#define ICE_P(x) (sizeof(int) == sizeof(*(1 ? ((void*)((x) * 0l)) : (int*)1)))
#define BIT64(n) (ICE_P(n) ? _BIT64(n) : bit64(n))
I don't think so, but this is as close as automatic compile-time check
and automatic use of proper macro vs. function I know of, did you have
something else in mind?
In this kind of code:
#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
#include <assert.h>
enum vals {
ZERO = 0,
ONE = BIT64(1),
TWO = BIT64(2),
THREE = BIT64(3),
};
int main(void)
{
uint64_t x = ONE;
x = BIT64(0);
x = BIT64(1);
x = BIT64(60);
x = BIT64(64);
x = BIT64(x);
printf("x: 0x%" PRIx64 "\n", x);
return 0;
}
The enum is defined using the macro, x = BIT64(64); triggers the
following warning with GCC:
constant_bitop.c:6:32: warning: left shift count >= width of type [-Wshift-count-overflow]
6 | #define _BIT64(n) (UINT64_C(1) << (n))
and x = BIT64(x); triggers the assert() at runtime.
> Furthermore: For the RTE_BITnn() operations in this patch set, I expect the compiler to generate perfectly efficient code using the macro for run time use. I.e. there would be no performance advantage by also implementing the macros as functions for run time use.
>
Regards,
--
Gaëtan
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v10 01/10] eal: introduce macro for bit definition
@ 2020-07-28 8:24 3% ` Morten Brørup
2020-07-28 9:29 0% ` Gaëtan Rivet
0 siblings, 1 reply; 200+ results
From: Morten Brørup @ 2020-07-28 8:24 UTC (permalink / raw)
To: Honnappa Nagarahalli, Parav Pandit, dev, grive, ferruh.yigit,
thomas, Ray Kinsella, Neil Horman
Cc: rasland, orika, matan, Joyce Kong, nd, nd
+ Ray and Neil as ABI Policy maintainers.
> From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> Sent: Tuesday, July 28, 2020 4:19 AM
>
> <snip>
> > >
> > > > Subject: [dpdk-dev] [PATCH v10 01/10] eal: introduce macro for
> bit
> > > definition
> > > >
> > > > There are several drivers which duplicate bit generation macro.
> > > > Introduce a generic bit macros so that such drivers avoid
> redefining
> > > same in
> > > > multiple drivers.
> > > >
> > > > Signed-off-by: Parav Pandit <parav@mellanox.com>
> > > > Acked-by: Matan Azrad <matan@mellanox.com>
> > > > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> > > > ---
> > > > Changelog:
> > > > v4->v5:
> > > > - Addressed comments from Morten Brørup
> > > > - Renamed newly added macro to RTE_BIT64
> > > > - Added doxygen comment section for the macro
> > > > v1->v2:
> > > > - Addressed comments from Thomas and Gaten.
> > > > - Avoided new file, added macro to rte_bitops.h
> > > > ---
> > > > lib/librte_eal/include/rte_bitops.h | 8 ++++++++
> > > > 1 file changed, 8 insertions(+)
> > > >
> > > > diff --git a/lib/librte_eal/include/rte_bitops.h
> > > > b/lib/librte_eal/include/rte_bitops.h
> > > > index 740927f3b..ca46a110f 100644
> > > > --- a/lib/librte_eal/include/rte_bitops.h
> > > > +++ b/lib/librte_eal/include/rte_bitops.h
> > > > @@ -17,6 +17,14 @@
> > > > #include <rte_debug.h>
> > > > #include <rte_compat.h>
> > > >
> > > > +/**
> > > > + * Get the uint64_t value for a specified bit set.
> > > > + *
> > > > + * @param nr
> > > > + * The bit number in range of 0 to 63.
> > > > + */
> > > > +#define RTE_BIT64(nr) (UINT64_C(1) << (nr))
> > > In general, the macros have been avoided in this file. Suggest
> > > changing this to an inline function.
> >
> > That has been discussed already, and rejected for good reasons:
> > http://inbox.dpdk.org/dev/AM0PR05MB4866823B0170B90F679A2765D1640@
> > AM0PR05MB4866.eurprd05.prod.outlook.com/
> Thank you for the link.
> In this patch series, I see the macro being used in enum initialization
> (7/10 in v11) as well as in functions (8/10 in v11). Does it make sense
> to introduce use inline functions and use the inline functions for
> 8/10?
> If we do this, we should document in rte_bitops.h that inline functions
> should be used wherever possible.
I would agree, but only in theory. I disagree in reality, and argue that there should only be macros for this. Here is why:
rte_byteorder.h has both RTE_BEnn() macros and rte_cpu_to_be_nn() functions, for doing the same thing at compile time or at run time. There are no compile time warnings if the wrong one is being used, so I am certain that we can find code that uses the macro where the function should be used, or vice versa.
Which opens another, higher level, question: Would it be possible to add a compile time check macro in rte_common.h for these and similar?
Furthermore: For the RTE_BITnn() operations in this patch set, I expect the compiler to generate perfectly efficient code using the macro for run time use. I.e. there would be no performance advantage by also implementing the macros as functions for run time use.
> >
> > > Also, this file has uses of this macro, it would be good to replace
> > > them with the new inline function.
> >
> > Makes sense.
> > And for consistency, it would require adding an RTE_BIT32() macro
> too.
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH] net/dpaa: announce extended definition of port_id in API 'rte_pmd_dpaa_set_tx_loopback'
@ 2020-07-23 14:34 4% ` Ferruh Yigit
0 siblings, 0 replies; 200+ results
From: Ferruh Yigit @ 2020-07-23 14:34 UTC (permalink / raw)
To: Yang, Zhiyong, Sachin Saxena (OSS), dev
On 7/23/2020 10:23 AM, Yang, Zhiyong wrote:
>
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Sachin Saxena (OSS)
> Sent: Tuesday, July 14, 2020 7:33 PM
> To: dev@dpdk.org; Yigit, Ferruh <ferruh.yigit@intel.com>
> Subject: [dpdk-dev] [PATCH] net/dpaa: announce extended definition of port_id in API 'rte_pmd_dpaa_set_tx_loopback'
>
> From: Sachin Saxena <sachin.saxena@oss.nxp.com>
>
> Signed-off-by: Sachin Saxena <sachin.saxena@oss.nxp.com>
>
> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
> Acked-by: Zhiyong Yang <zhiyong.yang@intel.com>
>
Applied to dpdk-next-net/master, thanks.
Updated commit log as below:
doc: announce dpaa specific API parameter change
'port_id' storage size should be 'uint16_t', the API
'rte_pmd_dpaa_set_tx_loopback()' has it as 'uint8_t' but fixing it is an
ABI breakage, that is why planning the fix in v20.11 release where ABI
breakage is allowed.
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH 2/2] doc: add deprecation notice for change of rawdev APIs
2020-07-23 1:55 5% ` Xu, Rosen
@ 2020-07-23 7:38 5% ` Hemant Agrawal
0 siblings, 0 replies; 200+ results
From: Hemant Agrawal @ 2020-07-23 7:38 UTC (permalink / raw)
To: Xu, Rosen, Richardson, Bruce, dev; +Cc: Richardson, Bruce, Nipun Gupta
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
-----Original Message-----
From: Xu, Rosen <rosen.xu@intel.com>
Sent: Thursday, July 23, 2020 7:26 AM
To: Richardson, Bruce <bruce.richardson@intel.com>; dev@dpdk.org
Cc: Richardson, Bruce <bruce.richardson@intel.com>; Nipun Gupta <nipun.gupta@nxp.com>; Hemant Agrawal <hemant.agrawal@nxp.com>
Subject: RE: [dpdk-dev] [PATCH 2/2] doc: add deprecation notice for change of rawdev APIs
Importance: High
Hi,
-----Original Message-----
From: dev <dev-bounces@dpdk.org> On Behalf Of Bruce Richardson
Sent: Monday, July 13, 2020 8:31 PM
To: dev@dpdk.org
Cc: Richardson, Bruce <bruce.richardson@intel.com>; Nipun Gupta <nipun.gupta@nxp.com>; Hemant Agrawal <hemant.agrawal@nxp.com>
Subject: [dpdk-dev] [PATCH 2/2] doc: add deprecation notice for change of rawdev APIs
Add to the documentation for 20.08 a notice about the changes of rawdev APIs proposed by patchset [1].
[1] https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Finbox.dpdk.org%2Fdev%2F20200709152047.167730-1-bruce.richardson%40intel.com%2F&data=02%7C01%7Chemant.agrawal%40nxp.com%7Cad3a0b27e46648a250d108d82eab80a3%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637310661421942508&sdata=DdzY9DbZwYzLjQasNhDKxxSSpFgbCc%2Foh9NZR1ksuB8%3D&reserved=0
Cc: Nipun Gupta <nipun.gupta@nxp.com>
Cc: Hemant Agrawal <hemant.agrawal@nxp.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
doc/guides/rel_notes/deprecation.rst | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index ead7cbe43..21b00103e 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -117,6 +117,13 @@ Deprecation Notices
break the ABI checks, that is why change is planned for 20.11.
The list of internal APIs are mainly ones listed in ``rte_ethdev_driver.h``.
+* rawdev: The rawdev APIs which take a device-specific structure as
+ parameter directly, or indirectly via a "private" pointer inside
+another
+ structure, will be modified to take an additional parameter of the
+ structure size. The affected APIs will include
+``rte_rawdev_info_get``,
+ ``rte_rawdev_configure``, ``rte_rawdev_queue_conf_get`` and
+ ``rte_rawdev_queue_setup``.
+
* traffic manager: All traffic manager API's in ``rte_tm.h`` were mistakenly made
ABI stable in the v19.11 release. The TM maintainer and other contributors have
agreed to keep the TM APIs as experimental in expectation of additional spec
--
2.25.1
Acked-by: Rosen Xu <rosen.xu@intel.com>
^ permalink raw reply [relevance 5%]
* Re: [dpdk-dev] [PATCH 2/2] doc: add deprecation notice for change of rawdev APIs
2020-07-13 12:31 5% ` [dpdk-dev] [PATCH 2/2] doc: add deprecation notice for change of rawdev APIs Bruce Richardson
2020-07-13 12:48 5% ` Hemant Agrawal
2020-07-20 11:35 0% ` Ananyev, Konstantin
@ 2020-07-23 1:55 5% ` Xu, Rosen
2020-07-23 7:38 5% ` Hemant Agrawal
2 siblings, 1 reply; 200+ results
From: Xu, Rosen @ 2020-07-23 1:55 UTC (permalink / raw)
To: Richardson, Bruce, dev; +Cc: Richardson, Bruce, Nipun Gupta, Hemant Agrawal
Hi,
-----Original Message-----
From: dev <dev-bounces@dpdk.org> On Behalf Of Bruce Richardson
Sent: Monday, July 13, 2020 8:31 PM
To: dev@dpdk.org
Cc: Richardson, Bruce <bruce.richardson@intel.com>; Nipun Gupta <nipun.gupta@nxp.com>; Hemant Agrawal <hemant.agrawal@nxp.com>
Subject: [dpdk-dev] [PATCH 2/2] doc: add deprecation notice for change of rawdev APIs
Add to the documentation for 20.08 a notice about the changes of rawdev APIs proposed by patchset [1].
[1] http://inbox.dpdk.org/dev/20200709152047.167730-1-bruce.richardson@intel.com/
Cc: Nipun Gupta <nipun.gupta@nxp.com>
Cc: Hemant Agrawal <hemant.agrawal@nxp.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
doc/guides/rel_notes/deprecation.rst | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index ead7cbe43..21b00103e 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -117,6 +117,13 @@ Deprecation Notices
break the ABI checks, that is why change is planned for 20.11.
The list of internal APIs are mainly ones listed in ``rte_ethdev_driver.h``.
+* rawdev: The rawdev APIs which take a device-specific structure as
+ parameter directly, or indirectly via a "private" pointer inside
+another
+ structure, will be modified to take an additional parameter of the
+ structure size. The affected APIs will include
+``rte_rawdev_info_get``,
+ ``rte_rawdev_configure``, ``rte_rawdev_queue_conf_get`` and
+ ``rte_rawdev_queue_setup``.
+
* traffic manager: All traffic manager API's in ``rte_tm.h`` were mistakenly made
ABI stable in the v19.11 release. The TM maintainer and other contributors have
agreed to keep the TM APIs as experimental in expectation of additional spec
--
2.25.1
Acked-by: Rosen Xu <rosen.xu@intel.com>
^ permalink raw reply [relevance 5%]
* [dpdk-dev] [dpdk-announce] release candidate 20.08-rc2
@ 2020-07-22 1:01 3% Thomas Monjalon
0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2020-07-22 1:01 UTC (permalink / raw)
To: announce
A new DPDK release candidate is ready for testing:
https://git.dpdk.org/dpdk/tag/?id=v20.08-rc2
There are 201 new patches in this snapshot.
Release notes:
http://doc.dpdk.org/guides/rel_notes/release_20_08.html
Highlights of 20.08-rc2:
- new mempool ring modes (RTS/HTS)
- new mlx5 regex driver
- warning when adding call to legacy atomic API
- warning when using Python 2
Some driver features have been dropped in last minute,
but are candidates for 20.08-rc3.
Except those exceptions, -rc3 should include only some bug fixes,
simple cleanups, doc and tooling.
We have one week to complete this milestone.
Then one more week (allowing -rc4) should be needed before the release.
Please test and report issues on bugs.dpdk.org.
As a community, we must close as many bugs as possible for -rc3.
Only two weeks are remaining to discuss API/ABI changes
allowed in the next major LTS branch (20.11).
Thank you everyone
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v2] lpm: fix unchecked return value
2020-07-21 17:10 3% ` Bruce Richardson
@ 2020-07-21 17:33 0% ` David Marchand
0 siblings, 0 replies; 200+ results
From: David Marchand @ 2020-07-21 17:33 UTC (permalink / raw)
To: Bruce Richardson, Medvedkin, Vladimir
Cc: Ruifeng Wang, dev, nd, Honnappa Nagarahalli, Phil Yang
On Tue, Jul 21, 2020 at 7:16 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Tue, Jul 21, 2020 at 05:23:02PM +0100, Medvedkin, Vladimir wrote:
> > Hi Ruifeng,
> >
> > On 18/07/2020 10:22, Ruifeng Wang wrote:
> > >
> > > > -----Original Message-----
> > > > From: Medvedkin, Vladimir <vladimir.medvedkin@intel.com>
> > > > Sent: Saturday, July 18, 2020 1:12 AM
> > > > To: Ruifeng Wang <Ruifeng.Wang@arm.com>; Bruce Richardson
> > > > <bruce.richardson@intel.com>
> > > > Cc: dev@dpdk.org; nd <nd@arm.com>; Honnappa Nagarahalli
> > > > <Honnappa.Nagarahalli@arm.com>; Phil Yang <Phil.Yang@arm.com>
> > > > Subject: Re: [PATCH v2] lpm: fix unchecked return value
> > > >
> > > > Hi Ruifeng,
> > > >
> > > Hi Vladimir,
> > >
> > > > On 16/07/2020 16:49, Ruifeng Wang wrote:
> > > > > Coverity complains about unchecked return value of
> > > > rte_rcu_qsbr_dq_enqueue.
> > > > > By default, defer queue size is big enough to hold all tbl8 groups.
> > > > > When enqueue fails, return error to the user to indicate system issue.
> > > > >
> > > > > Coverity issue: 360832
> > > > > Fixes: 8a9f8564e9f9 ("lpm: implement RCU rule reclamation")
> > > > >
> > > > > Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > > > > ---
> > > > > v2:
> > > > > Converted return value to conform to LPM API convention. (Vladimir)
> > > > >
> > > > > lib/librte_lpm/rte_lpm.c | 19 +++++++++++++------
> > > > > 1 file changed, 13 insertions(+), 6 deletions(-)
> > > > >
> > > > > diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c index
> > > > > 2db9e16a2..757436f49 100644
> > > > > --- a/lib/librte_lpm/rte_lpm.c
> > > > > +++ b/lib/librte_lpm/rte_lpm.c
> > > > > @@ -532,11 +532,12 @@ tbl8_alloc(struct rte_lpm *lpm)
> > > > > return group_idx;
> > > > > }
> > > > >
> > > > > -static void
> > > > > +static int32_t
> > > > > tbl8_free(struct rte_lpm *lpm, uint32_t tbl8_group_start)
> > > > > {
> > > > > struct rte_lpm_tbl_entry zero_tbl8_entry = {0};
> > > > > struct __rte_lpm *internal_lpm;
> > > > > + int status;
> > > > >
> > > > > internal_lpm = container_of(lpm, struct __rte_lpm, lpm);
> > > > > if (internal_lpm->v == NULL) {
> > > > > @@ -552,9 +553,15 @@ tbl8_free(struct rte_lpm *lpm, uint32_t
> > > > tbl8_group_start)
> > > > > __ATOMIC_RELAXED);
> > > > > } else if (internal_lpm->rcu_mode == RTE_LPM_QSBR_MODE_DQ) {
> > > > > /* Push into QSBR defer queue. */
> > > > > - rte_rcu_qsbr_dq_enqueue(internal_lpm->dq,
> > > > > + status = rte_rcu_qsbr_dq_enqueue(internal_lpm->dq,
> > > > > (void *)&tbl8_group_start);
> > > > > + if (status == 1) {
> > > > > + RTE_LOG(ERR, LPM, "Failed to push QSBR FIFO\n");
> > > > > + return -rte_errno;
> > > > > + }
> > > > > }
> > > > > +
> > > > > + return 0;
> > > > > }
> > > > >
> > > > > static __rte_noinline int32_t
> > > > > @@ -1040,7 +1047,7 @@ delete_depth_big(struct rte_lpm *lpm, uint32_t
> > > > ip_masked,
> > > > > #define group_idx next_hop
> > > > > uint32_t tbl24_index, tbl8_group_index, tbl8_group_start,
> > > > tbl8_index,
> > > > > tbl8_range, i;
> > > > > - int32_t tbl8_recycle_index;
> > > > > + int32_t tbl8_recycle_index, status = 0;
> > > > >
> > > > > /*
> > > > > * Calculate the index into tbl24 and range. Note: All depths
> > > > > larger @@ -1097,7 +1104,7 @@ delete_depth_big(struct rte_lpm *lpm,
> > > > uint32_t ip_masked,
> > > > > */
> > > > > lpm->tbl24[tbl24_index].valid = 0;
> > > > > __atomic_thread_fence(__ATOMIC_RELEASE);
> > > > > - tbl8_free(lpm, tbl8_group_start);
> > > > > + status = tbl8_free(lpm, tbl8_group_start);
> > > > > } else if (tbl8_recycle_index > -1) {
> > > > > /* Update tbl24 entry. */
> > > > > struct rte_lpm_tbl_entry new_tbl24_entry = { @@ -1113,10
> > > > +1120,10
> > > > > @@ delete_depth_big(struct rte_lpm *lpm, uint32_t ip_masked,
> > > > > __atomic_store(&lpm->tbl24[tbl24_index],
> > > > &new_tbl24_entry,
> > > > > __ATOMIC_RELAXED);
> > > > > __atomic_thread_fence(__ATOMIC_RELEASE);
> > > > > - tbl8_free(lpm, tbl8_group_start);
> > > > > + status = tbl8_free(lpm, tbl8_group_start);
> > > > > }
> > > > > #undef group_idx
> > > > > - return 0;
> > > > > + return status;
> > > >
> > > > This will change rte_lpm_delete API. As a suggestion, you can leave it as it
> > > > was before ("return 0"), and send separate patch (with "return status)"
> > > > which will be targeted to 20.11.
> > > >
> > >
> > > Is the change of API because a variable is returned instead of constant?
> > > The patch passed ABI check on Travis: http://mails.dpdk.org/archives/test-report/2020-July/144864.html
> > > So I didn't know there is API/ABI issue.
> >
> >
> > Because new error status codes are returned. At the moment rte_lpm_delete()
> > returns only -EINVAL. After patches it will also returns -ENOSPC. The user's
> > code may not handle this returned error status.
> >
> > On the other hand, from documentation about returned value:
> > "0 on success, negative value otherwise",
> > and given the fact that this behavior is only after calling
> > rte_lpm_rcu_qsbr_add(), I think we can accept this patch.
> > Bruce, please correct me.
> >
> That sounds reasonable to me. No change in the committed ABI, since the
> specific values are not called out.
>
I will take this as a second ack and merge this fix for rc2.
Thanks.
--
David Marchand
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v2] lpm: fix unchecked return value
2020-07-21 16:23 0% ` Medvedkin, Vladimir
@ 2020-07-21 17:10 3% ` Bruce Richardson
2020-07-21 17:33 0% ` David Marchand
0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2020-07-21 17:10 UTC (permalink / raw)
To: Medvedkin, Vladimir
Cc: Ruifeng Wang, dev, nd, Honnappa Nagarahalli, Phil Yang
On Tue, Jul 21, 2020 at 05:23:02PM +0100, Medvedkin, Vladimir wrote:
> Hi Ruifeng,
>
> On 18/07/2020 10:22, Ruifeng Wang wrote:
> >
> > > -----Original Message-----
> > > From: Medvedkin, Vladimir <vladimir.medvedkin@intel.com>
> > > Sent: Saturday, July 18, 2020 1:12 AM
> > > To: Ruifeng Wang <Ruifeng.Wang@arm.com>; Bruce Richardson
> > > <bruce.richardson@intel.com>
> > > Cc: dev@dpdk.org; nd <nd@arm.com>; Honnappa Nagarahalli
> > > <Honnappa.Nagarahalli@arm.com>; Phil Yang <Phil.Yang@arm.com>
> > > Subject: Re: [PATCH v2] lpm: fix unchecked return value
> > >
> > > Hi Ruifeng,
> > >
> > Hi Vladimir,
> >
> > > On 16/07/2020 16:49, Ruifeng Wang wrote:
> > > > Coverity complains about unchecked return value of
> > > rte_rcu_qsbr_dq_enqueue.
> > > > By default, defer queue size is big enough to hold all tbl8 groups.
> > > > When enqueue fails, return error to the user to indicate system issue.
> > > >
> > > > Coverity issue: 360832
> > > > Fixes: 8a9f8564e9f9 ("lpm: implement RCU rule reclamation")
> > > >
> > > > Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > > > ---
> > > > v2:
> > > > Converted return value to conform to LPM API convention. (Vladimir)
> > > >
> > > > lib/librte_lpm/rte_lpm.c | 19 +++++++++++++------
> > > > 1 file changed, 13 insertions(+), 6 deletions(-)
> > > >
> > > > diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c index
> > > > 2db9e16a2..757436f49 100644
> > > > --- a/lib/librte_lpm/rte_lpm.c
> > > > +++ b/lib/librte_lpm/rte_lpm.c
> > > > @@ -532,11 +532,12 @@ tbl8_alloc(struct rte_lpm *lpm)
> > > > return group_idx;
> > > > }
> > > >
> > > > -static void
> > > > +static int32_t
> > > > tbl8_free(struct rte_lpm *lpm, uint32_t tbl8_group_start)
> > > > {
> > > > struct rte_lpm_tbl_entry zero_tbl8_entry = {0};
> > > > struct __rte_lpm *internal_lpm;
> > > > + int status;
> > > >
> > > > internal_lpm = container_of(lpm, struct __rte_lpm, lpm);
> > > > if (internal_lpm->v == NULL) {
> > > > @@ -552,9 +553,15 @@ tbl8_free(struct rte_lpm *lpm, uint32_t
> > > tbl8_group_start)
> > > > __ATOMIC_RELAXED);
> > > > } else if (internal_lpm->rcu_mode == RTE_LPM_QSBR_MODE_DQ) {
> > > > /* Push into QSBR defer queue. */
> > > > - rte_rcu_qsbr_dq_enqueue(internal_lpm->dq,
> > > > + status = rte_rcu_qsbr_dq_enqueue(internal_lpm->dq,
> > > > (void *)&tbl8_group_start);
> > > > + if (status == 1) {
> > > > + RTE_LOG(ERR, LPM, "Failed to push QSBR FIFO\n");
> > > > + return -rte_errno;
> > > > + }
> > > > }
> > > > +
> > > > + return 0;
> > > > }
> > > >
> > > > static __rte_noinline int32_t
> > > > @@ -1040,7 +1047,7 @@ delete_depth_big(struct rte_lpm *lpm, uint32_t
> > > ip_masked,
> > > > #define group_idx next_hop
> > > > uint32_t tbl24_index, tbl8_group_index, tbl8_group_start,
> > > tbl8_index,
> > > > tbl8_range, i;
> > > > - int32_t tbl8_recycle_index;
> > > > + int32_t tbl8_recycle_index, status = 0;
> > > >
> > > > /*
> > > > * Calculate the index into tbl24 and range. Note: All depths
> > > > larger @@ -1097,7 +1104,7 @@ delete_depth_big(struct rte_lpm *lpm,
> > > uint32_t ip_masked,
> > > > */
> > > > lpm->tbl24[tbl24_index].valid = 0;
> > > > __atomic_thread_fence(__ATOMIC_RELEASE);
> > > > - tbl8_free(lpm, tbl8_group_start);
> > > > + status = tbl8_free(lpm, tbl8_group_start);
> > > > } else if (tbl8_recycle_index > -1) {
> > > > /* Update tbl24 entry. */
> > > > struct rte_lpm_tbl_entry new_tbl24_entry = { @@ -1113,10
> > > +1120,10
> > > > @@ delete_depth_big(struct rte_lpm *lpm, uint32_t ip_masked,
> > > > __atomic_store(&lpm->tbl24[tbl24_index],
> > > &new_tbl24_entry,
> > > > __ATOMIC_RELAXED);
> > > > __atomic_thread_fence(__ATOMIC_RELEASE);
> > > > - tbl8_free(lpm, tbl8_group_start);
> > > > + status = tbl8_free(lpm, tbl8_group_start);
> > > > }
> > > > #undef group_idx
> > > > - return 0;
> > > > + return status;
> > >
> > > This will change rte_lpm_delete API. As a suggestion, you can leave it as it
> > > was before ("return 0"), and send separate patch (with "return status)"
> > > which will be targeted to 20.11.
> > >
> >
> > Is the change of API because a variable is returned instead of constant?
> > The patch passed ABI check on Travis: http://mails.dpdk.org/archives/test-report/2020-July/144864.html
> > So I didn't know there is API/ABI issue.
>
>
> Because new error status codes are returned. At the moment rte_lpm_delete()
> returns only -EINVAL. After patches it will also returns -ENOSPC. The user's
> code may not handle this returned error status.
>
> On the other hand, from documentation about returned value:
> "0 on success, negative value otherwise",
> and given the fact that this behavior is only after calling
> rte_lpm_rcu_qsbr_add(), I think we can accept this patch.
> Bruce, please correct me.
>
That sounds reasonable to me. No change in the committed ABI, since the
specific values are not called out.
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v2] lpm: fix unchecked return value
2020-07-18 9:22 4% ` Ruifeng Wang
@ 2020-07-21 16:23 0% ` Medvedkin, Vladimir
2020-07-21 17:10 3% ` Bruce Richardson
0 siblings, 1 reply; 200+ results
From: Medvedkin, Vladimir @ 2020-07-21 16:23 UTC (permalink / raw)
To: Ruifeng Wang, Bruce Richardson; +Cc: dev, nd, Honnappa Nagarahalli, Phil Yang
Hi Ruifeng,
On 18/07/2020 10:22, Ruifeng Wang wrote:
>
>> -----Original Message-----
>> From: Medvedkin, Vladimir <vladimir.medvedkin@intel.com>
>> Sent: Saturday, July 18, 2020 1:12 AM
>> To: Ruifeng Wang <Ruifeng.Wang@arm.com>; Bruce Richardson
>> <bruce.richardson@intel.com>
>> Cc: dev@dpdk.org; nd <nd@arm.com>; Honnappa Nagarahalli
>> <Honnappa.Nagarahalli@arm.com>; Phil Yang <Phil.Yang@arm.com>
>> Subject: Re: [PATCH v2] lpm: fix unchecked return value
>>
>> Hi Ruifeng,
>>
> Hi Vladimir,
>
>> On 16/07/2020 16:49, Ruifeng Wang wrote:
>>> Coverity complains about unchecked return value of
>> rte_rcu_qsbr_dq_enqueue.
>>> By default, defer queue size is big enough to hold all tbl8 groups.
>>> When enqueue fails, return error to the user to indicate system issue.
>>>
>>> Coverity issue: 360832
>>> Fixes: 8a9f8564e9f9 ("lpm: implement RCU rule reclamation")
>>>
>>> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
>>> ---
>>> v2:
>>> Converted return value to conform to LPM API convention. (Vladimir)
>>>
>>> lib/librte_lpm/rte_lpm.c | 19 +++++++++++++------
>>> 1 file changed, 13 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c index
>>> 2db9e16a2..757436f49 100644
>>> --- a/lib/librte_lpm/rte_lpm.c
>>> +++ b/lib/librte_lpm/rte_lpm.c
>>> @@ -532,11 +532,12 @@ tbl8_alloc(struct rte_lpm *lpm)
>>> return group_idx;
>>> }
>>>
>>> -static void
>>> +static int32_t
>>> tbl8_free(struct rte_lpm *lpm, uint32_t tbl8_group_start)
>>> {
>>> struct rte_lpm_tbl_entry zero_tbl8_entry = {0};
>>> struct __rte_lpm *internal_lpm;
>>> + int status;
>>>
>>> internal_lpm = container_of(lpm, struct __rte_lpm, lpm);
>>> if (internal_lpm->v == NULL) {
>>> @@ -552,9 +553,15 @@ tbl8_free(struct rte_lpm *lpm, uint32_t
>> tbl8_group_start)
>>> __ATOMIC_RELAXED);
>>> } else if (internal_lpm->rcu_mode == RTE_LPM_QSBR_MODE_DQ) {
>>> /* Push into QSBR defer queue. */
>>> - rte_rcu_qsbr_dq_enqueue(internal_lpm->dq,
>>> + status = rte_rcu_qsbr_dq_enqueue(internal_lpm->dq,
>>> (void *)&tbl8_group_start);
>>> + if (status == 1) {
>>> + RTE_LOG(ERR, LPM, "Failed to push QSBR FIFO\n");
>>> + return -rte_errno;
>>> + }
>>> }
>>> +
>>> + return 0;
>>> }
>>>
>>> static __rte_noinline int32_t
>>> @@ -1040,7 +1047,7 @@ delete_depth_big(struct rte_lpm *lpm, uint32_t
>> ip_masked,
>>> #define group_idx next_hop
>>> uint32_t tbl24_index, tbl8_group_index, tbl8_group_start,
>> tbl8_index,
>>> tbl8_range, i;
>>> - int32_t tbl8_recycle_index;
>>> + int32_t tbl8_recycle_index, status = 0;
>>>
>>> /*
>>> * Calculate the index into tbl24 and range. Note: All depths
>>> larger @@ -1097,7 +1104,7 @@ delete_depth_big(struct rte_lpm *lpm,
>> uint32_t ip_masked,
>>> */
>>> lpm->tbl24[tbl24_index].valid = 0;
>>> __atomic_thread_fence(__ATOMIC_RELEASE);
>>> - tbl8_free(lpm, tbl8_group_start);
>>> + status = tbl8_free(lpm, tbl8_group_start);
>>> } else if (tbl8_recycle_index > -1) {
>>> /* Update tbl24 entry. */
>>> struct rte_lpm_tbl_entry new_tbl24_entry = { @@ -1113,10
>> +1120,10
>>> @@ delete_depth_big(struct rte_lpm *lpm, uint32_t ip_masked,
>>> __atomic_store(&lpm->tbl24[tbl24_index],
>> &new_tbl24_entry,
>>> __ATOMIC_RELAXED);
>>> __atomic_thread_fence(__ATOMIC_RELEASE);
>>> - tbl8_free(lpm, tbl8_group_start);
>>> + status = tbl8_free(lpm, tbl8_group_start);
>>> }
>>> #undef group_idx
>>> - return 0;
>>> + return status;
>>
>> This will change rte_lpm_delete API. As a suggestion, you can leave it as it
>> was before ("return 0"), and send separate patch (with "return status)"
>> which will be targeted to 20.11.
>>
>
> Is the change of API because a variable is returned instead of constant?
> The patch passed ABI check on Travis: http://mails.dpdk.org/archives/test-report/2020-July/144864.html
> So I didn't know there is API/ABI issue.
Because new error status codes are returned. At the moment
rte_lpm_delete() returns only -EINVAL. After patches it will also
returns -ENOSPC. The user's code may not handle this returned error status.
On the other hand, from documentation about returned value:
"0 on success, negative value otherwise",
and given the fact that this behavior is only after calling
rte_lpm_rcu_qsbr_add(), I think we can accept this patch.
Bruce, please correct me.
>
> Thanks.
> /Ruifeng
>>> }
>>>
>>> /*
>>>
>>
>> --
>> Regards,
>> Vladimir
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
--
Regards,
Vladimir
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH 2/2] doc: add deprecation notice for change of rawdev APIs
2020-07-13 12:31 5% ` [dpdk-dev] [PATCH 2/2] doc: add deprecation notice for change of rawdev APIs Bruce Richardson
2020-07-13 12:48 5% ` Hemant Agrawal
@ 2020-07-20 11:35 0% ` Ananyev, Konstantin
2020-07-23 1:55 5% ` Xu, Rosen
2 siblings, 0 replies; 200+ results
From: Ananyev, Konstantin @ 2020-07-20 11:35 UTC (permalink / raw)
To: Richardson, Bruce, dev; +Cc: Richardson, Bruce, Nipun Gupta, Hemant Agrawal
> Add to the documentation for 20.08 a notice about the changes of rawdev
> APIs proposed by patchset [1].
>
> [1] http://inbox.dpdk.org/dev/20200709152047.167730-1-bruce.richardson@intel.com/
>
> Cc: Nipun Gupta <nipun.gupta@nxp.com>
> Cc: Hemant Agrawal <hemant.agrawal@nxp.com>
>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
> doc/guides/rel_notes/deprecation.rst | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> index ead7cbe43..21b00103e 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -117,6 +117,13 @@ Deprecation Notices
> break the ABI checks, that is why change is planned for 20.11.
> The list of internal APIs are mainly ones listed in ``rte_ethdev_driver.h``.
>
> +* rawdev: The rawdev APIs which take a device-specific structure as
> + parameter directly, or indirectly via a "private" pointer inside another
> + structure, will be modified to take an additional parameter of the
> + structure size. The affected APIs will include ``rte_rawdev_info_get``,
> + ``rte_rawdev_configure``, ``rte_rawdev_queue_conf_get`` and
> + ``rte_rawdev_queue_setup``.
> +
> * traffic manager: All traffic manager API's in ``rte_tm.h`` were mistakenly made
> ABI stable in the v19.11 release. The TM maintainer and other contributors have
> agreed to keep the TM APIs as experimental in expectation of additional spec
> --
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com
> 2.25.1
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v2] lpm: fix unchecked return value
@ 2020-07-18 9:22 4% ` Ruifeng Wang
2020-07-21 16:23 0% ` Medvedkin, Vladimir
0 siblings, 1 reply; 200+ results
From: Ruifeng Wang @ 2020-07-18 9:22 UTC (permalink / raw)
To: Medvedkin, Vladimir, Bruce Richardson
Cc: dev, nd, Honnappa Nagarahalli, Phil Yang, nd
> -----Original Message-----
> From: Medvedkin, Vladimir <vladimir.medvedkin@intel.com>
> Sent: Saturday, July 18, 2020 1:12 AM
> To: Ruifeng Wang <Ruifeng.Wang@arm.com>; Bruce Richardson
> <bruce.richardson@intel.com>
> Cc: dev@dpdk.org; nd <nd@arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Phil Yang <Phil.Yang@arm.com>
> Subject: Re: [PATCH v2] lpm: fix unchecked return value
>
> Hi Ruifeng,
>
Hi Vladimir,
> On 16/07/2020 16:49, Ruifeng Wang wrote:
> > Coverity complains about unchecked return value of
> rte_rcu_qsbr_dq_enqueue.
> > By default, defer queue size is big enough to hold all tbl8 groups.
> > When enqueue fails, return error to the user to indicate system issue.
> >
> > Coverity issue: 360832
> > Fixes: 8a9f8564e9f9 ("lpm: implement RCU rule reclamation")
> >
> > Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > ---
> > v2:
> > Converted return value to conform to LPM API convention. (Vladimir)
> >
> > lib/librte_lpm/rte_lpm.c | 19 +++++++++++++------
> > 1 file changed, 13 insertions(+), 6 deletions(-)
> >
> > diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c index
> > 2db9e16a2..757436f49 100644
> > --- a/lib/librte_lpm/rte_lpm.c
> > +++ b/lib/librte_lpm/rte_lpm.c
> > @@ -532,11 +532,12 @@ tbl8_alloc(struct rte_lpm *lpm)
> > return group_idx;
> > }
> >
> > -static void
> > +static int32_t
> > tbl8_free(struct rte_lpm *lpm, uint32_t tbl8_group_start)
> > {
> > struct rte_lpm_tbl_entry zero_tbl8_entry = {0};
> > struct __rte_lpm *internal_lpm;
> > + int status;
> >
> > internal_lpm = container_of(lpm, struct __rte_lpm, lpm);
> > if (internal_lpm->v == NULL) {
> > @@ -552,9 +553,15 @@ tbl8_free(struct rte_lpm *lpm, uint32_t
> tbl8_group_start)
> > __ATOMIC_RELAXED);
> > } else if (internal_lpm->rcu_mode == RTE_LPM_QSBR_MODE_DQ) {
> > /* Push into QSBR defer queue. */
> > - rte_rcu_qsbr_dq_enqueue(internal_lpm->dq,
> > + status = rte_rcu_qsbr_dq_enqueue(internal_lpm->dq,
> > (void *)&tbl8_group_start);
> > + if (status == 1) {
> > + RTE_LOG(ERR, LPM, "Failed to push QSBR FIFO\n");
> > + return -rte_errno;
> > + }
> > }
> > +
> > + return 0;
> > }
> >
> > static __rte_noinline int32_t
> > @@ -1040,7 +1047,7 @@ delete_depth_big(struct rte_lpm *lpm, uint32_t
> ip_masked,
> > #define group_idx next_hop
> > uint32_t tbl24_index, tbl8_group_index, tbl8_group_start,
> tbl8_index,
> > tbl8_range, i;
> > - int32_t tbl8_recycle_index;
> > + int32_t tbl8_recycle_index, status = 0;
> >
> > /*
> > * Calculate the index into tbl24 and range. Note: All depths
> > larger @@ -1097,7 +1104,7 @@ delete_depth_big(struct rte_lpm *lpm,
> uint32_t ip_masked,
> > */
> > lpm->tbl24[tbl24_index].valid = 0;
> > __atomic_thread_fence(__ATOMIC_RELEASE);
> > - tbl8_free(lpm, tbl8_group_start);
> > + status = tbl8_free(lpm, tbl8_group_start);
> > } else if (tbl8_recycle_index > -1) {
> > /* Update tbl24 entry. */
> > struct rte_lpm_tbl_entry new_tbl24_entry = { @@ -1113,10
> +1120,10
> > @@ delete_depth_big(struct rte_lpm *lpm, uint32_t ip_masked,
> > __atomic_store(&lpm->tbl24[tbl24_index],
> &new_tbl24_entry,
> > __ATOMIC_RELAXED);
> > __atomic_thread_fence(__ATOMIC_RELEASE);
> > - tbl8_free(lpm, tbl8_group_start);
> > + status = tbl8_free(lpm, tbl8_group_start);
> > }
> > #undef group_idx
> > - return 0;
> > + return status;
>
> This will change rte_lpm_delete API. As a suggestion, you can leave it as it
> was before ("return 0"), and send separate patch (with "return status)"
> which will be targeted to 20.11.
>
Is the change of API because a variable is returned instead of constant?
The patch passed ABI check on Travis: http://mails.dpdk.org/archives/test-report/2020-July/144864.html
So I didn't know there is API/ABI issue.
Thanks.
/Ruifeng
> > }
> >
> > /*
> >
>
> --
> Regards,
> Vladimir
^ permalink raw reply [relevance 4%]
* [dpdk-dev] [PATCH v5 1/2] mbuf: use C11 atomic builtins for refcnt operations
2020-07-09 15:58 4% ` [dpdk-dev] [PATCH v4 1/2] " Phil Yang
2020-07-15 12:29 0% ` David Marchand
@ 2020-07-17 4:36 4% ` Phil Yang
1 sibling, 0 replies; 200+ results
From: Phil Yang @ 2020-07-17 4:36 UTC (permalink / raw)
To: david.marchand, dev
Cc: olivier.matz, stephen, drc, Honnappa.Nagarahalli, Ruifeng.Wang,
nd, Ray Kinsella, Neil Horman
Use C11 atomic builtins with explicit ordering instead of rte_atomic
ops which enforce unnecessary barriers on aarch64.
Suggested-by: Olivier Matz <olivier.matz@6wind.com>
Suggested-by: Dodji Seketeli <dodji@redhat.com>
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
v5:
1. Change built-ins to builtins.
2. Ignore updates of rte_mbuf_ext_shared_info refcnt_atomic in ABI
checker.
v4:
1. Add union for refcnt_atomic and refcnt in rte_mbuf_ext_shared_info
to avoid ABI breakage. (Olivier)
2. Add notice of refcnt_atomic deprecation. (Honnappa)
v3:
1.Fix ABI breakage.
2.Simplify data type cast.
v2:
Fix ABI issue: revert the rte_mbuf_ext_shared_info struct refcnt field
to refcnt_atomic.
devtools/libabigail.abignore | 4 ++++
lib/librte_mbuf/rte_mbuf.c | 1 -
lib/librte_mbuf/rte_mbuf.h | 19 ++++++++++---------
lib/librte_mbuf/rte_mbuf_core.h | 6 +++++-
4 files changed, 19 insertions(+), 11 deletions(-)
diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index daa4631..9fea822 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -52,6 +52,10 @@
[suppress_type]
type_kind = struct
name = rte_epoll_event
+; Ignore updates of rte_mbuf_ext_shared_info refcnt_atomic
+[suppress_type]
+ name = rte_mbuf_ext_shared_info
+ has_data_member_inserted_between = {offset_of(refcnt_atomic), offset_of(refcnt_atomic)}
;;;;;;;;;;;;;;;;;;;;;;
; Temporary exceptions till DPDK 20.11
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index ae91ae2..8a456e5 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -22,7 +22,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_mbuf.h>
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index f8e492e..7259575 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -37,7 +37,6 @@
#include <rte_config.h>
#include <rte_mempool.h>
#include <rte_memory.h>
-#include <rte_atomic.h>
#include <rte_prefetch.h>
#include <rte_branch_prediction.h>
#include <rte_byteorder.h>
@@ -365,7 +364,7 @@ rte_pktmbuf_priv_flags(struct rte_mempool *mp)
static inline uint16_t
rte_mbuf_refcnt_read(const struct rte_mbuf *m)
{
- return (uint16_t)(rte_atomic16_read(&m->refcnt_atomic));
+ return __atomic_load_n(&m->refcnt, __ATOMIC_RELAXED);
}
/**
@@ -378,14 +377,15 @@ rte_mbuf_refcnt_read(const struct rte_mbuf *m)
static inline void
rte_mbuf_refcnt_set(struct rte_mbuf *m, uint16_t new_value)
{
- rte_atomic16_set(&m->refcnt_atomic, (int16_t)new_value);
+ __atomic_store_n(&m->refcnt, new_value, __ATOMIC_RELAXED);
}
/* internal */
static inline uint16_t
__rte_mbuf_refcnt_update(struct rte_mbuf *m, int16_t value)
{
- return (uint16_t)(rte_atomic16_add_return(&m->refcnt_atomic, value));
+ return __atomic_add_fetch(&m->refcnt, (uint16_t)value,
+ __ATOMIC_ACQ_REL);
}
/**
@@ -466,7 +466,7 @@ rte_mbuf_refcnt_set(struct rte_mbuf *m, uint16_t new_value)
static inline uint16_t
rte_mbuf_ext_refcnt_read(const struct rte_mbuf_ext_shared_info *shinfo)
{
- return (uint16_t)(rte_atomic16_read(&shinfo->refcnt_atomic));
+ return __atomic_load_n(&shinfo->refcnt, __ATOMIC_RELAXED);
}
/**
@@ -481,7 +481,7 @@ static inline void
rte_mbuf_ext_refcnt_set(struct rte_mbuf_ext_shared_info *shinfo,
uint16_t new_value)
{
- rte_atomic16_set(&shinfo->refcnt_atomic, (int16_t)new_value);
+ __atomic_store_n(&shinfo->refcnt, new_value, __ATOMIC_RELAXED);
}
/**
@@ -505,7 +505,8 @@ rte_mbuf_ext_refcnt_update(struct rte_mbuf_ext_shared_info *shinfo,
return (uint16_t)value;
}
- return (uint16_t)rte_atomic16_add_return(&shinfo->refcnt_atomic, value);
+ return __atomic_add_fetch(&shinfo->refcnt, (uint16_t)value,
+ __ATOMIC_ACQ_REL);
}
/** Mbuf prefetch */
@@ -1304,8 +1305,8 @@ static inline int __rte_pktmbuf_pinned_extbuf_decref(struct rte_mbuf *m)
* Direct usage of add primitive to avoid
* duplication of comparing with one.
*/
- if (likely(rte_atomic16_add_return
- (&shinfo->refcnt_atomic, -1)))
+ if (likely(__atomic_add_fetch(&shinfo->refcnt, (uint16_t)-1,
+ __ATOMIC_ACQ_REL)))
return 1;
/* Reinitialize counter before mbuf freeing. */
diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
index 16600f1..8cd7137 100644
--- a/lib/librte_mbuf/rte_mbuf_core.h
+++ b/lib/librte_mbuf/rte_mbuf_core.h
@@ -679,7 +679,11 @@ typedef void (*rte_mbuf_extbuf_free_callback_t)(void *addr, void *opaque);
struct rte_mbuf_ext_shared_info {
rte_mbuf_extbuf_free_callback_t free_cb; /**< Free callback function */
void *fcb_opaque; /**< Free callback argument */
- rte_atomic16_t refcnt_atomic; /**< Atomically accessed refcnt */
+ RTE_STD_C11
+ union {
+ rte_atomic16_t refcnt_atomic; /**< Atomically accessed refcnt */
+ uint16_t refcnt;
+ };
};
/**< Maximum number of nb_segs allowed. */
--
2.7.4
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH v4 1/2] mbuf: use C11 atomic built-ins for refcnt operations
2020-07-16 4:16 0% ` Phil Yang
@ 2020-07-16 11:30 4% ` David Marchand
0 siblings, 0 replies; 200+ results
From: David Marchand @ 2020-07-16 11:30 UTC (permalink / raw)
To: Phil Yang, Olivier Matz, Dodji Seketeli, Ray Kinsella
Cc: dev, Stephen Hemminger, David Christensen, Honnappa Nagarahalli,
Ruifeng Wang, nd, Aaron Conole
On Thu, Jul 16, 2020 at 6:16 AM Phil Yang <Phil.Yang@arm.com> wrote:
>
> David Marchand <david.marchand@redhat.com> writes:
>
> > Subject: Re: [PATCH v4 1/2] mbuf: use C11 atomic built-ins for refcnt
> > operations
> >
> > On Thu, Jul 9, 2020 at 5:59 PM Phil Yang <phil.yang@arm.com> wrote:
> > >
> > > Use C11 atomic built-ins with explicit ordering instead of rte_atomic
> > > ops which enforce unnecessary barriers on aarch64.
> > >
> > > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > > ---
> > > v4:
> > > 1. Add union for refcnt_atomic and refcnt in rte_mbuf_ext_shared_info
> > > to avoid ABI breakage. (Olivier)
> > > 2. Add notice of refcnt_atomic deprecation. (Honnappa)
> >
> > v4 does not pass the checks (in both my env, and Travis).
> > https://travis-ci.com/github/ovsrobot/dpdk/jobs/359393389#L2405
>
> I think we need an exception in 'libabigail.abignore' for this change.
> Is that OK with you?
Testing the series with libabigail 1.7.0:
Functions changes summary: 0 Removed, 1 Changed (6 filtered out), 0
Added functions
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
1 function with some indirect sub-type change:
[C]'function unsigned int rte_reorder_drain(rte_reorder_buffer*,
rte_mbuf**, unsigned int)' at rte_reorder.c:367:1 has some indirect
sub-type changes:
parameter 2 of type 'rte_mbuf**' has sub-type changes:
in pointed to type 'rte_mbuf*':
in pointed to type 'struct rte_mbuf' at rte_mbuf_core.h:469:1:
type size hasn't changed
1 data member changes (1 filtered):
type of 'rte_mbuf_ext_shared_info* rte_mbuf::shinfo' changed:
in pointed to type 'struct rte_mbuf_ext_shared_info' at
rte_mbuf_core.h:679:1:
type size hasn't changed
1 data member change:
data member rte_atomic16_t
rte_mbuf_ext_shared_info::refcnt_atomic at offset 128 (in bits) became
anonymous data member 'union {rte_atomic16_t refcnt_atomic; uint16_t
refcnt;}'
Error: ABI issue reported for 'abidiff --suppr
/home/dmarchan/dpdk/devtools/../devtools/libabigail.abignore
--no-added-syms --headers-dir1
/home/dmarchan/abi/v20.05/build-gcc-static/usr/local/include
--headers-dir2 /home/dmarchan/builds/build-gcc-static/install/usr/local/include
/home/dmarchan/abi/v20.05/build-gcc-static/dump/librte_reorder.dump
/home/dmarchan/builds/build-gcc-static/install/dump/librte_reorder.dump'
ABIDIFF_ABI_CHANGE, this change requires a review (abidiff flagged
this as a potential issue).
We will have no other update on mbuf for 20.08, so the following rule
can do the job for 20.08 and we will remove it in 20.11.
diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index daa4631bf..b35f91257 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -52,6 +52,10 @@
[suppress_type]
type_kind = struct
name = rte_epoll_event
+; Ignore updates of rte_mbuf_ext_shared_info
+[suppress_type]
+ type_kind = struct
+ name = rte_mbuf_ext_shared_info
;;;;;;;;;;;;;;;;;;;;;;
; Temporary exceptions till DPDK 20.11
Olivier, Dodji, Ray?
--
David Marchand
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [RFC PATCH 0/2] Enable dyynamic configuration of subport bandwidth profile
2020-07-15 18:27 3% [dpdk-dev] [RFC PATCH 0/2] Enable dyynamic configuration of subport bandwidth profile Savinay Dharmappa
@ 2020-07-16 8:14 0% ` Singh, Jasvinder
0 siblings, 0 replies; 200+ results
From: Singh, Jasvinder @ 2020-07-16 8:14 UTC (permalink / raw)
To: Dharmappa, Savinay, dev, Dumitrescu, Cristian
> -----Original Message-----
> From: Dharmappa, Savinay <savinay.dharmappa@intel.com>
> Sent: Wednesday, July 15, 2020 7:28 PM
> To: Dharmappa, Savinay <savinay.dharmappa@intel.com>; Singh, Jasvinder
> <jasvinder.singh@intel.com>; dev@dpdk.org
> Subject: [RFC PATCH 0/2] Enable dyynamic configuration of subport
> bandwidth profile
>
> DPDK sched library allows runtime configuration of the pipe profiles to the
> pipes of the subport once scheduler hierarchy is constructed. However, to
> change the subport level bandwidth, existing hierarchy needs to be
> dismantled and whole process of building hierarchy under subport nodes
> needs to be repeated which might result in router downtime. Furthermore,
> due to lack of dynamic configuration of the subport bandwidth profile
> configuration (shaper and Traffic class rates), the user application is unable
> to dynamically re-distribute the excess-bandwidth of one subport among
> other subports in the scheduler hierarchy. Therefore, it is also not possible to
> adjust the subport bandwidth profile in sync with dynamic changes in pipe
> profiles of subscribers who want to consume higher bandwidth
> opportunistically.
>
> This RFC proposes dynamic configuration of the subport bandwidth profile to
> overcome the runtime situation when group of subscribers are not using the
> allotted bandwidth and dynamic bandwidth re-distribution is needed the
> without making any structural changes in the hierarchy.
>
> The implementation work includes refactoring the existing data structures
> defined for port and subport level, new APIs for adding subport level
> bandwidth profiles that can be used in runtime which causes API/ABI change.
> Therefore, deprecation notice will be sent out soon.
>
> Savinay Dharmappa (2):
> sched: add dynamic config of subport bandwidth profile
> example/qos_sched: subport bandwidth dynmaic conf
>
> examples/qos_sched/cfg_file.c | 158 ++++++-----
> examples/qos_sched/cfg_file.h | 4 +
> examples/qos_sched/init.c | 24 +-
> examples/qos_sched/main.h | 1 +
> examples/qos_sched/profile.cfg | 3 +
> lib/librte_sched/rte_sched.c | 486 ++++++++++++++++++++++++---------
> lib/librte_sched/rte_sched.h | 82 +++++-
> lib/librte_sched/rte_sched_version.map | 2 +
> 8 files changed, 544 insertions(+), 216 deletions(-)
>
> --
> 2.7.4
+ Cristian
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v2] devtools: give some hints for ABI errors
2020-07-15 12:48 4% ` Aaron Conole
@ 2020-07-16 7:29 4% ` David Marchand
0 siblings, 0 replies; 200+ results
From: David Marchand @ 2020-07-16 7:29 UTC (permalink / raw)
To: David Marchand
Cc: dev, Thomas Monjalon, Ray Kinsella, Neil Horman, Dodji Seketeli,
Aaron Conole
On Wed, Jul 15, 2020 at 2:49 PM Aaron Conole <aconole@redhat.com> wrote:
> David Marchand <david.marchand@redhat.com> writes:
>
> > abidiff can provide some more information about the ABI difference it
> > detected.
> > In all cases, a discussion on the mailing must happen but we can give
> > some hints to know if this is a problem with the script calling abidiff,
> > a potential ABI breakage or an unambiguous ABI breakage.
> >
> > Signed-off-by: David Marchand <david.marchand@redhat.com>
> > Acked-by: Ray Kinsella <mdr@ashroe.eu>
> > Acked-by: Neil Horman <nhorman@tuxdriver.com>
> Acked-by: Aaron Conole <aconole@redhat.com>
Applied.
--
David Marchand
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH v4 1/2] mbuf: use C11 atomic built-ins for refcnt operations
2020-07-15 12:29 0% ` David Marchand
2020-07-15 12:49 0% ` Aaron Conole
2020-07-15 16:29 0% ` Phil Yang
@ 2020-07-16 4:16 0% ` Phil Yang
2020-07-16 11:30 4% ` David Marchand
2 siblings, 1 reply; 200+ results
From: Phil Yang @ 2020-07-16 4:16 UTC (permalink / raw)
To: David Marchand, Olivier Matz
Cc: dev, Stephen Hemminger, David Christensen, Honnappa Nagarahalli,
Ruifeng Wang, nd, Dodji Seketeli, Aaron Conole, nd
David Marchand <david.marchand@redhat.com> writes:
> Subject: Re: [PATCH v4 1/2] mbuf: use C11 atomic built-ins for refcnt
> operations
>
> On Thu, Jul 9, 2020 at 5:59 PM Phil Yang <phil.yang@arm.com> wrote:
> >
> > Use C11 atomic built-ins with explicit ordering instead of rte_atomic
> > ops which enforce unnecessary barriers on aarch64.
> >
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > ---
> > v4:
> > 1. Add union for refcnt_atomic and refcnt in rte_mbuf_ext_shared_info
> > to avoid ABI breakage. (Olivier)
> > 2. Add notice of refcnt_atomic deprecation. (Honnappa)
>
> v4 does not pass the checks (in both my env, and Travis).
> https://travis-ci.com/github/ovsrobot/dpdk/jobs/359393389#L2405
I think we need an exception in 'libabigail.abignore' for this change.
Is that OK with you?
Thanks,
Phil
>
> It seems the robot had a hiccup as I can't see a report in the test-report ml.
>
>
> --
> David Marchand
^ permalink raw reply [relevance 0%]
* [dpdk-dev] [PATCH v5 9/9] doc: replace references to blacklist/whitelist
@ 2020-07-15 23:02 1% ` Stephen Hemminger
0 siblings, 0 replies; 200+ results
From: Stephen Hemminger @ 2020-07-15 23:02 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Luca Boccassi
The terms blacklist and whitelist are no longer used.
Most of this was automatic replacement, but in a couple of
places the language was awkward before and have tried to improve
the readabilty.
The blacklist/whitelist changes to API will not be a breaking
change for applications in this release but worth adding a note
to encourage migration.
Update examples to new config options
Replace -w with -i and -b with -x to reflect new usage.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Luca Boccassi <bluca@debian.org>
---
doc/guides/cryptodevs/dpaa2_sec.rst | 4 ++--
doc/guides/cryptodevs/dpaa_sec.rst | 4 ++--
doc/guides/cryptodevs/qat.rst | 6 ++---
doc/guides/eventdevs/octeontx2.rst | 20 ++++++++---------
doc/guides/freebsd_gsg/build_sample_apps.rst | 2 +-
doc/guides/linux_gsg/build_sample_apps.rst | 2 +-
doc/guides/linux_gsg/eal_args.include.rst | 14 ++++++------
doc/guides/linux_gsg/linux_drivers.rst | 4 ++--
doc/guides/mempool/octeontx2.rst | 4 ++--
doc/guides/nics/bnxt.rst | 6 ++---
doc/guides/nics/cxgbe.rst | 12 +++++-----
doc/guides/nics/dpaa.rst | 4 ++--
doc/guides/nics/dpaa2.rst | 4 ++--
doc/guides/nics/enic.rst | 12 +++++-----
doc/guides/nics/fail_safe.rst | 22 +++++++++----------
doc/guides/nics/features.rst | 2 +-
doc/guides/nics/i40e.rst | 12 +++++-----
doc/guides/nics/ice.rst | 18 +++++++--------
doc/guides/nics/mlx4.rst | 16 +++++++-------
doc/guides/nics/mlx5.rst | 12 +++++-----
doc/guides/nics/octeontx2.rst | 22 +++++++++----------
doc/guides/nics/sfc_efx.rst | 2 +-
doc/guides/nics/tap.rst | 10 ++++-----
doc/guides/nics/thunderx.rst | 4 ++--
.../prog_guide/env_abstraction_layer.rst | 7 +++---
doc/guides/prog_guide/multi_proc_support.rst | 4 ++--
doc/guides/rel_notes/known_issues.rst | 4 ++--
doc/guides/rel_notes/release_20_08.rst | 6 +++++
doc/guides/rel_notes/release_2_1.rst | 2 +-
doc/guides/sample_app_ug/bbdev_app.rst | 6 ++---
doc/guides/sample_app_ug/ipsec_secgw.rst | 6 ++---
doc/guides/sample_app_ug/l3_forward.rst | 2 +-
.../sample_app_ug/l3_forward_access_ctrl.rst | 2 +-
.../sample_app_ug/l3_forward_power_man.rst | 2 +-
doc/guides/sample_app_ug/vdpa.rst | 2 +-
doc/guides/tools/cryptoperf.rst | 6 ++---
doc/guides/tools/flow-perf.rst | 2 +-
37 files changed, 137 insertions(+), 132 deletions(-)
diff --git a/doc/guides/cryptodevs/dpaa2_sec.rst b/doc/guides/cryptodevs/dpaa2_sec.rst
index 3053636b8295..363c52f0422f 100644
--- a/doc/guides/cryptodevs/dpaa2_sec.rst
+++ b/doc/guides/cryptodevs/dpaa2_sec.rst
@@ -134,10 +134,10 @@ Supported DPAA2 SoCs
* LS2088A/LS2048A
* LS1088A/LS1048A
-Whitelisting & Blacklisting
+Allowlisting & Blocklisting
---------------------------
-For blacklisting a DPAA2 SEC device, following commands can be used.
+The DPAA2 SEC device can be blocked with the following:
.. code-block:: console
diff --git a/doc/guides/cryptodevs/dpaa_sec.rst b/doc/guides/cryptodevs/dpaa_sec.rst
index db3c8e918945..295164523d22 100644
--- a/doc/guides/cryptodevs/dpaa_sec.rst
+++ b/doc/guides/cryptodevs/dpaa_sec.rst
@@ -82,10 +82,10 @@ Supported DPAA SoCs
* LS1046A/LS1026A
* LS1043A/LS1023A
-Whitelisting & Blacklisting
+Allowlisting & Blocklisting
---------------------------
-For blacklisting a DPAA device, following commands can be used.
+For blocking a DPAA device, following commands can be used.
.. code-block:: console
diff --git a/doc/guides/cryptodevs/qat.rst b/doc/guides/cryptodevs/qat.rst
index 7f4036c3210e..38e5b0a96206 100644
--- a/doc/guides/cryptodevs/qat.rst
+++ b/doc/guides/cryptodevs/qat.rst
@@ -126,7 +126,7 @@ Limitations
optimisations in the GEN3 device. And if a GCM session is initialised on a
GEN3 device, then attached to an op sent to a GEN1/GEN2 device, it will not be
enqueued to the device and will be marked as failed. The simplest way to
- mitigate this is to use the bdf whitelist to avoid mixing devices of different
+ mitigate this is to use the bdf allowlist to avoid mixing devices of different
generations in the same process if planning to use for GCM.
* The mixed algo feature on GEN2 is not supported by all kernel drivers. Check
the notes under the Available Kernel Drivers table below for specific details.
@@ -261,7 +261,7 @@ adjusted to the number of VFs which the QAT common code will need to handle.
QAT VF may expose two crypto devices, sym and asym, it may happen that the
number of devices will be bigger than MAX_DEVS and the process will show an error
during PMD initialisation. To avoid this problem CONFIG_RTE_CRYPTO_MAX_DEVS may be
- increased or -w, pci-whitelist domain:bus:devid:func option may be used.
+ increased or -w, pci-allowlist domain:bus:devid:func option may be used.
QAT compression PMD needs intermediate buffers to support Deflate compression
@@ -299,7 +299,7 @@ return 0 (thereby avoiding an MMIO) if the device is congested and number of pac
possible to enqueue is smaller.
To use this feature the user must set the parameter on process start as a device additional parameter::
- -w 03:01.1,qat_sym_enq_threshold=32,qat_comp_enq_threshold=16
+ -i 03:01.1,qat_sym_enq_threshold=32,qat_comp_enq_threshold=16
All parameters can be used with the same device regardless of order. Parameters are separated
by comma. When the same parameter is used more than once first occurrence of the parameter
diff --git a/doc/guides/eventdevs/octeontx2.rst b/doc/guides/eventdevs/octeontx2.rst
index 6502f6415fb4..470ea5450432 100644
--- a/doc/guides/eventdevs/octeontx2.rst
+++ b/doc/guides/eventdevs/octeontx2.rst
@@ -66,7 +66,7 @@ Runtime Config Options
upper limit for in-flight events.
For example::
- -w 0002:0e:00.0,xae_cnt=16384
+ -i 0002:0e:00.0,xae_cnt=16384
- ``Force legacy mode``
@@ -74,7 +74,7 @@ Runtime Config Options
single workslot mode in SSO and disable the default dual workslot mode.
For example::
- -w 0002:0e:00.0,single_ws=1
+ -i 0002:0e:00.0,single_ws=1
- ``Event Group QoS support``
@@ -89,7 +89,7 @@ Runtime Config Options
default.
For example::
- -w 0002:0e:00.0,qos=[1-50-50-50]
+ -i 0002:0e:00.0,qos=[1-50-50-50]
- ``Selftest``
@@ -98,7 +98,7 @@ Runtime Config Options
The tests are run once the vdev creation is successfully complete.
For example::
- -w 0002:0e:00.0,selftest=1
+ -i 0002:0e:00.0,selftest=1
- ``TIM disable NPA``
@@ -107,7 +107,7 @@ Runtime Config Options
parameter disables NPA and uses software mempool to manage chunks
For example::
- -w 0002:0e:00.0,tim_disable_npa=1
+ -i 0002:0e:00.0,tim_disable_npa=1
- ``TIM modify chunk slots``
@@ -118,7 +118,7 @@ Runtime Config Options
to SSO. The default value is 255 and the max value is 4095.
For example::
- -w 0002:0e:00.0,tim_chnk_slots=1023
+ -i 0002:0e:00.0,tim_chnk_slots=1023
- ``TIM enable arm/cancel statistics``
@@ -126,7 +126,7 @@ Runtime Config Options
event timer adapter.
For example::
- -w 0002:0e:00.0,tim_stats_ena=1
+ -i 0002:0e:00.0,tim_stats_ena=1
- ``TIM limit max rings reserved``
@@ -136,7 +136,7 @@ Runtime Config Options
rings.
For example::
- -w 0002:0e:00.0,tim_rings_lmt=5
+ -i 0002:0e:00.0,tim_rings_lmt=5
- ``TIM ring control internal parameters``
@@ -146,7 +146,7 @@ Runtime Config Options
default values.
For Example::
- -w 0002:0e:00.0,tim_ring_ctl=[2-1023-1-0]
+ -i 0002:0e:00.0,tim_ring_ctl=[2-1023-1-0]
- ``Lock NPA contexts in NDC``
@@ -156,7 +156,7 @@ Runtime Config Options
For example::
- -w 0002:0e:00.0,npa_lock_mask=0xf
+ -i 0002:0e:00.0,npa_lock_mask=0xf
Debugging Options
~~~~~~~~~~~~~~~~~
diff --git a/doc/guides/freebsd_gsg/build_sample_apps.rst b/doc/guides/freebsd_gsg/build_sample_apps.rst
index 2a68f5fc3820..4fba671e4f5b 100644
--- a/doc/guides/freebsd_gsg/build_sample_apps.rst
+++ b/doc/guides/freebsd_gsg/build_sample_apps.rst
@@ -67,7 +67,7 @@ DPDK application. Some of the EAL options for FreeBSD are as follows:
is a list of cores to use instead of a core mask.
* ``-b <domain:bus:devid.func>``:
- Blacklisting of ports; prevent EAL from using specified PCI device
+ Blocklisting of ports; prevent EAL from using specified PCI device
(multiple ``-b`` options are allowed).
* ``--use-device``:
diff --git a/doc/guides/linux_gsg/build_sample_apps.rst b/doc/guides/linux_gsg/build_sample_apps.rst
index 2f606535c374..ebc6e3e02d74 100644
--- a/doc/guides/linux_gsg/build_sample_apps.rst
+++ b/doc/guides/linux_gsg/build_sample_apps.rst
@@ -102,7 +102,7 @@ The EAL options are as follows:
Number of memory channels per processor socket.
* ``-b <domain:bus:devid.func>``:
- Blacklisting of ports; prevent EAL from using specified PCI device
+ Blocklisting of ports; prevent EAL from using specified PCI device
(multiple ``-b`` options are allowed).
* ``--use-device``:
diff --git a/doc/guides/linux_gsg/eal_args.include.rst b/doc/guides/linux_gsg/eal_args.include.rst
index 0fe44579689b..41f399ccd608 100644
--- a/doc/guides/linux_gsg/eal_args.include.rst
+++ b/doc/guides/linux_gsg/eal_args.include.rst
@@ -44,20 +44,20 @@ Lcore-related options
Device-related options
~~~~~~~~~~~~~~~~~~~~~~
-* ``-b, --pci-blacklist <[domain:]bus:devid.func>``
+* ``-b, --pci-skip-probe <[domain:]bus:devid.func>``
- Blacklist a PCI device to prevent EAL from using it. Multiple -b options are
- allowed.
+ Skip probing a PCI device to prevent EAL from using it.
+ Multiple -b options are allowed.
.. Note::
- PCI blacklist cannot be used with ``-w`` option.
+ PCI skip probe cannot be used with the only list ``-w`` option.
-* ``-w, --pci-whitelist <[domain:]bus:devid.func>``
+* ``-w, --pci-only-list <[domain:]bus:devid.func>``
- Add a PCI device in white list.
+ Add a PCI device in to the list of probed devices.
.. Note::
- PCI whitelist cannot be used with ``-b`` option.
+ PCI only list cannot be used with the skip probe ``-b`` option.
* ``--vdev <device arguments>``
diff --git a/doc/guides/linux_gsg/linux_drivers.rst b/doc/guides/linux_gsg/linux_drivers.rst
index 4eda3d5bf4fe..0c6f9f8572ee 100644
--- a/doc/guides/linux_gsg/linux_drivers.rst
+++ b/doc/guides/linux_gsg/linux_drivers.rst
@@ -104,11 +104,11 @@ parameter ``--vfio-vf-token``.
3. echo 2 > /sys/bus/pci/devices/0000:86:00.0/sriov_numvfs
4. Start the PF:
- ./x86_64-native-linux-gcc/app/testpmd -l 22-25 -n 4 -w 86:00.0 \
+ ./x86_64-native-linux-gcc/app/testpmd -l 22-25 -n 4 -i 86:00.0 \
--vfio-vf-token=14d63f20-8445-11ea-8900-1f9ce7d5650d --file-prefix=pf -- -i
5. Start the VF:
- ./x86_64-native-linux-gcc/app/testpmd -l 26-29 -n 4 -w 86:02.0 \
+ ./x86_64-native-linux-gcc/app/testpmd -l 26-29 -n 4 -i 86:02.0 \
--vfio-vf-token=14d63f20-8445-11ea-8900-1f9ce7d5650d --file-prefix=vf0 -- -i
Also, to use VFIO, both kernel and BIOS must support and be configured to use IO virtualization (such as Intel® VT-d).
diff --git a/doc/guides/mempool/octeontx2.rst b/doc/guides/mempool/octeontx2.rst
index 49b45a04e8ec..507591d809c6 100644
--- a/doc/guides/mempool/octeontx2.rst
+++ b/doc/guides/mempool/octeontx2.rst
@@ -50,7 +50,7 @@ Runtime Config Options
for the application.
For example::
- -w 0002:02:00.0,max_pools=512
+ -i 0002:02:00.0,max_pools=512
With the above configuration, the driver will set up only 512 mempools for
the given application to save HW resources.
@@ -69,7 +69,7 @@ Runtime Config Options
For example::
- -w 0002:02:00.0,npa_lock_mask=0xf
+ -i 0002:02:00.0,npa_lock_mask=0xf
Debugging Options
~~~~~~~~~~~~~~~~~
diff --git a/doc/guides/nics/bnxt.rst b/doc/guides/nics/bnxt.rst
index 6ff75d0a25e9..716d02beba3c 100644
--- a/doc/guides/nics/bnxt.rst
+++ b/doc/guides/nics/bnxt.rst
@@ -259,7 +259,7 @@ Unicast MAC Filter
^^^^^^^^^^^^^^^^^^
The application adds (or removes) MAC addresses to enable (or disable)
-whitelist filtering to accept packets.
+allowlist filtering to accept packets.
.. code-block:: console
@@ -270,7 +270,7 @@ Multicast MAC Filter
^^^^^^^^^^^^^^^^^^^^
Application adds (or removes) Multicast addresses to enable (or disable)
-whitelist filtering to accept packets.
+allowlist filtering to accept packets.
.. code-block:: console
@@ -278,7 +278,7 @@ whitelist filtering to accept packets.
testpmd> mcast_addr (add|remove) (port_id) (XX:XX:XX:XX:XX:XX)
Application adds (or removes) Multicast addresses to enable (or disable)
-whitelist filtering to accept packets.
+allowlist filtering to accept packets.
Note that the BNXT PMD supports up to 16 MC MAC filters. if the user adds more
than 16 MC MACs, the BNXT PMD puts the port into the Allmulticast mode.
diff --git a/doc/guides/nics/cxgbe.rst b/doc/guides/nics/cxgbe.rst
index 54a4c138998c..870904cfd9b0 100644
--- a/doc/guides/nics/cxgbe.rst
+++ b/doc/guides/nics/cxgbe.rst
@@ -40,8 +40,8 @@ expose a single PCI bus address, thus, librte_pmd_cxgbe registers
itself as a PCI driver that allocates one Ethernet device per detected
port.
-For this reason, one cannot whitelist/blacklist a single port without
-whitelisting/blacklisting the other ports on the same device.
+For this reason, one cannot allowlist/blocklist a single port without
+allowlisting/blocklisting the other ports on the same device.
.. _t5-nics:
@@ -112,7 +112,7 @@ be passed as part of EAL arguments. For example,
.. code-block:: console
- testpmd -w 02:00.4,keep_ovlan=1 -- -i
+ testpmd -i 02:00.4,keep_ovlan=1 -- -i
Common Runtime Options
^^^^^^^^^^^^^^^^^^^^^^
@@ -317,7 +317,7 @@ CXGBE PF Only Runtime Options
.. code-block:: console
- testpmd -w 02:00.4,filtermode=0x88 -- -i
+ testpmd -i 02:00.4,filtermode=0x88 -- -i
- ``filtermask`` (default **0**)
@@ -344,7 +344,7 @@ CXGBE PF Only Runtime Options
.. code-block:: console
- testpmd -w 02:00.4,filtermode=0x88,filtermask=0x80 -- -i
+ testpmd -i 02:00.4,filtermode=0x88,filtermask=0x80 -- -i
.. _driver-compilation:
@@ -776,7 +776,7 @@ devices managed by librte_pmd_cxgbe in FreeBSD operating system.
.. code-block:: console
- ./x86_64-native-freebsd-clang/app/testpmd -l 0-3 -n 4 -w 0000:02:00.4 -- -i
+ ./x86_64-native-freebsd-clang/app/testpmd -l 0-3 -n 4 -i 0000:02:00.4 -- -i
Example output:
diff --git a/doc/guides/nics/dpaa.rst b/doc/guides/nics/dpaa.rst
index 17839a920e60..efcbb7207734 100644
--- a/doc/guides/nics/dpaa.rst
+++ b/doc/guides/nics/dpaa.rst
@@ -162,10 +162,10 @@ Manager.
this pool.
-Whitelisting & Blacklisting
+Allowlisting & Blocklisting
---------------------------
-For blacklisting a DPAA device, following commands can be used.
+For blocking a DPAA device, following commands can be used.
.. code-block:: console
diff --git a/doc/guides/nics/dpaa2.rst b/doc/guides/nics/dpaa2.rst
index fdfa6fdd5aea..91b5c59f8c0f 100644
--- a/doc/guides/nics/dpaa2.rst
+++ b/doc/guides/nics/dpaa2.rst
@@ -527,10 +527,10 @@ which are lower than logging ``level``.
Using ``pmd.net.dpaa2`` as log matching criteria, all PMD logs can be enabled
which are lower than logging ``level``.
-Whitelisting & Blacklisting
+Allowlisting & Blocklisting
---------------------------
-For blacklisting a DPAA2 device, following commands can be used.
+For blocking a DPAA2 device, following commands can be used.
.. code-block:: console
diff --git a/doc/guides/nics/enic.rst b/doc/guides/nics/enic.rst
index a28a7f4e477a..a67f169a87a8 100644
--- a/doc/guides/nics/enic.rst
+++ b/doc/guides/nics/enic.rst
@@ -187,14 +187,14 @@ or ``vfio`` in non-IOMMU mode.
In the VM, the kernel enic driver may be automatically bound to the VF during
boot. Unbinding it currently hangs due to a known issue with the driver. To
-work around the issue, blacklist the enic module as follows.
+work around the issue, blocklist the enic module as follows.
Please see :ref:`Limitations <enic_limitations>` for limitations in
the use of SR-IOV.
.. code-block:: console
# cat /etc/modprobe.d/enic.conf
- blacklist enic
+ blocklist enic
# dracut --force
@@ -312,7 +312,7 @@ enables overlay offload, it prints the following message on the console.
By default, PMD enables overlay offload if hardware supports it. To disable
it, set ``devargs`` parameter ``disable-overlay=1``. For example::
- -w 12:00.0,disable-overlay=1
+ -i 12:00.0,disable-overlay=1
By default, the NIC uses 4789 as the VXLAN port. The user may change
it through ``rte_eth_dev_udp_tunnel_port_{add,delete}``. However, as
@@ -378,7 +378,7 @@ vectorized handler, take the following steps.
PMD consider the vectorized handler when selecting the receive handler.
For example::
- -w 12:00.0,enable-avx2-rx=1
+ -i 12:00.0,enable-avx2-rx=1
As the current implementation is intended for field trials, by default, the
vectorized handler is not considered (``enable-avx2-rx=0``).
@@ -427,7 +427,7 @@ DPDK as untagged packets. In this case mbuf->vlan_tci and the PKT_RX_VLAN and
PKT_RX_VLAN_STRIPPED mbuf flags would not be set. This mode is enabled with the
``devargs`` parameter ``ig-vlan-rewrite=untag``. For example::
- -w 12:00.0,ig-vlan-rewrite=untag
+ -i 12:00.0,ig-vlan-rewrite=untag
- **SR-IOV**
@@ -437,7 +437,7 @@ PKT_RX_VLAN_STRIPPED mbuf flags would not be set. This mode is enabled with the
- VF devices are not usable directly from the host. They can only be used
as assigned devices on VM instances.
- Currently, unbind of the ENIC kernel mode driver 'enic.ko' on the VM
- instance may hang. As a workaround, enic.ko should be blacklisted or removed
+ instance may hang. As a workaround, enic.ko should be blocklisted or removed
from the boot process.
- pci_generic cannot be used as the uio module in the VM. igb_uio or
vfio in non-IOMMU mode can be used.
diff --git a/doc/guides/nics/fail_safe.rst b/doc/guides/nics/fail_safe.rst
index b4a92f663b17..46d1224b048f 100644
--- a/doc/guides/nics/fail_safe.rst
+++ b/doc/guides/nics/fail_safe.rst
@@ -68,11 +68,11 @@ Fail-safe command line parameters
.. note::
- In case where the sub-device is also used as a whitelist device, using ``-w``
+ In case where the sub-device is also used as a allowlist device, using ``-w``
on the EAL command line, the fail-safe PMD will use the device with the
options provided to the EAL instead of its own parameters.
- When trying to use a PCI device automatically probed by the blacklist mode,
+ When trying to use a PCI device automatically probed by the blocklist mode,
the name for the fail-safe sub-device must be the full PCI id:
Domain:Bus:Device.Function, *i.e.* ``00:00:00.0`` instead of ``00:00.0``,
as the second form is historically accepted by the DPDK.
@@ -123,28 +123,28 @@ This section shows some example of using **testpmd** with a fail-safe PMD.
#. To build a PMD and configure DPDK, refer to the document
:ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>`.
-#. Start testpmd. The sub-device ``84:00.0`` should be blacklisted from normal EAL
- operations to avoid probing it twice, as the PCI bus is in blacklist mode.
+#. Start testpmd. The sub-device ``84:00.0`` should be blocklisted from normal EAL
+ operations to avoid probing it twice, as the PCI bus is in blocklist mode.
.. code-block:: console
$RTE_TARGET/build/app/testpmd -c 0xff -n 4 \
--vdev 'net_failsafe0,mac=de:ad:be:ef:01:02,dev(84:00.0),dev(net_ring0)' \
- -b 84:00.0 -b 00:04.0 -- -i
+ -x 84:00.0 -x 00:04.0 -- -i
- If the sub-device ``84:00.0`` is not blacklisted, it will be probed by the
+ If the sub-device ``84:00.0`` is not blocklisted, it will be probed by the
EAL first. When the fail-safe then tries to initialize it the probe operation
fails.
- Note that PCI blacklist mode is the default PCI operating mode.
+ Note that PCI blocklist mode is the default PCI operating mode.
-#. Alternatively, it can be used alongside any other device in whitelist mode.
+#. Alternatively, it can be used alongside any other device in allowlist mode.
.. code-block:: console
$RTE_TARGET/build/app/testpmd -c 0xff -n 4 \
--vdev 'net_failsafe0,mac=de:ad:be:ef:01:02,dev(84:00.0),dev(net_ring0)' \
- -w 81:00.0 -- -i
+ -i 81:00.0 -- -i
#. Start testpmd using a flexible device definition
@@ -155,9 +155,9 @@ This section shows some example of using **testpmd** with a fail-safe PMD.
#. Start testpmd, automatically probing the device 84:00.0 and using it with
the fail-safe.
-
+
.. code-block:: console
-
+
$RTE_TARGET/build/app/testpmd -c 0xff -n 4 \
--vdev 'net_failsafe0,dev(0000:84:00.0),dev(net_ring0)' -- -i
diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
index edd21c4d8e9d..6aecead6e019 100644
--- a/doc/guides/nics/features.rst
+++ b/doc/guides/nics/features.rst
@@ -247,7 +247,7 @@ Supports enabling/disabling receiving multicast frames.
Unicast MAC filter
------------------
-Supports adding MAC addresses to enable whitelist filtering to accept packets.
+Supports adding MAC addresses to enable allowlist filtering to accept packets.
* **[implements] eth_dev_ops**: ``mac_addr_set``, ``mac_addr_add``, ``mac_addr_remove``.
* **[implements] rte_eth_dev_data**: ``mac_addrs``.
diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst
index cf1ae2d0b043..4f4072b3bf6f 100644
--- a/doc/guides/nics/i40e.rst
+++ b/doc/guides/nics/i40e.rst
@@ -203,7 +203,7 @@ Runtime Config Options
Adapter with both Linux kernel and DPDK PMD. To fix this issue, ``devargs``
parameter ``support-multi-driver`` is introduced, for example::
- -w 84:00.0,support-multi-driver=1
+ -i 84:00.0,support-multi-driver=1
With the above configuration, DPDK PMD will not change global registers, and
will switch PF interrupt from IntN to Int0 to avoid interrupt conflict between
@@ -230,7 +230,7 @@ Runtime Config Options
since it can get better perf in some real work loading cases. So ``devargs`` param
``use-latest-supported-vec`` is introduced, for example::
- -w 84:00.0,use-latest-supported-vec=1
+ -i 84:00.0,use-latest-supported-vec=1
- ``Enable validation for VF message`` (default ``not enabled``)
@@ -240,7 +240,7 @@ Runtime Config Options
Format -- "maximal-message@period-seconds:ignore-seconds"
For example::
- -w 84:00.0,vf_msg_cfg=80@120:180
+ -i 84:00.0,vf_msg_cfg=80@120:180
Vector RX Pre-conditions
~~~~~~~~~~~~~~~~~~~~~~~~
@@ -475,7 +475,7 @@ no physical uplink on the associated NIC port.
To enable this feature, the user should pass a ``devargs`` parameter to the
EAL, for example::
- -w 84:00.0,enable_floating_veb=1
+ -i 84:00.0,enable_floating_veb=1
In this configuration the PMD will use the floating VEB feature for all the
VFs created by this PF device.
@@ -483,7 +483,7 @@ VFs created by this PF device.
Alternatively, the user can specify which VFs need to connect to this floating
VEB using the ``floating_veb_list`` argument::
- -w 84:00.0,enable_floating_veb=1,floating_veb_list=1;3-4
+ -i 84:00.0,enable_floating_veb=1,floating_veb_list=1;3-4
In this example ``VF1``, ``VF3`` and ``VF4`` connect to the floating VEB,
while other VFs connect to the normal VEB.
@@ -809,7 +809,7 @@ See :numref:`figure_intel_perf_test_setup` for the performance test setup.
7. The command line of running l3fwd would be something like the following::
- ./l3fwd -l 18-21 -n 4 -w 82:00.0 -w 85:00.0 \
+ ./l3fwd -l 18-21 -n 4 -i 82:00.0 -i 85:00.0 \
-- -p 0x3 --config '(0,0,18),(0,1,19),(1,0,20),(1,1,21)'
This means that the application uses core 18 for port 0, queue pair 0 forwarding, core 19 for port 0, queue pair 1 forwarding,
diff --git a/doc/guides/nics/ice.rst b/doc/guides/nics/ice.rst
index 9a9f4a6bb093..93eb2f0c2264 100644
--- a/doc/guides/nics/ice.rst
+++ b/doc/guides/nics/ice.rst
@@ -50,7 +50,7 @@ Runtime Config Options
But if user intend to use the device without OS package, user can take ``devargs``
parameter ``safe-mode-support``, for example::
- -w 80:00.0,safe-mode-support=1
+ -i 80:00.0,safe-mode-support=1
Then the driver will be initialized successfully and the device will enter Safe Mode.
NOTE: In Safe mode, only very limited features are available, features like RSS,
@@ -61,7 +61,7 @@ Runtime Config Options
In pipeline mode, a flow can be set at one specific stage by setting parameter
``priority``. Currently, we support two stages: priority = 0 or !0. Flows with
priority 0 located at the first pipeline stage which typically be used as a firewall
- to drop the packet on a blacklist(we called it permission stage). At this stage,
+ to drop the packet on a blocklist(we called it permission stage). At this stage,
flow rules are created for the device's exact match engine: switch. Flows with priority
!0 located at the second stage, typically packets are classified here and be steered to
specific queue or queue group (we called it distribution stage), At this stage, flow
@@ -73,7 +73,7 @@ Runtime Config Options
use pipeline mode by setting ``devargs`` parameter ``pipeline-mode-support``,
for example::
- -w 80:00.0,pipeline-mode-support=1
+ -i 80:00.0,pipeline-mode-support=1
- ``Flow Mark Support`` (default ``0``)
@@ -85,7 +85,7 @@ Runtime Config Options
2) a new offload like RTE_DEV_RX_OFFLOAD_FLOW_MARK be introduced as a standard way to hint.
Example::
- -w 80:00.0,flow-mark-support=1
+ -i 80:00.0,flow-mark-support=1
- ``Protocol extraction for per queue``
@@ -94,8 +94,8 @@ Runtime Config Options
The argument format is::
- -w 18:00.0,proto_xtr=<queues:protocol>[<queues:protocol>...]
- -w 18:00.0,proto_xtr=<protocol>
+ -i 18:00.0,proto_xtr=<queues:protocol>[<queues:protocol>...]
+ -i 18:00.0,proto_xtr=<protocol>
Queues are grouped by ``(`` and ``)`` within the group. The ``-`` character
is used as a range separator and ``,`` is used as a single number separator.
@@ -106,14 +106,14 @@ Runtime Config Options
.. code-block:: console
- testpmd -w 18:00.0,proto_xtr='[(1,2-3,8-9):tcp,10-13:vlan]'
+ testpmd -i 18:00.0,proto_xtr='[(1,2-3,8-9):tcp,10-13:vlan]'
This setting means queues 1, 2-3, 8-9 are TCP extraction, queues 10-13 are
VLAN extraction, other queues run with no protocol extraction.
.. code-block:: console
- testpmd -w 18:00.0,proto_xtr=vlan,proto_xtr='[(1,2-3,8-9):tcp,10-23:ipv6]'
+ testpmd -i 18:00.0,proto_xtr=vlan,proto_xtr='[(1,2-3,8-9):tcp,10-23:ipv6]'
This setting means queues 1, 2-3, 8-9 are TCP extraction, queues 10-23 are
IPv6 extraction, other queues use the default VLAN extraction.
@@ -253,7 +253,7 @@ responses for the same from PF.
#. Bind the VF0, and run testpmd with 'cap=dcf' devarg::
- testpmd -l 22-25 -n 4 -w 18:01.0,cap=dcf -- -i
+ testpmd -l 22-25 -n 4 -i 18:01.0,cap=dcf -- -i
#. Monitor the VF2 interface network traffic::
diff --git a/doc/guides/nics/mlx4.rst b/doc/guides/nics/mlx4.rst
index 1f1e2f6c7767..f445dd51a65f 100644
--- a/doc/guides/nics/mlx4.rst
+++ b/doc/guides/nics/mlx4.rst
@@ -29,8 +29,8 @@ Most Mellanox ConnectX-3 devices provide two ports but expose a single PCI
bus address, thus unlike most drivers, librte_pmd_mlx4 registers itself as a
PCI driver that allocates one Ethernet device per detected port.
-For this reason, one cannot white/blacklist a single port without also
-white/blacklisting the others on the same device.
+For this reason, one cannot white/blocklist a single port without also
+white/blocklisting the others on the same device.
Besides its dependency on libibverbs (that implies libmlx4 and associated
kernel support), librte_pmd_mlx4 relies heavily on system calls for control
@@ -422,7 +422,7 @@ devices managed by librte_pmd_mlx4.
eth4
eth5
-#. Optionally, retrieve their PCI bus addresses for whitelisting::
+#. Optionally, retrieve their PCI bus addresses for allowlisting::
{
for intf in eth2 eth3 eth4 eth5;
@@ -434,10 +434,10 @@ devices managed by librte_pmd_mlx4.
Example output::
- -w 0000:83:00.0
- -w 0000:83:00.0
- -w 0000:84:00.0
- -w 0000:84:00.0
+ -i 0000:83:00.0
+ -i 0000:83:00.0
+ -i 0000:84:00.0
+ -i 0000:84:00.0
.. note::
@@ -450,7 +450,7 @@ devices managed by librte_pmd_mlx4.
#. Start testpmd with basic parameters::
- testpmd -l 8-15 -n 4 -w 0000:83:00.0 -w 0000:84:00.0 -- --rxq=2 --txq=2 -i
+ testpmd -l 8-15 -n 4 -i 0000:83:00.0 -i 0000:84:00.0 -- --rxq=2 --txq=2 -i
Example output::
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 4b6d8fb4d55b..bafabba518b5 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -1454,7 +1454,7 @@ ConnectX-4/ConnectX-5/ConnectX-6/BlueField devices managed by librte_pmd_mlx5.
eth32
eth33
-#. Optionally, retrieve their PCI bus addresses for whitelisting::
+#. Optionally, retrieve their PCI bus addresses for allowlisting::
{
for intf in eth2 eth3 eth4 eth5;
@@ -1466,10 +1466,10 @@ ConnectX-4/ConnectX-5/ConnectX-6/BlueField devices managed by librte_pmd_mlx5.
Example output::
- -w 0000:05:00.1
- -w 0000:06:00.0
- -w 0000:06:00.1
- -w 0000:05:00.0
+ -i 0000:05:00.1
+ -i 0000:06:00.0
+ -i 0000:06:00.1
+ -i 0000:05:00.0
#. Request huge pages::
@@ -1477,7 +1477,7 @@ ConnectX-4/ConnectX-5/ConnectX-6/BlueField devices managed by librte_pmd_mlx5.
#. Start testpmd with basic parameters::
- testpmd -l 8-15 -n 4 -w 05:00.0 -w 05:00.1 -w 06:00.0 -w 06:00.1 -- --rxq=2 --txq=2 -i
+ testpmd -l 8-15 -n 4 -i 05:00.0 -i 05:00.1 -i 06:00.0 -i 06:00.1 -- --rxq=2 --txq=2 -i
Example output::
diff --git a/doc/guides/nics/octeontx2.rst b/doc/guides/nics/octeontx2.rst
index bb591a8b7e65..3d382446d1d1 100644
--- a/doc/guides/nics/octeontx2.rst
+++ b/doc/guides/nics/octeontx2.rst
@@ -74,7 +74,7 @@ use arm64-octeontx2-linux-gcc as target.
.. code-block:: console
- ./build/app/testpmd -c 0x300 -w 0002:02:00.0 -- --portmask=0x1 --nb-cores=1 --port-topology=loop --rxq=1 --txq=1
+ ./build/app/testpmd -c 0x300 -i 0002:02:00.0 -- --portmask=0x1 --nb-cores=1 --port-topology=loop --rxq=1 --txq=1
EAL: Detected 24 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
@@ -127,7 +127,7 @@ Runtime Config Options
For example::
- -w 0002:02:00.0,reta_size=256
+ -i 0002:02:00.0,reta_size=256
With the above configuration, reta table of size 256 is populated.
@@ -138,7 +138,7 @@ Runtime Config Options
For example::
- -w 0002:02:00.0,flow_max_priority=10
+ -i 0002:02:00.0,flow_max_priority=10
With the above configuration, priority level was set to 10 (0-9). Max
priority level supported is 32.
@@ -150,7 +150,7 @@ Runtime Config Options
For example::
- -w 0002:02:00.0,flow_prealloc_size=4
+ -i 0002:02:00.0,flow_prealloc_size=4
With the above configuration, pre alloc size was set to 4. Max pre alloc
size supported is 32.
@@ -162,7 +162,7 @@ Runtime Config Options
For example::
- -w 0002:02:00.0,max_sqb_count=64
+ -i 0002:02:00.0,max_sqb_count=64
With the above configuration, each send queue's decscriptor buffer count is
limited to a maximum of 64 buffers.
@@ -174,7 +174,7 @@ Runtime Config Options
For example::
- -w 0002:02:00.0,switch_header="higig2"
+ -i 0002:02:00.0,switch_header="higig2"
With the above configuration, higig2 will be enabled on that port and the
traffic on this port should be higig2 traffic only. Supported switch header
@@ -196,7 +196,7 @@ Runtime Config Options
For example to select the legacy mode(RSS tag adder as XOR)::
- -w 0002:02:00.0,tag_as_xor=1
+ -i 0002:02:00.0,tag_as_xor=1
- ``Max SPI for inbound inline IPsec`` (default ``1``)
@@ -205,7 +205,7 @@ Runtime Config Options
For example::
- -w 0002:02:00.0,ipsec_in_max_spi=128
+ -i 0002:02:00.0,ipsec_in_max_spi=128
With the above configuration, application can enable inline IPsec processing
on 128 SAs (SPI 0-127).
@@ -216,7 +216,7 @@ Runtime Config Options
For example::
- -w 0002:02:00.0,lock_rx_ctx=1
+ -i 0002:02:00.0,lock_rx_ctx=1
- ``Lock Tx contexts in NDC cache``
@@ -224,7 +224,7 @@ Runtime Config Options
For example::
- -w 0002:02:00.0,lock_tx_ctx=1
+ -i 0002:02:00.0,lock_tx_ctx=1
.. note::
@@ -240,7 +240,7 @@ Runtime Config Options
For example::
- -w 0002:02:00.0,npa_lock_mask=0xf
+ -i 0002:02:00.0,npa_lock_mask=0xf
.. _otx2_tmapi:
diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst
index be1c2fe1d67e..44115a666a94 100644
--- a/doc/guides/nics/sfc_efx.rst
+++ b/doc/guides/nics/sfc_efx.rst
@@ -290,7 +290,7 @@ Per-Device Parameters
~~~~~~~~~~~~~~~~~~~~~
The following per-device parameters can be passed via EAL PCI device
-whitelist option like "-w 02:00.0,arg1=value1,...".
+allowlist option like "-w 02:00.0,arg1=value1,...".
Case-insensitive 1/y/yes/on or 0/n/no/off may be used to specify
boolean parameters value.
diff --git a/doc/guides/nics/tap.rst b/doc/guides/nics/tap.rst
index 7e44f846206c..b5a0f51988aa 100644
--- a/doc/guides/nics/tap.rst
+++ b/doc/guides/nics/tap.rst
@@ -183,15 +183,15 @@ following::
sudo ./app/app/x86_64-native-linux-gcc/app/pktgen -l 1-5 -n 4 \
--proc-type auto --log-level debug --socket-mem 512,512 --file-prefix pg \
- --vdev=net_tap0 --vdev=net_tap1 -b 05:00.0 -b 05:00.1 \
- -b 04:00.0 -b 04:00.1 -b 04:00.2 -b 04:00.3 \
- -b 81:00.0 -b 81:00.1 -b 81:00.2 -b 81:00.3 \
- -b 82:00.0 -b 83:00.0 -- -T -P -m [2:3].0 -m [4:5].1 \
+ --vdev=net_tap0 --vdev=net_tap1 -x 05:00.0 -x 05:00.1 \
+ -x 04:00.0 -x 04:00.1 -x 04:00.2 -x 04:00.3 \
+ -x 81:00.0 -x 81:00.1 -x 81:00.2 -x 81:00.3 \
+ -x 82:00.0 -x 83:00.0 -- -T -P -m [2:3].0 -m [4:5].1 \
-f themes/black-yellow.theme
.. Note:
- Change the ``-b`` options to blacklist all of your physical ports. The
+ Change the ``-x`` options to exclude all of your physical ports. The
following command line is all one line.
Also, ``-f themes/black-yellow.theme`` is optional if the default colors
diff --git a/doc/guides/nics/thunderx.rst b/doc/guides/nics/thunderx.rst
index f42133e5464d..f1b27a3f269c 100644
--- a/doc/guides/nics/thunderx.rst
+++ b/doc/guides/nics/thunderx.rst
@@ -178,7 +178,7 @@ This section provides instructions to configure SR-IOV with Linux OS.
.. code-block:: console
- ./arm64-thunderx-linux-gcc/app/testpmd -l 0-3 -n 4 -w 0002:01:00.2 \
+ ./arm64-thunderx-linux-gcc/app/testpmd -l 0-3 -n 4 -i 0002:01:00.2 \
-- -i --no-flush-rx \
--port-topology=loop
@@ -398,7 +398,7 @@ This scheme is useful when application would like to insert vlan header without
Example:
.. code-block:: console
- -w 0002:01:00.2,skip_data_bytes=8
+ -i 0002:01:00.2,skip_data_bytes=8
Limitations
-----------
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index f64ae953d106..5965c15baa43 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -407,12 +407,11 @@ device having emitted a Device Removal Event. In such case, calling
callback. Care must be taken not to close the device from the interrupt handler
context. It is necessary to reschedule such closing operation.
-Blacklisting
+Blocklisting
~~~~~~~~~~~~
-The EAL PCI device blacklist functionality can be used to mark certain NIC ports as blacklisted,
-so they are ignored by the DPDK.
-The ports to be blacklisted are identified using the PCIe* description (Domain:Bus:Device.Function).
+The EAL PCI device blocklist functionality can be used to mark certain NIC ports as unavailale, so they are ignored by the DPDK.
+The ports to be blocklisted are identified using the PCIe* description (Domain:Bus:Device.Function).
Misc Functions
~~~~~~~~~~~~~~
diff --git a/doc/guides/prog_guide/multi_proc_support.rst b/doc/guides/prog_guide/multi_proc_support.rst
index a84083b96c8a..14cb6db85661 100644
--- a/doc/guides/prog_guide/multi_proc_support.rst
+++ b/doc/guides/prog_guide/multi_proc_support.rst
@@ -30,7 +30,7 @@ after a primary process has already configured the hugepage shared memory for th
Secondary processes should run alongside primary process with same DPDK version.
Secondary processes which requires access to physical devices in Primary process, must
- be passed with the same whitelist and blacklist options.
+ be passed with the same allowlist and blocklist options.
To support these two process types, and other multi-process setups described later,
two additional command-line parameters are available to the EAL:
@@ -131,7 +131,7 @@ can use).
.. note::
Independent DPDK instances running side-by-side on a single machine cannot share any network ports.
- Any network ports being used by one process should be blacklisted in every other process.
+ Any network ports being used by one process should be blocklisted in every other process.
Running Multiple Independent Groups of DPDK Applications
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
diff --git a/doc/guides/rel_notes/known_issues.rst b/doc/guides/rel_notes/known_issues.rst
index de0782136d3c..83a3b38c0ae0 100644
--- a/doc/guides/rel_notes/known_issues.rst
+++ b/doc/guides/rel_notes/known_issues.rst
@@ -523,8 +523,8 @@ Devices bound to igb_uio with VT-d enabled do not work on Linux kernel 3.15-3.17
DMAR:[fault reason 02] Present bit in context entry is clear
**Resolution/Workaround**:
- Use earlier or later kernel versions, or avoid driver binding on boot by blacklisting the driver modules.
- I.e., in the case of ``ixgbe``, we can pass the kernel command line option: ``modprobe.blacklist=ixgbe``.
+ Use earlier or later kernel versions, or avoid driver binding on boot by blocklisting the driver modules.
+ I.e., in the case of ``ixgbe``, we can pass the kernel command line option: ``modprobe.blocklist=ixgbe``.
This way we do not need to unbind the device to bind it to igb_uio.
**Affected Environment/Platform**:
diff --git a/doc/guides/rel_notes/release_20_08.rst b/doc/guides/rel_notes/release_20_08.rst
index f19b748728e4..b9509f657b30 100644
--- a/doc/guides/rel_notes/release_20_08.rst
+++ b/doc/guides/rel_notes/release_20_08.rst
@@ -261,6 +261,12 @@ API Changes
* vhost: The API of ``rte_vhost_host_notifier_ctrl`` was changed to be per
queue and not per device, a qid parameter was added to the arguments list.
+* eal: The definitions related to including and excluding devices
+ has been changed from blacklist/whitelist to include/exclude.
+ There are compatiablity macros and command line mapping to accept
+ the old values but applications and scripts are strongly encouraged
+ to migrate to the new names.
+
ABI Changes
-----------
diff --git a/doc/guides/rel_notes/release_2_1.rst b/doc/guides/rel_notes/release_2_1.rst
index beadc51ba438..6339172c64fa 100644
--- a/doc/guides/rel_notes/release_2_1.rst
+++ b/doc/guides/rel_notes/release_2_1.rst
@@ -472,7 +472,7 @@ Resolved Issues
* **devargs: Fix crash on failure.**
- This problem occurred when passing an invalid PCI id to the blacklist API in
+ This problem occurred when passing an invalid PCI id to the blocklist API in
devargs.
diff --git a/doc/guides/sample_app_ug/bbdev_app.rst b/doc/guides/sample_app_ug/bbdev_app.rst
index 405e706a46e4..b722d0263772 100644
--- a/doc/guides/sample_app_ug/bbdev_app.rst
+++ b/doc/guides/sample_app_ug/bbdev_app.rst
@@ -79,7 +79,7 @@ This means that HW baseband device/s must be bound to a DPDK driver or
a SW baseband device/s (virtual BBdev) must be created (using --vdev).
To run the application in linux environment with the turbo_sw baseband device
-using the whitelisted port running on 1 encoding lcore and 1 decoding lcore
+using the allowlisted port running on 1 encoding lcore and 1 decoding lcore
issue the command:
.. code-block:: console
@@ -90,7 +90,7 @@ issue the command:
where, NIC0PCIADDR is the PCI address of the Rx port
This command creates one virtual bbdev devices ``baseband_turbo_sw`` where the
-device gets linked to a corresponding ethernet port as whitelisted by
+device gets linked to a corresponding ethernet port as allowlisted by
the parameter -w.
3 cores are allocated to the application, and assigned as:
@@ -111,7 +111,7 @@ Using Packet Generator with baseband device sample application
To allow the bbdev sample app to do the loopback, an influx of traffic is required.
This can be done by using DPDK Pktgen to burst traffic on two ethernet ports, and
it will print the transmitted along with the looped-back traffic on Rx ports.
-Executing the command below will generate traffic on the two whitelisted ethernet
+Executing the command below will generate traffic on the two allowlisted ethernet
ports.
.. code-block:: console
diff --git a/doc/guides/sample_app_ug/ipsec_secgw.rst b/doc/guides/sample_app_ug/ipsec_secgw.rst
index 81c5d4360615..fac75819a762 100644
--- a/doc/guides/sample_app_ug/ipsec_secgw.rst
+++ b/doc/guides/sample_app_ug/ipsec_secgw.rst
@@ -329,15 +329,15 @@ This means that if the application is using a single core and both hardware
and software crypto devices are detected, hardware devices will be used.
A way to achieve the case where you want to force the use of virtual crypto
-devices is to whitelist the Ethernet devices needed and therefore implicitly
-blacklisting all hardware crypto devices.
+devices is to allowlist the Ethernet devices needed and therefore implicitly
+blocklisting all hardware crypto devices.
For example, something like the following command line:
.. code-block:: console
./build/ipsec-secgw -l 20,21 -n 4 --socket-mem 0,2048 \
- -w 81:00.0 -w 81:00.1 -w 81:00.2 -w 81:00.3 \
+ -i 81:00.0 -i 81:00.1 -i 81:00.2 -i 81:00.3 \
--vdev "crypto_aesni_mb" --vdev "crypto_null" \
-- \
-p 0xf -P -u 0x3 --config="(0,0,20),(1,0,20),(2,0,21),(3,0,21)" \
diff --git a/doc/guides/sample_app_ug/l3_forward.rst b/doc/guides/sample_app_ug/l3_forward.rst
index 07c8d44936d6..69a29ab1314e 100644
--- a/doc/guides/sample_app_ug/l3_forward.rst
+++ b/doc/guides/sample_app_ug/l3_forward.rst
@@ -148,7 +148,7 @@ or
In this command:
-* -w option whitelist the event device supported by platform. Way to pass this device may vary based on platform.
+* -w option allowlist the event device supported by platform. Way to pass this device may vary based on platform.
* The --mode option defines PMD to be used for packet I/O.
diff --git a/doc/guides/sample_app_ug/l3_forward_access_ctrl.rst b/doc/guides/sample_app_ug/l3_forward_access_ctrl.rst
index a44fbcd52c3a..473326275e49 100644
--- a/doc/guides/sample_app_ug/l3_forward_access_ctrl.rst
+++ b/doc/guides/sample_app_ug/l3_forward_access_ctrl.rst
@@ -18,7 +18,7 @@ The application loads two types of rules at initialization:
* Route information rules, which are used for L3 forwarding
-* Access Control List (ACL) rules that blacklist (or block) packets with a specific characteristic
+* Access Control List (ACL) rules that blocklist (or block) packets with a specific characteristic
When packets are received from a port,
the application extracts the necessary information from the TCP/IP header of the received packet and
diff --git a/doc/guides/sample_app_ug/l3_forward_power_man.rst b/doc/guides/sample_app_ug/l3_forward_power_man.rst
index 0cc6f2e62e75..4cc55004cca8 100644
--- a/doc/guides/sample_app_ug/l3_forward_power_man.rst
+++ b/doc/guides/sample_app_ug/l3_forward_power_man.rst
@@ -378,7 +378,7 @@ See :doc:`Power Management<../prog_guide/power_man>` chapter in the DPDK Program
.. code-block:: console
- ./l3fwd-power -l xxx -n 4 -w 0000:xx:00.0 -w 0000:xx:00.1 -- -p 0x3 -P --config="(0,0,xx),(1,0,xx)" --empty-poll="0,0,0" -l 14 -m 9 -h 1
+ ./l3fwd-power -l xxx -n 4 -i 0000:xx:00.0 -i 0000:xx:00.1 -- -p 0x3 -P --config="(0,0,xx),(1,0,xx)" --empty-poll="0,0,0" -l 14 -m 9 -h 1
Where,
diff --git a/doc/guides/sample_app_ug/vdpa.rst b/doc/guides/sample_app_ug/vdpa.rst
index d66a724827af..e388c738a1e3 100644
--- a/doc/guides/sample_app_ug/vdpa.rst
+++ b/doc/guides/sample_app_ug/vdpa.rst
@@ -52,7 +52,7 @@ Take IFCVF driver for example:
.. code-block:: console
./vdpa -c 0x2 -n 4 --socket-mem 1024,1024 \
- -w 0000:06:00.3,vdpa=1 -w 0000:06:00.4,vdpa=1 \
+ -i 0000:06:00.3,vdpa=1 -i 0000:06:00.4,vdpa=1 \
-- --interactive
.. note::
diff --git a/doc/guides/tools/cryptoperf.rst b/doc/guides/tools/cryptoperf.rst
index 28b729dbda8b..334a4f558abd 100644
--- a/doc/guides/tools/cryptoperf.rst
+++ b/doc/guides/tools/cryptoperf.rst
@@ -417,7 +417,7 @@ Call application for performance throughput test of single Aesni MB PMD
for cipher encryption aes-cbc and auth generation sha1-hmac,
one million operations, burst size 32, packet size 64::
- dpdk-test-crypto-perf -l 6-7 --vdev crypto_aesni_mb -w 0000:00:00.0 --
+ dpdk-test-crypto-perf -l 6-7 --vdev crypto_aesni_mb -i 0000:00:00.0 --
--ptest throughput --devtype crypto_aesni_mb --optype cipher-then-auth
--cipher-algo aes-cbc --cipher-op encrypt --cipher-key-sz 16 --auth-algo
sha1-hmac --auth-op generate --auth-key-sz 64 --digest-sz 12
@@ -427,7 +427,7 @@ Call application for performance latency test of two Aesni MB PMD executed
on two cores for cipher encryption aes-cbc, ten operations in silent mode::
dpdk-test-crypto-perf -l 4-7 --vdev crypto_aesni_mb1
- --vdev crypto_aesni_mb2 -w 0000:00:00.0 -- --devtype crypto_aesni_mb
+ --vdev crypto_aesni_mb2 -i 0000:00:00.0 -- --devtype crypto_aesni_mb
--cipher-algo aes-cbc --cipher-key-sz 16 --cipher-iv-sz 16
--cipher-op encrypt --optype cipher-only --silent
--ptest latency --total-ops 10
@@ -437,7 +437,7 @@ for cipher encryption aes-gcm and auth generation aes-gcm,ten operations
in silent mode, test vector provide in file "test_aes_gcm.data"
with packet verification::
- dpdk-test-crypto-perf -l 4-7 --vdev crypto_openssl -w 0000:00:00.0 --
+ dpdk-test-crypto-perf -l 4-7 --vdev crypto_openssl -i 0000:00:00.0 --
--devtype crypto_openssl --aead-algo aes-gcm --aead-key-sz 16
--aead-iv-sz 16 --aead-op encrypt --aead-aad-sz 16 --digest-sz 16
--optype aead --silent --ptest verify --total-ops 10
diff --git a/doc/guides/tools/flow-perf.rst b/doc/guides/tools/flow-perf.rst
index cdedaf9a97d4..c03681525e60 100644
--- a/doc/guides/tools/flow-perf.rst
+++ b/doc/guides/tools/flow-perf.rst
@@ -61,7 +61,7 @@ with a ``--`` separator:
.. code-block:: console
- sudo ./dpdk-test-flow_perf -n 4 -w 08:00.0 -- --ingress --ether --ipv4 --queue --flows-count=1000000
+ sudo ./dpdk-test-flow_perf -n 4 -i 08:00.0 -- --ingress --ether --ipv4 --queue --flows-count=1000000
The command line options are:
--
2.27.0
^ permalink raw reply [relevance 1%]
* [dpdk-dev] [RFC PATCH 0/2] Enable dyynamic configuration of subport bandwidth profile
@ 2020-07-15 18:27 3% Savinay Dharmappa
2020-07-16 8:14 0% ` Singh, Jasvinder
0 siblings, 1 reply; 200+ results
From: Savinay Dharmappa @ 2020-07-15 18:27 UTC (permalink / raw)
To: savinay.dharmappa, jasvinder.singh, dev
DPDK sched library allows runtime configuration of the pipe profiles to the
pipes of the subport once scheduler hierarchy is constructed. However, to
change the subport level bandwidth, existing hierarchy needs to be dismantled
and whole process of building hierarchy under subport nodes needs to be
repeated which might result in router downtime. Furthermore, due to lack of
dynamic configuration of the subport bandwidth profile configuration
(shaper and Traffic class rates), the user application is unable to dynamically
re-distribute the excess-bandwidth of one subport among other subports in the
scheduler hierarchy. Therefore, it is also not possible to adjust the subport
bandwidth profile in sync with dynamic changes in pipe profiles of subscribers
who want to consume higher bandwidth opportunistically.
This RFC proposes dynamic configuration of the subport bandwidth profile to
overcome the runtime situation when group of subscribers are not using the
allotted bandwidth and dynamic bandwidth re-distribution is needed the without
making any structural changes in the hierarchy.
The implementation work includes refactoring the existing data structures
defined for port and subport level, new APIs for adding subport level
bandwidth profiles that can be used in runtime which causes API/ABI change.
Therefore, deprecation notice will be sent out soon.
Savinay Dharmappa (2):
sched: add dynamic config of subport bandwidth profile
example/qos_sched: subport bandwidth dynmaic conf
examples/qos_sched/cfg_file.c | 158 ++++++-----
examples/qos_sched/cfg_file.h | 4 +
examples/qos_sched/init.c | 24 +-
examples/qos_sched/main.h | 1 +
examples/qos_sched/profile.cfg | 3 +
lib/librte_sched/rte_sched.c | 486 ++++++++++++++++++++++++---------
lib/librte_sched/rte_sched.h | 82 +++++-
lib/librte_sched/rte_sched_version.map | 2 +
8 files changed, 544 insertions(+), 216 deletions(-)
--
2.7.4
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v4 1/2] mbuf: use C11 atomic built-ins for refcnt operations
2020-07-15 12:29 0% ` David Marchand
2020-07-15 12:49 0% ` Aaron Conole
@ 2020-07-15 16:29 0% ` Phil Yang
2020-07-16 4:16 0% ` Phil Yang
2 siblings, 0 replies; 200+ results
From: Phil Yang @ 2020-07-15 16:29 UTC (permalink / raw)
To: David Marchand
Cc: Olivier Matz, dev, Stephen Hemminger, David Christensen,
Honnappa Nagarahalli, Ruifeng Wang, nd, Dodji Seketeli,
Aaron Conole, nd
> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Wednesday, July 15, 2020 8:29 PM
> To: Phil Yang <Phil.Yang@arm.com>
> Cc: Olivier Matz <olivier.matz@6wind.com>; dev <dev@dpdk.org>; Stephen
> Hemminger <stephen@networkplumber.org>; David Christensen
> <drc@linux.vnet.ibm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; nd <nd@arm.com>; Dodji Seketeli
> <dodji@redhat.com>; Aaron Conole <aconole@redhat.com>
> Subject: Re: [PATCH v4 1/2] mbuf: use C11 atomic built-ins for refcnt
> operations
>
> On Thu, Jul 9, 2020 at 5:59 PM Phil Yang <phil.yang@arm.com> wrote:
> >
> > Use C11 atomic built-ins with explicit ordering instead of rte_atomic
> > ops which enforce unnecessary barriers on aarch64.
> >
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > ---
> > v4:
> > 1. Add union for refcnt_atomic and refcnt in rte_mbuf_ext_shared_info
> > to avoid ABI breakage. (Olivier)
> > 2. Add notice of refcnt_atomic deprecation. (Honnappa)
>
> v4 does not pass the checks (in both my env, and Travis).
> https://travis-ci.com/github/ovsrobot/dpdk/jobs/359393389#L2405
I had tested with test-meson-builds in my env and it didn't give any error message. The reference version is v20.05.
$ DPDK_ABI_REF_DIR=$PWD/reference DPDK_ABI_REF_VERSION=v20.05 ./devtools/test-meson-builds.sh
It seems to be a problem with my test environment.
I will fix this problem as soon as possible.
>
> It seems the robot had a hiccup as I can't see a report in the test-report ml.
>
>
> --
> David Marchand
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v4 1/2] mbuf: use C11 atomic built-ins for refcnt operations
2020-07-15 12:29 0% ` David Marchand
@ 2020-07-15 12:49 0% ` Aaron Conole
2020-07-15 16:29 0% ` Phil Yang
2020-07-16 4:16 0% ` Phil Yang
2 siblings, 0 replies; 200+ results
From: Aaron Conole @ 2020-07-15 12:49 UTC (permalink / raw)
To: David Marchand
Cc: Phil Yang, Olivier Matz, dev, Stephen Hemminger,
David Christensen, Honnappa Nagarahalli,
Ruifeng Wang (Arm Technology China),
nd, Dodji Seketeli
David Marchand <david.marchand@redhat.com> writes:
> On Thu, Jul 9, 2020 at 5:59 PM Phil Yang <phil.yang@arm.com> wrote:
>>
>> Use C11 atomic built-ins with explicit ordering instead of rte_atomic
>> ops which enforce unnecessary barriers on aarch64.
>>
>> Signed-off-by: Phil Yang <phil.yang@arm.com>
>> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
>> ---
>> v4:
>> 1. Add union for refcnt_atomic and refcnt in rte_mbuf_ext_shared_info
>> to avoid ABI breakage. (Olivier)
>> 2. Add notice of refcnt_atomic deprecation. (Honnappa)
>
> v4 does not pass the checks (in both my env, and Travis).
> https://travis-ci.com/github/ovsrobot/dpdk/jobs/359393389#L2405
>
> It seems the robot had a hiccup as I can't see a report in the test-report ml.
Hrrm... that has been happening quite a bit. I'll investigate.
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v2] devtools: give some hints for ABI errors
2020-07-15 12:15 25% ` [dpdk-dev] [PATCH v2] " David Marchand
@ 2020-07-15 12:48 4% ` Aaron Conole
2020-07-16 7:29 4% ` David Marchand
0 siblings, 1 reply; 200+ results
From: Aaron Conole @ 2020-07-15 12:48 UTC (permalink / raw)
To: David Marchand; +Cc: dev, thomas, mdr, nhorman, dodji
David Marchand <david.marchand@redhat.com> writes:
> abidiff can provide some more information about the ABI difference it
> detected.
> In all cases, a discussion on the mailing must happen but we can give
> some hints to know if this is a problem with the script calling abidiff,
> a potential ABI breakage or an unambiguous ABI breakage.
>
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> Acked-by: Ray Kinsella <mdr@ashroe.eu>
> Acked-by: Neil Horman <nhorman@tuxdriver.com>
> ---
> Changes since v1:
> - used arithmetic test,
> - updated error message for generic errors,
>
Acked-by: Aaron Conole <aconole@redhat.com>
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH v4 1/2] mbuf: use C11 atomic built-ins for refcnt operations
2020-07-09 15:58 4% ` [dpdk-dev] [PATCH v4 1/2] " Phil Yang
@ 2020-07-15 12:29 0% ` David Marchand
2020-07-15 12:49 0% ` Aaron Conole
` (2 more replies)
2020-07-17 4:36 4% ` [dpdk-dev] [PATCH v5 1/2] mbuf: use C11 atomic builtins " Phil Yang
1 sibling, 3 replies; 200+ results
From: David Marchand @ 2020-07-15 12:29 UTC (permalink / raw)
To: Phil Yang
Cc: Olivier Matz, dev, Stephen Hemminger, David Christensen,
Honnappa Nagarahalli, Ruifeng Wang (Arm Technology China),
nd, Dodji Seketeli, Aaron Conole
On Thu, Jul 9, 2020 at 5:59 PM Phil Yang <phil.yang@arm.com> wrote:
>
> Use C11 atomic built-ins with explicit ordering instead of rte_atomic
> ops which enforce unnecessary barriers on aarch64.
>
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
> v4:
> 1. Add union for refcnt_atomic and refcnt in rte_mbuf_ext_shared_info
> to avoid ABI breakage. (Olivier)
> 2. Add notice of refcnt_atomic deprecation. (Honnappa)
v4 does not pass the checks (in both my env, and Travis).
https://travis-ci.com/github/ovsrobot/dpdk/jobs/359393389#L2405
It seems the robot had a hiccup as I can't see a report in the test-report ml.
--
David Marchand
^ permalink raw reply [relevance 0%]
* [dpdk-dev] [PATCH v2] devtools: give some hints for ABI errors
2020-07-08 10:22 25% [dpdk-dev] [PATCH] devtools: give some hints for ABI errors David Marchand
` (3 preceding siblings ...)
2020-07-10 10:58 4% ` Neil Horman
@ 2020-07-15 12:15 25% ` David Marchand
2020-07-15 12:48 4% ` Aaron Conole
4 siblings, 1 reply; 200+ results
From: David Marchand @ 2020-07-15 12:15 UTC (permalink / raw)
To: dev; +Cc: thomas, mdr, nhorman, dodji, aconole
abidiff can provide some more information about the ABI difference it
detected.
In all cases, a discussion on the mailing must happen but we can give
some hints to know if this is a problem with the script calling abidiff,
a potential ABI breakage or an unambiguous ABI breakage.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
---
Changes since v1:
- used arithmetic test,
- updated error message for generic errors,
---
devtools/check-abi.sh | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/devtools/check-abi.sh b/devtools/check-abi.sh
index e17fedbd9f..172e934382 100755
--- a/devtools/check-abi.sh
+++ b/devtools/check-abi.sh
@@ -50,10 +50,22 @@ for dump in $(find $refdir -name "*.dump"); do
error=1
continue
fi
- if ! abidiff $ABIDIFF_OPTIONS $dump $dump2; then
+ abidiff $ABIDIFF_OPTIONS $dump $dump2 || {
+ abiret=$?
echo "Error: ABI issue reported for 'abidiff $ABIDIFF_OPTIONS $dump $dump2'"
error=1
- fi
+ echo
+ if [ $(($abiret & 3)) -ne 0 ]; then
+ echo "ABIDIFF_ERROR|ABIDIFF_USAGE_ERROR, this could be a script or environment issue."
+ fi
+ if [ $(($abiret & 4)) -ne 0 ]; then
+ echo "ABIDIFF_ABI_CHANGE, this change requires a review (abidiff flagged this as a potential issue)."
+ fi
+ if [ $(($abiret & 8)) -ne 0 ]; then
+ echo "ABIDIFF_ABI_INCOMPATIBLE_CHANGE, this change breaks the ABI."
+ fi
+ echo
+ }
done
[ -z "$error" ] || [ -n "$warnonly" ]
--
2.23.0
^ permalink raw reply [relevance 25%]
* [dpdk-dev] [PATCH v4 09/11] doc: add note about blacklist/whitelist changes
@ 2020-07-14 5:39 4% ` Stephen Hemminger
0 siblings, 0 replies; 200+ results
From: Stephen Hemminger @ 2020-07-14 5:39 UTC (permalink / raw)
To: dev; +Cc: Stephen Hemminger, Luca Boccassi
The blacklist/whitelist changes to API will not be a breaking
change for applications in this release but worth adding a note
to encourage migration.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Luca Boccassi <bluca@debian.org>
---
doc/guides/rel_notes/release_20_08.rst | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/doc/guides/rel_notes/release_20_08.rst b/doc/guides/rel_notes/release_20_08.rst
index f19b748728e4..b9509f657b30 100644
--- a/doc/guides/rel_notes/release_20_08.rst
+++ b/doc/guides/rel_notes/release_20_08.rst
@@ -261,6 +261,12 @@ API Changes
* vhost: The API of ``rte_vhost_host_notifier_ctrl`` was changed to be per
queue and not per device, a qid parameter was added to the arguments list.
+* eal: The definitions related to including and excluding devices
+ has been changed from blacklist/whitelist to include/exclude.
+ There are compatiablity macros and command line mapping to accept
+ the old values but applications and scripts are strongly encouraged
+ to migrate to the new names.
+
ABI Changes
-----------
--
2.26.2
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH v3 00/10] rename blacklist/whitelist to block/allow
2020-07-10 15:06 3% ` David Marchand
@ 2020-07-14 4:43 0% ` Stephen Hemminger
0 siblings, 0 replies; 200+ results
From: Stephen Hemminger @ 2020-07-14 4:43 UTC (permalink / raw)
To: David Marchand; +Cc: dev, techboard, Luca Boccassi, Mcnamara, John
On Fri, 10 Jul 2020 17:06:11 +0200
David Marchand <david.marchand@redhat.com> wrote:
> On Sat, Jun 13, 2020 at 2:01 AM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > The terms blacklist and whitelist are often seen as reminders
> > of the divisions in society. Instead, use more exact terms for
> > handling of which devices are used in DPDK.
> >
> > This is a proposed change for DPDK 20.08 to replace the names
> > blacklist and whitelist in API and command lines.
> >
> > The first three patches fix some other unnecessary use of
> > blacklist/whitelist and have no user visible impact.
> >
> > The rest change the PCI blacklist to be blocklist and
> > whitelist to be allowlist.
>
> Thanks for working on this.
>
> I agree, the first patches can go in right now.
>
> But I have some concerns about the rest.
>
> New options in EAL are not consistent with "allow"/"block" list:
> + "b:" /* pci-skip-probe */
> + "w:" /* pci-only-probe */
> +#define OPT_PCI_SKIP_PROBE "pci-skip-probe"
> + OPT_PCI_SKIP_PROBE_NUM = 'b',
> +#define OPT_PCI_ONLY_PROBE "pci-only-probe"
> + OPT_PCI_ONLY_PROBE_NUM = 'w',
>
> The CI flagged the series as failing, because the unit test for EAL
> flags is unaligned:
> +#define pci_allowlist "--pci-allowlist"
> https://travis-ci.com/github/ovsrobot/dpdk/jobs/348668299#L5657
>
>
> The ABI check complains about the enum update:
> https://travis-ci.com/github/ovsrobot/dpdk/jobs/348668301#L2400
> Either we deal with this, or we need a libabigail exception rule.
>
>
> About deprecating existing API/EAL flags in this release, this should
> go through the standard deprecation process.
> I would go with introducing new options + full compatibility + a
> deprecation notice in the 20.08 release.
> The actual deprecation/API flagging will go in 20.11.
> Removal will come later.
>
>
The next version will use different flags, and the old flags will cause
runtime deprecation warning.
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v2] mempool/ring: add support for new ring sync modes
2020-07-13 15:00 3% ` Olivier Matz
@ 2020-07-13 16:29 0% ` Ananyev, Konstantin
0 siblings, 0 replies; 200+ results
From: Ananyev, Konstantin @ 2020-07-13 16:29 UTC (permalink / raw)
To: Olivier Matz; +Cc: dev, arybchenko, jielong.zjl, Eads, Gage
> On Mon, Jul 13, 2020 at 02:46:35PM +0000, Ananyev, Konstantin wrote:
> > Hi Olivier,
> >
> > > Hi Konstantin,
> > >
> > > On Fri, Jul 10, 2020 at 03:20:12PM +0000, Ananyev, Konstantin wrote:
> > > >
> > > >
> > > > >
> > > > > Hi Olivier,
> > > > >
> > > > > > Hi Konstantin,
> > > > > >
> > > > > > On Thu, Jul 09, 2020 at 05:55:30PM +0000, Ananyev, Konstantin wrote:
> > > > > > > Hi Olivier,
> > > > > > >
> > > > > > > > Hi Konstantin,
> > > > > > > >
> > > > > > > > On Mon, Jun 29, 2020 at 05:10:24PM +0100, Konstantin Ananyev wrote:
> > > > > > > > > v2:
> > > > > > > > > - update Release Notes (as per comments)
> > > > > > > > >
> > > > > > > > > Two new sync modes were introduced into rte_ring:
> > > > > > > > > relaxed tail sync (RTS) and head/tail sync (HTS).
> > > > > > > > > This change provides user with ability to select these
> > > > > > > > > modes for ring based mempool via mempool ops API.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> > > > > > > > > Acked-by: Gage Eads <gage.eads@intel.com>
> > > > > > > > > ---
> > > > > > > > > doc/guides/rel_notes/release_20_08.rst | 6 ++
> > > > > > > > > drivers/mempool/ring/rte_mempool_ring.c | 97 ++++++++++++++++++++++---
> > > > > > > > > 2 files changed, 94 insertions(+), 9 deletions(-)
> > > > > > > > >
> > > > > > > > > diff --git a/doc/guides/rel_notes/release_20_08.rst b/doc/guides/rel_notes/release_20_08.rst
> > > > > > > > > index eaaf11c37..7bdcf3aac 100644
> > > > > > > > > --- a/doc/guides/rel_notes/release_20_08.rst
> > > > > > > > > +++ b/doc/guides/rel_notes/release_20_08.rst
> > > > > > > > > @@ -84,6 +84,12 @@ New Features
> > > > > > > > > * Dump ``rte_flow`` memory consumption.
> > > > > > > > > * Measure packet per second forwarding.
> > > > > > > > >
> > > > > > > > > +* **Added support for new sync modes into mempool ring driver.**
> > > > > > > > > +
> > > > > > > > > + Added ability to select new ring synchronisation modes:
> > > > > > > > > + ``relaxed tail sync (ring_mt_rts)`` and ``head/tail sync (ring_mt_hts)``
> > > > > > > > > + via mempool ops API.
> > > > > > > > > +
> > > > > > > > >
> > > > > > > > > Removed Items
> > > > > > > > > -------------
> > > > > > > > > diff --git a/drivers/mempool/ring/rte_mempool_ring.c b/drivers/mempool/ring/rte_mempool_ring.c
> > > > > > > > > index bc123fc52..15ec7dee7 100644
> > > > > > > > > --- a/drivers/mempool/ring/rte_mempool_ring.c
> > > > > > > > > +++ b/drivers/mempool/ring/rte_mempool_ring.c
> > > > > > > > > @@ -25,6 +25,22 @@ common_ring_sp_enqueue(struct rte_mempool *mp, void * const *obj_table,
> > > > > > > > > obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > +static int
> > > > > > > > > +rts_ring_mp_enqueue(struct rte_mempool *mp, void * const *obj_table,
> > > > > > > > > + unsigned int n)
> > > > > > > > > +{
> > > > > > > > > + return rte_ring_mp_rts_enqueue_bulk(mp->pool_data,
> > > > > > > > > + obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > > +static int
> > > > > > > > > +hts_ring_mp_enqueue(struct rte_mempool *mp, void * const *obj_table,
> > > > > > > > > + unsigned int n)
> > > > > > > > > +{
> > > > > > > > > + return rte_ring_mp_hts_enqueue_bulk(mp->pool_data,
> > > > > > > > > + obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > > static int
> > > > > > > > > common_ring_mc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
> > > > > > > > > {
> > > > > > > > > @@ -39,17 +55,30 @@ common_ring_sc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
> > > > > > > > > obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > +static int
> > > > > > > > > +rts_ring_mc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned int n)
> > > > > > > > > +{
> > > > > > > > > + return rte_ring_mc_rts_dequeue_bulk(mp->pool_data,
> > > > > > > > > + obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > > +static int
> > > > > > > > > +hts_ring_mc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned int n)
> > > > > > > > > +{
> > > > > > > > > + return rte_ring_mc_hts_dequeue_bulk(mp->pool_data,
> > > > > > > > > + obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > > static unsigned
> > > > > > > > > common_ring_get_count(const struct rte_mempool *mp)
> > > > > > > > > {
> > > > > > > > > return rte_ring_count(mp->pool_data);
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > -
> > > > > > > > > static int
> > > > > > > > > -common_ring_alloc(struct rte_mempool *mp)
> > > > > > > > > +ring_alloc(struct rte_mempool *mp, uint32_t rg_flags)
> > > > > > > > > {
> > > > > > > > > - int rg_flags = 0, ret;
> > > > > > > > > + int ret;
> > > > > > > > > char rg_name[RTE_RING_NAMESIZE];
> > > > > > > > > struct rte_ring *r;
> > > > > > > > >
> > > > > > > > > @@ -60,12 +89,6 @@ common_ring_alloc(struct rte_mempool *mp)
> > > > > > > > > return -rte_errno;
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > - /* ring flags */
> > > > > > > > > - if (mp->flags & MEMPOOL_F_SP_PUT)
> > > > > > > > > - rg_flags |= RING_F_SP_ENQ;
> > > > > > > > > - if (mp->flags & MEMPOOL_F_SC_GET)
> > > > > > > > > - rg_flags |= RING_F_SC_DEQ;
> > > > > > > > > -
> > > > > > > > > /*
> > > > > > > > > * Allocate the ring that will be used to store objects.
> > > > > > > > > * Ring functions will return appropriate errors if we are
> > > > > > > > > @@ -82,6 +105,40 @@ common_ring_alloc(struct rte_mempool *mp)
> > > > > > > > > return 0;
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > +static int
> > > > > > > > > +common_ring_alloc(struct rte_mempool *mp)
> > > > > > > > > +{
> > > > > > > > > + uint32_t rg_flags;
> > > > > > > > > +
> > > > > > > > > + rg_flags = 0;
> > > > > > > >
> > > > > > > > Maybe it could go on the same line
> > > > > > > >
> > > > > > > > > +
> > > > > > > > > + /* ring flags */
> > > > > > > >
> > > > > > > > Not sure we need to keep this comment
> > > > > > > >
> > > > > > > > > + if (mp->flags & MEMPOOL_F_SP_PUT)
> > > > > > > > > + rg_flags |= RING_F_SP_ENQ;
> > > > > > > > > + if (mp->flags & MEMPOOL_F_SC_GET)
> > > > > > > > > + rg_flags |= RING_F_SC_DEQ;
> > > > > > > > > +
> > > > > > > > > + return ring_alloc(mp, rg_flags);
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > > +static int
> > > > > > > > > +rts_ring_alloc(struct rte_mempool *mp)
> > > > > > > > > +{
> > > > > > > > > + if ((mp->flags & (MEMPOOL_F_SP_PUT | MEMPOOL_F_SC_GET)) != 0)
> > > > > > > > > + return -EINVAL;
> > > > > > > >
> > > > > > > > Why do we need this? It is a problem to allow sc/sp in this mode (even
> > > > > > > > if it's not optimal)?
> > > > > > >
> > > > > > > These new sync modes (RTS, HTS) are for MT.
> > > > > > > For SP/SC - there is simply no point to use MT sync modes.
> > > > > > > I suppose there are few choices:
> > > > > > > 1. Make F_SP_PUT/F_SC_GET flags silently override expected ops behaviour
> > > > > > > and create actual ring with ST sync mode for prod/cons.
> > > > > > > 2. Report an error.
> > > > > > > 3. Silently ignore these flags.
> > > > > > >
> > > > > > > As I can see for "ring_mp_mc" ops, we doing #1,
> > > > > > > while for "stack" we are doing #3.
> > > > > > > For RTS/HTS I chosoe #2, as it seems cleaner to me.
> > > > > > > Any thoughts from your side what preferable behaviour should be?
> > > > > >
> > > > > > The F_SP_PUT/F_SC_GET are only used in rte_mempool_create() to select
> > > > > > the default ops among (ring_sp_sc, ring_mp_sc, ring_sp_mc,
> > > > > > ring_mp_mc).
> > > > >
> > > > > As I understand, nothing prevents user from doing:
> > > > >
> > > > > mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
> > > > > sizeof(struct rte_pktmbuf_pool_private), socket_id, 0);
> > > >
> > > > Apologies, hit send accidently.
> > > > I meant user can do:
> > > >
> > > > mp = rte_mempool_create_empty(..., F_SP_PUT | F_SC_GET);
> > > > rte_mempool_set_ops_byname(mp, "ring_mp_mc", NULL);
> > > >
> > > > An in that case, he'll get SP/SC ring underneath.
> > >
> > > It looks it's not the case. Since commit 449c49b93a6b ("mempool: support
> > > handler operations"), the flags SP_PUT/SC_GET are converted into a call
> > > to rte_mempool_set_ops_byname() in rte_mempool_create() only.
> > >
> > > In rte_mempool_create_empty(), these flags are ignored. It is expected
> > > that the user calls rte_mempool_set_ops_byname() by itself.
> >
> > As I understand the code - not exactly.
> > rte_mempool_create_empty() doesn't make any specific actions based on 'flags' value,
> > but it does store it's value inside mp->flags.
> > Later, when mempool_ops_alloc_once() is called these flags will be used by
> > common_ring_alloc() and might override selected by ops ring behaviour.
> >
> > >
> > > I don't think it is a good behavior:
> > >
> > > 1/ The documentation of rte_mempool_create_empty() does not say that the
> > > flags are ignored, and a user can expect that F_SP_PUT | F_SC_GET
> > > sets the default ops like rte_mempool_create().
> > >
> > > 2/ If rte_mempool_set_ops_byname() is not called after
> > > rte_mempool_create_empty() (and it looks it happens in dpdk's code),
> > > the default ops are the ones registered at index 0. This depends on
> > > the link order.
> > >
> > > So I propose to move the following code in
> > > rte_mempool_create_empty().
> > >
> > > if ((flags & MEMPOOL_F_SP_PUT) && (flags & MEMPOOL_F_SC_GET))
> > > ret = rte_mempool_set_ops_byname(mp, "ring_sp_sc", NULL);
> > > else if (flags & MEMPOOL_F_SP_PUT)
> > > ret = rte_mempool_set_ops_byname(mp, "ring_sp_mc", NULL);
> > > else if (flags & MEMPOOL_F_SC_GET)
> > > ret = rte_mempool_set_ops_byname(mp, "ring_mp_sc", NULL);
> > > else
> > > ret = rte_mempool_set_ops_byname(mp, "ring_mp_mc", NULL);
> > >
> > > What do you think?
> >
> > I think it will be a good thing - as in that case we'll always have
> > "ring_mp_mc" selected as default one.
> > As another thought, it porbably would be good to deprecate and later remove
> > MEMPOOL_F_SP_PUT and MEMPOOL_F_SC_GET completely.
> > These days user can select this behaviour via mempool ops and such dualism
> > just makes things more error-prone and harder to maintain.
> > Especially as we don't have clear policy what should be the higher priority
> > for sync mode selection: mempool ops or flags.
> >
>
> I'll tend to agree, however it would mean deprecate rte_mempool_create()
> too, because we wouldn't be able to set ops with it. Or we would have to
> add a 12th (!) argument to the function, to set the ops name.
>
> I don't like having that many arguments to this function, but it seems
> it is widely used, probably because it is just one function call (vs
> create_empty + set_ops + populate). So adding a "ops_name" argument is
> maybe the right thing to do, given we can keep abi compat.
My thought was - just keep rte_mempool_create()
parameter list as it is, and always set ops to "ring_mp_mc" for it.
Users who'd like some other ops would force to use
create_empty+set_ops+populate.
That's pretty much the same what we have right now,
the only diff will be ring with SP/SC mode.
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v2] mempool/ring: add support for new ring sync modes
@ 2020-07-13 15:00 3% ` Olivier Matz
2020-07-13 16:29 0% ` Ananyev, Konstantin
0 siblings, 1 reply; 200+ results
From: Olivier Matz @ 2020-07-13 15:00 UTC (permalink / raw)
To: Ananyev, Konstantin; +Cc: dev, arybchenko, jielong.zjl, Eads, Gage
On Mon, Jul 13, 2020 at 02:46:35PM +0000, Ananyev, Konstantin wrote:
> Hi Olivier,
>
> > Hi Konstantin,
> >
> > On Fri, Jul 10, 2020 at 03:20:12PM +0000, Ananyev, Konstantin wrote:
> > >
> > >
> > > >
> > > > Hi Olivier,
> > > >
> > > > > Hi Konstantin,
> > > > >
> > > > > On Thu, Jul 09, 2020 at 05:55:30PM +0000, Ananyev, Konstantin wrote:
> > > > > > Hi Olivier,
> > > > > >
> > > > > > > Hi Konstantin,
> > > > > > >
> > > > > > > On Mon, Jun 29, 2020 at 05:10:24PM +0100, Konstantin Ananyev wrote:
> > > > > > > > v2:
> > > > > > > > - update Release Notes (as per comments)
> > > > > > > >
> > > > > > > > Two new sync modes were introduced into rte_ring:
> > > > > > > > relaxed tail sync (RTS) and head/tail sync (HTS).
> > > > > > > > This change provides user with ability to select these
> > > > > > > > modes for ring based mempool via mempool ops API.
> > > > > > > >
> > > > > > > > Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> > > > > > > > Acked-by: Gage Eads <gage.eads@intel.com>
> > > > > > > > ---
> > > > > > > > doc/guides/rel_notes/release_20_08.rst | 6 ++
> > > > > > > > drivers/mempool/ring/rte_mempool_ring.c | 97 ++++++++++++++++++++++---
> > > > > > > > 2 files changed, 94 insertions(+), 9 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/doc/guides/rel_notes/release_20_08.rst b/doc/guides/rel_notes/release_20_08.rst
> > > > > > > > index eaaf11c37..7bdcf3aac 100644
> > > > > > > > --- a/doc/guides/rel_notes/release_20_08.rst
> > > > > > > > +++ b/doc/guides/rel_notes/release_20_08.rst
> > > > > > > > @@ -84,6 +84,12 @@ New Features
> > > > > > > > * Dump ``rte_flow`` memory consumption.
> > > > > > > > * Measure packet per second forwarding.
> > > > > > > >
> > > > > > > > +* **Added support for new sync modes into mempool ring driver.**
> > > > > > > > +
> > > > > > > > + Added ability to select new ring synchronisation modes:
> > > > > > > > + ``relaxed tail sync (ring_mt_rts)`` and ``head/tail sync (ring_mt_hts)``
> > > > > > > > + via mempool ops API.
> > > > > > > > +
> > > > > > > >
> > > > > > > > Removed Items
> > > > > > > > -------------
> > > > > > > > diff --git a/drivers/mempool/ring/rte_mempool_ring.c b/drivers/mempool/ring/rte_mempool_ring.c
> > > > > > > > index bc123fc52..15ec7dee7 100644
> > > > > > > > --- a/drivers/mempool/ring/rte_mempool_ring.c
> > > > > > > > +++ b/drivers/mempool/ring/rte_mempool_ring.c
> > > > > > > > @@ -25,6 +25,22 @@ common_ring_sp_enqueue(struct rte_mempool *mp, void * const *obj_table,
> > > > > > > > obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
> > > > > > > > }
> > > > > > > >
> > > > > > > > +static int
> > > > > > > > +rts_ring_mp_enqueue(struct rte_mempool *mp, void * const *obj_table,
> > > > > > > > + unsigned int n)
> > > > > > > > +{
> > > > > > > > + return rte_ring_mp_rts_enqueue_bulk(mp->pool_data,
> > > > > > > > + obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +static int
> > > > > > > > +hts_ring_mp_enqueue(struct rte_mempool *mp, void * const *obj_table,
> > > > > > > > + unsigned int n)
> > > > > > > > +{
> > > > > > > > + return rte_ring_mp_hts_enqueue_bulk(mp->pool_data,
> > > > > > > > + obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > static int
> > > > > > > > common_ring_mc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
> > > > > > > > {
> > > > > > > > @@ -39,17 +55,30 @@ common_ring_sc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned n)
> > > > > > > > obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
> > > > > > > > }
> > > > > > > >
> > > > > > > > +static int
> > > > > > > > +rts_ring_mc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned int n)
> > > > > > > > +{
> > > > > > > > + return rte_ring_mc_rts_dequeue_bulk(mp->pool_data,
> > > > > > > > + obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +static int
> > > > > > > > +hts_ring_mc_dequeue(struct rte_mempool *mp, void **obj_table, unsigned int n)
> > > > > > > > +{
> > > > > > > > + return rte_ring_mc_hts_dequeue_bulk(mp->pool_data,
> > > > > > > > + obj_table, n, NULL) == 0 ? -ENOBUFS : 0;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > static unsigned
> > > > > > > > common_ring_get_count(const struct rte_mempool *mp)
> > > > > > > > {
> > > > > > > > return rte_ring_count(mp->pool_data);
> > > > > > > > }
> > > > > > > >
> > > > > > > > -
> > > > > > > > static int
> > > > > > > > -common_ring_alloc(struct rte_mempool *mp)
> > > > > > > > +ring_alloc(struct rte_mempool *mp, uint32_t rg_flags)
> > > > > > > > {
> > > > > > > > - int rg_flags = 0, ret;
> > > > > > > > + int ret;
> > > > > > > > char rg_name[RTE_RING_NAMESIZE];
> > > > > > > > struct rte_ring *r;
> > > > > > > >
> > > > > > > > @@ -60,12 +89,6 @@ common_ring_alloc(struct rte_mempool *mp)
> > > > > > > > return -rte_errno;
> > > > > > > > }
> > > > > > > >
> > > > > > > > - /* ring flags */
> > > > > > > > - if (mp->flags & MEMPOOL_F_SP_PUT)
> > > > > > > > - rg_flags |= RING_F_SP_ENQ;
> > > > > > > > - if (mp->flags & MEMPOOL_F_SC_GET)
> > > > > > > > - rg_flags |= RING_F_SC_DEQ;
> > > > > > > > -
> > > > > > > > /*
> > > > > > > > * Allocate the ring that will be used to store objects.
> > > > > > > > * Ring functions will return appropriate errors if we are
> > > > > > > > @@ -82,6 +105,40 @@ common_ring_alloc(struct rte_mempool *mp)
> > > > > > > > return 0;
> > > > > > > > }
> > > > > > > >
> > > > > > > > +static int
> > > > > > > > +common_ring_alloc(struct rte_mempool *mp)
> > > > > > > > +{
> > > > > > > > + uint32_t rg_flags;
> > > > > > > > +
> > > > > > > > + rg_flags = 0;
> > > > > > >
> > > > > > > Maybe it could go on the same line
> > > > > > >
> > > > > > > > +
> > > > > > > > + /* ring flags */
> > > > > > >
> > > > > > > Not sure we need to keep this comment
> > > > > > >
> > > > > > > > + if (mp->flags & MEMPOOL_F_SP_PUT)
> > > > > > > > + rg_flags |= RING_F_SP_ENQ;
> > > > > > > > + if (mp->flags & MEMPOOL_F_SC_GET)
> > > > > > > > + rg_flags |= RING_F_SC_DEQ;
> > > > > > > > +
> > > > > > > > + return ring_alloc(mp, rg_flags);
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +static int
> > > > > > > > +rts_ring_alloc(struct rte_mempool *mp)
> > > > > > > > +{
> > > > > > > > + if ((mp->flags & (MEMPOOL_F_SP_PUT | MEMPOOL_F_SC_GET)) != 0)
> > > > > > > > + return -EINVAL;
> > > > > > >
> > > > > > > Why do we need this? It is a problem to allow sc/sp in this mode (even
> > > > > > > if it's not optimal)?
> > > > > >
> > > > > > These new sync modes (RTS, HTS) are for MT.
> > > > > > For SP/SC - there is simply no point to use MT sync modes.
> > > > > > I suppose there are few choices:
> > > > > > 1. Make F_SP_PUT/F_SC_GET flags silently override expected ops behaviour
> > > > > > and create actual ring with ST sync mode for prod/cons.
> > > > > > 2. Report an error.
> > > > > > 3. Silently ignore these flags.
> > > > > >
> > > > > > As I can see for "ring_mp_mc" ops, we doing #1,
> > > > > > while for "stack" we are doing #3.
> > > > > > For RTS/HTS I chosoe #2, as it seems cleaner to me.
> > > > > > Any thoughts from your side what preferable behaviour should be?
> > > > >
> > > > > The F_SP_PUT/F_SC_GET are only used in rte_mempool_create() to select
> > > > > the default ops among (ring_sp_sc, ring_mp_sc, ring_sp_mc,
> > > > > ring_mp_mc).
> > > >
> > > > As I understand, nothing prevents user from doing:
> > > >
> > > > mp = rte_mempool_create_empty(name, n, elt_size, cache_size,
> > > > sizeof(struct rte_pktmbuf_pool_private), socket_id, 0);
> > >
> > > Apologies, hit send accidently.
> > > I meant user can do:
> > >
> > > mp = rte_mempool_create_empty(..., F_SP_PUT | F_SC_GET);
> > > rte_mempool_set_ops_byname(mp, "ring_mp_mc", NULL);
> > >
> > > An in that case, he'll get SP/SC ring underneath.
> >
> > It looks it's not the case. Since commit 449c49b93a6b ("mempool: support
> > handler operations"), the flags SP_PUT/SC_GET are converted into a call
> > to rte_mempool_set_ops_byname() in rte_mempool_create() only.
> >
> > In rte_mempool_create_empty(), these flags are ignored. It is expected
> > that the user calls rte_mempool_set_ops_byname() by itself.
>
> As I understand the code - not exactly.
> rte_mempool_create_empty() doesn't make any specific actions based on 'flags' value,
> but it does store it's value inside mp->flags.
> Later, when mempool_ops_alloc_once() is called these flags will be used by
> common_ring_alloc() and might override selected by ops ring behaviour.
>
> >
> > I don't think it is a good behavior:
> >
> > 1/ The documentation of rte_mempool_create_empty() does not say that the
> > flags are ignored, and a user can expect that F_SP_PUT | F_SC_GET
> > sets the default ops like rte_mempool_create().
> >
> > 2/ If rte_mempool_set_ops_byname() is not called after
> > rte_mempool_create_empty() (and it looks it happens in dpdk's code),
> > the default ops are the ones registered at index 0. This depends on
> > the link order.
> >
> > So I propose to move the following code in
> > rte_mempool_create_empty().
> >
> > if ((flags & MEMPOOL_F_SP_PUT) && (flags & MEMPOOL_F_SC_GET))
> > ret = rte_mempool_set_ops_byname(mp, "ring_sp_sc", NULL);
> > else if (flags & MEMPOOL_F_SP_PUT)
> > ret = rte_mempool_set_ops_byname(mp, "ring_sp_mc", NULL);
> > else if (flags & MEMPOOL_F_SC_GET)
> > ret = rte_mempool_set_ops_byname(mp, "ring_mp_sc", NULL);
> > else
> > ret = rte_mempool_set_ops_byname(mp, "ring_mp_mc", NULL);
> >
> > What do you think?
>
> I think it will be a good thing - as in that case we'll always have
> "ring_mp_mc" selected as default one.
> As another thought, it porbably would be good to deprecate and later remove
> MEMPOOL_F_SP_PUT and MEMPOOL_F_SC_GET completely.
> These days user can select this behaviour via mempool ops and such dualism
> just makes things more error-prone and harder to maintain.
> Especially as we don't have clear policy what should be the higher priority
> for sync mode selection: mempool ops or flags.
>
I'll tend to agree, however it would mean deprecate rte_mempool_create()
too, because we wouldn't be able to set ops with it. Or we would have to
add a 12th (!) argument to the function, to set the ops name.
I don't like having that many arguments to this function, but it seems
it is widely used, probably because it is just one function call (vs
create_empty + set_ops + populate). So adding a "ops_name" argument is
maybe the right thing to do, given we can keep abi compat.
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH 2/2] doc: add deprecation notice for change of rawdev APIs
2020-07-13 12:31 5% ` [dpdk-dev] [PATCH 2/2] doc: add deprecation notice for change of rawdev APIs Bruce Richardson
@ 2020-07-13 12:48 5% ` Hemant Agrawal
2020-07-20 11:35 0% ` Ananyev, Konstantin
2020-07-23 1:55 5% ` Xu, Rosen
2 siblings, 0 replies; 200+ results
From: Hemant Agrawal @ 2020-07-13 12:48 UTC (permalink / raw)
To: Bruce Richardson, dev; +Cc: Nipun Gupta
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
-----Original Message-----
From: Bruce Richardson <bruce.richardson@intel.com>
Sent: Monday, July 13, 2020 6:01 PM
To: dev@dpdk.org
Cc: Bruce Richardson <bruce.richardson@intel.com>; Nipun Gupta <nipun.gupta@nxp.com>; Hemant Agrawal <hemant.agrawal@nxp.com>
Subject: [PATCH 2/2] doc: add deprecation notice for change of rawdev APIs
Importance: High
Add to the documentation for 20.08 a notice about the changes of rawdev APIs proposed by patchset [1].
[1] https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Finbox.dpdk.org%2Fdev%2F20200709152047.167730-1-bruce.richardson%40intel.com%2F&data=02%7C01%7Chemant.agrawal%40nxp.com%7C0f08b692d7e1471fa69908d82728a944%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637302402871919863&sdata=MfNlKHVOUk%2FRViCJBqSnyJRB%2BnnKpPeViN6OiCb9MJA%3D&reserved=0
Cc: Nipun Gupta <nipun.gupta@nxp.com>
Cc: Hemant Agrawal <hemant.agrawal@nxp.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
doc/guides/rel_notes/deprecation.rst | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index ead7cbe43..21b00103e 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -117,6 +117,13 @@ Deprecation Notices
break the ABI checks, that is why change is planned for 20.11.
The list of internal APIs are mainly ones listed in ``rte_ethdev_driver.h``.
+* rawdev: The rawdev APIs which take a device-specific structure as
+ parameter directly, or indirectly via a "private" pointer inside
+another
+ structure, will be modified to take an additional parameter of the
+ structure size. The affected APIs will include
+``rte_rawdev_info_get``,
+ ``rte_rawdev_configure``, ``rte_rawdev_queue_conf_get`` and
+ ``rte_rawdev_queue_setup``.
+
* traffic manager: All traffic manager API's in ``rte_tm.h`` were mistakenly made
ABI stable in the v19.11 release. The TM maintainer and other contributors have
agreed to keep the TM APIs as experimental in expectation of additional spec
--
2.25.1
^ permalink raw reply [relevance 5%]
* [dpdk-dev] [PATCH 2/2] doc: add deprecation notice for change of rawdev APIs
@ 2020-07-13 12:31 5% ` Bruce Richardson
2020-07-13 12:48 5% ` Hemant Agrawal
` (2 more replies)
0 siblings, 3 replies; 200+ results
From: Bruce Richardson @ 2020-07-13 12:31 UTC (permalink / raw)
To: dev; +Cc: Bruce Richardson, Nipun Gupta, Hemant Agrawal
Add to the documentation for 20.08 a notice about the changes of rawdev
APIs proposed by patchset [1].
[1] http://inbox.dpdk.org/dev/20200709152047.167730-1-bruce.richardson@intel.com/
Cc: Nipun Gupta <nipun.gupta@nxp.com>
Cc: Hemant Agrawal <hemant.agrawal@nxp.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
doc/guides/rel_notes/deprecation.rst | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index ead7cbe43..21b00103e 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -117,6 +117,13 @@ Deprecation Notices
break the ABI checks, that is why change is planned for 20.11.
The list of internal APIs are mainly ones listed in ``rte_ethdev_driver.h``.
+* rawdev: The rawdev APIs which take a device-specific structure as
+ parameter directly, or indirectly via a "private" pointer inside another
+ structure, will be modified to take an additional parameter of the
+ structure size. The affected APIs will include ``rte_rawdev_info_get``,
+ ``rte_rawdev_configure``, ``rte_rawdev_queue_conf_get`` and
+ ``rte_rawdev_queue_setup``.
+
* traffic manager: All traffic manager API's in ``rte_tm.h`` were mistakenly made
ABI stable in the v19.11 release. The TM maintainer and other contributors have
agreed to keep the TM APIs as experimental in expectation of additional spec
--
2.25.1
^ permalink raw reply [relevance 5%]
* [dpdk-dev] The mbuf API needs some cleaning up
@ 2020-07-13 9:57 3% Morten Brørup
2020-07-31 15:24 0% ` Olivier Matz
0 siblings, 1 reply; 200+ results
From: Morten Brørup @ 2020-07-13 9:57 UTC (permalink / raw)
To: Olivier Matz; +Cc: dev
The MBUF library exposes some macros and constants without the RTE_ prefix. I propose cleaning up these, so better names get into the coming LTS release.
The worst is:
#define MBUF_INVALID_PORT UINT16_MAX
I say it's the worst because when we were looking for the official "invalid" port value for our application, we didn't find this one. (Probably because its documentation is wrong.)
MBUF_INVALID_PORT is defined in rte_mbuf_core.h without any description, and in rte_mbuf.h, where it is injected between the rte_pktmbuf_reset() function and its description, so the API documentation shows the function's description for the constant, and no description for the function.
I propose keeping it at a sensible location in rte_mbuf_core.h only, adding a description, and renaming it to:
#define RTE_PORT_INVALID UINT16_MAX
For backwards compatibility, we could add:
/* this old name is deprecated */
#define MBUF_INVALID_PORT RTE_PORT_INVALID
I also wonder why there are no compiler warnings about the double definition?
There are also the data buffer location constants:
#define EXT_ATTACHED_MBUF (1ULL << 61)
and
#define IND_ATTACHED_MBUF (1ULL << 62)
There are already macros (with good names) for reading these, so simply adding the RTE_ prefix to these two constants suffices.
And all the packet offload flags, such as:
#define PKT_RX_VLAN (1ULL << 0)
They are supposed to be used by applications, so I guess we should keep them unchanged for ABI stability reasons.
And the local macro:
#define MBUF_RAW_ALLOC_CHECK(m) do { \
This might as well be an internal inline function:
/* internal */
static inline void
__rte_mbuf_raw_alloc_check(const struct rte_mbuf *m)
Or we could keep it a macro and move it next to
__rte_mbuf_sanity_check(), keeping it clear that it is only relevant when RTE_LIBRTE_MBUF_DEBUG is set. But rename it to lower case, similar to the __rte_mbuf_sanity_check() macro.
Med venlig hilsen / kind regards
- Morten Brørup
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v5 1/2] rte_flow: add eCPRI key fields to flow API
2020-07-12 14:28 0% ` Bing Zhao
@ 2020-07-12 14:43 0% ` Olivier Matz
0 siblings, 0 replies; 200+ results
From: Olivier Matz @ 2020-07-12 14:43 UTC (permalink / raw)
To: Bing Zhao
Cc: Ori Kam, john.mcnamara, marko.kovacevic, Thomas Monjalon,
ferruh.yigit, arybchenko, akhil.goyal, dev, wenzhuo.lu,
beilei.xing, bernard.iremonger
On Sun, Jul 12, 2020 at 02:28:03PM +0000, Bing Zhao wrote:
> Hi Olivier,
> Thanks
>
> BR. Bing
>
> > -----Original Message-----
> > From: Olivier Matz <olivier.matz@6wind.com>
> > Sent: Sunday, July 12, 2020 9:18 PM
> > To: Bing Zhao <bingz@mellanox.com>
> > Cc: Ori Kam <orika@mellanox.com>; john.mcnamara@intel.com;
> > marko.kovacevic@intel.com; Thomas Monjalon
> > <thomas@monjalon.net>; ferruh.yigit@intel.com;
> > arybchenko@solarflare.com; akhil.goyal@nxp.com; dev@dpdk.org;
> > wenzhuo.lu@intel.com; beilei.xing@intel.com;
> > bernard.iremonger@intel.com
> > Subject: Re: [PATCH v5 1/2] rte_flow: add eCPRI key fields to flow API
> >
> > Hi Bing,
> >
> > On Sat, Jul 11, 2020 at 04:25:49AM +0000, Bing Zhao wrote:
> > > Hi Olivier,
> > > Many thanks for your comments.
> >
> > [...]
> >
> > > > > +/**
> > > > > + * eCPRI Common Header
> > > > > + */
> > > > > +RTE_STD_C11
> > > > > +struct rte_ecpri_common_hdr {
> > > > > +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > > > > + uint32_t size:16; /**< Payload Size */
> > > > > + uint32_t type:8; /**< Message Type */
> > > > > + uint32_t c:1; /**< Concatenation Indicator
> > > > */
> > > > > + uint32_t res:3; /**< Reserved */
> > > > > + uint32_t revision:4; /**< Protocol Revision */
> > > > > +#elif RTE_BYTE_ORDER == RTE_BIG_ENDIAN
> > > > > + uint32_t revision:4; /**< Protocol Revision */
> > > > > + uint32_t res:3; /**< Reserved */
> > > > > + uint32_t c:1; /**< Concatenation Indicator
> > > > */
> > > > > + uint32_t type:8; /**< Message Type */
> > > > > + uint32_t size:16; /**< Payload Size */
> > > > > +#endif
> > > > > +} __rte_packed;
> > > >
> > > > Does it really need to be packed? Why next types do not need it?
> > > > It looks only those which have bitfields are.
> > > >
> > >
> > > Nice catch, thanks. For the common header, there is no need to use
> > the
> > > packed attribute, because it is a u32 and the bitfields will be
> > > aligned.
> > > I checked all the definitions again. Only " Type #4: Remote Memory
> > Access"
> > > needs to use the packed attribute.
> > > For other sub-types, "sub-header" part of the message payload will
> > get
> > > aligned by nature. For example, u16 after u16, u8 after u16, these
> > > should be OK.
> > > But in type #4, the address is 48bits wide, with 16bits MSB and 32bits
> > > LSB (no detailed description in the specification, correct me if
> > > anything wrong.) Usually, the 48bits address will be devided as this
> > > in a system. And there is no 48-bits type at all. So we need to define
> > two parts for it: 32b LSB follows 16b MSB.
> > > u32 after u16 should be with packed attribute. Thanks
> >
> > What about using a bitfield into a uint64_t ? I mean:
> >
> > struct rte_ecpri_msg_rm_access {
> > if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > ...
> > uint64_t length:16; /**< number of bytes
> > */
> > uint64_t addr:48; /**< address */
> > #else
> > ...
> > uint64_t addr:48; /**< address */
> > uint64_t length:16; /**< number of bytes
> > */
> > #endif
> > };
> >
>
> Thanks for your suggestion.
> https://stackoverflow.com/questions/10906238/warning-when-using-bitfield-with-unsigned-char
> AFAIK (from this page), bitfields support only support bool and int. uint64_t is some type of "long"
> and most of the compilers should support it. But I am not sure if it is a standard implementation.
The uint8_t[6], as in your v6, is a good idea.
> > >
> > > >
> > > > I wonder if the 'dw0' could be in this definition instead of in
> > > > struct rte_ecpri_msg_hdr?
> > > >
> > > > Something like this:
> > > >
> > > > struct rte_ecpri_common_hdr {
> > > > union {
> > > > uint32_t u32;
> > > > struct {
> > > > ...
> > > > };
> > > > };
> > > > };
> > > >
> > > > I see 2 advantages:
> > > >
> > > > - someone that only uses the common_hdr struct can use the .u32
> > > > in its software
> > > > - when using it in messages, it looks clearer to me:
> > > > msg.common_hdr.u32 = value;
> > > > instead of:
> > > > msg.dw0 = value;
> > > >
> > > > What do you think?
> > >
> > > Thanks for the suggestion, this is much better, I will change it.
> > > Indeed, in my original version, no DW(u32) is defined for the header.
> > > After that, I noticed that it would not be easy for the static casting
> > > to a u32 from bitfield(the compiler will complain), and it would not
> > > be clear to swap the endian if the user wants to use this header. I
> > > added this DW(u32) to simplify the usage of this header. But yes, if I
> > > do not add it here, it would be not easy or clear for users who just
> > use this header structure.
> > > I will change it. Is it OK if I use the name "dw0"?
> >
> > In my opinion, u32 is more usual than dw0.
> >
>
> I sent patch set v6 with this change, thanks.
>
> > >
> > > >
> > > > > +
> > > > > +/**
> > > > > + * eCPRI Message Header of Type #0: IQ Data */ struct
> > > > > +rte_ecpri_msg_iq_data {
> > > > > + rte_be16_t pc_id; /**< Physical channel ID */
> > > > > + rte_be16_t seq_id; /**< Sequence ID */
> > > > > +};
> > > > > +
> > > > > +/**
> > > > > + * eCPRI Message Header of Type #1: Bit Sequence */ struct
> > > > > +rte_ecpri_msg_bit_seq {
> > > > > + rte_be16_t pc_id; /**< Physical channel ID */
> > > > > + rte_be16_t seq_id; /**< Sequence ID */
> > > > > +};
> > > > > +
> > > > > +/**
> > > > > + * eCPRI Message Header of Type #2: Real-Time Control Data */
> > > > struct
> > > > > +rte_ecpri_msg_rtc_ctrl {
> > > > > + rte_be16_t rtc_id; /**< Real-Time Control Data ID
> > > > */
> > > > > + rte_be16_t seq_id; /**< Sequence ID */
> > > > > +};
> > > > > +
> > > > > +/**
> > > > > + * eCPRI Message Header of Type #3: Generic Data Transfer */
> > > > struct
> > > > > +rte_ecpri_msg_gen_data {
> > > > > + rte_be32_t pc_id; /**< Physical channel ID */
> > > > > + rte_be32_t seq_id; /**< Sequence ID */
> > > > > +};
> > > > > +
> > > > > +/**
> > > > > + * eCPRI Message Header of Type #4: Remote Memory Access
> > */
> > > > > +RTE_STD_C11
> > > > > +struct rte_ecpri_msg_rm_access {
> > > > > +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > > > > + uint32_t ele_id:16; /**< Element ID */
> > > > > + uint32_t rr:4; /**< Req/Resp */
> > > > > + uint32_t rw:4; /**< Read/Write */
> > > > > + uint32_t rma_id:8; /**< Remote Memory Access
> > > > ID */
> > > > > +#elif RTE_BYTE_ORDER == RTE_BIG_ENDIAN
> > > > > + uint32_t rma_id:8; /**< Remote Memory Access
> > > > ID */
> > > > > + uint32_t rw:4; /**< Read/Write */
> > > > > + uint32_t rr:4; /**< Req/Resp */
> > > > > + uint32_t ele_id:16; /**< Element ID */
> > > > > +#endif
> > > > > + rte_be16_t addr_m; /**< 48-bits address (16 MSB)
> > > > */
> > > > > + rte_be32_t addr_l; /**< 48-bits address (32 LSB) */
> > > > > + rte_be16_t length; /**< number of bytes */
> > > > > +} __rte_packed;
> > > > > +
> > > > > +/**
> > > > > + * eCPRI Message Header of Type #5: One-Way Delay
> > Measurement
> > > > */
> > > > > +struct rte_ecpri_msg_delay_measure {
> > > > > + uint8_t msr_id; /**< Measurement ID */
> > > > > + uint8_t act_type; /**< Action Type */
> > > > > +};
> > > > > +
> > > > > +/**
> > > > > + * eCPRI Message Header of Type #6: Remote Reset */ struct
> > > > > +rte_ecpri_msg_remote_reset {
> > > > > + rte_be16_t rst_id; /**< Reset ID */
> > > > > + uint8_t rst_op; /**< Reset Code Op */
> > > > > +};
> > > > > +
> > > > > +/**
> > > > > + * eCPRI Message Header of Type #7: Event Indication */ struct
> > > > > +rte_ecpri_msg_event_ind {
> > > > > + uint8_t evt_id; /**< Event ID */
> > > > > + uint8_t evt_type; /**< Event Type */
> > > > > + uint8_t seq; /**< Sequence Number */
> > > > > + uint8_t number; /**< Number of
> > > > Faults/Notif */
> > > > > +};
> > > > > +
> > > > > +/**
> > > > > + * eCPRI Message Header Format: Common Header + Message
> > > > Types */
> > > > > +RTE_STD_C11
> > > > > +struct rte_ecpri_msg_hdr {
> > > > > + union {
> > > > > + struct rte_ecpri_common_hdr common;
> > > > > + uint32_t dw0;
> > > > > + };
> > > > > + union {
> > > > > + struct rte_ecpri_msg_iq_data type0;
> > > > > + struct rte_ecpri_msg_bit_seq type1;
> > > > > + struct rte_ecpri_msg_rtc_ctrl type2;
> > > > > + struct rte_ecpri_msg_bit_seq type3;
> > > > > + struct rte_ecpri_msg_rm_access type4;
> > > > > + struct rte_ecpri_msg_delay_measure type5;
> > > > > + struct rte_ecpri_msg_remote_reset type6;
> > > > > + struct rte_ecpri_msg_event_ind type7;
> > > > > + uint32_t dummy[3];
> > > > > + };
> > > > > +};
> > > >
> > > > What is the point in having this struct?
> > > >
> > > > From a software point of view, I think it is a bit risky, because
> > > > its size is the size of the largest message. This is probably what
> > > > you want in your case, but when a software will rx or tx such
> > > > packet, I think they shouldn't use this one. My understanding is
> > > > that you only need this structure for the mask in rte_flow.
> > > >
> > > > Also, I'm not sure to understand the purpose of dummy[3], even
> > after
> > > > reading your answer to Akhil's question.
> > > >
> > >
> > > Basically YES and no. To my understanding, the eCPRI message
> > format is
> > > something like the ICMP packet format. The message (packet) itself
> > > will be parsed into different formats based on the type of the
> > common
> > > header. In the message payload part, there is no distinct definition
> > > of the "sub-header". We can divide them into the sub-header and
> > data parts based on the specification.
> > > E.g. physical channel ID / real-time control ID / Event ID + type are
> > > the parts that datapath forwarding will only care about. The
> > following
> > > timestamp or user data parts are the parts that the higher layer in
> > the application will use.
> > > 1. If an application wants to create some offload flow, or even
> > handle
> > > it in the SW, the common header + first several bytes in the payload
> > > should be enough. BUT YES, it is not good or safe to use it in the
> > higher layer of the application.
> > > 2. A higher layer of the application should have its own definition of
> > > the whole payload of a specific sub-type, including the parsing of the
> > user data part after the "sub-header".
> > > It is better for them just skip the first 4 bytes of the eCPRI message or
> > a known offset.
> > > We do not need to cover the upper layers.
> >
> > Let me explain what is my vision of how an application would use the
> > structures (these are completly dummy examples, as I don't know
> > ecpri protocol at all).
> >
> > Rx:
> >
> > int ecpri_input(struct rte_mbuf *m)
> > {
> > struct rte_ecpri_common_hdr hdr_copy, *hdr;
> > struct rte_ecpri_msg_event_ind event_copy, *event;
> > struct app_specific app_copy, *app;
> >
> > hdr = rte_pktmbuf_read(m, 0, sizeof(*hdr),
> > &hdr_copy);
> > if (unlikely(hdr == NULL))
> > return -1;
> > switch (hdr->type) {
> > ...
> > case RTE_ECPRI_EVT_IND_NTFY_IND:
> > event = rte_pktmbuf_read(m, sizeof(*hdr),
> > sizeof(*event),
> > &event_copy);
> > if (unlikely(event == NULL))
> > return -1;
> > ...
> > app = rte_pktmbuf_read(m, sizeof(*app),
> > sizeof(*hdr) + sizeof(*event),
> > &app_copy);
> > ...
> >
> > Tx:
> >
> > int ecpri_output(void)
> > {
> > struct rte_ecpri_common_hdr *hdr;
> > struct rte_ecpri_msg_event_ind *event;
> > struct app_specific *app;
> >
> > m = rte_pktmbuf_alloc(mp);
> > if (unlikely(m == NULL))
> > return -1;
> >
> > app = rte_pktmbuf_append(m, sizeof(*data));
> > if (app == NULL)
> > ...
> > app->... = ...;
> > ...
> > event = rte_pktmbuf_prepend(m, sizeof(*event));
> > if (event == NULL)
> > ...
> > event->... = ...;
> > ...
> > hdr = rte_pktmbuf_prepend(m, sizeof(*hdr));
> > if (hdr == NULL)
> > ...
> > hdr->... = ...;
> >
> > return packet_send(m);
> > }
> >
> > In these 2 examples, we never need the unioned structure (struct
> > rte_ecpri_msg_hdr).
> >
> > Using it does not look possible to me, because its size is corresponds to
> > the largest message, not to the one we are parsing/building.
> >
>
> Yes, in the cases, we do not need the unioned structures at all.
> Since the common header is always 4 bytes, an application could use the
> sub-types structures started from an offset of 4 of eCPRI layer, as in your example.
> This is in the datapath. My original purpose is for some "control path", typically
> the flow (not sure if any other new lib implementation) API, then the union
> could be used there w/o treating the common header and message payload
> header in a separate way and then combine them together. In this case, only
> the first several bytes will be accessed and checked, there will be no change
> of message itself, and then just passing it to datapath for further handling as
> in your example.
>
> > > I think some comments could be added here, is it OK?
> > > 3. Regarding this structure, I add it because I do not want to
> > > introduce a lot of new items in the rte_flow: new items with
> > > structures, new enum types. I prefer one single structure will cover
> > most of the cases (subtypes). What do you think?
> > > 4. About the *dummy* u32, I calculated all the "subheaders" and
> > choose
> > > the maximal value of the length. Two purposes (same as the u32 in
> > the common header):
> > > a. easy to swap the endianness, but not quite useful. Because some
> > parts are u16 and u8,
> > > and should not be swapped in a u32. (some physical channel ID
> > and address LSB have 32bits width)
> > > But if some HW parsed the header u32 by u32, then it would be
> > helpful.
> > > b. easy for checking in flow API, if the user wants to insert a flow.
> > Some checking should
> > > be done to confirm if it is wildcard flow (all eCPRI messages or
> > eCPRI message in some specific type),
> > > or some precise flow (to match the pc id or rtc id, for example).
> > With these fields, 3 DW
> > > of the masks only need to be check before continuing. Or else, the
> > code needs to check the type
> > > and a lot of switch-case conditions and go through all different
> > members of each header.
> >
> > Thanks for the clarification.
> >
> > I'll tend to say that if the rte_ecpri_msg_hdr structure is only useful for
> > rte_flow, it should be defined inside rte_flow.
> >
>
> Right now, yes it will be used in rte_flow. But I checked the current implementations
> of each item, almost all the headers are defined in their own protocol files. So in v6,
> I change the name of it, in order not to confuse the users of this API, would it be OK?
> Thanks
OK
>
> > However, I have some fears about the dummy[3]. You said it could be
> > enlarged later: I think it is dangerous to change the size of a structure
> > that may be used to parse data (and this would be an ABI change).
> > Also, it seems dummy[3] cannot be used easily to swap endianness, so
> > is it really useful?
> >
>
> It might be enlarger but not for now, until a new revision of this specification. For
> all the subtypes it has right now, the specification will remain them as same as today.
> Only new types will be added then. So after several years, we may consider to change it
> in the LTS. Is it OK?
OK, I think in this case it may even be another structure
> In most cases, the endianness swap could be easy, we will swap it in each DW / u32. Tome
> the exception is that some field crosses the u32 boundary, like the address in type#4, we may
> treat it separately. And the most useful case is for the mask checking, it could simplify the
> checking to at most 3 (u32==0?) without going through each member of different types.
>
> And v6 already sent, I change the code based on your suggestion. Would you please
> help to give some comments also?
>
> Appreciate for your help and suggestion.
>
> >
> > Thanks,
> > Olivier
> >
> >
> > > > > +
> > > > > +#ifdef __cplusplus
> > > > > +}
> > > > > +#endif
> > > > > +
> > > > > +#endif /* _RTE_ECPRI_H_ */
> > > > > diff --git a/lib/librte_net/rte_ether.h
> > > > > b/lib/librte_net/rte_ether.h index 0ae4e75..184a3f9 100644
> > > > > --- a/lib/librte_net/rte_ether.h
> > > > > +++ b/lib/librte_net/rte_ether.h
> > > > > @@ -304,6 +304,7 @@ struct rte_vlan_hdr { #define
> > > > RTE_ETHER_TYPE_LLDP
> > > > > 0x88CC /**< LLDP Protocol. */ #define RTE_ETHER_TYPE_MPLS
> > > > 0x8847 /**<
> > > > > MPLS ethertype. */ #define RTE_ETHER_TYPE_MPLSM 0x8848
> > /**<
> > > > MPLS
> > > > > multicast ethertype. */
> > > > > +#define RTE_ETHER_TYPE_ECPRI 0xAEFE /**< eCPRI ethertype
> > (.1Q
> > > > > +supported). */
> > > > >
> > > > > /**
> > > > > * Extract VLAN tag information into mbuf
> > > > > --
> > > > > 1.8.3.1
> > > > >
> > >
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v5 1/2] rte_flow: add eCPRI key fields to flow API
2020-07-12 13:17 3% ` Olivier Matz
@ 2020-07-12 14:28 0% ` Bing Zhao
2020-07-12 14:43 0% ` Olivier Matz
0 siblings, 1 reply; 200+ results
From: Bing Zhao @ 2020-07-12 14:28 UTC (permalink / raw)
To: Olivier Matz
Cc: Ori Kam, john.mcnamara, marko.kovacevic, Thomas Monjalon,
ferruh.yigit, arybchenko, akhil.goyal, dev, wenzhuo.lu,
beilei.xing, bernard.iremonger
Hi Olivier,
Thanks
BR. Bing
> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Sunday, July 12, 2020 9:18 PM
> To: Bing Zhao <bingz@mellanox.com>
> Cc: Ori Kam <orika@mellanox.com>; john.mcnamara@intel.com;
> marko.kovacevic@intel.com; Thomas Monjalon
> <thomas@monjalon.net>; ferruh.yigit@intel.com;
> arybchenko@solarflare.com; akhil.goyal@nxp.com; dev@dpdk.org;
> wenzhuo.lu@intel.com; beilei.xing@intel.com;
> bernard.iremonger@intel.com
> Subject: Re: [PATCH v5 1/2] rte_flow: add eCPRI key fields to flow API
>
> Hi Bing,
>
> On Sat, Jul 11, 2020 at 04:25:49AM +0000, Bing Zhao wrote:
> > Hi Olivier,
> > Many thanks for your comments.
>
> [...]
>
> > > > +/**
> > > > + * eCPRI Common Header
> > > > + */
> > > > +RTE_STD_C11
> > > > +struct rte_ecpri_common_hdr {
> > > > +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > > > + uint32_t size:16; /**< Payload Size */
> > > > + uint32_t type:8; /**< Message Type */
> > > > + uint32_t c:1; /**< Concatenation Indicator
> > > */
> > > > + uint32_t res:3; /**< Reserved */
> > > > + uint32_t revision:4; /**< Protocol Revision */
> > > > +#elif RTE_BYTE_ORDER == RTE_BIG_ENDIAN
> > > > + uint32_t revision:4; /**< Protocol Revision */
> > > > + uint32_t res:3; /**< Reserved */
> > > > + uint32_t c:1; /**< Concatenation Indicator
> > > */
> > > > + uint32_t type:8; /**< Message Type */
> > > > + uint32_t size:16; /**< Payload Size */
> > > > +#endif
> > > > +} __rte_packed;
> > >
> > > Does it really need to be packed? Why next types do not need it?
> > > It looks only those which have bitfields are.
> > >
> >
> > Nice catch, thanks. For the common header, there is no need to use
> the
> > packed attribute, because it is a u32 and the bitfields will be
> > aligned.
> > I checked all the definitions again. Only " Type #4: Remote Memory
> Access"
> > needs to use the packed attribute.
> > For other sub-types, "sub-header" part of the message payload will
> get
> > aligned by nature. For example, u16 after u16, u8 after u16, these
> > should be OK.
> > But in type #4, the address is 48bits wide, with 16bits MSB and 32bits
> > LSB (no detailed description in the specification, correct me if
> > anything wrong.) Usually, the 48bits address will be devided as this
> > in a system. And there is no 48-bits type at all. So we need to define
> two parts for it: 32b LSB follows 16b MSB.
> > u32 after u16 should be with packed attribute. Thanks
>
> What about using a bitfield into a uint64_t ? I mean:
>
> struct rte_ecpri_msg_rm_access {
> if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> ...
> uint64_t length:16; /**< number of bytes
> */
> uint64_t addr:48; /**< address */
> #else
> ...
> uint64_t addr:48; /**< address */
> uint64_t length:16; /**< number of bytes
> */
> #endif
> };
>
Thanks for your suggestion.
https://stackoverflow.com/questions/10906238/warning-when-using-bitfield-with-unsigned-char
AFAIK (from this page), bitfields support only support bool and int. uint64_t is some type of "long"
and most of the compilers should support it. But I am not sure if it is a standard implementation.
>
> >
> > >
> > > I wonder if the 'dw0' could be in this definition instead of in
> > > struct rte_ecpri_msg_hdr?
> > >
> > > Something like this:
> > >
> > > struct rte_ecpri_common_hdr {
> > > union {
> > > uint32_t u32;
> > > struct {
> > > ...
> > > };
> > > };
> > > };
> > >
> > > I see 2 advantages:
> > >
> > > - someone that only uses the common_hdr struct can use the .u32
> > > in its software
> > > - when using it in messages, it looks clearer to me:
> > > msg.common_hdr.u32 = value;
> > > instead of:
> > > msg.dw0 = value;
> > >
> > > What do you think?
> >
> > Thanks for the suggestion, this is much better, I will change it.
> > Indeed, in my original version, no DW(u32) is defined for the header.
> > After that, I noticed that it would not be easy for the static casting
> > to a u32 from bitfield(the compiler will complain), and it would not
> > be clear to swap the endian if the user wants to use this header. I
> > added this DW(u32) to simplify the usage of this header. But yes, if I
> > do not add it here, it would be not easy or clear for users who just
> use this header structure.
> > I will change it. Is it OK if I use the name "dw0"?
>
> In my opinion, u32 is more usual than dw0.
>
I sent patch set v6 with this change, thanks.
> >
> > >
> > > > +
> > > > +/**
> > > > + * eCPRI Message Header of Type #0: IQ Data */ struct
> > > > +rte_ecpri_msg_iq_data {
> > > > + rte_be16_t pc_id; /**< Physical channel ID */
> > > > + rte_be16_t seq_id; /**< Sequence ID */
> > > > +};
> > > > +
> > > > +/**
> > > > + * eCPRI Message Header of Type #1: Bit Sequence */ struct
> > > > +rte_ecpri_msg_bit_seq {
> > > > + rte_be16_t pc_id; /**< Physical channel ID */
> > > > + rte_be16_t seq_id; /**< Sequence ID */
> > > > +};
> > > > +
> > > > +/**
> > > > + * eCPRI Message Header of Type #2: Real-Time Control Data */
> > > struct
> > > > +rte_ecpri_msg_rtc_ctrl {
> > > > + rte_be16_t rtc_id; /**< Real-Time Control Data ID
> > > */
> > > > + rte_be16_t seq_id; /**< Sequence ID */
> > > > +};
> > > > +
> > > > +/**
> > > > + * eCPRI Message Header of Type #3: Generic Data Transfer */
> > > struct
> > > > +rte_ecpri_msg_gen_data {
> > > > + rte_be32_t pc_id; /**< Physical channel ID */
> > > > + rte_be32_t seq_id; /**< Sequence ID */
> > > > +};
> > > > +
> > > > +/**
> > > > + * eCPRI Message Header of Type #4: Remote Memory Access
> */
> > > > +RTE_STD_C11
> > > > +struct rte_ecpri_msg_rm_access {
> > > > +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > > > + uint32_t ele_id:16; /**< Element ID */
> > > > + uint32_t rr:4; /**< Req/Resp */
> > > > + uint32_t rw:4; /**< Read/Write */
> > > > + uint32_t rma_id:8; /**< Remote Memory Access
> > > ID */
> > > > +#elif RTE_BYTE_ORDER == RTE_BIG_ENDIAN
> > > > + uint32_t rma_id:8; /**< Remote Memory Access
> > > ID */
> > > > + uint32_t rw:4; /**< Read/Write */
> > > > + uint32_t rr:4; /**< Req/Resp */
> > > > + uint32_t ele_id:16; /**< Element ID */
> > > > +#endif
> > > > + rte_be16_t addr_m; /**< 48-bits address (16 MSB)
> > > */
> > > > + rte_be32_t addr_l; /**< 48-bits address (32 LSB) */
> > > > + rte_be16_t length; /**< number of bytes */
> > > > +} __rte_packed;
> > > > +
> > > > +/**
> > > > + * eCPRI Message Header of Type #5: One-Way Delay
> Measurement
> > > */
> > > > +struct rte_ecpri_msg_delay_measure {
> > > > + uint8_t msr_id; /**< Measurement ID */
> > > > + uint8_t act_type; /**< Action Type */
> > > > +};
> > > > +
> > > > +/**
> > > > + * eCPRI Message Header of Type #6: Remote Reset */ struct
> > > > +rte_ecpri_msg_remote_reset {
> > > > + rte_be16_t rst_id; /**< Reset ID */
> > > > + uint8_t rst_op; /**< Reset Code Op */
> > > > +};
> > > > +
> > > > +/**
> > > > + * eCPRI Message Header of Type #7: Event Indication */ struct
> > > > +rte_ecpri_msg_event_ind {
> > > > + uint8_t evt_id; /**< Event ID */
> > > > + uint8_t evt_type; /**< Event Type */
> > > > + uint8_t seq; /**< Sequence Number */
> > > > + uint8_t number; /**< Number of
> > > Faults/Notif */
> > > > +};
> > > > +
> > > > +/**
> > > > + * eCPRI Message Header Format: Common Header + Message
> > > Types */
> > > > +RTE_STD_C11
> > > > +struct rte_ecpri_msg_hdr {
> > > > + union {
> > > > + struct rte_ecpri_common_hdr common;
> > > > + uint32_t dw0;
> > > > + };
> > > > + union {
> > > > + struct rte_ecpri_msg_iq_data type0;
> > > > + struct rte_ecpri_msg_bit_seq type1;
> > > > + struct rte_ecpri_msg_rtc_ctrl type2;
> > > > + struct rte_ecpri_msg_bit_seq type3;
> > > > + struct rte_ecpri_msg_rm_access type4;
> > > > + struct rte_ecpri_msg_delay_measure type5;
> > > > + struct rte_ecpri_msg_remote_reset type6;
> > > > + struct rte_ecpri_msg_event_ind type7;
> > > > + uint32_t dummy[3];
> > > > + };
> > > > +};
> > >
> > > What is the point in having this struct?
> > >
> > > From a software point of view, I think it is a bit risky, because
> > > its size is the size of the largest message. This is probably what
> > > you want in your case, but when a software will rx or tx such
> > > packet, I think they shouldn't use this one. My understanding is
> > > that you only need this structure for the mask in rte_flow.
> > >
> > > Also, I'm not sure to understand the purpose of dummy[3], even
> after
> > > reading your answer to Akhil's question.
> > >
> >
> > Basically YES and no. To my understanding, the eCPRI message
> format is
> > something like the ICMP packet format. The message (packet) itself
> > will be parsed into different formats based on the type of the
> common
> > header. In the message payload part, there is no distinct definition
> > of the "sub-header". We can divide them into the sub-header and
> data parts based on the specification.
> > E.g. physical channel ID / real-time control ID / Event ID + type are
> > the parts that datapath forwarding will only care about. The
> following
> > timestamp or user data parts are the parts that the higher layer in
> the application will use.
> > 1. If an application wants to create some offload flow, or even
> handle
> > it in the SW, the common header + first several bytes in the payload
> > should be enough. BUT YES, it is not good or safe to use it in the
> higher layer of the application.
> > 2. A higher layer of the application should have its own definition of
> > the whole payload of a specific sub-type, including the parsing of the
> user data part after the "sub-header".
> > It is better for them just skip the first 4 bytes of the eCPRI message or
> a known offset.
> > We do not need to cover the upper layers.
>
> Let me explain what is my vision of how an application would use the
> structures (these are completly dummy examples, as I don't know
> ecpri protocol at all).
>
> Rx:
>
> int ecpri_input(struct rte_mbuf *m)
> {
> struct rte_ecpri_common_hdr hdr_copy, *hdr;
> struct rte_ecpri_msg_event_ind event_copy, *event;
> struct app_specific app_copy, *app;
>
> hdr = rte_pktmbuf_read(m, 0, sizeof(*hdr),
> &hdr_copy);
> if (unlikely(hdr == NULL))
> return -1;
> switch (hdr->type) {
> ...
> case RTE_ECPRI_EVT_IND_NTFY_IND:
> event = rte_pktmbuf_read(m, sizeof(*hdr),
> sizeof(*event),
> &event_copy);
> if (unlikely(event == NULL))
> return -1;
> ...
> app = rte_pktmbuf_read(m, sizeof(*app),
> sizeof(*hdr) + sizeof(*event),
> &app_copy);
> ...
>
> Tx:
>
> int ecpri_output(void)
> {
> struct rte_ecpri_common_hdr *hdr;
> struct rte_ecpri_msg_event_ind *event;
> struct app_specific *app;
>
> m = rte_pktmbuf_alloc(mp);
> if (unlikely(m == NULL))
> return -1;
>
> app = rte_pktmbuf_append(m, sizeof(*data));
> if (app == NULL)
> ...
> app->... = ...;
> ...
> event = rte_pktmbuf_prepend(m, sizeof(*event));
> if (event == NULL)
> ...
> event->... = ...;
> ...
> hdr = rte_pktmbuf_prepend(m, sizeof(*hdr));
> if (hdr == NULL)
> ...
> hdr->... = ...;
>
> return packet_send(m);
> }
>
> In these 2 examples, we never need the unioned structure (struct
> rte_ecpri_msg_hdr).
>
> Using it does not look possible to me, because its size is corresponds to
> the largest message, not to the one we are parsing/building.
>
Yes, in the cases, we do not need the unioned structures at all.
Since the common header is always 4 bytes, an application could use the
sub-types structures started from an offset of 4 of eCPRI layer, as in your example.
This is in the datapath. My original purpose is for some "control path", typically
the flow (not sure if any other new lib implementation) API, then the union
could be used there w/o treating the common header and message payload
header in a separate way and then combine them together. In this case, only
the first several bytes will be accessed and checked, there will be no change
of message itself, and then just passing it to datapath for further handling as
in your example.
> > I think some comments could be added here, is it OK?
> > 3. Regarding this structure, I add it because I do not want to
> > introduce a lot of new items in the rte_flow: new items with
> > structures, new enum types. I prefer one single structure will cover
> most of the cases (subtypes). What do you think?
> > 4. About the *dummy* u32, I calculated all the "subheaders" and
> choose
> > the maximal value of the length. Two purposes (same as the u32 in
> the common header):
> > a. easy to swap the endianness, but not quite useful. Because some
> parts are u16 and u8,
> > and should not be swapped in a u32. (some physical channel ID
> and address LSB have 32bits width)
> > But if some HW parsed the header u32 by u32, then it would be
> helpful.
> > b. easy for checking in flow API, if the user wants to insert a flow.
> Some checking should
> > be done to confirm if it is wildcard flow (all eCPRI messages or
> eCPRI message in some specific type),
> > or some precise flow (to match the pc id or rtc id, for example).
> With these fields, 3 DW
> > of the masks only need to be check before continuing. Or else, the
> code needs to check the type
> > and a lot of switch-case conditions and go through all different
> members of each header.
>
> Thanks for the clarification.
>
> I'll tend to say that if the rte_ecpri_msg_hdr structure is only useful for
> rte_flow, it should be defined inside rte_flow.
>
Right now, yes it will be used in rte_flow. But I checked the current implementations
of each item, almost all the headers are defined in their own protocol files. So in v6,
I change the name of it, in order not to confuse the users of this API, would it be OK?
Thanks
> However, I have some fears about the dummy[3]. You said it could be
> enlarged later: I think it is dangerous to change the size of a structure
> that may be used to parse data (and this would be an ABI change).
> Also, it seems dummy[3] cannot be used easily to swap endianness, so
> is it really useful?
>
It might be enlarger but not for now, until a new revision of this specification. For
all the subtypes it has right now, the specification will remain them as same as today.
Only new types will be added then. So after several years, we may consider to change it
in the LTS. Is it OK?
In most cases, the endianness swap could be easy, we will swap it in each DW / u32. Tome
the exception is that some field crosses the u32 boundary, like the address in type#4, we may
treat it separately. And the most useful case is for the mask checking, it could simplify the
checking to at most 3 (u32==0?) without going through each member of different types.
And v6 already sent, I change the code based on your suggestion. Would you please
help to give some comments also?
Appreciate for your help and suggestion.
>
> Thanks,
> Olivier
>
>
> > > > +
> > > > +#ifdef __cplusplus
> > > > +}
> > > > +#endif
> > > > +
> > > > +#endif /* _RTE_ECPRI_H_ */
> > > > diff --git a/lib/librte_net/rte_ether.h
> > > > b/lib/librte_net/rte_ether.h index 0ae4e75..184a3f9 100644
> > > > --- a/lib/librte_net/rte_ether.h
> > > > +++ b/lib/librte_net/rte_ether.h
> > > > @@ -304,6 +304,7 @@ struct rte_vlan_hdr { #define
> > > RTE_ETHER_TYPE_LLDP
> > > > 0x88CC /**< LLDP Protocol. */ #define RTE_ETHER_TYPE_MPLS
> > > 0x8847 /**<
> > > > MPLS ethertype. */ #define RTE_ETHER_TYPE_MPLSM 0x8848
> /**<
> > > MPLS
> > > > multicast ethertype. */
> > > > +#define RTE_ETHER_TYPE_ECPRI 0xAEFE /**< eCPRI ethertype
> (.1Q
> > > > +supported). */
> > > >
> > > > /**
> > > > * Extract VLAN tag information into mbuf
> > > > --
> > > > 1.8.3.1
> > > >
> >
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH 20.11 3/5] rawdev: add private data length parameter to config fn
2020-07-09 15:20 3% ` [dpdk-dev] [PATCH 20.11 3/5] rawdev: add private data length parameter to config fn Bruce Richardson
@ 2020-07-12 14:13 0% ` Xu, Rosen
0 siblings, 0 replies; 200+ results
From: Xu, Rosen @ 2020-07-12 14:13 UTC (permalink / raw)
To: Richardson, Bruce, Nipun Gupta, Hemant Agrawal
Cc: dev, Zhang, Tianfei, Li, Xiaoyun, Wu, Jingjing, Satha Rao,
Mahipal Challa, Jerin Jacob
Hi,
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
> -----Original Message-----
> From: Richardson, Bruce <bruce.richardson@intel.com>
> Sent: Thursday, July 09, 2020 23:21
> To: Nipun Gupta <nipun.gupta@nxp.com>; Hemant Agrawal
> <hemant.agrawal@nxp.com>
> Cc: dev@dpdk.org; Xu, Rosen <rosen.xu@intel.com>; Zhang, Tianfei
> <tianfei.zhang@intel.com>; Li, Xiaoyun <xiaoyun.li@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>; Satha Rao <skoteshwar@marvell.com>; Mahipal
> Challa <mchalla@marvell.com>; Jerin Jacob <jerinj@marvell.com>;
> Richardson, Bruce <bruce.richardson@intel.com>
> Subject: [PATCH 20.11 3/5] rawdev: add private data length parameter to
> config fn
>
> Currently with the rawdev API there is no way to check that the structure
> passed in via the dev_private pointer in the structure passed to configure API
> is of the correct type - it's just checked that it is non-NULL. Adding in the
> length of the expected structure provides a measure of typechecking, and
> can also be used for ABI compatibility in future, since ABI changes involving
> structs almost always involve a change in size.
>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
> drivers/raw/ifpga/ifpga_rawdev.c | 3 ++-
> drivers/raw/ioat/ioat_rawdev.c | 5 +++--
> drivers/raw/ioat/ioat_rawdev_test.c | 2 +-
> drivers/raw/ntb/ntb.c | 6 +++++-
> drivers/raw/octeontx2_dma/otx2_dpi_rawdev.c | 7 ++++---
> drivers/raw/octeontx2_dma/otx2_dpi_test.c | 3 ++-
> drivers/raw/octeontx2_ep/otx2_ep_rawdev.c | 7 ++++---
> drivers/raw/octeontx2_ep/otx2_ep_test.c | 2 +-
> drivers/raw/skeleton/skeleton_rawdev.c | 5 +++--
> drivers/raw/skeleton/skeleton_rawdev_test.c | 5 +++--
> examples/ioat/ioatfwd.c | 2 +-
> examples/ntb/ntb_fwd.c | 2 +-
> lib/librte_rawdev/rte_rawdev.c | 6 ++++--
> lib/librte_rawdev/rte_rawdev.h | 8 +++++++-
> lib/librte_rawdev/rte_rawdev_pmd.h | 3 ++-
> 15 files changed, 43 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/raw/ifpga/ifpga_rawdev.c
> b/drivers/raw/ifpga/ifpga_rawdev.c
> index 32a2b96c9..a50173264 100644
> --- a/drivers/raw/ifpga/ifpga_rawdev.c
> +++ b/drivers/raw/ifpga/ifpga_rawdev.c
> @@ -684,7 +684,8 @@ ifpga_rawdev_info_get(struct rte_rawdev *dev,
>
> static int
> ifpga_rawdev_configure(const struct rte_rawdev *dev,
> - rte_rawdev_obj_t config)
> + rte_rawdev_obj_t config,
> + size_t config_size __rte_unused)
> {
> IFPGA_RAWDEV_PMD_FUNC_TRACE();
>
> diff --git a/drivers/raw/ioat/ioat_rawdev.c b/drivers/raw/ioat/ioat_rawdev.c
> index 6a336795d..b29ff983f 100644
> --- a/drivers/raw/ioat/ioat_rawdev.c
> +++ b/drivers/raw/ioat/ioat_rawdev.c
> @@ -41,7 +41,8 @@ RTE_LOG_REGISTER(ioat_pmd_logtype, rawdev.ioat,
> INFO); #define COMPLETION_SZ sizeof(__m128i)
>
> static int
> -ioat_dev_configure(const struct rte_rawdev *dev, rte_rawdev_obj_t config)
> +ioat_dev_configure(const struct rte_rawdev *dev, rte_rawdev_obj_t
> config,
> + size_t config_size)
> {
> struct rte_ioat_rawdev_config *params = config;
> struct rte_ioat_rawdev *ioat = dev->dev_private; @@ -51,7 +52,7
> @@ ioat_dev_configure(const struct rte_rawdev *dev, rte_rawdev_obj_t
> config)
> if (dev->started)
> return -EBUSY;
>
> - if (params == NULL)
> + if (params == NULL || config_size != sizeof(*params))
> return -EINVAL;
>
> if (params->ring_size > 4096 || params->ring_size < 64 || diff --git
> a/drivers/raw/ioat/ioat_rawdev_test.c
> b/drivers/raw/ioat/ioat_rawdev_test.c
> index 90f5974cd..e5b50ae9f 100644
> --- a/drivers/raw/ioat/ioat_rawdev_test.c
> +++ b/drivers/raw/ioat/ioat_rawdev_test.c
> @@ -165,7 +165,7 @@ ioat_rawdev_test(uint16_t dev_id)
> }
>
> p.ring_size = IOAT_TEST_RINGSIZE;
> - if (rte_rawdev_configure(dev_id, &info) != 0) {
> + if (rte_rawdev_configure(dev_id, &info, sizeof(p)) != 0) {
> printf("Error with rte_rawdev_configure()\n");
> return -1;
> }
> diff --git a/drivers/raw/ntb/ntb.c b/drivers/raw/ntb/ntb.c index
> eaeb67b74..c181094d5 100644
> --- a/drivers/raw/ntb/ntb.c
> +++ b/drivers/raw/ntb/ntb.c
> @@ -837,13 +837,17 @@ ntb_dev_info_get(struct rte_rawdev *dev,
> rte_rawdev_obj_t dev_info, }
>
> static int
> -ntb_dev_configure(const struct rte_rawdev *dev, rte_rawdev_obj_t config)
> +ntb_dev_configure(const struct rte_rawdev *dev, rte_rawdev_obj_t
> config,
> + size_t config_size)
> {
> struct ntb_dev_config *conf = config;
> struct ntb_hw *hw = dev->dev_private;
> uint32_t xstats_num;
> int ret;
>
> + if (conf == NULL || config_size != sizeof(*conf))
> + return -EINVAL;
> +
> hw->queue_pairs = conf->num_queues;
> hw->queue_size = conf->queue_size;
> hw->used_mw_num = conf->mz_num;
> diff --git a/drivers/raw/octeontx2_dma/otx2_dpi_rawdev.c
> b/drivers/raw/octeontx2_dma/otx2_dpi_rawdev.c
> index e398abb75..5b496446c 100644
> --- a/drivers/raw/octeontx2_dma/otx2_dpi_rawdev.c
> +++ b/drivers/raw/octeontx2_dma/otx2_dpi_rawdev.c
> @@ -294,7 +294,8 @@ otx2_dpi_rawdev_reset(struct rte_rawdev *dev) }
>
> static int
> -otx2_dpi_rawdev_configure(const struct rte_rawdev *dev,
> rte_rawdev_obj_t config)
> +otx2_dpi_rawdev_configure(const struct rte_rawdev *dev,
> rte_rawdev_obj_t config,
> + size_t config_size)
> {
> struct dpi_rawdev_conf_s *conf = config;
> struct dpi_vf_s *dpivf = NULL;
> @@ -302,8 +303,8 @@ otx2_dpi_rawdev_configure(const struct rte_rawdev
> *dev, rte_rawdev_obj_t config)
> uintptr_t pool;
> uint32_t gaura;
>
> - if (conf == NULL) {
> - otx2_dpi_dbg("NULL configuration");
> + if (conf == NULL || config_size != sizeof(*conf)) {
> + otx2_dpi_dbg("NULL or invalid configuration");
> return -EINVAL;
> }
> dpivf = (struct dpi_vf_s *)dev->dev_private; diff --git
> a/drivers/raw/octeontx2_dma/otx2_dpi_test.c
> b/drivers/raw/octeontx2_dma/otx2_dpi_test.c
> index 276658af0..cec6ca91b 100644
> --- a/drivers/raw/octeontx2_dma/otx2_dpi_test.c
> +++ b/drivers/raw/octeontx2_dma/otx2_dpi_test.c
> @@ -182,7 +182,8 @@ test_otx2_dma_rawdev(uint16_t val)
> /* Configure rawdev ports */
> conf.chunk_pool = dpi_create_mempool();
> rdev_info.dev_private = &conf;
> - ret = rte_rawdev_configure(i, (rte_rawdev_obj_t)&rdev_info);
> + ret = rte_rawdev_configure(i, (rte_rawdev_obj_t)&rdev_info,
> + sizeof(conf));
> if (ret) {
> otx2_dpi_dbg("Unable to configure DPIVF %d", i);
> return -ENODEV;
> diff --git a/drivers/raw/octeontx2_ep/otx2_ep_rawdev.c
> b/drivers/raw/octeontx2_ep/otx2_ep_rawdev.c
> index 0778603d5..2b78a7941 100644
> --- a/drivers/raw/octeontx2_ep/otx2_ep_rawdev.c
> +++ b/drivers/raw/octeontx2_ep/otx2_ep_rawdev.c
> @@ -224,13 +224,14 @@ sdp_rawdev_close(struct rte_rawdev *dev) }
>
> static int
> -sdp_rawdev_configure(const struct rte_rawdev *dev, rte_rawdev_obj_t
> config)
> +sdp_rawdev_configure(const struct rte_rawdev *dev, rte_rawdev_obj_t
> config,
> + size_t config_size)
> {
> struct sdp_rawdev_info *app_info = (struct sdp_rawdev_info
> *)config;
> struct sdp_device *sdpvf;
>
> - if (app_info == NULL) {
> - otx2_err("Application config info [NULL]");
> + if (app_info == NULL || config_size != sizeof(*app_info)) {
> + otx2_err("Application config info [NULL] or incorrect size");
> return -EINVAL;
> }
>
> diff --git a/drivers/raw/octeontx2_ep/otx2_ep_test.c
> b/drivers/raw/octeontx2_ep/otx2_ep_test.c
> index 091f1827c..b876275f7 100644
> --- a/drivers/raw/octeontx2_ep/otx2_ep_test.c
> +++ b/drivers/raw/octeontx2_ep/otx2_ep_test.c
> @@ -108,7 +108,7 @@ sdp_rawdev_selftest(uint16_t dev_id)
>
> dev_info.dev_private = &app_info;
>
> - ret = rte_rawdev_configure(dev_id, &dev_info);
> + ret = rte_rawdev_configure(dev_id, &dev_info, sizeof(app_info));
> if (ret) {
> otx2_err("Unable to configure SDP_VF %d", dev_id);
> rte_mempool_free(ioq_mpool);
> diff --git a/drivers/raw/skeleton/skeleton_rawdev.c
> b/drivers/raw/skeleton/skeleton_rawdev.c
> index dce300c35..531d0450c 100644
> --- a/drivers/raw/skeleton/skeleton_rawdev.c
> +++ b/drivers/raw/skeleton/skeleton_rawdev.c
> @@ -68,7 +68,8 @@ static int skeleton_rawdev_info_get(struct rte_rawdev
> *dev, }
>
> static int skeleton_rawdev_configure(const struct rte_rawdev *dev,
> - rte_rawdev_obj_t config)
> + rte_rawdev_obj_t config,
> + size_t config_size)
> {
> struct skeleton_rawdev *skeldev;
> struct skeleton_rawdev_conf *skeldev_conf; @@ -77,7 +78,7 @@
> static int skeleton_rawdev_configure(const struct rte_rawdev *dev,
>
> RTE_FUNC_PTR_OR_ERR_RET(dev, -EINVAL);
>
> - if (!config) {
> + if (config == NULL || config_size != sizeof(*skeldev_conf)) {
> SKELETON_PMD_ERR("Invalid configuration");
> return -EINVAL;
> }
> diff --git a/drivers/raw/skeleton/skeleton_rawdev_test.c
> b/drivers/raw/skeleton/skeleton_rawdev_test.c
> index 9b8390dfb..7dc7c7684 100644
> --- a/drivers/raw/skeleton/skeleton_rawdev_test.c
> +++ b/drivers/raw/skeleton/skeleton_rawdev_test.c
> @@ -126,7 +126,7 @@ test_rawdev_configure(void)
> struct skeleton_rawdev_conf rdev_conf_get = {0};
>
> /* Check invalid configuration */
> - ret = rte_rawdev_configure(test_dev_id, NULL);
> + ret = rte_rawdev_configure(test_dev_id, NULL, 0);
> RTE_TEST_ASSERT(ret == -EINVAL,
> "Null configure; Expected -EINVAL, got %d", ret);
>
> @@ -137,7 +137,8 @@ test_rawdev_configure(void)
>
> rdev_info.dev_private = &rdev_conf_set;
> ret = rte_rawdev_configure(test_dev_id,
> - (rte_rawdev_obj_t)&rdev_info);
> + (rte_rawdev_obj_t)&rdev_info,
> + sizeof(rdev_conf_set));
> RTE_TEST_ASSERT_SUCCESS(ret, "Failed to configure rawdev (%d)",
> ret);
>
> rdev_info.dev_private = &rdev_conf_get; diff --git
> a/examples/ioat/ioatfwd.c b/examples/ioat/ioatfwd.c index
> 5c631da1b..8e9513e44 100644
> --- a/examples/ioat/ioatfwd.c
> +++ b/examples/ioat/ioatfwd.c
> @@ -734,7 +734,7 @@ configure_rawdev_queue(uint32_t dev_id)
> struct rte_ioat_rawdev_config dev_config = { .ring_size = ring_size };
> struct rte_rawdev_info info = { .dev_private = &dev_config };
>
> - if (rte_rawdev_configure(dev_id, &info) != 0) {
> + if (rte_rawdev_configure(dev_id, &info, sizeof(dev_config)) != 0) {
> rte_exit(EXIT_FAILURE,
> "Error with rte_rawdev_configure()\n");
> }
> diff --git a/examples/ntb/ntb_fwd.c b/examples/ntb/ntb_fwd.c index
> 11e224451..656f73659 100644
> --- a/examples/ntb/ntb_fwd.c
> +++ b/examples/ntb/ntb_fwd.c
> @@ -1401,7 +1401,7 @@ main(int argc, char **argv)
> ntb_conf.num_queues = num_queues;
> ntb_conf.queue_size = nb_desc;
> ntb_rawdev_conf.dev_private = (rte_rawdev_obj_t)(&ntb_conf);
> - ret = rte_rawdev_configure(dev_id, &ntb_rawdev_conf);
> + ret = rte_rawdev_configure(dev_id, &ntb_rawdev_conf,
> +sizeof(ntb_conf));
> if (ret)
> rte_exit(EXIT_FAILURE, "Can't config ntb dev: err=%d, "
> "port=%u\n", ret, dev_id);
> diff --git a/lib/librte_rawdev/rte_rawdev.c b/lib/librte_rawdev/rte_rawdev.c
> index bde33763e..6c4d783cc 100644
> --- a/lib/librte_rawdev/rte_rawdev.c
> +++ b/lib/librte_rawdev/rte_rawdev.c
> @@ -104,7 +104,8 @@ rte_rawdev_info_get(uint16_t dev_id, struct
> rte_rawdev_info *dev_info, }
>
> int
> -rte_rawdev_configure(uint16_t dev_id, struct rte_rawdev_info *dev_conf)
> +rte_rawdev_configure(uint16_t dev_id, struct rte_rawdev_info *dev_conf,
> + size_t dev_private_size)
> {
> struct rte_rawdev *dev;
> int diag;
> @@ -123,7 +124,8 @@ rte_rawdev_configure(uint16_t dev_id, struct
> rte_rawdev_info *dev_conf)
> }
>
> /* Configure the device */
> - diag = (*dev->dev_ops->dev_configure)(dev, dev_conf-
> >dev_private);
> + diag = (*dev->dev_ops->dev_configure)(dev, dev_conf-
> >dev_private,
> + dev_private_size);
> if (diag != 0)
> RTE_RDEV_ERR("dev%d dev_configure = %d", dev_id, diag);
> else
> diff --git a/lib/librte_rawdev/rte_rawdev.h
> b/lib/librte_rawdev/rte_rawdev.h index cf6acfd26..73e3bd5ae 100644
> --- a/lib/librte_rawdev/rte_rawdev.h
> +++ b/lib/librte_rawdev/rte_rawdev.h
> @@ -116,13 +116,19 @@ rte_rawdev_info_get(uint16_t dev_id, struct
> rte_rawdev_info *dev_info,
> * driver/implementation can use to configure the device. It is also assumed
> * that once the configuration is done, a `queue_id` type field can be used
> * to refer to some arbitrary internal representation of a queue.
> + * @dev_private_size
> + * The length of the memory space pointed to by dev_private in dev_info.
> + * This should be set to the size of the expected private structure to be
> + * used by the driver, and may be checked by drivers to ensure the
> expected
> + * struct type is provided.
> *
> * @return
> * - 0: Success, device configured.
> * - <0: Error code returned by the driver configuration function.
> */
> int
> -rte_rawdev_configure(uint16_t dev_id, struct rte_rawdev_info *dev_conf);
> +rte_rawdev_configure(uint16_t dev_id, struct rte_rawdev_info *dev_conf,
> + size_t dev_private_size);
>
>
> /**
> diff --git a/lib/librte_rawdev/rte_rawdev_pmd.h
> b/lib/librte_rawdev/rte_rawdev_pmd.h
> index 89e46412a..050f8b029 100644
> --- a/lib/librte_rawdev/rte_rawdev_pmd.h
> +++ b/lib/librte_rawdev/rte_rawdev_pmd.h
> @@ -160,7 +160,8 @@ typedef int (*rawdev_info_get_t)(struct rte_rawdev
> *dev,
> * Returns 0 on success
> */
> typedef int (*rawdev_configure_t)(const struct rte_rawdev *dev,
> - rte_rawdev_obj_t config);
> + rte_rawdev_obj_t config,
> + size_t config_size);
>
> /**
> * Start a configured device.
> --
> 2.25.1
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH 20.11 1/5] rawdev: add private data length parameter to info fn
2020-07-09 15:20 3% ` [dpdk-dev] [PATCH 20.11 1/5] rawdev: add private data length parameter to info fn Bruce Richardson
@ 2020-07-12 14:13 0% ` Xu, Rosen
0 siblings, 0 replies; 200+ results
From: Xu, Rosen @ 2020-07-12 14:13 UTC (permalink / raw)
To: Richardson, Bruce, Nipun Gupta, Hemant Agrawal
Cc: dev, Zhang, Tianfei, Li, Xiaoyun, Wu, Jingjing, Satha Rao,
Mahipal Challa, Jerin Jacob
Hi,
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
> -----Original Message-----
> From: Richardson, Bruce <bruce.richardson@intel.com>
> Sent: Thursday, July 09, 2020 23:21
> To: Nipun Gupta <nipun.gupta@nxp.com>; Hemant Agrawal
> <hemant.agrawal@nxp.com>
> Cc: dev@dpdk.org; Xu, Rosen <rosen.xu@intel.com>; Zhang, Tianfei
> <tianfei.zhang@intel.com>; Li, Xiaoyun <xiaoyun.li@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>; Satha Rao <skoteshwar@marvell.com>; Mahipal
> Challa <mchalla@marvell.com>; Jerin Jacob <jerinj@marvell.com>;
> Richardson, Bruce <bruce.richardson@intel.com>
> Subject: [PATCH 20.11 1/5] rawdev: add private data length parameter to info
> fn
>
> Currently with the rawdev API there is no way to check that the structure
> passed in via the dev_private pointer in the dev_info structure is of the
> correct type - it's just checked that it is non-NULL. Adding in the length of the
> expected structure provides a measure of typechecking, and can also be
> used for ABI compatibility in future, since ABI changes involving structs
> almost always involve a change in size.
>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
> drivers/bus/ifpga/ifpga_bus.c | 2 +-
> drivers/raw/ifpga/ifpga_rawdev.c | 5 +++--
> drivers/raw/ioat/ioat_rawdev.c | 5 +++--
> drivers/raw/ioat/ioat_rawdev_test.c | 4 ++--
> drivers/raw/ntb/ntb.c | 8 +++++++-
> drivers/raw/skeleton/skeleton_rawdev.c | 5 +++--
> drivers/raw/skeleton/skeleton_rawdev_test.c | 19 ++++++++++++-------
> examples/ioat/ioatfwd.c | 2 +-
> examples/ntb/ntb_fwd.c | 2 +-
> lib/librte_rawdev/rte_rawdev.c | 6 ++++--
> lib/librte_rawdev/rte_rawdev.h | 9 ++++++++-
> lib/librte_rawdev/rte_rawdev_pmd.h | 5 ++++-
> 12 files changed, 49 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/bus/ifpga/ifpga_bus.c b/drivers/bus/ifpga/ifpga_bus.c
> index 6b16a20bb..bb8b3dcfb 100644
> --- a/drivers/bus/ifpga/ifpga_bus.c
> +++ b/drivers/bus/ifpga/ifpga_bus.c
> @@ -162,7 +162,7 @@ ifpga_scan_one(struct rte_rawdev *rawdev,
> afu_dev->id.port = afu_pr_conf.afu_id.port;
>
> if (rawdev->dev_ops && rawdev->dev_ops->dev_info_get)
> - rawdev->dev_ops->dev_info_get(rawdev, afu_dev);
> + rawdev->dev_ops->dev_info_get(rawdev, afu_dev,
> sizeof(*afu_dev));
>
> if (rawdev->dev_ops &&
> rawdev->dev_ops->dev_start &&
> diff --git a/drivers/raw/ifpga/ifpga_rawdev.c
> b/drivers/raw/ifpga/ifpga_rawdev.c
> index cc25c662b..47cfa3877 100644
> --- a/drivers/raw/ifpga/ifpga_rawdev.c
> +++ b/drivers/raw/ifpga/ifpga_rawdev.c
> @@ -605,7 +605,8 @@ ifpga_fill_afu_dev(struct opae_accelerator *acc,
>
> static void
> ifpga_rawdev_info_get(struct rte_rawdev *dev,
> - rte_rawdev_obj_t dev_info)
> + rte_rawdev_obj_t dev_info,
> + size_t dev_info_size)
> {
> struct opae_adapter *adapter;
> struct opae_accelerator *acc;
> @@ -617,7 +618,7 @@ ifpga_rawdev_info_get(struct rte_rawdev *dev,
>
> IFPGA_RAWDEV_PMD_FUNC_TRACE();
>
> - if (!dev_info) {
> + if (!dev_info || dev_info_size != sizeof(*afu_dev)) {
> IFPGA_RAWDEV_PMD_ERR("Invalid request");
> return;
> }
> diff --git a/drivers/raw/ioat/ioat_rawdev.c b/drivers/raw/ioat/ioat_rawdev.c
> index f876ffc3f..8dd856c55 100644
> --- a/drivers/raw/ioat/ioat_rawdev.c
> +++ b/drivers/raw/ioat/ioat_rawdev.c
> @@ -113,12 +113,13 @@ ioat_dev_stop(struct rte_rawdev *dev) }
>
> static void
> -ioat_dev_info_get(struct rte_rawdev *dev, rte_rawdev_obj_t dev_info)
> +ioat_dev_info_get(struct rte_rawdev *dev, rte_rawdev_obj_t dev_info,
> + size_t dev_info_size)
> {
> struct rte_ioat_rawdev_config *cfg = dev_info;
> struct rte_ioat_rawdev *ioat = dev->dev_private;
>
> - if (cfg != NULL)
> + if (cfg != NULL && dev_info_size == sizeof(*cfg))
> cfg->ring_size = ioat->ring_size;
> }
>
> diff --git a/drivers/raw/ioat/ioat_rawdev_test.c
> b/drivers/raw/ioat/ioat_rawdev_test.c
> index d99f1bd6b..90f5974cd 100644
> --- a/drivers/raw/ioat/ioat_rawdev_test.c
> +++ b/drivers/raw/ioat/ioat_rawdev_test.c
> @@ -157,7 +157,7 @@ ioat_rawdev_test(uint16_t dev_id)
> return TEST_SKIPPED;
> }
>
> - rte_rawdev_info_get(dev_id, &info);
> + rte_rawdev_info_get(dev_id, &info, sizeof(p));
> if (p.ring_size != expected_ring_size[dev_id]) {
> printf("Error, initial ring size is not as expected (Actual: %d,
> Expected: %d)\n",
> (int)p.ring_size, expected_ring_size[dev_id]);
> @@ -169,7 +169,7 @@ ioat_rawdev_test(uint16_t dev_id)
> printf("Error with rte_rawdev_configure()\n");
> return -1;
> }
> - rte_rawdev_info_get(dev_id, &info);
> + rte_rawdev_info_get(dev_id, &info, sizeof(p));
> if (p.ring_size != IOAT_TEST_RINGSIZE) {
> printf("Error, ring size is not %d (%d)\n",
> IOAT_TEST_RINGSIZE, (int)p.ring_size); diff --
> git a/drivers/raw/ntb/ntb.c b/drivers/raw/ntb/ntb.c index
> e40412bb7..4676c6f8f 100644
> --- a/drivers/raw/ntb/ntb.c
> +++ b/drivers/raw/ntb/ntb.c
> @@ -801,11 +801,17 @@ ntb_dequeue_bufs(struct rte_rawdev *dev, }
>
> static void
> -ntb_dev_info_get(struct rte_rawdev *dev, rte_rawdev_obj_t dev_info)
> +ntb_dev_info_get(struct rte_rawdev *dev, rte_rawdev_obj_t dev_info,
> + size_t dev_info_size)
> {
> struct ntb_hw *hw = dev->dev_private;
> struct ntb_dev_info *info = dev_info;
>
> + if (dev_info_size != sizeof(*info)){
> + NTB_LOG(ERR, "Invalid size parameter to %s", __func__);
> + return;
> + }
> +
> info->mw_cnt = hw->mw_cnt;
> info->mw_size = hw->mw_size;
>
> diff --git a/drivers/raw/skeleton/skeleton_rawdev.c
> b/drivers/raw/skeleton/skeleton_rawdev.c
> index 72ece887a..dc05f3ecf 100644
> --- a/drivers/raw/skeleton/skeleton_rawdev.c
> +++ b/drivers/raw/skeleton/skeleton_rawdev.c
> @@ -42,14 +42,15 @@ static struct queue_buffers
> queue_buf[SKELETON_MAX_QUEUES] = {}; static void clear_queue_bufs(int
> queue_id);
>
> static void skeleton_rawdev_info_get(struct rte_rawdev *dev,
> - rte_rawdev_obj_t dev_info)
> + rte_rawdev_obj_t dev_info,
> + size_t dev_info_size)
> {
> struct skeleton_rawdev *skeldev;
> struct skeleton_rawdev_conf *skeldev_conf;
>
> SKELETON_PMD_FUNC_TRACE();
>
> - if (!dev_info) {
> + if (!dev_info || dev_info_size != sizeof(*skeldev_conf)) {
> SKELETON_PMD_ERR("Invalid request");
> return;
> }
> diff --git a/drivers/raw/skeleton/skeleton_rawdev_test.c
> b/drivers/raw/skeleton/skeleton_rawdev_test.c
> index 9ecfdee81..9b8390dfb 100644
> --- a/drivers/raw/skeleton/skeleton_rawdev_test.c
> +++ b/drivers/raw/skeleton/skeleton_rawdev_test.c
> @@ -106,12 +106,12 @@ test_rawdev_info_get(void)
> struct rte_rawdev_info rdev_info = {0};
> struct skeleton_rawdev_conf skel_conf = {0};
>
> - ret = rte_rawdev_info_get(test_dev_id, NULL);
> + ret = rte_rawdev_info_get(test_dev_id, NULL, 0);
> RTE_TEST_ASSERT(ret == -EINVAL, "Expected -EINVAL, %d", ret);
>
> rdev_info.dev_private = &skel_conf;
>
> - ret = rte_rawdev_info_get(test_dev_id, &rdev_info);
> + ret = rte_rawdev_info_get(test_dev_id, &rdev_info,
> sizeof(skel_conf));
> RTE_TEST_ASSERT_SUCCESS(ret, "Failed to get raw dev info");
>
> return TEST_SUCCESS;
> @@ -142,7 +142,8 @@ test_rawdev_configure(void)
>
> rdev_info.dev_private = &rdev_conf_get;
> ret = rte_rawdev_info_get(test_dev_id,
> - (rte_rawdev_obj_t)&rdev_info);
> + (rte_rawdev_obj_t)&rdev_info,
> + sizeof(rdev_conf_get));
> RTE_TEST_ASSERT_SUCCESS(ret,
> "Failed to obtain rawdev configuration (%d)",
> ret);
> @@ -170,7 +171,8 @@ test_rawdev_queue_default_conf_get(void)
> /* Get the current configuration */
> rdev_info.dev_private = &rdev_conf_get;
> ret = rte_rawdev_info_get(test_dev_id,
> - (rte_rawdev_obj_t)&rdev_info);
> + (rte_rawdev_obj_t)&rdev_info,
> + sizeof(rdev_conf_get));
> RTE_TEST_ASSERT_SUCCESS(ret, "Failed to obtain rawdev
> configuration (%d)",
> ret);
>
> @@ -218,7 +220,8 @@ test_rawdev_queue_setup(void)
> /* Get the current configuration */
> rdev_info.dev_private = &rdev_conf_get;
> ret = rte_rawdev_info_get(test_dev_id,
> - (rte_rawdev_obj_t)&rdev_info);
> + (rte_rawdev_obj_t)&rdev_info,
> + sizeof(rdev_conf_get));
> RTE_TEST_ASSERT_SUCCESS(ret,
> "Failed to obtain rawdev configuration (%d)",
> ret);
> @@ -327,7 +330,8 @@ test_rawdev_start_stop(void)
> dummy_firmware = NULL;
>
> rte_rawdev_start(test_dev_id);
> - ret = rte_rawdev_info_get(test_dev_id,
> (rte_rawdev_obj_t)&rdev_info);
> + ret = rte_rawdev_info_get(test_dev_id,
> (rte_rawdev_obj_t)&rdev_info,
> + sizeof(rdev_conf_get));
> RTE_TEST_ASSERT_SUCCESS(ret,
> "Failed to obtain rawdev configuration (%d)",
> ret);
> @@ -336,7 +340,8 @@ test_rawdev_start_stop(void)
> rdev_conf_get.device_state);
>
> rte_rawdev_stop(test_dev_id);
> - ret = rte_rawdev_info_get(test_dev_id,
> (rte_rawdev_obj_t)&rdev_info);
> + ret = rte_rawdev_info_get(test_dev_id,
> (rte_rawdev_obj_t)&rdev_info,
> + sizeof(rdev_conf_get));
> RTE_TEST_ASSERT_SUCCESS(ret,
> "Failed to obtain rawdev configuration (%d)",
> ret);
> diff --git a/examples/ioat/ioatfwd.c b/examples/ioat/ioatfwd.c index
> b66ee73bc..5c631da1b 100644
> --- a/examples/ioat/ioatfwd.c
> +++ b/examples/ioat/ioatfwd.c
> @@ -757,7 +757,7 @@ assign_rawdevs(void)
> do {
> if (rdev_id == rte_rawdev_count())
> goto end;
> - rte_rawdev_info_get(rdev_id++,
> &rdev_info);
> + rte_rawdev_info_get(rdev_id++, &rdev_info,
> 0);
> } while (rdev_info.driver_name == NULL ||
> strcmp(rdev_info.driver_name,
>
> IOAT_PMD_RAWDEV_NAME_STR) != 0);
> diff --git a/examples/ntb/ntb_fwd.c b/examples/ntb/ntb_fwd.c index
> eba8ebf9f..11e224451 100644
> --- a/examples/ntb/ntb_fwd.c
> +++ b/examples/ntb/ntb_fwd.c
> @@ -1389,7 +1389,7 @@ main(int argc, char **argv)
> rte_rawdev_set_attr(dev_id, NTB_QUEUE_NUM_NAME,
> num_queues);
> printf("Set queue number as %u.\n", num_queues);
> ntb_rawdev_info.dev_private = (rte_rawdev_obj_t)(&ntb_info);
> - rte_rawdev_info_get(dev_id, &ntb_rawdev_info);
> + rte_rawdev_info_get(dev_id, &ntb_rawdev_info, sizeof(ntb_info));
>
> nb_mbuf = nb_desc * num_queues * 2 * 2 + rte_lcore_count() *
> MEMPOOL_CACHE_SIZE;
> diff --git a/lib/librte_rawdev/rte_rawdev.c b/lib/librte_rawdev/rte_rawdev.c
> index 8f84d0b22..a57689035 100644
> --- a/lib/librte_rawdev/rte_rawdev.c
> +++ b/lib/librte_rawdev/rte_rawdev.c
> @@ -78,7 +78,8 @@ rte_rawdev_socket_id(uint16_t dev_id) }
>
> int
> -rte_rawdev_info_get(uint16_t dev_id, struct rte_rawdev_info *dev_info)
> +rte_rawdev_info_get(uint16_t dev_id, struct rte_rawdev_info *dev_info,
> + size_t dev_private_size)
> {
> struct rte_rawdev *rawdev;
>
> @@ -89,7 +90,8 @@ rte_rawdev_info_get(uint16_t dev_id, struct
> rte_rawdev_info *dev_info)
>
> if (dev_info->dev_private != NULL) {
> RTE_FUNC_PTR_OR_ERR_RET(*rawdev->dev_ops-
> >dev_info_get, -ENOTSUP);
> - (*rawdev->dev_ops->dev_info_get)(rawdev, dev_info-
> >dev_private);
> + (*rawdev->dev_ops->dev_info_get)(rawdev, dev_info-
> >dev_private,
> + dev_private_size);
> }
>
> dev_info->driver_name = rawdev->driver_name; diff --git
> a/lib/librte_rawdev/rte_rawdev.h b/lib/librte_rawdev/rte_rawdev.h index
> 32f6b8bb0..cf6acfd26 100644
> --- a/lib/librte_rawdev/rte_rawdev.h
> +++ b/lib/librte_rawdev/rte_rawdev.h
> @@ -82,13 +82,20 @@ struct rte_rawdev_info;
> * will be returned. This can be used to safely query the type of a rawdev
> * instance without needing to know the size of the private data to return.
> *
> + * @param dev_private_size
> + * The length of the memory space pointed to by dev_private in dev_info.
> + * This should be set to the size of the expected private structure to be
> + * returned, and may be checked by drivers to ensure the expected struct
> + * type is provided.
> + *
> * @return
> * - 0: Success, driver updates the contextual information of the raw device
> * - <0: Error code returned by the driver info get function.
> *
> */
> int
> -rte_rawdev_info_get(uint16_t dev_id, struct rte_rawdev_info *dev_info);
> +rte_rawdev_info_get(uint16_t dev_id, struct rte_rawdev_info *dev_info,
> + size_t dev_private_size);
>
> /**
> * Configure a raw device.
> diff --git a/lib/librte_rawdev/rte_rawdev_pmd.h
> b/lib/librte_rawdev/rte_rawdev_pmd.h
> index 4395a2182..0e72a9205 100644
> --- a/lib/librte_rawdev/rte_rawdev_pmd.h
> +++ b/lib/librte_rawdev/rte_rawdev_pmd.h
> @@ -138,12 +138,15 @@ rte_rawdev_pmd_is_valid_dev(uint8_t dev_id)
> * Raw device pointer
> * @param dev_info
> * Raw device information structure
> + * @param dev_private_size
> + * The size of the structure pointed to by dev_info->dev_private
> *
> * @return
> * Returns 0 on success
> */
> typedef void (*rawdev_info_get_t)(struct rte_rawdev *dev,
> - rte_rawdev_obj_t dev_info);
> + rte_rawdev_obj_t dev_info,
> + size_t dev_private_size);
>
> /**
> * Configure a device.
> --
> 2.25.1
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v5 1/2] rte_flow: add eCPRI key fields to flow API
@ 2020-07-12 13:17 3% ` Olivier Matz
2020-07-12 14:28 0% ` Bing Zhao
0 siblings, 1 reply; 200+ results
From: Olivier Matz @ 2020-07-12 13:17 UTC (permalink / raw)
To: Bing Zhao
Cc: Ori Kam, john.mcnamara, marko.kovacevic, Thomas Monjalon,
ferruh.yigit, arybchenko, akhil.goyal, dev, wenzhuo.lu,
beilei.xing, bernard.iremonger
Hi Bing,
On Sat, Jul 11, 2020 at 04:25:49AM +0000, Bing Zhao wrote:
> Hi Olivier,
> Many thanks for your comments.
[...]
> > > +/**
> > > + * eCPRI Common Header
> > > + */
> > > +RTE_STD_C11
> > > +struct rte_ecpri_common_hdr {
> > > +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > > + uint32_t size:16; /**< Payload Size */
> > > + uint32_t type:8; /**< Message Type */
> > > + uint32_t c:1; /**< Concatenation Indicator
> > */
> > > + uint32_t res:3; /**< Reserved */
> > > + uint32_t revision:4; /**< Protocol Revision */
> > > +#elif RTE_BYTE_ORDER == RTE_BIG_ENDIAN
> > > + uint32_t revision:4; /**< Protocol Revision */
> > > + uint32_t res:3; /**< Reserved */
> > > + uint32_t c:1; /**< Concatenation Indicator
> > */
> > > + uint32_t type:8; /**< Message Type */
> > > + uint32_t size:16; /**< Payload Size */
> > > +#endif
> > > +} __rte_packed;
> >
> > Does it really need to be packed? Why next types do not need it?
> > It looks only those which have bitfields are.
> >
>
> Nice catch, thanks. For the common header, there is no need to use
> the packed attribute, because it is a u32 and the bitfields will be
> aligned.
> I checked all the definitions again. Only " Type #4: Remote Memory Access"
> needs to use the packed attribute.
> For other sub-types, "sub-header" part of the message payload will get
> aligned by nature. For example, u16 after u16, u8 after u16, these should
> be OK.
> But in type #4, the address is 48bits wide, with 16bits MSB and 32bits LSB (no
> detailed description in the specification, correct me if anything wrong.) Usually,
> the 48bits address will be devided as this in a system. And there is no 48-bits
> type at all. So we need to define two parts for it: 32b LSB follows 16b MSB.
> u32 after u16 should be with packed attribute. Thanks
What about using a bitfield into a uint64_t ? I mean:
struct rte_ecpri_msg_rm_access {
if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
...
uint64_t length:16; /**< number of bytes */
uint64_t addr:48; /**< address */
#else
...
uint64_t addr:48; /**< address */
uint64_t length:16; /**< number of bytes */
#endif
};
>
> >
> > I wonder if the 'dw0' could be in this definition instead of in struct
> > rte_ecpri_msg_hdr?
> >
> > Something like this:
> >
> > struct rte_ecpri_common_hdr {
> > union {
> > uint32_t u32;
> > struct {
> > ...
> > };
> > };
> > };
> >
> > I see 2 advantages:
> >
> > - someone that only uses the common_hdr struct can use the .u32
> > in its software
> > - when using it in messages, it looks clearer to me:
> > msg.common_hdr.u32 = value;
> > instead of:
> > msg.dw0 = value;
> >
> > What do you think?
>
> Thanks for the suggestion, this is much better, I will change it.
> Indeed, in my original version, no DW(u32) is defined for the header.
> After that, I noticed that it would not be easy for the static casting to a u32
> from bitfield(the compiler will complain), and it would not be clear to
> swap the endian if the user wants to use this header. I added this DW(u32)
> to simplify the usage of this header. But yes, if I do not add it here, it would
> be not easy or clear for users who just use this header structure.
> I will change it. Is it OK if I use the name "dw0"?
In my opinion, u32 is more usual than dw0.
>
> >
> > > +
> > > +/**
> > > + * eCPRI Message Header of Type #0: IQ Data */ struct
> > > +rte_ecpri_msg_iq_data {
> > > + rte_be16_t pc_id; /**< Physical channel ID */
> > > + rte_be16_t seq_id; /**< Sequence ID */
> > > +};
> > > +
> > > +/**
> > > + * eCPRI Message Header of Type #1: Bit Sequence */ struct
> > > +rte_ecpri_msg_bit_seq {
> > > + rte_be16_t pc_id; /**< Physical channel ID */
> > > + rte_be16_t seq_id; /**< Sequence ID */
> > > +};
> > > +
> > > +/**
> > > + * eCPRI Message Header of Type #2: Real-Time Control Data */
> > struct
> > > +rte_ecpri_msg_rtc_ctrl {
> > > + rte_be16_t rtc_id; /**< Real-Time Control Data ID
> > */
> > > + rte_be16_t seq_id; /**< Sequence ID */
> > > +};
> > > +
> > > +/**
> > > + * eCPRI Message Header of Type #3: Generic Data Transfer */
> > struct
> > > +rte_ecpri_msg_gen_data {
> > > + rte_be32_t pc_id; /**< Physical channel ID */
> > > + rte_be32_t seq_id; /**< Sequence ID */
> > > +};
> > > +
> > > +/**
> > > + * eCPRI Message Header of Type #4: Remote Memory Access */
> > > +RTE_STD_C11
> > > +struct rte_ecpri_msg_rm_access {
> > > +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > > + uint32_t ele_id:16; /**< Element ID */
> > > + uint32_t rr:4; /**< Req/Resp */
> > > + uint32_t rw:4; /**< Read/Write */
> > > + uint32_t rma_id:8; /**< Remote Memory Access
> > ID */
> > > +#elif RTE_BYTE_ORDER == RTE_BIG_ENDIAN
> > > + uint32_t rma_id:8; /**< Remote Memory Access
> > ID */
> > > + uint32_t rw:4; /**< Read/Write */
> > > + uint32_t rr:4; /**< Req/Resp */
> > > + uint32_t ele_id:16; /**< Element ID */
> > > +#endif
> > > + rte_be16_t addr_m; /**< 48-bits address (16 MSB)
> > */
> > > + rte_be32_t addr_l; /**< 48-bits address (32 LSB) */
> > > + rte_be16_t length; /**< number of bytes */
> > > +} __rte_packed;
> > > +
> > > +/**
> > > + * eCPRI Message Header of Type #5: One-Way Delay Measurement
> > */
> > > +struct rte_ecpri_msg_delay_measure {
> > > + uint8_t msr_id; /**< Measurement ID */
> > > + uint8_t act_type; /**< Action Type */
> > > +};
> > > +
> > > +/**
> > > + * eCPRI Message Header of Type #6: Remote Reset */ struct
> > > +rte_ecpri_msg_remote_reset {
> > > + rte_be16_t rst_id; /**< Reset ID */
> > > + uint8_t rst_op; /**< Reset Code Op */
> > > +};
> > > +
> > > +/**
> > > + * eCPRI Message Header of Type #7: Event Indication */ struct
> > > +rte_ecpri_msg_event_ind {
> > > + uint8_t evt_id; /**< Event ID */
> > > + uint8_t evt_type; /**< Event Type */
> > > + uint8_t seq; /**< Sequence Number */
> > > + uint8_t number; /**< Number of
> > Faults/Notif */
> > > +};
> > > +
> > > +/**
> > > + * eCPRI Message Header Format: Common Header + Message
> > Types */
> > > +RTE_STD_C11
> > > +struct rte_ecpri_msg_hdr {
> > > + union {
> > > + struct rte_ecpri_common_hdr common;
> > > + uint32_t dw0;
> > > + };
> > > + union {
> > > + struct rte_ecpri_msg_iq_data type0;
> > > + struct rte_ecpri_msg_bit_seq type1;
> > > + struct rte_ecpri_msg_rtc_ctrl type2;
> > > + struct rte_ecpri_msg_bit_seq type3;
> > > + struct rte_ecpri_msg_rm_access type4;
> > > + struct rte_ecpri_msg_delay_measure type5;
> > > + struct rte_ecpri_msg_remote_reset type6;
> > > + struct rte_ecpri_msg_event_ind type7;
> > > + uint32_t dummy[3];
> > > + };
> > > +};
> >
> > What is the point in having this struct?
> >
> > From a software point of view, I think it is a bit risky, because its size is
> > the size of the largest message. This is probably what you want in your
> > case, but when a software will rx or tx such packet, I think they
> > shouldn't use this one. My understanding is that you only need this
> > structure for the mask in rte_flow.
> >
> > Also, I'm not sure to understand the purpose of dummy[3], even after
> > reading your answer to Akhil's question.
> >
>
> Basically YES and no. To my understanding, the eCPRI message format is something
> like the ICMP packet format. The message (packet) itself will be parsed into
> different formats based on the type of the common header. In the message
> payload part, there is no distinct definition of the "sub-header". We can divide
> them into the sub-header and data parts based on the specification.
> E.g. physical channel ID / real-time control ID / Event ID + type are the parts
> that datapath forwarding will only care about. The following timestamp or user data
> parts are the parts that the higher layer in the application will use.
> 1. If an application wants to create some offload flow, or even handle it in the SW, the
> common header + first several bytes in the payload should be enough. BUT YES, it is
> not good or safe to use it in the higher layer of the application.
> 2. A higher layer of the application should have its own definition of the whole payload
> of a specific sub-type, including the parsing of the user data part after the "sub-header".
> It is better for them just skip the first 4 bytes of the eCPRI message or a known offset.
> We do not need to cover the upper layers.
Let me explain what is my vision of how an application would use the
structures (these are completly dummy examples, as I don't know ecpri
protocol at all).
Rx:
int ecpri_input(struct rte_mbuf *m)
{
struct rte_ecpri_common_hdr hdr_copy, *hdr;
struct rte_ecpri_msg_event_ind event_copy, *event;
struct app_specific app_copy, *app;
hdr = rte_pktmbuf_read(m, 0, sizeof(*hdr), &hdr_copy);
if (unlikely(hdr == NULL))
return -1;
switch (hdr->type) {
...
case RTE_ECPRI_EVT_IND_NTFY_IND:
event = rte_pktmbuf_read(m, sizeof(*hdr), sizeof(*event),
&event_copy);
if (unlikely(event == NULL))
return -1;
...
app = rte_pktmbuf_read(m, sizeof(*app),
sizeof(*hdr) + sizeof(*event),
&app_copy);
...
Tx:
int ecpri_output(void)
{
struct rte_ecpri_common_hdr *hdr;
struct rte_ecpri_msg_event_ind *event;
struct app_specific *app;
m = rte_pktmbuf_alloc(mp);
if (unlikely(m == NULL))
return -1;
app = rte_pktmbuf_append(m, sizeof(*data));
if (app == NULL)
...
app->... = ...;
...
event = rte_pktmbuf_prepend(m, sizeof(*event));
if (event == NULL)
...
event->... = ...;
...
hdr = rte_pktmbuf_prepend(m, sizeof(*hdr));
if (hdr == NULL)
...
hdr->... = ...;
return packet_send(m);
}
In these 2 examples, we never need the unioned structure (struct
rte_ecpri_msg_hdr).
Using it does not look possible to me, because its size is corresponds
to the largest message, not to the one we are parsing/building.
> I think some comments could be added here, is it OK?
> 3. Regarding this structure, I add it because I do not want to introduce a lot of new items
> in the rte_flow: new items with structures, new enum types. I prefer one single structure
> will cover most of the cases (subtypes). What do you think?
> 4. About the *dummy* u32, I calculated all the "subheaders" and choose the maximal value
> of the length. Two purposes (same as the u32 in the common header):
> a. easy to swap the endianness, but not quite useful. Because some parts are u16 and u8,
> and should not be swapped in a u32. (some physical channel ID and address LSB have 32bits width)
> But if some HW parsed the header u32 by u32, then it would be helpful.
> b. easy for checking in flow API, if the user wants to insert a flow. Some checking should
> be done to confirm if it is wildcard flow (all eCPRI messages or eCPRI message in some specific type),
> or some precise flow (to match the pc id or rtc id, for example). With these fields, 3 DW
> of the masks only need to be check before continuing. Or else, the code needs to check the type
> and a lot of switch-case conditions and go through all different members of each header.
Thanks for the clarification.
I'll tend to say that if the rte_ecpri_msg_hdr structure is only
useful for rte_flow, it should be defined inside rte_flow.
However, I have some fears about the dummy[3]. You said it could be
enlarged later: I think it is dangerous to change the size of a
structure that may be used to parse data (and this would be an ABI
change). Also, it seems dummy[3] cannot be used easily to swap
endianness, so is it really useful?
Thanks,
Olivier
> > > +
> > > +#ifdef __cplusplus
> > > +}
> > > +#endif
> > > +
> > > +#endif /* _RTE_ECPRI_H_ */
> > > diff --git a/lib/librte_net/rte_ether.h b/lib/librte_net/rte_ether.h
> > > index 0ae4e75..184a3f9 100644
> > > --- a/lib/librte_net/rte_ether.h
> > > +++ b/lib/librte_net/rte_ether.h
> > > @@ -304,6 +304,7 @@ struct rte_vlan_hdr { #define
> > RTE_ETHER_TYPE_LLDP
> > > 0x88CC /**< LLDP Protocol. */ #define RTE_ETHER_TYPE_MPLS
> > 0x8847 /**<
> > > MPLS ethertype. */ #define RTE_ETHER_TYPE_MPLSM 0x8848 /**<
> > MPLS
> > > multicast ethertype. */
> > > +#define RTE_ETHER_TYPE_ECPRI 0xAEFE /**< eCPRI ethertype (.1Q
> > > +supported). */
> > >
> > > /**
> > > * Extract VLAN tag information into mbuf
> > > --
> > > 1.8.3.1
> > >
>
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v7 1/2] mbuf: introduce accurate packet Tx scheduling
2020-07-10 15:46 0% ` Slava Ovsiienko
@ 2020-07-10 22:07 0% ` Ferruh Yigit
0 siblings, 0 replies; 200+ results
From: Ferruh Yigit @ 2020-07-10 22:07 UTC (permalink / raw)
To: Slava Ovsiienko, dev
Cc: Matan Azrad, Raslan Darawsheh, olivier.matz, arybchenko, Thomas Monjalon
On 7/10/2020 4:46 PM, Slava Ovsiienko wrote:
>
>
>> -----Original Message-----
>> From: dev <dev-bounces@dpdk.org> On Behalf Of Viacheslav Ovsiienko
>> Sent: Friday, July 10, 2020 15:40
>> To: dev@dpdk.org
>> Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
>> <rasland@mellanox.com>; olivier.matz@6wind.com;
>> arybchenko@solarflare.com; Thomas Monjalon <thomas@monjalon.net>;
>> ferruh.yigit@intel.com
>> Subject: [dpdk-dev] [PATCH v7 1/2] mbuf: introduce accurate packet Tx
>> scheduling
>>
>> There is the requirement on some networks for precise traffic timing
>> management. The ability to send (and, generally speaking, receive) the
>> packets at the very precisely specified moment of time provides the
>> opportunity to support the connections with Time Division Multiplexing using
>> the contemporary general purpose NIC without involving an auxiliary
>> hardware. For example, the supporting of O-RAN Fronthaul interface is one
>> of the promising features for potentially usage of the precise time
>> management for the egress packets.
>>
>> The main objective of this patchset is to specify the way how applications
>> can provide the moment of time at what the packet transmission must be
>> started and to describe in preliminary the supporting this feature from mlx5
>> PMD side [1].
>>
>> The new dynamic timestamp field is proposed, it provides some timing
>> information, the units and time references (initial phase) are not explicitly
>> defined but are maintained always the same for a given port.
>> Some devices allow to query rte_eth_read_clock() that will return the current
>> device timestamp. The dynamic timestamp flag tells whether the field
>> contains actual timestamp value. For the packets being sent this value can be
>> used by PMD to schedule packet sending.
>>
>> The device clock is opaque entity, the units and frequency are vendor specific
>> and might depend on hardware capabilities and configurations. If might (or
>> not) be synchronized with real time via PTP, might (or not) be synchronous
>> with CPU clock (for example if NIC and CPU share the same clock source
>> there might be no any drift between the NIC and CPU clocks), etc.
>>
>> After PKT_RX_TIMESTAMP flag and fixed timestamp field supposed
>> deprecation and obsoleting, these dynamic flag and field might be used to
>> manage the timestamps on receiving datapath as well. Having the dedicated
>> flags for Rx/Tx timestamps allows applications not to perform explicit flags
>> reset on forwarding and not to promote received timestamps to the
>> transmitting datapath by default.
>> The static PKT_RX_TIMESTAMP is considered as candidate to become the
>> dynamic flag and this move should be discussed.
>>
>> When PMD sees the "rte_dynfield_timestamp" set on the packet being sent it
>> tries to synchronize the time of packet appearing on the wire with the
>> specified packet timestamp. If the specified one is in the past it should be
>> ignored, if one is in the distant future it should be capped with some
>> reasonable value (in range of seconds). These specific cases ("too late" and
>> "distant future") can be optionally reported via device xstats to assist
>> applications to detect the time-related problems.
>>
>> There is no any packet reordering according timestamps is supposed, neither
>> within packet burst, nor between packets, it is an entirely application
>> responsibility to generate packets and its timestamps in desired order. The
>> timestamps can be put only in the first packet in the burst providing the
>> entire burst scheduling.
>>
>> PMD reports the ability to synchronize packet sending on timestamp with
>> new offload flag:
>>
>> This is palliative and might be replaced with new eth_dev API about
>> reporting/managing the supported dynamic flags and its related features.
>> This API would break ABI compatibility and can't be introduced at the
>> moment, so is postponed to 20.11.
>>
>> For testing purposes it is proposed to update testpmd "txonly"
>> forwarding mode routine. With this update testpmd application generates
>> the packets and sets the dynamic timestamps according to specified time
>> pattern if it sees the "rte_dynfield_timestamp" is registered.
>>
>> The new testpmd command is proposed to configure sending pattern:
>>
>> set tx_times <burst_gap>,<intra_gap>
>>
>> <intra_gap> - the delay between the packets within the burst
>> specified in the device clock units. The number
>> of packets in the burst is defined by txburst parameter
>>
>> <burst_gap> - the delay between the bursts in the device clock units
>>
>> As the result the bursts of packet will be transmitted with specific delays
>> between the packets within the burst and specific delay between the bursts.
>> The rte_eth_read_clock is supposed to be engaged to get the current device
>> clock value and provide the reference for the timestamps.
>>
>> [1]
>> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatche
>> s.dpdk.org%2Fpatch%2F73714%2F&data=02%7C01%7Cviacheslavo%40
>> mellanox.com%7C810609c61c3b466e8f5a08d824ce57f8%7Ca652971c7d2e4
>> d9ba6a4d149256f461b%7C0%7C0%7C637299815958194092&sdata=H
>> D7efBGOLuYxHd5KLJYJj7RSbiLRVBNm5jdq%2FJv%2FXfk%3D&reserved=
>> 0
>>
>> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
>
> promote Acked-bt from previous patch version to maintain patchwork
> status accordingly
>
> Acked-by: Olivier Matz <olivier.matz@6wind.com>
>
For series,
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Applied to dpdk-next-net/master, thanks.
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH] doc: mark internal symbols in ethdev
2020-07-10 14:20 0% ` Thomas Monjalon
@ 2020-07-10 16:17 0% ` Ferruh Yigit
0 siblings, 0 replies; 200+ results
From: Ferruh Yigit @ 2020-07-10 16:17 UTC (permalink / raw)
To: Thomas Monjalon
Cc: Neil Horman, John McNamara, Marko Kovacevic, dev, David Marchand,
Andrew Rybchenko, Kinsella, Ray
On 7/10/2020 3:20 PM, Thomas Monjalon wrote:
> 26/06/2020 10:49, Kinsella, Ray:
>> On 23/06/2020 14:49, Ferruh Yigit wrote:
>>> The APIs are marked in the doxygen comment but better to mark the
>>> symbols too. This is planned for v20.11 release.
>>>
>>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
>>> ---
>>> +* ethdev: Some internal APIs for driver usage are exported in the .map file.
>>> + Now DPDK has ``__rte_internal`` marker so we can mark internal APIs and move
>>> + them to the INTERNAL block in .map. Although these APIs are internal it will
>>> + break the ABI checks, that is why change is planned for 20.11.
>>> + The list of internal APIs are mainly ones listed in ``rte_ethdev_driver.h``.
>>> +
>>
>> Acked-by: Ray Kinsella <mdr@ashroe.eu>
>>
>> A bunch of other folks have already annotated "internal" APIs, and added entries to
>> libabigail.abignore to suppress warnings. If you are 100% certain these are never used
>> by end applications, you could do likewise.
>>
>> That said, depreciation notice and completing in 20.11 is definitely the better approach.
>> See https://git.dpdk.org/dpdk/tree/devtools/libabigail.abignore#n53
>
> I agree we can wait 20.11.
>
> Acked-by: Thomas Monjalon <thomas@monjalon.net>
>
Applied to dpdk-next-net/master, thanks.
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v7 1/2] mbuf: introduce accurate packet Tx scheduling
2020-07-10 12:39 2% ` [dpdk-dev] [PATCH v7 " Viacheslav Ovsiienko
@ 2020-07-10 15:46 0% ` Slava Ovsiienko
2020-07-10 22:07 0% ` Ferruh Yigit
0 siblings, 1 reply; 200+ results
From: Slava Ovsiienko @ 2020-07-10 15:46 UTC (permalink / raw)
To: Slava Ovsiienko, dev
Cc: Matan Azrad, Raslan Darawsheh, olivier.matz, arybchenko,
Thomas Monjalon, ferruh.yigit
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Viacheslav Ovsiienko
> Sent: Friday, July 10, 2020 15:40
> To: dev@dpdk.org
> Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; olivier.matz@6wind.com;
> arybchenko@solarflare.com; Thomas Monjalon <thomas@monjalon.net>;
> ferruh.yigit@intel.com
> Subject: [dpdk-dev] [PATCH v7 1/2] mbuf: introduce accurate packet Tx
> scheduling
>
> There is the requirement on some networks for precise traffic timing
> management. The ability to send (and, generally speaking, receive) the
> packets at the very precisely specified moment of time provides the
> opportunity to support the connections with Time Division Multiplexing using
> the contemporary general purpose NIC without involving an auxiliary
> hardware. For example, the supporting of O-RAN Fronthaul interface is one
> of the promising features for potentially usage of the precise time
> management for the egress packets.
>
> The main objective of this patchset is to specify the way how applications
> can provide the moment of time at what the packet transmission must be
> started and to describe in preliminary the supporting this feature from mlx5
> PMD side [1].
>
> The new dynamic timestamp field is proposed, it provides some timing
> information, the units and time references (initial phase) are not explicitly
> defined but are maintained always the same for a given port.
> Some devices allow to query rte_eth_read_clock() that will return the current
> device timestamp. The dynamic timestamp flag tells whether the field
> contains actual timestamp value. For the packets being sent this value can be
> used by PMD to schedule packet sending.
>
> The device clock is opaque entity, the units and frequency are vendor specific
> and might depend on hardware capabilities and configurations. If might (or
> not) be synchronized with real time via PTP, might (or not) be synchronous
> with CPU clock (for example if NIC and CPU share the same clock source
> there might be no any drift between the NIC and CPU clocks), etc.
>
> After PKT_RX_TIMESTAMP flag and fixed timestamp field supposed
> deprecation and obsoleting, these dynamic flag and field might be used to
> manage the timestamps on receiving datapath as well. Having the dedicated
> flags for Rx/Tx timestamps allows applications not to perform explicit flags
> reset on forwarding and not to promote received timestamps to the
> transmitting datapath by default.
> The static PKT_RX_TIMESTAMP is considered as candidate to become the
> dynamic flag and this move should be discussed.
>
> When PMD sees the "rte_dynfield_timestamp" set on the packet being sent it
> tries to synchronize the time of packet appearing on the wire with the
> specified packet timestamp. If the specified one is in the past it should be
> ignored, if one is in the distant future it should be capped with some
> reasonable value (in range of seconds). These specific cases ("too late" and
> "distant future") can be optionally reported via device xstats to assist
> applications to detect the time-related problems.
>
> There is no any packet reordering according timestamps is supposed, neither
> within packet burst, nor between packets, it is an entirely application
> responsibility to generate packets and its timestamps in desired order. The
> timestamps can be put only in the first packet in the burst providing the
> entire burst scheduling.
>
> PMD reports the ability to synchronize packet sending on timestamp with
> new offload flag:
>
> This is palliative and might be replaced with new eth_dev API about
> reporting/managing the supported dynamic flags and its related features.
> This API would break ABI compatibility and can't be introduced at the
> moment, so is postponed to 20.11.
>
> For testing purposes it is proposed to update testpmd "txonly"
> forwarding mode routine. With this update testpmd application generates
> the packets and sets the dynamic timestamps according to specified time
> pattern if it sees the "rte_dynfield_timestamp" is registered.
>
> The new testpmd command is proposed to configure sending pattern:
>
> set tx_times <burst_gap>,<intra_gap>
>
> <intra_gap> - the delay between the packets within the burst
> specified in the device clock units. The number
> of packets in the burst is defined by txburst parameter
>
> <burst_gap> - the delay between the bursts in the device clock units
>
> As the result the bursts of packet will be transmitted with specific delays
> between the packets within the burst and specific delay between the bursts.
> The rte_eth_read_clock is supposed to be engaged to get the current device
> clock value and provide the reference for the timestamps.
>
> [1]
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatche
> s.dpdk.org%2Fpatch%2F73714%2F&data=02%7C01%7Cviacheslavo%40
> mellanox.com%7C810609c61c3b466e8f5a08d824ce57f8%7Ca652971c7d2e4
> d9ba6a4d149256f461b%7C0%7C0%7C637299815958194092&sdata=H
> D7efBGOLuYxHd5KLJYJj7RSbiLRVBNm5jdq%2FJv%2FXfk%3D&reserved=
> 0
>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
promote Acked-bt from previous patch version to maintain patchwork
status accordingly
Acked-by: Olivier Matz <olivier.matz@6wind.com>
>
> ---
> v1->v4:
> - dedicated dynamic Tx timestamp flag instead of shared with Rx
> v4->v5:
> - elaborated commit message
> - more words about device clocks added,
> - note about dedicated Rx/Tx timestamp flags added
> v5->v6:
> - release notes are updated
> v6->v7:
> - commit message is updated
> - testpmd checks the supported offloads before registering
> dynamic timestamp flag/field
> ---
> doc/guides/rel_notes/release_20_08.rst | 7 +++++++
> lib/librte_ethdev/rte_ethdev.c | 1 +
> lib/librte_ethdev/rte_ethdev.h | 4 ++++
> lib/librte_mbuf/rte_mbuf_dyn.h | 31
> +++++++++++++++++++++++++++++++
> 4 files changed, 43 insertions(+)
>
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v3] lib/librte_timer:fix corruption with reset
@ 2020-07-10 15:19 3% ` Stephen Hemminger
2020-07-28 19:04 3% ` Carrillo, Erik G
1 sibling, 0 replies; 200+ results
From: Stephen Hemminger @ 2020-07-10 15:19 UTC (permalink / raw)
To: Sarosh Arif; +Cc: rsanford, erik.g.carrillo, dev, stable, h.mikita89
On Fri, 10 Jul 2020 11:59:54 +0500
Sarosh Arif <sarosh.arif@emumba.com> wrote:
> If the user tries to reset/stop some other timer in it's callback
> function, which is also about to expire, using
> rte_timer_reset_sync/rte_timer_stop_sync the application goes into
> an infinite loop. This happens because
> rte_timer_reset_sync/rte_timer_stop_sync loop until the timer
> resets/stops and there is check inside timer_set_config_state which
> prevents a running timer from being reset/stopped by not it's own
> timer_cb. Therefore timer_set_config_state returns -1 due to which
> rte_timer_reset returns -1 and rte_timer_reset_sync goes into an
> infinite loop.
>
> The soloution to this problem is to return -1 from
> rte_timer_reset_sync/rte_timer_stop_sync in case the user tries to
> reset/stop some other timer in it's callback function.
>
> Bugzilla ID: 491
> Fixes: 20d159f20543 ("timer: fix corruption with reset")
> Cc: h.mikita89@gmail.com
> Signed-off-by: Sarosh Arif <sarosh.arif@emumba.com>
> ---
> v2: remove line continuations
> v3: separate code and declarations
If you want to change the return value, you need to go through the steps
in the API/ABI policy. Maybe even symbol versioning.
Sorry, I know it is painful but we committed to the rules.
And changing the return value can never go to stable.
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v3 00/10] rename blacklist/whitelist to block/allow
@ 2020-07-10 15:06 3% ` David Marchand
2020-07-14 4:43 0% ` Stephen Hemminger
1 sibling, 1 reply; 200+ results
From: David Marchand @ 2020-07-10 15:06 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev, techboard, Luca Boccassi, Mcnamara, John
On Sat, Jun 13, 2020 at 2:01 AM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> The terms blacklist and whitelist are often seen as reminders
> of the divisions in society. Instead, use more exact terms for
> handling of which devices are used in DPDK.
>
> This is a proposed change for DPDK 20.08 to replace the names
> blacklist and whitelist in API and command lines.
>
> The first three patches fix some other unnecessary use of
> blacklist/whitelist and have no user visible impact.
>
> The rest change the PCI blacklist to be blocklist and
> whitelist to be allowlist.
Thanks for working on this.
I agree, the first patches can go in right now.
But I have some concerns about the rest.
New options in EAL are not consistent with "allow"/"block" list:
+ "b:" /* pci-skip-probe */
+ "w:" /* pci-only-probe */
+#define OPT_PCI_SKIP_PROBE "pci-skip-probe"
+ OPT_PCI_SKIP_PROBE_NUM = 'b',
+#define OPT_PCI_ONLY_PROBE "pci-only-probe"
+ OPT_PCI_ONLY_PROBE_NUM = 'w',
The CI flagged the series as failing, because the unit test for EAL
flags is unaligned:
+#define pci_allowlist "--pci-allowlist"
https://travis-ci.com/github/ovsrobot/dpdk/jobs/348668299#L5657
The ABI check complains about the enum update:
https://travis-ci.com/github/ovsrobot/dpdk/jobs/348668301#L2400
Either we deal with this, or we need a libabigail exception rule.
About deprecating existing API/EAL flags in this release, this should
go through the standard deprecation process.
I would go with introducing new options + full compatibility + a
deprecation notice in the 20.08 release.
The actual deprecation/API flagging will go in 20.11.
Removal will come later.
--
David Marchand
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH] doc: mark internal symbols in ethdev
@ 2020-07-10 14:20 0% ` Thomas Monjalon
2020-07-10 16:17 0% ` Ferruh Yigit
0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2020-07-10 14:20 UTC (permalink / raw)
To: Ferruh Yigit
Cc: Neil Horman, John McNamara, Marko Kovacevic, dev, David Marchand,
Andrew Rybchenko, Kinsella, Ray
26/06/2020 10:49, Kinsella, Ray:
> On 23/06/2020 14:49, Ferruh Yigit wrote:
> > The APIs are marked in the doxygen comment but better to mark the
> > symbols too. This is planned for v20.11 release.
> >
> > Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
> > ---
> > +* ethdev: Some internal APIs for driver usage are exported in the .map file.
> > + Now DPDK has ``__rte_internal`` marker so we can mark internal APIs and move
> > + them to the INTERNAL block in .map. Although these APIs are internal it will
> > + break the ABI checks, that is why change is planned for 20.11.
> > + The list of internal APIs are mainly ones listed in ``rte_ethdev_driver.h``.
> > +
>
> Acked-by: Ray Kinsella <mdr@ashroe.eu>
>
> A bunch of other folks have already annotated "internal" APIs, and added entries to
> libabigail.abignore to suppress warnings. If you are 100% certain these are never used
> by end applications, you could do likewise.
>
> That said, depreciation notice and completing in 20.11 is definitely the better approach.
> See https://git.dpdk.org/dpdk/tree/devtools/libabigail.abignore#n53
I agree we can wait 20.11.
Acked-by: Thomas Monjalon <thomas@monjalon.net>
^ permalink raw reply [relevance 0%]
* [dpdk-dev] [PATCH v7 1/2] mbuf: introduce accurate packet Tx scheduling
` (5 preceding siblings ...)
2020-07-09 12:36 2% ` [dpdk-dev] [PATCH v6 " Viacheslav Ovsiienko
@ 2020-07-10 12:39 2% ` Viacheslav Ovsiienko
2020-07-10 15:46 0% ` Slava Ovsiienko
6 siblings, 1 reply; 200+ results
From: Viacheslav Ovsiienko @ 2020-07-10 12:39 UTC (permalink / raw)
To: dev; +Cc: matan, rasland, olivier.matz, arybchenko, thomas, ferruh.yigit
There is the requirement on some networks for precise traffic timing
management. The ability to send (and, generally speaking, receive)
the packets at the very precisely specified moment of time provides
the opportunity to support the connections with Time Division
Multiplexing using the contemporary general purpose NIC without involving
an auxiliary hardware. For example, the supporting of O-RAN Fronthaul
interface is one of the promising features for potentially usage of the
precise time management for the egress packets.
The main objective of this patchset is to specify the way how applications
can provide the moment of time at what the packet transmission must be
started and to describe in preliminary the supporting this feature
from mlx5 PMD side [1].
The new dynamic timestamp field is proposed, it provides some timing
information, the units and time references (initial phase) are not
explicitly defined but are maintained always the same for a given port.
Some devices allow to query rte_eth_read_clock() that will return
the current device timestamp. The dynamic timestamp flag tells whether
the field contains actual timestamp value. For the packets being sent
this value can be used by PMD to schedule packet sending.
The device clock is opaque entity, the units and frequency are
vendor specific and might depend on hardware capabilities and
configurations. If might (or not) be synchronized with real time
via PTP, might (or not) be synchronous with CPU clock (for example
if NIC and CPU share the same clock source there might be no
any drift between the NIC and CPU clocks), etc.
After PKT_RX_TIMESTAMP flag and fixed timestamp field supposed
deprecation and obsoleting, these dynamic flag and field might be
used to manage the timestamps on receiving datapath as well. Having
the dedicated flags for Rx/Tx timestamps allows applications not
to perform explicit flags reset on forwarding and not to promote
received timestamps to the transmitting datapath by default.
The static PKT_RX_TIMESTAMP is considered as candidate to become
the dynamic flag and this move should be discussed.
When PMD sees the "rte_dynfield_timestamp" set on the packet being sent
it tries to synchronize the time of packet appearing on the wire with
the specified packet timestamp. If the specified one is in the past it
should be ignored, if one is in the distant future it should be capped
with some reasonable value (in range of seconds). These specific cases
("too late" and "distant future") can be optionally reported via
device xstats to assist applications to detect the time-related
problems.
There is no any packet reordering according timestamps is supposed,
neither within packet burst, nor between packets, it is an entirely
application responsibility to generate packets and its timestamps
in desired order. The timestamps can be put only in the first packet
in the burst providing the entire burst scheduling.
PMD reports the ability to synchronize packet sending on timestamp
with new offload flag:
This is palliative and might be replaced with new eth_dev API
about reporting/managing the supported dynamic flags and its related
features. This API would break ABI compatibility and can't be introduced
at the moment, so is postponed to 20.11.
For testing purposes it is proposed to update testpmd "txonly"
forwarding mode routine. With this update testpmd application generates
the packets and sets the dynamic timestamps according to specified time
pattern if it sees the "rte_dynfield_timestamp" is registered.
The new testpmd command is proposed to configure sending pattern:
set tx_times <burst_gap>,<intra_gap>
<intra_gap> - the delay between the packets within the burst
specified in the device clock units. The number
of packets in the burst is defined by txburst parameter
<burst_gap> - the delay between the bursts in the device clock units
As the result the bursts of packet will be transmitted with specific
delays between the packets within the burst and specific delay between
the bursts. The rte_eth_read_clock is supposed to be engaged to get the
current device clock value and provide the reference for the timestamps.
[1] http://patches.dpdk.org/patch/73714/
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
v1->v4:
- dedicated dynamic Tx timestamp flag instead of shared with Rx
v4->v5:
- elaborated commit message
- more words about device clocks added,
- note about dedicated Rx/Tx timestamp flags added
v5->v6:
- release notes are updated
v6->v7:
- commit message is updated
- testpmd checks the supported offloads before registering
dynamic timestamp flag/field
---
doc/guides/rel_notes/release_20_08.rst | 7 +++++++
lib/librte_ethdev/rte_ethdev.c | 1 +
lib/librte_ethdev/rte_ethdev.h | 4 ++++
lib/librte_mbuf/rte_mbuf_dyn.h | 31 +++++++++++++++++++++++++++++++
4 files changed, 43 insertions(+)
diff --git a/doc/guides/rel_notes/release_20_08.rst b/doc/guides/rel_notes/release_20_08.rst
index 988474c..bdea389 100644
--- a/doc/guides/rel_notes/release_20_08.rst
+++ b/doc/guides/rel_notes/release_20_08.rst
@@ -81,6 +81,13 @@ New Features
Added the RegEx library which provides an API for offload of regular
expressions search operations to hardware or software accelerator devices.
+
+* **Introduced send packet scheduling on the timestamps.**
+
+ Added the new mbuf dynamic field and flag to provide timestamp on what packet
+ transmitting can be synchronized. The device Tx offload flag is added to
+ indicate the PMD supports send scheduling.
+
* **Updated PCAP driver.**
Updated PCAP driver with new features and improvements, including:
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 7022bd7..c48ca2a 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -160,6 +160,7 @@ struct rte_eth_xstats_name_off {
RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
+ RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
};
#undef RTE_TX_OFFLOAD_BIT2STR
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index 631b146..97313a0 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1178,6 +1178,10 @@ struct rte_eth_conf {
/** Device supports outer UDP checksum */
#define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM 0x00100000
+/** Device supports send on timestamp */
+#define DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
+
+
#define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
/**< Device supports Rx queue setup after device started*/
#define RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 96c3631..8407230 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -250,4 +250,35 @@ int rte_mbuf_dynflag_lookup(const char *name,
#define RTE_MBUF_DYNFIELD_METADATA_NAME "rte_flow_dynfield_metadata"
#define RTE_MBUF_DYNFLAG_METADATA_NAME "rte_flow_dynflag_metadata"
+/**
+ * The timestamp dynamic field provides some timing information, the
+ * units and time references (initial phase) are not explicitly defined
+ * but are maintained always the same for a given port. Some devices allow
+ * to query rte_eth_read_clock() that will return the current device
+ * timestamp. The dynamic Tx timestamp flag tells whether the field contains
+ * actual timestamp value for the packets being sent, this value can be
+ * used by PMD to schedule packet sending.
+ *
+ * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
+ * and obsoleting, the dedicated Rx timestamp flag is supposed to be
+ * introduced and the shared dynamic timestamp field will be used
+ * to handle the timestamps on receiving datapath as well.
+ */
+#define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME "rte_dynfield_timestamp"
+
+/**
+ * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME flag set on the
+ * packet being sent it tries to synchronize the time of packet appearing
+ * on the wire with the specified packet timestamp. If the specified one
+ * is in the past it should be ignored, if one is in the distant future
+ * it should be capped with some reasonable value (in range of seconds).
+ *
+ * There is no any packet reordering according to timestamps is supposed,
+ * neither for packet within the burst, nor for the whole bursts, it is
+ * an entirely application responsibility to generate packets and its
+ * timestamps in desired order. The timestamps might be put only in
+ * the first packet in the burst providing the entire burst scheduling.
+ */
+#define RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME "rte_dynflag_tx_timestamp"
+
#endif
--
1.8.3.1
^ permalink raw reply [relevance 2%]
* Re: [dpdk-dev] [PATCH v6 1/2] mbuf: introduce accurate packet Tx scheduling
2020-07-09 23:47 0% ` Ferruh Yigit
@ 2020-07-10 12:32 0% ` Slava Ovsiienko
0 siblings, 0 replies; 200+ results
From: Slava Ovsiienko @ 2020-07-10 12:32 UTC (permalink / raw)
To: Ferruh Yigit, dev
Cc: Matan Azrad, Raslan Darawsheh, olivier.matz, bernard.iremonger,
thomas, Andrew Rybchenko
Hi, Ferruh
Thanks a lot for the review.
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@intel.com>
> Sent: Friday, July 10, 2020 2:47
> To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; olivier.matz@6wind.com;
> bernard.iremonger@intel.com; thomas@monjalon.com; Andrew Rybchenko
> <arybchenko@solarflare.com>
> Subject: Re: [dpdk-dev] [PATCH v6 1/2] mbuf: introduce accurate packet Tx
> scheduling
>
> On 7/9/2020 1:36 PM, Viacheslav Ovsiienko wrote:
> > There is the requirement on some networks for precise traffic timing
> > management. The ability to send (and, generally speaking, receive) the
> > packets at the very precisely specified moment of time provides the
> > opportunity to support the connections with Time Division Multiplexing
> > using the contemporary general purpose NIC without involving an
> > auxiliary hardware. For example, the supporting of O-RAN Fronthaul
> > interface is one of the promising features for potentially usage of
> > the precise time management for the egress packets.
>
> Is this a HW support, or is the scheduling planned to be done in the driver?
Yes, mlx5 PMD feature v1 is sent: http://patches.dpdk.org/patch/73714/
>
> >
> > The main objective of this RFC is to specify the way how applications
>
> It is no more RFC.
Oops, miscopy. Thanks.
>
> > can provide the moment of time at what the packet transmission must be
> > started and to describe in preliminary the supporting this feature
> > from
> > mlx5 PMD side.
>
> I was about the ask this, will there be a PMD counterpart implementation of
> the feature? It would be better to have it as part of this set.
> What is the plan for the PMD implementation?
Please, see above.
>
> >
> > The new dynamic timestamp field is proposed, it provides some timing
> > information, the units and time references (initial phase) are not
> > explicitly defined but are maintained always the same for a given port.
> > Some devices allow to query rte_eth_read_clock() that will return the
> > current device timestamp. The dynamic timestamp flag tells whether the
> > field contains actual timestamp value. For the packets being sent this
> > value can be used by PMD to schedule packet sending.
> >
> > The device clock is opaque entity, the units and frequency are vendor
> > specific and might depend on hardware capabilities and configurations.
> > If might (or not) be synchronized with real time via PTP, might (or
> > not) be synchronous with CPU clock (for example if NIC and CPU share
> > the same clock source there might be no any drift between the NIC and
> > CPU clocks), etc.
> >
> > After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation and
> > obsoleting, these dynamic flag and field will be used to manage the
> > timestamps on receiving datapath as well. Having the dedicated flags
> > for Rx/Tx timestamps allows applications not to perform explicit flags
> > reset on forwarding and not to promote received timestamps to the
> > transmitting datapath by default. The static PKT_RX_TIMESTAMP is
> > considered as candidate to become the dynamic flag.
>
> Is there a deprecation notice for 'PKT_RX_TIMESTAMP'? Is this decided?
No, we are going to discuss that, the Rx timestamp is a good candidate to be
moved out from the first mbuf cacheline to the dynamic field.
There are good chances we will deprecate fixed Rx timestamp flag/field,
that's why we'd prefer not to rely on ones anymore.
>
> >
> > When PMD sees the "rte_dynfield_timestamp" set on the packet being
> > sent it tries to synchronize the time of packet appearing on the wire
> > with the specified packet timestamp. If the specified one is in the
> > past it should be ignored, if one is in the distant future it should
> > be capped with some reasonable value (in range of seconds). These
> > specific cases ("too late" and "distant future") can be optionally
> > reported via device xstats to assist applications to detect the
> > time-related problems.
> >
> > There is no any packet reordering according timestamps is supposed,
> > neither within packet burst, nor between packets, it is an entirely
> > application responsibility to generate packets and its timestamps in
> > desired order. The timestamps can be put only in the first packet in
> > the burst providing the entire burst scheduling.
> >
> > PMD reports the ability to synchronize packet sending on timestamp
> > with new offload flag:
> >
> > This is palliative and is going to be replaced with new eth_dev API
> > about reporting/managing the supported dynamic flags and its related
> > features. This API would break ABI compatibility and can't be
> > introduced at the moment, so is postponed to 20.11.
>
> Good to hear that there will be a generic API to get supported dynamic flags.
> I was concerned about adding 'DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP'
> flag, since not sure if there will be any other PMD that will want to use it.
> The trouble is it is hard to remove a public macro after it is introduced, in this
> release I think only single PMD (mlx) will support this feature, and in next
> release the plan is to remove the macro. In this case what do you think to
> not introduce the flag at all?
Currently no other way to report/control the port caps/cfg, but these xx_OFFLOAD_xx flags.
If the new side-channel API to report/control very specific PMD caps is introduced
this should be consistent with OFFLOAD flags, ie., if cap is disabled via the new API
it will be reflected in OFFLOAD flags either. The new API is questionable, the OFFLOAD
flags is not scarce resource, the offload field can be extended and we are still far
from exhausting the existing one. So, I replaced the "will" with "might" in commit
message. Not sure we should remove this flag, we can keep this consistent.
>
> >
> > For testing purposes it is proposed to update testpmd "txonly"
> > forwarding mode routine. With this update testpmd application
> > generates the packets and sets the dynamic timestamps according to
> > specified time pattern if it sees the "rte_dynfield_timestamp" is registered.
> >
> > The new testpmd command is proposed to configure sending pattern:
> >
> > set tx_times <burst_gap>,<intra_gap>
> >
> > <intra_gap> - the delay between the packets within the burst
> > specified in the device clock units. The number
> > of packets in the burst is defined by txburst parameter
> >
> > <burst_gap> - the delay between the bursts in the device clock units
> >
> > As the result the bursts of packet will be transmitted with specific
> > delays between the packets within the burst and specific delay between
> > the bursts. The rte_eth_get_clock is supposed to be engaged to get the
>
> 'rte_eth_read_clock()'?
Sure, my bad.
>
> > current device clock value and provide the reference for the timestamps.
> >
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > Acked-by: Olivier Matz <olivier.matz@6wind.com>
> >
> > ---
> > v1->v4:
> > - dedicated dynamic Tx timestamp flag instead of shared with Rx
> > v4->v5:
> > - elaborated commit message
> > - more words about device clocks added,
> > - note about dedicated Rx/Tx timestamp flags added
> > v5->v6:
> > - release notes are updated
> > ---
> > doc/guides/rel_notes/release_20_08.rst | 6 ++++++
> > lib/librte_ethdev/rte_ethdev.c | 1 +
> > lib/librte_ethdev/rte_ethdev.h | 4 ++++
> > lib/librte_mbuf/rte_mbuf_dyn.h | 31
> +++++++++++++++++++++++++++++++
> > 4 files changed, 42 insertions(+)
> >
> > diff --git a/doc/guides/rel_notes/release_20_08.rst
> > b/doc/guides/rel_notes/release_20_08.rst
> > index 988474c..5527bab 100644
> > --- a/doc/guides/rel_notes/release_20_08.rst
> > +++ b/doc/guides/rel_notes/release_20_08.rst
> > @@ -200,6 +200,12 @@ New Features
> > See the :doc:`../sample_app_ug/l2_forward_real_virtual` for more
> > details of this parameter usage.
> >
> > +* **Introduced send packet scheduling on the timestamps.**
> > +
> > + Added the new mbuf dynamic field and flag to provide timestamp on
> > + what packet transmitting can be synchronized. The device Tx offload
> > + flag is added to indicate the PMD supports send scheduling.
> > +
>
> This is a core library change, can go up in the section, please check the
> section comment for the ordering details.
>
Done.
> >
> > Removed Items
> > -------------
> > diff --git a/lib/librte_ethdev/rte_ethdev.c
> > b/lib/librte_ethdev/rte_ethdev.c index 7022bd7..c48ca2a 100644
> > --- a/lib/librte_ethdev/rte_ethdev.c
> > +++ b/lib/librte_ethdev/rte_ethdev.c
> > @@ -160,6 +160,7 @@ struct rte_eth_xstats_name_off {
> > RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
> > RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
> > RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
> > + RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
> > };
> >
> > #undef RTE_TX_OFFLOAD_BIT2STR
> > diff --git a/lib/librte_ethdev/rte_ethdev.h
> > b/lib/librte_ethdev/rte_ethdev.h index 631b146..97313a0 100644
> > --- a/lib/librte_ethdev/rte_ethdev.h
> > +++ b/lib/librte_ethdev/rte_ethdev.h
> > @@ -1178,6 +1178,10 @@ struct rte_eth_conf {
> > /** Device supports outer UDP checksum */ #define
> > DEV_TX_OFFLOAD_OUTER_UDP_CKSUM 0x00100000
> >
> > +/** Device supports send on timestamp */ #define
> > +DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
>
> Please cc the ethdev maintainers.
>
> As mentioned above my concern is if this is generic enough or are we adding
> a flag to a specific PMD? And since commit log says this is temporary
> solution for just this release, I repeat my question if we can remove the flag
> completely?
Will remove "temporary", replace with might. And now I do not think this flag
will be actually removed. As this feature development proved - it is on right place
and is easy to use by PMD and application in standardized (for offloads)way.
With best regards, Slava
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH] devtools: give some hints for ABI errors
2020-07-08 10:22 25% [dpdk-dev] [PATCH] devtools: give some hints for ABI errors David Marchand
` (2 preceding siblings ...)
2020-07-10 7:37 4% ` Kinsella, Ray
@ 2020-07-10 10:58 4% ` Neil Horman
2020-07-15 12:15 25% ` [dpdk-dev] [PATCH v2] " David Marchand
4 siblings, 0 replies; 200+ results
From: Neil Horman @ 2020-07-10 10:58 UTC (permalink / raw)
To: David Marchand; +Cc: dev, thomas, dodji, Ray Kinsella
On Wed, Jul 08, 2020 at 12:22:12PM +0200, David Marchand wrote:
> abidiff can provide some more information about the ABI difference it
> detected.
> In all cases, a discussion on the mailing must happen but we can give
> some hints to know if this is a problem with the script calling abidiff,
> a potential ABI breakage or an unambiguous ABI breakage.
>
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---
> devtools/check-abi.sh | 16 ++++++++++++++--
> 1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/devtools/check-abi.sh b/devtools/check-abi.sh
> index e17fedbd9f..521e2cce7c 100755
> --- a/devtools/check-abi.sh
> +++ b/devtools/check-abi.sh
> @@ -50,10 +50,22 @@ for dump in $(find $refdir -name "*.dump"); do
> error=1
> continue
> fi
> - if ! abidiff $ABIDIFF_OPTIONS $dump $dump2; then
> + abidiff $ABIDIFF_OPTIONS $dump $dump2 || {
> + abiret=$?
> echo "Error: ABI issue reported for 'abidiff $ABIDIFF_OPTIONS $dump $dump2'"
> error=1
> - fi
> + echo
> + if [ $(($abiret & 3)) != 0 ]; then
> + echo "ABIDIFF_ERROR|ABIDIFF_USAGE_ERROR, please report this to dev@dpdk.org."
> + fi
> + if [ $(($abiret & 4)) != 0 ]; then
> + echo "ABIDIFF_ABI_CHANGE, this change requires a review (abidiff flagged this as a potential issue)."
> + fi
> + if [ $(($abiret & 8)) != 0 ]; then
> + echo "ABIDIFF_ABI_INCOMPATIBLE_CHANGE, this change breaks the ABI."
> + fi
> + echo
> + }
> done
>
> [ -z "$error" ] || [ -n "$warnonly" ]
> --
> 2.23.0
>
>
this looks pretty reasonable to me, sure.
Acked-by: Neil Horman <nhorman@tuxdriver.com>
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH] devtools: give some hints for ABI errors
2020-07-08 10:22 25% [dpdk-dev] [PATCH] devtools: give some hints for ABI errors David Marchand
2020-07-08 13:09 7% ` Kinsella, Ray
2020-07-09 15:52 4% ` Dodji Seketeli
@ 2020-07-10 7:37 4% ` Kinsella, Ray
2020-07-10 10:58 4% ` Neil Horman
2020-07-15 12:15 25% ` [dpdk-dev] [PATCH v2] " David Marchand
4 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2020-07-10 7:37 UTC (permalink / raw)
To: David Marchand, dev; +Cc: thomas, dodji, Neil Horman, Aaron Conole
On 08/07/2020 11:22, David Marchand wrote:
> abidiff can provide some more information about the ABI difference it
> detected.
> In all cases, a discussion on the mailing must happen but we can give
> some hints to know if this is a problem with the script calling abidiff,
> a potential ABI breakage or an unambiguous ABI breakage.
>
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---
> devtools/check-abi.sh | 16 ++++++++++++++--
> 1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/devtools/check-abi.sh b/devtools/check-abi.sh
> index e17fedbd9f..521e2cce7c 100755
> --- a/devtools/check-abi.sh
> +++ b/devtools/check-abi.sh
> @@ -50,10 +50,22 @@ for dump in $(find $refdir -name "*.dump"); do
> error=1
> continue
> fi
> - if ! abidiff $ABIDIFF_OPTIONS $dump $dump2; then
> + abidiff $ABIDIFF_OPTIONS $dump $dump2 || {
> + abiret=$?
> echo "Error: ABI issue reported for 'abidiff $ABIDIFF_OPTIONS $dump $dump2'"
> error=1
> - fi
> + echo
> + if [ $(($abiret & 3)) != 0 ]; then
> + echo "ABIDIFF_ERROR|ABIDIFF_USAGE_ERROR, please report this to dev@dpdk.org."
> + fi
> + if [ $(($abiret & 4)) != 0 ]; then
> + echo "ABIDIFF_ABI_CHANGE, this change requires a review (abidiff flagged this as a potential issue)."
> + fi
> + if [ $(($abiret & 8)) != 0 ]; then
> + echo "ABIDIFF_ABI_INCOMPATIBLE_CHANGE, this change breaks the ABI."
> + fi
> + echo
> + }
> done
>
> [ -z "$error" ] || [ -n "$warnonly" ]
>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH v3] eal: use c11 atomic built-ins for interrupt status
2020-07-09 10:30 0% ` David Marchand
@ 2020-07-10 7:18 3% ` Dodji Seketeli
0 siblings, 0 replies; 200+ results
From: Dodji Seketeli @ 2020-07-10 7:18 UTC (permalink / raw)
To: David Marchand
Cc: Phil Yang, Ray Kinsella, Harman Kalra, dev, stefan.puiu,
Aaron Conole, David Christensen, Honnappa Nagarahalli,
Ruifeng Wang (Arm Technology China),
nd, Neil Horman
David Marchand <david.marchand@redhat.com> writes:
[...]
>> --- a/devtools/libabigail.abignore
>> +++ b/devtools/libabigail.abignore
>> @@ -48,6 +48,10 @@
>> changed_enumerators = RTE_CRYPTO_AEAD_LIST_END
>> [suppress_variable]
>> name = rte_crypto_aead_algorithm_strings
>> +; Ignore updates of epoll event
>> +[suppress_type]
>> + type_kind = struct
>> + name = rte_epoll_event
>
> In general, ignoring all changes on a structure is risky.
> But the risk is acceptable as long as we remember this for the rest of
> the 20.08 release (and we will start from scratch for 20.11).
Right, I thought about this too when I saw that change. If that struct
is inherently *not* part of the logically exposed ABI, the risk is
really minimal as well. In that case, maybe a comment saying so in the
.abignore file could be useful for future reference.
[...]
Cheers,
--
Dodji
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v10 1/3] lib/lpm: integrate RCU QSBR
2020-07-10 2:22 2% ` [dpdk-dev] [PATCH v10 1/3] lib/lpm: integrate RCU QSBR Ruifeng Wang
@ 2020-07-10 2:29 0% ` Ruifeng Wang
0 siblings, 0 replies; 200+ results
From: Ruifeng Wang @ 2020-07-10 2:29 UTC (permalink / raw)
To: Ruifeng Wang, Bruce Richardson, Vladimir Medvedkin,
John McNamara, Marko Kovacevic, Ray Kinsella, Neil Horman
Cc: dev, konstantin.ananyev, Honnappa Nagarahalli, nd, nd
The ci/checkpatch warning is a false positive.
> -----Original Message-----
> From: Ruifeng Wang <ruifeng.wang@arm.com>
> Sent: Friday, July 10, 2020 10:22 AM
> To: Bruce Richardson <bruce.richardson@intel.com>; Vladimir Medvedkin
> <vladimir.medvedkin@intel.com>; John McNamara
> <john.mcnamara@intel.com>; Marko Kovacevic
> <marko.kovacevic@intel.com>; Ray Kinsella <mdr@ashroe.eu>; Neil Horman
> <nhorman@tuxdriver.com>
> Cc: dev@dpdk.org; konstantin.ananyev@intel.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>
> Subject: [PATCH v10 1/3] lib/lpm: integrate RCU QSBR
>
> Currently, the tbl8 group is freed even though the readers might be using the
> tbl8 group entries. The freed tbl8 group can be reallocated quickly. This
> results in incorrect lookup results.
>
> RCU QSBR process is integrated for safe tbl8 group reclaim.
> Refer to RCU documentation to understand various aspects of integrating
> RCU library into other libraries.
>
> To avoid ABI breakage, a struct __rte_lpm is created for lpm library internal
> use. This struct wraps rte_lpm that has been exposed and also includes
> members that don't need to be exposed such as RCU related config.
>
> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Acked-by: Ray Kinsella <mdr@ashroe.eu>
> Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
> ---
> doc/guides/prog_guide/lpm_lib.rst | 32 ++++++
> lib/librte_lpm/Makefile | 2 +-
> lib/librte_lpm/meson.build | 1 +
> lib/librte_lpm/rte_lpm.c | 165 +++++++++++++++++++++++++----
> lib/librte_lpm/rte_lpm.h | 53 +++++++++
> lib/librte_lpm/rte_lpm_version.map | 6 ++
> 6 files changed, 237 insertions(+), 22 deletions(-)
>
^ permalink raw reply [relevance 0%]
* [dpdk-dev] [PATCH v10 1/3] lib/lpm: integrate RCU QSBR
2020-07-10 2:22 4% ` [dpdk-dev] [PATCH v10 0/3] RCU integration with LPM library Ruifeng Wang
@ 2020-07-10 2:22 2% ` Ruifeng Wang
2020-07-10 2:29 0% ` Ruifeng Wang
0 siblings, 1 reply; 200+ results
From: Ruifeng Wang @ 2020-07-10 2:22 UTC (permalink / raw)
To: Bruce Richardson, Vladimir Medvedkin, John McNamara,
Marko Kovacevic, Ray Kinsella, Neil Horman
Cc: dev, konstantin.ananyev, honnappa.nagarahalli, nd, Ruifeng Wang
Currently, the tbl8 group is freed even though the readers might be
using the tbl8 group entries. The freed tbl8 group can be reallocated
quickly. This results in incorrect lookup results.
RCU QSBR process is integrated for safe tbl8 group reclaim.
Refer to RCU documentation to understand various aspects of
integrating RCU library into other libraries.
To avoid ABI breakage, a struct __rte_lpm is created for lpm library
internal use. This struct wraps rte_lpm that has been exposed and
also includes members that don't need to be exposed such as RCU related
config.
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
---
doc/guides/prog_guide/lpm_lib.rst | 32 ++++++
lib/librte_lpm/Makefile | 2 +-
lib/librte_lpm/meson.build | 1 +
lib/librte_lpm/rte_lpm.c | 165 +++++++++++++++++++++++++----
lib/librte_lpm/rte_lpm.h | 53 +++++++++
lib/librte_lpm/rte_lpm_version.map | 6 ++
6 files changed, 237 insertions(+), 22 deletions(-)
diff --git a/doc/guides/prog_guide/lpm_lib.rst b/doc/guides/prog_guide/lpm_lib.rst
index 1609a57d0..03945904b 100644
--- a/doc/guides/prog_guide/lpm_lib.rst
+++ b/doc/guides/prog_guide/lpm_lib.rst
@@ -145,6 +145,38 @@ depending on whether we need to move to the next table or not.
Prefix expansion is one of the keys of this algorithm,
since it improves the speed dramatically by adding redundancy.
+Deletion
+~~~~~~~~
+
+When deleting a rule, a replacement rule is searched for. Replacement rule is an existing rule that has
+the longest prefix match with the rule to be deleted, but has shorter prefix.
+
+If a replacement rule is found, target tbl24 and tbl8 entries are updated to have the same depth and next hop
+value with the replacement rule.
+
+If no replacement rule can be found, target tbl24 and tbl8 entries will be cleared.
+
+Prefix expansion is performed if the rule's depth is not exactly 24 bits or 32 bits.
+
+After deleting a rule, a group of tbl8s that belongs to the same tbl24 entry are freed in following cases:
+
+* All tbl8s in the group are empty .
+
+* All tbl8s in the group have the same values and with depth no greater than 24.
+
+Free of tbl8s have different behaviors:
+
+* If RCU is not used, tbl8s are cleared and reclaimed immediately.
+
+* If RCU is used, tbl8s are reclaimed when readers are in quiescent state.
+
+When the LPM is not using RCU, tbl8 group can be freed immediately even though the readers might be using
+the tbl8 group entries. This might result in incorrect lookup results.
+
+RCU QSBR process is integrated for safe tbl8 group reclamation. Application has certain responsibilities
+while using this feature. Please refer to resource reclamation framework of :ref:`RCU library <RCU_Library>`
+for more details.
+
Lookup
~~~~~~
diff --git a/lib/librte_lpm/Makefile b/lib/librte_lpm/Makefile
index d682785b6..6f06c5c03 100644
--- a/lib/librte_lpm/Makefile
+++ b/lib/librte_lpm/Makefile
@@ -8,7 +8,7 @@ LIB = librte_lpm.a
CFLAGS += -O3
CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
-LDLIBS += -lrte_eal -lrte_hash
+LDLIBS += -lrte_eal -lrte_hash -lrte_rcu
EXPORT_MAP := rte_lpm_version.map
diff --git a/lib/librte_lpm/meson.build b/lib/librte_lpm/meson.build
index 021ac6d8d..6cfc083c5 100644
--- a/lib/librte_lpm/meson.build
+++ b/lib/librte_lpm/meson.build
@@ -7,3 +7,4 @@ headers = files('rte_lpm.h', 'rte_lpm6.h')
# without worrying about which architecture we actually need
headers += files('rte_lpm_altivec.h', 'rte_lpm_neon.h', 'rte_lpm_sse.h')
deps += ['hash']
+deps += ['rcu']
diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c
index 38ab512a4..2d687c372 100644
--- a/lib/librte_lpm/rte_lpm.c
+++ b/lib/librte_lpm/rte_lpm.c
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2020 Arm Limited
*/
#include <string.h>
@@ -39,6 +40,17 @@ enum valid_flag {
VALID
};
+/** @internal LPM structure. */
+struct __rte_lpm {
+ /* LPM metadata. */
+ struct rte_lpm lpm;
+
+ /* RCU config. */
+ struct rte_rcu_qsbr *v; /* RCU QSBR variable. */
+ enum rte_lpm_qsbr_mode rcu_mode;/* Blocking, defer queue. */
+ struct rte_rcu_qsbr_dq *dq; /* RCU QSBR defer queue. */
+};
+
/* Macro to enable/disable run-time checks. */
#if defined(RTE_LIBRTE_LPM_DEBUG)
#include <rte_debug.h>
@@ -122,6 +134,7 @@ rte_lpm_create(const char *name, int socket_id,
const struct rte_lpm_config *config)
{
char mem_name[RTE_LPM_NAMESIZE];
+ struct __rte_lpm *internal_lpm;
struct rte_lpm *lpm = NULL;
struct rte_tailq_entry *te;
uint32_t mem_size, rules_size, tbl8s_size;
@@ -140,12 +153,6 @@ rte_lpm_create(const char *name, int socket_id,
snprintf(mem_name, sizeof(mem_name), "LPM_%s", name);
- /* Determine the amount of memory to allocate. */
- mem_size = sizeof(*lpm);
- rules_size = sizeof(struct rte_lpm_rule) * config->max_rules;
- tbl8s_size = (sizeof(struct rte_lpm_tbl_entry) *
- RTE_LPM_TBL8_GROUP_NUM_ENTRIES * config->number_tbl8s);
-
rte_mcfg_tailq_write_lock();
/* guarantee there's no existing */
@@ -161,6 +168,12 @@ rte_lpm_create(const char *name, int socket_id,
goto exit;
}
+ /* Determine the amount of memory to allocate. */
+ mem_size = sizeof(*internal_lpm);
+ rules_size = sizeof(struct rte_lpm_rule) * config->max_rules;
+ tbl8s_size = (sizeof(struct rte_lpm_tbl_entry) *
+ RTE_LPM_TBL8_GROUP_NUM_ENTRIES * config->number_tbl8s);
+
/* allocate tailq entry */
te = rte_zmalloc("LPM_TAILQ_ENTRY", sizeof(*te), 0);
if (te == NULL) {
@@ -170,21 +183,23 @@ rte_lpm_create(const char *name, int socket_id,
}
/* Allocate memory to store the LPM data structures. */
- lpm = rte_zmalloc_socket(mem_name, mem_size,
+ internal_lpm = rte_zmalloc_socket(mem_name, mem_size,
RTE_CACHE_LINE_SIZE, socket_id);
- if (lpm == NULL) {
+ if (internal_lpm == NULL) {
RTE_LOG(ERR, LPM, "LPM memory allocation failed\n");
rte_free(te);
rte_errno = ENOMEM;
goto exit;
}
+ lpm = &internal_lpm->lpm;
lpm->rules_tbl = rte_zmalloc_socket(NULL,
(size_t)rules_size, RTE_CACHE_LINE_SIZE, socket_id);
if (lpm->rules_tbl == NULL) {
RTE_LOG(ERR, LPM, "LPM rules_tbl memory allocation failed\n");
- rte_free(lpm);
+ rte_free(internal_lpm);
+ internal_lpm = NULL;
lpm = NULL;
rte_free(te);
rte_errno = ENOMEM;
@@ -197,7 +212,8 @@ rte_lpm_create(const char *name, int socket_id,
if (lpm->tbl8 == NULL) {
RTE_LOG(ERR, LPM, "LPM tbl8 memory allocation failed\n");
rte_free(lpm->rules_tbl);
- rte_free(lpm);
+ rte_free(internal_lpm);
+ internal_lpm = NULL;
lpm = NULL;
rte_free(te);
rte_errno = ENOMEM;
@@ -225,6 +241,7 @@ rte_lpm_create(const char *name, int socket_id,
void
rte_lpm_free(struct rte_lpm *lpm)
{
+ struct __rte_lpm *internal_lpm;
struct rte_lpm_list *lpm_list;
struct rte_tailq_entry *te;
@@ -246,12 +263,84 @@ rte_lpm_free(struct rte_lpm *lpm)
rte_mcfg_tailq_write_unlock();
+ internal_lpm = container_of(lpm, struct __rte_lpm, lpm);
+ if (internal_lpm->dq)
+ rte_rcu_qsbr_dq_delete(internal_lpm->dq);
rte_free(lpm->tbl8);
rte_free(lpm->rules_tbl);
rte_free(lpm);
rte_free(te);
}
+static void
+__lpm_rcu_qsbr_free_resource(void *p, void *data, unsigned int n)
+{
+ struct rte_lpm_tbl_entry zero_tbl8_entry = {0};
+ uint32_t tbl8_group_index = *(uint32_t *)data;
+ struct rte_lpm_tbl_entry *tbl8 = ((struct rte_lpm *)p)->tbl8;
+
+ RTE_SET_USED(n);
+ /* Set tbl8 group invalid */
+ __atomic_store(&tbl8[tbl8_group_index], &zero_tbl8_entry,
+ __ATOMIC_RELAXED);
+}
+
+/* Associate QSBR variable with an LPM object.
+ */
+int
+rte_lpm_rcu_qsbr_add(struct rte_lpm *lpm, struct rte_lpm_rcu_config *cfg,
+ struct rte_rcu_qsbr_dq **dq)
+{
+ struct __rte_lpm *internal_lpm;
+ char rcu_dq_name[RTE_RCU_QSBR_DQ_NAMESIZE];
+ struct rte_rcu_qsbr_dq_parameters params = {0};
+
+ if (lpm == NULL || cfg == NULL) {
+ rte_errno = EINVAL;
+ return 1;
+ }
+
+ internal_lpm = container_of(lpm, struct __rte_lpm, lpm);
+ if (internal_lpm->v != NULL) {
+ rte_errno = EEXIST;
+ return 1;
+ }
+
+ if (cfg->mode == RTE_LPM_QSBR_MODE_SYNC) {
+ /* No other things to do. */
+ } else if (cfg->mode == RTE_LPM_QSBR_MODE_DQ) {
+ /* Init QSBR defer queue. */
+ snprintf(rcu_dq_name, sizeof(rcu_dq_name),
+ "LPM_RCU_%s", lpm->name);
+ params.name = rcu_dq_name;
+ params.size = cfg->dq_size;
+ if (params.size == 0)
+ params.size = lpm->number_tbl8s;
+ params.trigger_reclaim_limit = cfg->reclaim_thd;
+ params.max_reclaim_size = cfg->reclaim_max;
+ if (params.max_reclaim_size == 0)
+ params.max_reclaim_size = RTE_LPM_RCU_DQ_RECLAIM_MAX;
+ params.esize = sizeof(uint32_t); /* tbl8 group index */
+ params.free_fn = __lpm_rcu_qsbr_free_resource;
+ params.p = lpm;
+ params.v = cfg->v;
+ internal_lpm->dq = rte_rcu_qsbr_dq_create(¶ms);
+ if (internal_lpm->dq == NULL) {
+ RTE_LOG(ERR, LPM, "LPM defer queue creation failed\n");
+ return 1;
+ }
+ if (dq)
+ *dq = internal_lpm->dq;
+ } else {
+ rte_errno = EINVAL;
+ return 1;
+ }
+ internal_lpm->rcu_mode = cfg->mode;
+ internal_lpm->v = cfg->v;
+
+ return 0;
+}
+
/*
* Adds a rule to the rule table.
*
@@ -394,14 +483,15 @@ rule_find(struct rte_lpm *lpm, uint32_t ip_masked, uint8_t depth)
* Find, clean and allocate a tbl8.
*/
static int32_t
-tbl8_alloc(struct rte_lpm_tbl_entry *tbl8, uint32_t number_tbl8s)
+_tbl8_alloc(struct rte_lpm *lpm)
{
uint32_t group_idx; /* tbl8 group index. */
struct rte_lpm_tbl_entry *tbl8_entry;
/* Scan through tbl8 to find a free (i.e. INVALID) tbl8 group. */
- for (group_idx = 0; group_idx < number_tbl8s; group_idx++) {
- tbl8_entry = &tbl8[group_idx * RTE_LPM_TBL8_GROUP_NUM_ENTRIES];
+ for (group_idx = 0; group_idx < lpm->number_tbl8s; group_idx++) {
+ tbl8_entry = &lpm->tbl8[group_idx *
+ RTE_LPM_TBL8_GROUP_NUM_ENTRIES];
/* If a free tbl8 group is found clean it and set as VALID. */
if (!tbl8_entry->valid_group) {
struct rte_lpm_tbl_entry new_tbl8_entry = {
@@ -427,14 +517,47 @@ tbl8_alloc(struct rte_lpm_tbl_entry *tbl8, uint32_t number_tbl8s)
return -ENOSPC;
}
+static int32_t
+tbl8_alloc(struct rte_lpm *lpm)
+{
+ struct __rte_lpm *internal_lpm = container_of(lpm,
+ struct __rte_lpm, lpm);
+ int32_t group_idx; /* tbl8 group index. */
+
+ group_idx = _tbl8_alloc(lpm);
+ if (group_idx == -ENOSPC && internal_lpm->dq != NULL) {
+ /* If there are no tbl8 groups try to reclaim one. */
+ if (rte_rcu_qsbr_dq_reclaim(internal_lpm->dq, 1,
+ NULL, NULL, NULL) == 0)
+ group_idx = _tbl8_alloc(lpm);
+ }
+
+ return group_idx;
+}
+
static void
-tbl8_free(struct rte_lpm_tbl_entry *tbl8, uint32_t tbl8_group_start)
+tbl8_free(struct rte_lpm *lpm, uint32_t tbl8_group_start)
{
- /* Set tbl8 group invalid*/
+ struct __rte_lpm *internal_lpm = container_of(lpm,
+ struct __rte_lpm, lpm);
struct rte_lpm_tbl_entry zero_tbl8_entry = {0};
- __atomic_store(&tbl8[tbl8_group_start], &zero_tbl8_entry,
- __ATOMIC_RELAXED);
+ if (internal_lpm->v == NULL) {
+ /* Set tbl8 group invalid*/
+ __atomic_store(&lpm->tbl8[tbl8_group_start], &zero_tbl8_entry,
+ __ATOMIC_RELAXED);
+ } else if (internal_lpm->rcu_mode == RTE_LPM_QSBR_MODE_SYNC) {
+ /* Wait for quiescent state change. */
+ rte_rcu_qsbr_synchronize(internal_lpm->v,
+ RTE_QSBR_THRID_INVALID);
+ /* Set tbl8 group invalid*/
+ __atomic_store(&lpm->tbl8[tbl8_group_start], &zero_tbl8_entry,
+ __ATOMIC_RELAXED);
+ } else if (internal_lpm->rcu_mode == RTE_LPM_QSBR_MODE_DQ) {
+ /* Push into QSBR defer queue. */
+ rte_rcu_qsbr_dq_enqueue(internal_lpm->dq,
+ (void *)&tbl8_group_start);
+ }
}
static __rte_noinline int32_t
@@ -523,7 +646,7 @@ add_depth_big(struct rte_lpm *lpm, uint32_t ip_masked, uint8_t depth,
if (!lpm->tbl24[tbl24_index].valid) {
/* Search for a free tbl8 group. */
- tbl8_group_index = tbl8_alloc(lpm->tbl8, lpm->number_tbl8s);
+ tbl8_group_index = tbl8_alloc(lpm);
/* Check tbl8 allocation was successful. */
if (tbl8_group_index < 0) {
@@ -569,7 +692,7 @@ add_depth_big(struct rte_lpm *lpm, uint32_t ip_masked, uint8_t depth,
} /* If valid entry but not extended calculate the index into Table8. */
else if (lpm->tbl24[tbl24_index].valid_group == 0) {
/* Search for free tbl8 group. */
- tbl8_group_index = tbl8_alloc(lpm->tbl8, lpm->number_tbl8s);
+ tbl8_group_index = tbl8_alloc(lpm);
if (tbl8_group_index < 0) {
return tbl8_group_index;
@@ -977,7 +1100,7 @@ delete_depth_big(struct rte_lpm *lpm, uint32_t ip_masked,
*/
lpm->tbl24[tbl24_index].valid = 0;
__atomic_thread_fence(__ATOMIC_RELEASE);
- tbl8_free(lpm->tbl8, tbl8_group_start);
+ tbl8_free(lpm, tbl8_group_start);
} else if (tbl8_recycle_index > -1) {
/* Update tbl24 entry. */
struct rte_lpm_tbl_entry new_tbl24_entry = {
@@ -993,7 +1116,7 @@ delete_depth_big(struct rte_lpm *lpm, uint32_t ip_masked,
__atomic_store(&lpm->tbl24[tbl24_index], &new_tbl24_entry,
__ATOMIC_RELAXED);
__atomic_thread_fence(__ATOMIC_RELEASE);
- tbl8_free(lpm->tbl8, tbl8_group_start);
+ tbl8_free(lpm, tbl8_group_start);
}
#undef group_idx
return 0;
diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
index b9d49ac87..a9568fcdd 100644
--- a/lib/librte_lpm/rte_lpm.h
+++ b/lib/librte_lpm/rte_lpm.h
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2020 Arm Limited
*/
#ifndef _RTE_LPM_H_
@@ -20,6 +21,7 @@
#include <rte_memory.h>
#include <rte_common.h>
#include <rte_vect.h>
+#include <rte_rcu_qsbr.h>
#ifdef __cplusplus
extern "C" {
@@ -62,6 +64,17 @@ extern "C" {
/** Bitmask used to indicate successful lookup */
#define RTE_LPM_LOOKUP_SUCCESS 0x01000000
+/** @internal Default RCU defer queue entries to reclaim in one go. */
+#define RTE_LPM_RCU_DQ_RECLAIM_MAX 16
+
+/** RCU reclamation modes */
+enum rte_lpm_qsbr_mode {
+ /** Create defer queue for reclaim. */
+ RTE_LPM_QSBR_MODE_DQ = 0,
+ /** Use blocking mode reclaim. No defer queue created. */
+ RTE_LPM_QSBR_MODE_SYNC
+};
+
#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
/** @internal Tbl24 entry structure. */
__extension__
@@ -132,6 +145,22 @@ struct rte_lpm {
struct rte_lpm_rule *rules_tbl; /**< LPM rules. */
};
+/** LPM RCU QSBR configuration structure. */
+struct rte_lpm_rcu_config {
+ struct rte_rcu_qsbr *v; /* RCU QSBR variable. */
+ /* Mode of RCU QSBR. RTE_LPM_QSBR_MODE_xxx
+ * '0' for default: create defer queue for reclaim.
+ */
+ enum rte_lpm_qsbr_mode mode;
+ uint32_t dq_size; /* RCU defer queue size.
+ * default: lpm->number_tbl8s.
+ */
+ uint32_t reclaim_thd; /* Threshold to trigger auto reclaim. */
+ uint32_t reclaim_max; /* Max entries to reclaim in one go.
+ * default: RTE_LPM_RCU_DQ_RECLAIM_MAX.
+ */
+};
+
/**
* Create an LPM object.
*
@@ -179,6 +208,30 @@ rte_lpm_find_existing(const char *name);
void
rte_lpm_free(struct rte_lpm *lpm);
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Associate RCU QSBR variable with an LPM object.
+ *
+ * @param lpm
+ * the lpm object to add RCU QSBR
+ * @param cfg
+ * RCU QSBR configuration
+ * @param dq
+ * handler of created RCU QSBR defer queue
+ * @return
+ * On success - 0
+ * On error - 1 with error code set in rte_errno.
+ * Possible rte_errno codes are:
+ * - EINVAL - invalid pointer
+ * - EEXIST - already added QSBR
+ * - ENOMEM - memory allocation failure
+ */
+__rte_experimental
+int rte_lpm_rcu_qsbr_add(struct rte_lpm *lpm, struct rte_lpm_rcu_config *cfg,
+ struct rte_rcu_qsbr_dq **dq);
+
/**
* Add a rule to the LPM table.
*
diff --git a/lib/librte_lpm/rte_lpm_version.map b/lib/librte_lpm/rte_lpm_version.map
index 500f58b80..bfccd7eac 100644
--- a/lib/librte_lpm/rte_lpm_version.map
+++ b/lib/librte_lpm/rte_lpm_version.map
@@ -21,3 +21,9 @@ DPDK_20.0 {
local: *;
};
+
+EXPERIMENTAL {
+ global:
+
+ rte_lpm_rcu_qsbr_add;
+};
--
2.17.1
^ permalink raw reply [relevance 2%]
* [dpdk-dev] [PATCH v10 0/3] RCU integration with LPM library
` (4 preceding siblings ...)
2020-07-09 15:42 4% ` [dpdk-dev] [PATCH v9 0/3] RCU integration with LPM library Ruifeng Wang
@ 2020-07-10 2:22 4% ` Ruifeng Wang
2020-07-10 2:22 2% ` [dpdk-dev] [PATCH v10 1/3] lib/lpm: integrate RCU QSBR Ruifeng Wang
5 siblings, 1 reply; 200+ results
From: Ruifeng Wang @ 2020-07-10 2:22 UTC (permalink / raw)
Cc: dev, mdr, konstantin.ananyev, honnappa.nagarahalli, nd, Ruifeng Wang
This patchset integrates RCU QSBR support with LPM library.
Resource reclaimation implementation was splitted from the original
series, and has already been part of RCU library. Rework the series
to base LPM integration on RCU reclaimation APIs.
New API rte_lpm_rcu_qsbr_add is introduced for application to
register a RCU variable that LPM library will use. This provides
user the handle to enable RCU that integrated in LPM library.
Functional tests and performance tests are added to cover the
integration with RCU.
---
v10:
Added missing Acked-by tags.
v9:
Cleared lpm when allocation failed. (David)
v8:
Fixed ABI issue by adding internal LPM control structure. (David)
Changed to use RFC5737 address in unit test. (Vladimir)
v7:
Fixed typos in document.
v6:
Remove ALLOW_EXPERIMENTAL_API from rte_lpm.c.
v5:
No default value for reclaim_thd. This allows reclamation triggering with every call.
Pass LPM pointer instead of tbl8 as argument of reclaim callback free function.
Updated group_idx check at tbl8 allocation.
Use enums instead of defines for different reclamation modes.
RCU QSBR integrated path is inside ALLOW_EXPERIMENTAL_API to avoid ABI change.
v4:
Allow user to configure defer queue: size, reclaim threshold, max entries.
Return defer queue handler so user can manually trigger reclaimation.
Add blocking mode support. Defer queue will not be created.
Honnappa Nagarahalli (1):
test/lpm: add RCU integration performance tests
Ruifeng Wang (2):
lib/lpm: integrate RCU QSBR
test/lpm: add LPM RCU integration functional tests
app/test/test_lpm.c | 291 ++++++++++++++++-
app/test/test_lpm_perf.c | 492 ++++++++++++++++++++++++++++-
doc/guides/prog_guide/lpm_lib.rst | 32 ++
lib/librte_lpm/Makefile | 2 +-
lib/librte_lpm/meson.build | 1 +
lib/librte_lpm/rte_lpm.c | 165 ++++++++--
lib/librte_lpm/rte_lpm.h | 53 ++++
lib/librte_lpm/rte_lpm_version.map | 6 +
8 files changed, 1016 insertions(+), 26 deletions(-)
--
2.17.1
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH v6 1/2] mbuf: introduce accurate packet Tx scheduling
2020-07-09 12:36 2% ` [dpdk-dev] [PATCH v6 " Viacheslav Ovsiienko
@ 2020-07-09 23:47 0% ` Ferruh Yigit
2020-07-10 12:32 0% ` Slava Ovsiienko
0 siblings, 1 reply; 200+ results
From: Ferruh Yigit @ 2020-07-09 23:47 UTC (permalink / raw)
To: Viacheslav Ovsiienko, dev
Cc: matan, rasland, olivier.matz, bernard.iremonger, thomas,
Andrew Rybchenko
On 7/9/2020 1:36 PM, Viacheslav Ovsiienko wrote:
> There is the requirement on some networks for precise traffic timing
> management. The ability to send (and, generally speaking, receive)
> the packets at the very precisely specified moment of time provides
> the opportunity to support the connections with Time Division
> Multiplexing using the contemporary general purpose NIC without involving
> an auxiliary hardware. For example, the supporting of O-RAN Fronthaul
> interface is one of the promising features for potentially usage of the
> precise time management for the egress packets.
Is this a HW support, or is the scheduling planned to be done in the driver?
>
> The main objective of this RFC is to specify the way how applications
It is no more RFC.
> can provide the moment of time at what the packet transmission must be
> started and to describe in preliminary the supporting this feature from
> mlx5 PMD side.
I was about the ask this, will there be a PMD counterpart implementation of the
feature? It would be better to have it as part of this set.
What is the plan for the PMD implementation?
>
> The new dynamic timestamp field is proposed, it provides some timing
> information, the units and time references (initial phase) are not
> explicitly defined but are maintained always the same for a given port.
> Some devices allow to query rte_eth_read_clock() that will return
> the current device timestamp. The dynamic timestamp flag tells whether
> the field contains actual timestamp value. For the packets being sent
> this value can be used by PMD to schedule packet sending.
>
> The device clock is opaque entity, the units and frequency are
> vendor specific and might depend on hardware capabilities and
> configurations. If might (or not) be synchronized with real time
> via PTP, might (or not) be synchronous with CPU clock (for example
> if NIC and CPU share the same clock source there might be no
> any drift between the NIC and CPU clocks), etc.
>
> After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
> and obsoleting, these dynamic flag and field will be used to manage
> the timestamps on receiving datapath as well. Having the dedicated
> flags for Rx/Tx timestamps allows applications not to perform explicit
> flags reset on forwarding and not to promote received timestamps
> to the transmitting datapath by default. The static PKT_RX_TIMESTAMP
> is considered as candidate to become the dynamic flag.
Is there a deprecation notice for 'PKT_RX_TIMESTAMP'? Is this decided?
>
> When PMD sees the "rte_dynfield_timestamp" set on the packet being sent
> it tries to synchronize the time of packet appearing on the wire with
> the specified packet timestamp. If the specified one is in the past it
> should be ignored, if one is in the distant future it should be capped
> with some reasonable value (in range of seconds). These specific cases
> ("too late" and "distant future") can be optionally reported via
> device xstats to assist applications to detect the time-related
> problems.
>
> There is no any packet reordering according timestamps is supposed,
> neither within packet burst, nor between packets, it is an entirely
> application responsibility to generate packets and its timestamps
> in desired order. The timestamps can be put only in the first packet
> in the burst providing the entire burst scheduling.
>
> PMD reports the ability to synchronize packet sending on timestamp
> with new offload flag:
>
> This is palliative and is going to be replaced with new eth_dev API
> about reporting/managing the supported dynamic flags and its related
> features. This API would break ABI compatibility and can't be introduced
> at the moment, so is postponed to 20.11.
Good to hear that there will be a generic API to get supported dynamic flags. I
was concerned about adding 'DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP' flag, since not
sure if there will be any other PMD that will want to use it.
The trouble is it is hard to remove a public macro after it is introduced, in
this release I think only single PMD (mlx) will support this feature, and in
next release the plan is to remove the macro. In this case what do you think to
not introduce the flag at all?
>
> For testing purposes it is proposed to update testpmd "txonly"
> forwarding mode routine. With this update testpmd application generates
> the packets and sets the dynamic timestamps according to specified time
> pattern if it sees the "rte_dynfield_timestamp" is registered.
>
> The new testpmd command is proposed to configure sending pattern:
>
> set tx_times <burst_gap>,<intra_gap>
>
> <intra_gap> - the delay between the packets within the burst
> specified in the device clock units. The number
> of packets in the burst is defined by txburst parameter
>
> <burst_gap> - the delay between the bursts in the device clock units
>
> As the result the bursts of packet will be transmitted with specific
> delays between the packets within the burst and specific delay between
> the bursts. The rte_eth_get_clock is supposed to be engaged to get the
'rte_eth_read_clock()'?
> current device clock value and provide the reference for the timestamps.
>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> Acked-by: Olivier Matz <olivier.matz@6wind.com>
>
> ---
> v1->v4:
> - dedicated dynamic Tx timestamp flag instead of shared with Rx
> v4->v5:
> - elaborated commit message
> - more words about device clocks added,
> - note about dedicated Rx/Tx timestamp flags added
> v5->v6:
> - release notes are updated
> ---
> doc/guides/rel_notes/release_20_08.rst | 6 ++++++
> lib/librte_ethdev/rte_ethdev.c | 1 +
> lib/librte_ethdev/rte_ethdev.h | 4 ++++
> lib/librte_mbuf/rte_mbuf_dyn.h | 31 +++++++++++++++++++++++++++++++
> 4 files changed, 42 insertions(+)
>
> diff --git a/doc/guides/rel_notes/release_20_08.rst b/doc/guides/rel_notes/release_20_08.rst
> index 988474c..5527bab 100644
> --- a/doc/guides/rel_notes/release_20_08.rst
> +++ b/doc/guides/rel_notes/release_20_08.rst
> @@ -200,6 +200,12 @@ New Features
> See the :doc:`../sample_app_ug/l2_forward_real_virtual` for more
> details of this parameter usage.
>
> +* **Introduced send packet scheduling on the timestamps.**
> +
> + Added the new mbuf dynamic field and flag to provide timestamp on what packet
> + transmitting can be synchronized. The device Tx offload flag is added to
> + indicate the PMD supports send scheduling.
> +
This is a core library change, can go up in the section, please check the
section comment for the ordering details.
>
> Removed Items
> -------------
> diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
> index 7022bd7..c48ca2a 100644
> --- a/lib/librte_ethdev/rte_ethdev.c
> +++ b/lib/librte_ethdev/rte_ethdev.c
> @@ -160,6 +160,7 @@ struct rte_eth_xstats_name_off {
> RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
> RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
> RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
> + RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
> };
>
> #undef RTE_TX_OFFLOAD_BIT2STR
> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> index 631b146..97313a0 100644
> --- a/lib/librte_ethdev/rte_ethdev.h
> +++ b/lib/librte_ethdev/rte_ethdev.h
> @@ -1178,6 +1178,10 @@ struct rte_eth_conf {
> /** Device supports outer UDP checksum */
> #define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM 0x00100000
>
> +/** Device supports send on timestamp */
> +#define DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
Please cc the ethdev maintainers.
As mentioned above my concern is if this is generic enough or are we adding a
flag to a specific PMD? And since commit log says this is temporary solution for
just this release, I repeat my question if we can remove the flag completely?
^ permalink raw reply [relevance 0%]
* [dpdk-dev] [PATCH v4 1/2] mbuf: use C11 atomic built-ins for refcnt operations
2020-07-09 10:10 4% ` [dpdk-dev] [PATCH v3] mbuf: use C11 atomic built-ins " Phil Yang
2020-07-09 11:03 3% ` Olivier Matz
@ 2020-07-09 15:58 4% ` Phil Yang
2020-07-15 12:29 0% ` David Marchand
2020-07-17 4:36 4% ` [dpdk-dev] [PATCH v5 1/2] mbuf: use C11 atomic builtins " Phil Yang
1 sibling, 2 replies; 200+ results
From: Phil Yang @ 2020-07-09 15:58 UTC (permalink / raw)
To: olivier.matz, dev
Cc: stephen, david.marchand, drc, Honnappa.Nagarahalli, Ruifeng.Wang, nd
Use C11 atomic built-ins with explicit ordering instead of rte_atomic
ops which enforce unnecessary barriers on aarch64.
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
v4:
1. Add union for refcnt_atomic and refcnt in rte_mbuf_ext_shared_info
to avoid ABI breakage. (Olivier)
2. Add notice of refcnt_atomic deprecation. (Honnappa)
v3:
1.Fix ABI breakage.
2.Simplify data type cast.
v2:
Fix ABI issue: revert the rte_mbuf_ext_shared_info struct refcnt field
to refcnt_atomic.
lib/librte_mbuf/rte_mbuf.c | 1 -
lib/librte_mbuf/rte_mbuf.h | 19 ++++++++++---------
lib/librte_mbuf/rte_mbuf_core.h | 6 +++++-
3 files changed, 15 insertions(+), 11 deletions(-)
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index ae91ae2..8a456e5 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -22,7 +22,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_mbuf.h>
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index f8e492e..7259575 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -37,7 +37,6 @@
#include <rte_config.h>
#include <rte_mempool.h>
#include <rte_memory.h>
-#include <rte_atomic.h>
#include <rte_prefetch.h>
#include <rte_branch_prediction.h>
#include <rte_byteorder.h>
@@ -365,7 +364,7 @@ rte_pktmbuf_priv_flags(struct rte_mempool *mp)
static inline uint16_t
rte_mbuf_refcnt_read(const struct rte_mbuf *m)
{
- return (uint16_t)(rte_atomic16_read(&m->refcnt_atomic));
+ return __atomic_load_n(&m->refcnt, __ATOMIC_RELAXED);
}
/**
@@ -378,14 +377,15 @@ rte_mbuf_refcnt_read(const struct rte_mbuf *m)
static inline void
rte_mbuf_refcnt_set(struct rte_mbuf *m, uint16_t new_value)
{
- rte_atomic16_set(&m->refcnt_atomic, (int16_t)new_value);
+ __atomic_store_n(&m->refcnt, new_value, __ATOMIC_RELAXED);
}
/* internal */
static inline uint16_t
__rte_mbuf_refcnt_update(struct rte_mbuf *m, int16_t value)
{
- return (uint16_t)(rte_atomic16_add_return(&m->refcnt_atomic, value));
+ return __atomic_add_fetch(&m->refcnt, (uint16_t)value,
+ __ATOMIC_ACQ_REL);
}
/**
@@ -466,7 +466,7 @@ rte_mbuf_refcnt_set(struct rte_mbuf *m, uint16_t new_value)
static inline uint16_t
rte_mbuf_ext_refcnt_read(const struct rte_mbuf_ext_shared_info *shinfo)
{
- return (uint16_t)(rte_atomic16_read(&shinfo->refcnt_atomic));
+ return __atomic_load_n(&shinfo->refcnt, __ATOMIC_RELAXED);
}
/**
@@ -481,7 +481,7 @@ static inline void
rte_mbuf_ext_refcnt_set(struct rte_mbuf_ext_shared_info *shinfo,
uint16_t new_value)
{
- rte_atomic16_set(&shinfo->refcnt_atomic, (int16_t)new_value);
+ __atomic_store_n(&shinfo->refcnt, new_value, __ATOMIC_RELAXED);
}
/**
@@ -505,7 +505,8 @@ rte_mbuf_ext_refcnt_update(struct rte_mbuf_ext_shared_info *shinfo,
return (uint16_t)value;
}
- return (uint16_t)rte_atomic16_add_return(&shinfo->refcnt_atomic, value);
+ return __atomic_add_fetch(&shinfo->refcnt, (uint16_t)value,
+ __ATOMIC_ACQ_REL);
}
/** Mbuf prefetch */
@@ -1304,8 +1305,8 @@ static inline int __rte_pktmbuf_pinned_extbuf_decref(struct rte_mbuf *m)
* Direct usage of add primitive to avoid
* duplication of comparing with one.
*/
- if (likely(rte_atomic16_add_return
- (&shinfo->refcnt_atomic, -1)))
+ if (likely(__atomic_add_fetch(&shinfo->refcnt, (uint16_t)-1,
+ __ATOMIC_ACQ_REL)))
return 1;
/* Reinitialize counter before mbuf freeing. */
diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
index 16600f1..8cd7137 100644
--- a/lib/librte_mbuf/rte_mbuf_core.h
+++ b/lib/librte_mbuf/rte_mbuf_core.h
@@ -679,7 +679,11 @@ typedef void (*rte_mbuf_extbuf_free_callback_t)(void *addr, void *opaque);
struct rte_mbuf_ext_shared_info {
rte_mbuf_extbuf_free_callback_t free_cb; /**< Free callback function */
void *fcb_opaque; /**< Free callback argument */
- rte_atomic16_t refcnt_atomic; /**< Atomically accessed refcnt */
+ RTE_STD_C11
+ union {
+ rte_atomic16_t refcnt_atomic; /**< Atomically accessed refcnt */
+ uint16_t refcnt;
+ };
};
/**< Maximum number of nb_segs allowed. */
--
2.7.4
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH] devtools: give some hints for ABI errors
2020-07-08 10:22 25% [dpdk-dev] [PATCH] devtools: give some hints for ABI errors David Marchand
2020-07-08 13:09 7% ` Kinsella, Ray
@ 2020-07-09 15:52 4% ` Dodji Seketeli
2020-07-10 7:37 4% ` Kinsella, Ray
` (2 subsequent siblings)
4 siblings, 0 replies; 200+ results
From: Dodji Seketeli @ 2020-07-09 15:52 UTC (permalink / raw)
To: David Marchand; +Cc: dev, thomas, Ray Kinsella, Neil Horman
Hello,
David Marchand <david.marchand@redhat.com> writes:
> abidiff can provide some more information about the ABI difference it
> detected.
> In all cases, a discussion on the mailing must happen but we can give
> some hints to know if this is a problem with the script calling abidiff,
> a potential ABI breakage or an unambiguous ABI breakage.
>
> Signed-off-by: David Marchand <david.marchand@redhat.com>
For what it's worth, the change looks good to me, at least from an
abidiff perspective.
Thanks.
Cheers.
--
Dodji
^ permalink raw reply [relevance 4%]
* [dpdk-dev] [PATCH v9 1/3] lib/lpm: integrate RCU QSBR
2020-07-09 15:42 4% ` [dpdk-dev] [PATCH v9 0/3] RCU integration with LPM library Ruifeng Wang
@ 2020-07-09 15:42 2% ` Ruifeng Wang
0 siblings, 0 replies; 200+ results
From: Ruifeng Wang @ 2020-07-09 15:42 UTC (permalink / raw)
To: Bruce Richardson, Vladimir Medvedkin, John McNamara,
Marko Kovacevic, Ray Kinsella, Neil Horman
Cc: dev, konstantin.ananyev, honnappa.nagarahalli, nd, Ruifeng Wang
Currently, the tbl8 group is freed even though the readers might be
using the tbl8 group entries. The freed tbl8 group can be reallocated
quickly. This results in incorrect lookup results.
RCU QSBR process is integrated for safe tbl8 group reclaim.
Refer to RCU documentation to understand various aspects of
integrating RCU library into other libraries.
To avoid ABI breakage, a struct __rte_lpm is created for lpm library
internal use. This struct wraps rte_lpm that has been exposed and
also includes members that don't need to be exposed such as RCU related
config.
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
doc/guides/prog_guide/lpm_lib.rst | 32 ++++++
lib/librte_lpm/Makefile | 2 +-
lib/librte_lpm/meson.build | 1 +
lib/librte_lpm/rte_lpm.c | 165 +++++++++++++++++++++++++----
lib/librte_lpm/rte_lpm.h | 53 +++++++++
lib/librte_lpm/rte_lpm_version.map | 6 ++
6 files changed, 237 insertions(+), 22 deletions(-)
diff --git a/doc/guides/prog_guide/lpm_lib.rst b/doc/guides/prog_guide/lpm_lib.rst
index 1609a57d0..03945904b 100644
--- a/doc/guides/prog_guide/lpm_lib.rst
+++ b/doc/guides/prog_guide/lpm_lib.rst
@@ -145,6 +145,38 @@ depending on whether we need to move to the next table or not.
Prefix expansion is one of the keys of this algorithm,
since it improves the speed dramatically by adding redundancy.
+Deletion
+~~~~~~~~
+
+When deleting a rule, a replacement rule is searched for. Replacement rule is an existing rule that has
+the longest prefix match with the rule to be deleted, but has shorter prefix.
+
+If a replacement rule is found, target tbl24 and tbl8 entries are updated to have the same depth and next hop
+value with the replacement rule.
+
+If no replacement rule can be found, target tbl24 and tbl8 entries will be cleared.
+
+Prefix expansion is performed if the rule's depth is not exactly 24 bits or 32 bits.
+
+After deleting a rule, a group of tbl8s that belongs to the same tbl24 entry are freed in following cases:
+
+* All tbl8s in the group are empty .
+
+* All tbl8s in the group have the same values and with depth no greater than 24.
+
+Free of tbl8s have different behaviors:
+
+* If RCU is not used, tbl8s are cleared and reclaimed immediately.
+
+* If RCU is used, tbl8s are reclaimed when readers are in quiescent state.
+
+When the LPM is not using RCU, tbl8 group can be freed immediately even though the readers might be using
+the tbl8 group entries. This might result in incorrect lookup results.
+
+RCU QSBR process is integrated for safe tbl8 group reclamation. Application has certain responsibilities
+while using this feature. Please refer to resource reclamation framework of :ref:`RCU library <RCU_Library>`
+for more details.
+
Lookup
~~~~~~
diff --git a/lib/librte_lpm/Makefile b/lib/librte_lpm/Makefile
index d682785b6..6f06c5c03 100644
--- a/lib/librte_lpm/Makefile
+++ b/lib/librte_lpm/Makefile
@@ -8,7 +8,7 @@ LIB = librte_lpm.a
CFLAGS += -O3
CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
-LDLIBS += -lrte_eal -lrte_hash
+LDLIBS += -lrte_eal -lrte_hash -lrte_rcu
EXPORT_MAP := rte_lpm_version.map
diff --git a/lib/librte_lpm/meson.build b/lib/librte_lpm/meson.build
index 021ac6d8d..6cfc083c5 100644
--- a/lib/librte_lpm/meson.build
+++ b/lib/librte_lpm/meson.build
@@ -7,3 +7,4 @@ headers = files('rte_lpm.h', 'rte_lpm6.h')
# without worrying about which architecture we actually need
headers += files('rte_lpm_altivec.h', 'rte_lpm_neon.h', 'rte_lpm_sse.h')
deps += ['hash']
+deps += ['rcu']
diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c
index 38ab512a4..2d687c372 100644
--- a/lib/librte_lpm/rte_lpm.c
+++ b/lib/librte_lpm/rte_lpm.c
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2020 Arm Limited
*/
#include <string.h>
@@ -39,6 +40,17 @@ enum valid_flag {
VALID
};
+/** @internal LPM structure. */
+struct __rte_lpm {
+ /* LPM metadata. */
+ struct rte_lpm lpm;
+
+ /* RCU config. */
+ struct rte_rcu_qsbr *v; /* RCU QSBR variable. */
+ enum rte_lpm_qsbr_mode rcu_mode;/* Blocking, defer queue. */
+ struct rte_rcu_qsbr_dq *dq; /* RCU QSBR defer queue. */
+};
+
/* Macro to enable/disable run-time checks. */
#if defined(RTE_LIBRTE_LPM_DEBUG)
#include <rte_debug.h>
@@ -122,6 +134,7 @@ rte_lpm_create(const char *name, int socket_id,
const struct rte_lpm_config *config)
{
char mem_name[RTE_LPM_NAMESIZE];
+ struct __rte_lpm *internal_lpm;
struct rte_lpm *lpm = NULL;
struct rte_tailq_entry *te;
uint32_t mem_size, rules_size, tbl8s_size;
@@ -140,12 +153,6 @@ rte_lpm_create(const char *name, int socket_id,
snprintf(mem_name, sizeof(mem_name), "LPM_%s", name);
- /* Determine the amount of memory to allocate. */
- mem_size = sizeof(*lpm);
- rules_size = sizeof(struct rte_lpm_rule) * config->max_rules;
- tbl8s_size = (sizeof(struct rte_lpm_tbl_entry) *
- RTE_LPM_TBL8_GROUP_NUM_ENTRIES * config->number_tbl8s);
-
rte_mcfg_tailq_write_lock();
/* guarantee there's no existing */
@@ -161,6 +168,12 @@ rte_lpm_create(const char *name, int socket_id,
goto exit;
}
+ /* Determine the amount of memory to allocate. */
+ mem_size = sizeof(*internal_lpm);
+ rules_size = sizeof(struct rte_lpm_rule) * config->max_rules;
+ tbl8s_size = (sizeof(struct rte_lpm_tbl_entry) *
+ RTE_LPM_TBL8_GROUP_NUM_ENTRIES * config->number_tbl8s);
+
/* allocate tailq entry */
te = rte_zmalloc("LPM_TAILQ_ENTRY", sizeof(*te), 0);
if (te == NULL) {
@@ -170,21 +183,23 @@ rte_lpm_create(const char *name, int socket_id,
}
/* Allocate memory to store the LPM data structures. */
- lpm = rte_zmalloc_socket(mem_name, mem_size,
+ internal_lpm = rte_zmalloc_socket(mem_name, mem_size,
RTE_CACHE_LINE_SIZE, socket_id);
- if (lpm == NULL) {
+ if (internal_lpm == NULL) {
RTE_LOG(ERR, LPM, "LPM memory allocation failed\n");
rte_free(te);
rte_errno = ENOMEM;
goto exit;
}
+ lpm = &internal_lpm->lpm;
lpm->rules_tbl = rte_zmalloc_socket(NULL,
(size_t)rules_size, RTE_CACHE_LINE_SIZE, socket_id);
if (lpm->rules_tbl == NULL) {
RTE_LOG(ERR, LPM, "LPM rules_tbl memory allocation failed\n");
- rte_free(lpm);
+ rte_free(internal_lpm);
+ internal_lpm = NULL;
lpm = NULL;
rte_free(te);
rte_errno = ENOMEM;
@@ -197,7 +212,8 @@ rte_lpm_create(const char *name, int socket_id,
if (lpm->tbl8 == NULL) {
RTE_LOG(ERR, LPM, "LPM tbl8 memory allocation failed\n");
rte_free(lpm->rules_tbl);
- rte_free(lpm);
+ rte_free(internal_lpm);
+ internal_lpm = NULL;
lpm = NULL;
rte_free(te);
rte_errno = ENOMEM;
@@ -225,6 +241,7 @@ rte_lpm_create(const char *name, int socket_id,
void
rte_lpm_free(struct rte_lpm *lpm)
{
+ struct __rte_lpm *internal_lpm;
struct rte_lpm_list *lpm_list;
struct rte_tailq_entry *te;
@@ -246,12 +263,84 @@ rte_lpm_free(struct rte_lpm *lpm)
rte_mcfg_tailq_write_unlock();
+ internal_lpm = container_of(lpm, struct __rte_lpm, lpm);
+ if (internal_lpm->dq)
+ rte_rcu_qsbr_dq_delete(internal_lpm->dq);
rte_free(lpm->tbl8);
rte_free(lpm->rules_tbl);
rte_free(lpm);
rte_free(te);
}
+static void
+__lpm_rcu_qsbr_free_resource(void *p, void *data, unsigned int n)
+{
+ struct rte_lpm_tbl_entry zero_tbl8_entry = {0};
+ uint32_t tbl8_group_index = *(uint32_t *)data;
+ struct rte_lpm_tbl_entry *tbl8 = ((struct rte_lpm *)p)->tbl8;
+
+ RTE_SET_USED(n);
+ /* Set tbl8 group invalid */
+ __atomic_store(&tbl8[tbl8_group_index], &zero_tbl8_entry,
+ __ATOMIC_RELAXED);
+}
+
+/* Associate QSBR variable with an LPM object.
+ */
+int
+rte_lpm_rcu_qsbr_add(struct rte_lpm *lpm, struct rte_lpm_rcu_config *cfg,
+ struct rte_rcu_qsbr_dq **dq)
+{
+ struct __rte_lpm *internal_lpm;
+ char rcu_dq_name[RTE_RCU_QSBR_DQ_NAMESIZE];
+ struct rte_rcu_qsbr_dq_parameters params = {0};
+
+ if (lpm == NULL || cfg == NULL) {
+ rte_errno = EINVAL;
+ return 1;
+ }
+
+ internal_lpm = container_of(lpm, struct __rte_lpm, lpm);
+ if (internal_lpm->v != NULL) {
+ rte_errno = EEXIST;
+ return 1;
+ }
+
+ if (cfg->mode == RTE_LPM_QSBR_MODE_SYNC) {
+ /* No other things to do. */
+ } else if (cfg->mode == RTE_LPM_QSBR_MODE_DQ) {
+ /* Init QSBR defer queue. */
+ snprintf(rcu_dq_name, sizeof(rcu_dq_name),
+ "LPM_RCU_%s", lpm->name);
+ params.name = rcu_dq_name;
+ params.size = cfg->dq_size;
+ if (params.size == 0)
+ params.size = lpm->number_tbl8s;
+ params.trigger_reclaim_limit = cfg->reclaim_thd;
+ params.max_reclaim_size = cfg->reclaim_max;
+ if (params.max_reclaim_size == 0)
+ params.max_reclaim_size = RTE_LPM_RCU_DQ_RECLAIM_MAX;
+ params.esize = sizeof(uint32_t); /* tbl8 group index */
+ params.free_fn = __lpm_rcu_qsbr_free_resource;
+ params.p = lpm;
+ params.v = cfg->v;
+ internal_lpm->dq = rte_rcu_qsbr_dq_create(¶ms);
+ if (internal_lpm->dq == NULL) {
+ RTE_LOG(ERR, LPM, "LPM defer queue creation failed\n");
+ return 1;
+ }
+ if (dq)
+ *dq = internal_lpm->dq;
+ } else {
+ rte_errno = EINVAL;
+ return 1;
+ }
+ internal_lpm->rcu_mode = cfg->mode;
+ internal_lpm->v = cfg->v;
+
+ return 0;
+}
+
/*
* Adds a rule to the rule table.
*
@@ -394,14 +483,15 @@ rule_find(struct rte_lpm *lpm, uint32_t ip_masked, uint8_t depth)
* Find, clean and allocate a tbl8.
*/
static int32_t
-tbl8_alloc(struct rte_lpm_tbl_entry *tbl8, uint32_t number_tbl8s)
+_tbl8_alloc(struct rte_lpm *lpm)
{
uint32_t group_idx; /* tbl8 group index. */
struct rte_lpm_tbl_entry *tbl8_entry;
/* Scan through tbl8 to find a free (i.e. INVALID) tbl8 group. */
- for (group_idx = 0; group_idx < number_tbl8s; group_idx++) {
- tbl8_entry = &tbl8[group_idx * RTE_LPM_TBL8_GROUP_NUM_ENTRIES];
+ for (group_idx = 0; group_idx < lpm->number_tbl8s; group_idx++) {
+ tbl8_entry = &lpm->tbl8[group_idx *
+ RTE_LPM_TBL8_GROUP_NUM_ENTRIES];
/* If a free tbl8 group is found clean it and set as VALID. */
if (!tbl8_entry->valid_group) {
struct rte_lpm_tbl_entry new_tbl8_entry = {
@@ -427,14 +517,47 @@ tbl8_alloc(struct rte_lpm_tbl_entry *tbl8, uint32_t number_tbl8s)
return -ENOSPC;
}
+static int32_t
+tbl8_alloc(struct rte_lpm *lpm)
+{
+ struct __rte_lpm *internal_lpm = container_of(lpm,
+ struct __rte_lpm, lpm);
+ int32_t group_idx; /* tbl8 group index. */
+
+ group_idx = _tbl8_alloc(lpm);
+ if (group_idx == -ENOSPC && internal_lpm->dq != NULL) {
+ /* If there are no tbl8 groups try to reclaim one. */
+ if (rte_rcu_qsbr_dq_reclaim(internal_lpm->dq, 1,
+ NULL, NULL, NULL) == 0)
+ group_idx = _tbl8_alloc(lpm);
+ }
+
+ return group_idx;
+}
+
static void
-tbl8_free(struct rte_lpm_tbl_entry *tbl8, uint32_t tbl8_group_start)
+tbl8_free(struct rte_lpm *lpm, uint32_t tbl8_group_start)
{
- /* Set tbl8 group invalid*/
+ struct __rte_lpm *internal_lpm = container_of(lpm,
+ struct __rte_lpm, lpm);
struct rte_lpm_tbl_entry zero_tbl8_entry = {0};
- __atomic_store(&tbl8[tbl8_group_start], &zero_tbl8_entry,
- __ATOMIC_RELAXED);
+ if (internal_lpm->v == NULL) {
+ /* Set tbl8 group invalid*/
+ __atomic_store(&lpm->tbl8[tbl8_group_start], &zero_tbl8_entry,
+ __ATOMIC_RELAXED);
+ } else if (internal_lpm->rcu_mode == RTE_LPM_QSBR_MODE_SYNC) {
+ /* Wait for quiescent state change. */
+ rte_rcu_qsbr_synchronize(internal_lpm->v,
+ RTE_QSBR_THRID_INVALID);
+ /* Set tbl8 group invalid*/
+ __atomic_store(&lpm->tbl8[tbl8_group_start], &zero_tbl8_entry,
+ __ATOMIC_RELAXED);
+ } else if (internal_lpm->rcu_mode == RTE_LPM_QSBR_MODE_DQ) {
+ /* Push into QSBR defer queue. */
+ rte_rcu_qsbr_dq_enqueue(internal_lpm->dq,
+ (void *)&tbl8_group_start);
+ }
}
static __rte_noinline int32_t
@@ -523,7 +646,7 @@ add_depth_big(struct rte_lpm *lpm, uint32_t ip_masked, uint8_t depth,
if (!lpm->tbl24[tbl24_index].valid) {
/* Search for a free tbl8 group. */
- tbl8_group_index = tbl8_alloc(lpm->tbl8, lpm->number_tbl8s);
+ tbl8_group_index = tbl8_alloc(lpm);
/* Check tbl8 allocation was successful. */
if (tbl8_group_index < 0) {
@@ -569,7 +692,7 @@ add_depth_big(struct rte_lpm *lpm, uint32_t ip_masked, uint8_t depth,
} /* If valid entry but not extended calculate the index into Table8. */
else if (lpm->tbl24[tbl24_index].valid_group == 0) {
/* Search for free tbl8 group. */
- tbl8_group_index = tbl8_alloc(lpm->tbl8, lpm->number_tbl8s);
+ tbl8_group_index = tbl8_alloc(lpm);
if (tbl8_group_index < 0) {
return tbl8_group_index;
@@ -977,7 +1100,7 @@ delete_depth_big(struct rte_lpm *lpm, uint32_t ip_masked,
*/
lpm->tbl24[tbl24_index].valid = 0;
__atomic_thread_fence(__ATOMIC_RELEASE);
- tbl8_free(lpm->tbl8, tbl8_group_start);
+ tbl8_free(lpm, tbl8_group_start);
} else if (tbl8_recycle_index > -1) {
/* Update tbl24 entry. */
struct rte_lpm_tbl_entry new_tbl24_entry = {
@@ -993,7 +1116,7 @@ delete_depth_big(struct rte_lpm *lpm, uint32_t ip_masked,
__atomic_store(&lpm->tbl24[tbl24_index], &new_tbl24_entry,
__ATOMIC_RELAXED);
__atomic_thread_fence(__ATOMIC_RELEASE);
- tbl8_free(lpm->tbl8, tbl8_group_start);
+ tbl8_free(lpm, tbl8_group_start);
}
#undef group_idx
return 0;
diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
index b9d49ac87..a9568fcdd 100644
--- a/lib/librte_lpm/rte_lpm.h
+++ b/lib/librte_lpm/rte_lpm.h
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2020 Arm Limited
*/
#ifndef _RTE_LPM_H_
@@ -20,6 +21,7 @@
#include <rte_memory.h>
#include <rte_common.h>
#include <rte_vect.h>
+#include <rte_rcu_qsbr.h>
#ifdef __cplusplus
extern "C" {
@@ -62,6 +64,17 @@ extern "C" {
/** Bitmask used to indicate successful lookup */
#define RTE_LPM_LOOKUP_SUCCESS 0x01000000
+/** @internal Default RCU defer queue entries to reclaim in one go. */
+#define RTE_LPM_RCU_DQ_RECLAIM_MAX 16
+
+/** RCU reclamation modes */
+enum rte_lpm_qsbr_mode {
+ /** Create defer queue for reclaim. */
+ RTE_LPM_QSBR_MODE_DQ = 0,
+ /** Use blocking mode reclaim. No defer queue created. */
+ RTE_LPM_QSBR_MODE_SYNC
+};
+
#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
/** @internal Tbl24 entry structure. */
__extension__
@@ -132,6 +145,22 @@ struct rte_lpm {
struct rte_lpm_rule *rules_tbl; /**< LPM rules. */
};
+/** LPM RCU QSBR configuration structure. */
+struct rte_lpm_rcu_config {
+ struct rte_rcu_qsbr *v; /* RCU QSBR variable. */
+ /* Mode of RCU QSBR. RTE_LPM_QSBR_MODE_xxx
+ * '0' for default: create defer queue for reclaim.
+ */
+ enum rte_lpm_qsbr_mode mode;
+ uint32_t dq_size; /* RCU defer queue size.
+ * default: lpm->number_tbl8s.
+ */
+ uint32_t reclaim_thd; /* Threshold to trigger auto reclaim. */
+ uint32_t reclaim_max; /* Max entries to reclaim in one go.
+ * default: RTE_LPM_RCU_DQ_RECLAIM_MAX.
+ */
+};
+
/**
* Create an LPM object.
*
@@ -179,6 +208,30 @@ rte_lpm_find_existing(const char *name);
void
rte_lpm_free(struct rte_lpm *lpm);
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Associate RCU QSBR variable with an LPM object.
+ *
+ * @param lpm
+ * the lpm object to add RCU QSBR
+ * @param cfg
+ * RCU QSBR configuration
+ * @param dq
+ * handler of created RCU QSBR defer queue
+ * @return
+ * On success - 0
+ * On error - 1 with error code set in rte_errno.
+ * Possible rte_errno codes are:
+ * - EINVAL - invalid pointer
+ * - EEXIST - already added QSBR
+ * - ENOMEM - memory allocation failure
+ */
+__rte_experimental
+int rte_lpm_rcu_qsbr_add(struct rte_lpm *lpm, struct rte_lpm_rcu_config *cfg,
+ struct rte_rcu_qsbr_dq **dq);
+
/**
* Add a rule to the LPM table.
*
diff --git a/lib/librte_lpm/rte_lpm_version.map b/lib/librte_lpm/rte_lpm_version.map
index 500f58b80..bfccd7eac 100644
--- a/lib/librte_lpm/rte_lpm_version.map
+++ b/lib/librte_lpm/rte_lpm_version.map
@@ -21,3 +21,9 @@ DPDK_20.0 {
local: *;
};
+
+EXPERIMENTAL {
+ global:
+
+ rte_lpm_rcu_qsbr_add;
+};
--
2.17.1
^ permalink raw reply [relevance 2%]
* [dpdk-dev] [PATCH v9 0/3] RCU integration with LPM library
` (3 preceding siblings ...)
2020-07-09 8:02 4% ` [dpdk-dev] [PATCH v8 0/3] RCU integration with LPM library Ruifeng Wang
@ 2020-07-09 15:42 4% ` Ruifeng Wang
2020-07-09 15:42 2% ` [dpdk-dev] [PATCH v9 1/3] lib/lpm: integrate RCU QSBR Ruifeng Wang
2020-07-10 2:22 4% ` [dpdk-dev] [PATCH v10 0/3] RCU integration with LPM library Ruifeng Wang
5 siblings, 1 reply; 200+ results
From: Ruifeng Wang @ 2020-07-09 15:42 UTC (permalink / raw)
Cc: dev, mdr, konstantin.ananyev, honnappa.nagarahalli, nd, Ruifeng Wang
This patchset integrates RCU QSBR support with LPM library.
Resource reclaimation implementation was splitted from the original
series, and has already been part of RCU library. Rework the series
to base LPM integration on RCU reclaimation APIs.
New API rte_lpm_rcu_qsbr_add is introduced for application to
register a RCU variable that LPM library will use. This provides
user the handle to enable RCU that integrated in LPM library.
Functional tests and performance tests are added to cover the
integration with RCU.
---
v9:
Cleared lpm when allocation failed. (David)
v8:
Fixed ABI issue by adding internal LPM control structure. (David)
Changed to use RFC5737 address in unit test. (Vladimir)
v7:
Fixed typos in document.
v6:
Remove ALLOW_EXPERIMENTAL_API from rte_lpm.c.
v5:
No default value for reclaim_thd. This allows reclamation triggering with every call.
Pass LPM pointer instead of tbl8 as argument of reclaim callback free function.
Updated group_idx check at tbl8 allocation.
Use enums instead of defines for different reclamation modes.
RCU QSBR integrated path is inside ALLOW_EXPERIMENTAL_API to avoid ABI change.
v4:
Allow user to configure defer queue: size, reclaim threshold, max entries.
Return defer queue handler so user can manually trigger reclaimation.
Add blocking mode support. Defer queue will not be created.
Honnappa Nagarahalli (1):
test/lpm: add RCU integration performance tests
Ruifeng Wang (2):
lib/lpm: integrate RCU QSBR
test/lpm: add LPM RCU integration functional tests
app/test/test_lpm.c | 291 ++++++++++++++++-
app/test/test_lpm_perf.c | 492 ++++++++++++++++++++++++++++-
doc/guides/prog_guide/lpm_lib.rst | 32 ++
lib/librte_lpm/Makefile | 2 +-
lib/librte_lpm/meson.build | 1 +
lib/librte_lpm/rte_lpm.c | 165 ++++++++--
lib/librte_lpm/rte_lpm.h | 53 ++++
lib/librte_lpm/rte_lpm_version.map | 6 +
8 files changed, 1016 insertions(+), 26 deletions(-)
--
2.17.1
^ permalink raw reply [relevance 4%]
* [dpdk-dev] [PATCH 20.11 4/5] rawdev: add private data length parameter to queue fns
2020-07-09 15:20 4% [dpdk-dev] [PATCH 20.11 0/5] Enhance rawdev APIs Bruce Richardson
2020-07-09 15:20 3% ` [dpdk-dev] [PATCH 20.11 1/5] rawdev: add private data length parameter to info fn Bruce Richardson
2020-07-09 15:20 3% ` [dpdk-dev] [PATCH 20.11 3/5] rawdev: add private data length parameter to config fn Bruce Richardson
@ 2020-07-09 15:20 3% ` Bruce Richardson
2 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2020-07-09 15:20 UTC (permalink / raw)
To: Nipun Gupta, Hemant Agrawal
Cc: dev, Rosen Xu, Tianfei zhang, Xiaoyun Li, Jingjing Wu, Satha Rao,
Mahipal Challa, Jerin Jacob, Bruce Richardson
The queue setup and queue defaults query functions take a void * parameter
as configuration data, preventing any compile-time checking of the
parameters and limiting runtime checks. Adding in the length of the
expected structure provides a measure of typechecking, and can also be used
for ABI compatibility in future, since ABI changes involving structs almost
always involve a change in size.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
drivers/raw/ntb/ntb.c | 25 ++++++++++++++++-----
drivers/raw/skeleton/skeleton_rawdev.c | 10 +++++----
drivers/raw/skeleton/skeleton_rawdev_test.c | 8 +++----
examples/ntb/ntb_fwd.c | 3 ++-
lib/librte_rawdev/rte_rawdev.c | 10 +++++----
lib/librte_rawdev/rte_rawdev.h | 10 +++++++--
lib/librte_rawdev/rte_rawdev_pmd.h | 6 +++--
7 files changed, 49 insertions(+), 23 deletions(-)
diff --git a/drivers/raw/ntb/ntb.c b/drivers/raw/ntb/ntb.c
index c181094d5..7c15e204c 100644
--- a/drivers/raw/ntb/ntb.c
+++ b/drivers/raw/ntb/ntb.c
@@ -249,11 +249,15 @@ ntb_dev_intr_handler(void *param)
static void
ntb_queue_conf_get(struct rte_rawdev *dev,
uint16_t queue_id,
- rte_rawdev_obj_t queue_conf)
+ rte_rawdev_obj_t queue_conf,
+ size_t conf_size)
{
struct ntb_queue_conf *q_conf = queue_conf;
struct ntb_hw *hw = dev->dev_private;
+ if (conf_size != sizeof(*q_conf))
+ return;
+
q_conf->tx_free_thresh = hw->tx_queues[queue_id]->tx_free_thresh;
q_conf->nb_desc = hw->rx_queues[queue_id]->nb_rx_desc;
q_conf->rx_mp = hw->rx_queues[queue_id]->mpool;
@@ -294,12 +298,16 @@ ntb_rxq_release(struct ntb_rx_queue *rxq)
static int
ntb_rxq_setup(struct rte_rawdev *dev,
uint16_t qp_id,
- rte_rawdev_obj_t queue_conf)
+ rte_rawdev_obj_t queue_conf,
+ size_t conf_size)
{
struct ntb_queue_conf *rxq_conf = queue_conf;
struct ntb_hw *hw = dev->dev_private;
struct ntb_rx_queue *rxq;
+ if (conf_size != sizeof(*rxq_conf))
+ return -EINVAL;
+
/* Allocate the rx queue data structure */
rxq = rte_zmalloc_socket("ntb rx queue",
sizeof(struct ntb_rx_queue),
@@ -375,13 +383,17 @@ ntb_txq_release(struct ntb_tx_queue *txq)
static int
ntb_txq_setup(struct rte_rawdev *dev,
uint16_t qp_id,
- rte_rawdev_obj_t queue_conf)
+ rte_rawdev_obj_t queue_conf,
+ size_t conf_size)
{
struct ntb_queue_conf *txq_conf = queue_conf;
struct ntb_hw *hw = dev->dev_private;
struct ntb_tx_queue *txq;
uint16_t i, prev;
+ if (conf_size != sizeof(*txq_conf))
+ return -EINVAL;
+
/* Allocate the TX queue data structure. */
txq = rte_zmalloc_socket("ntb tx queue",
sizeof(struct ntb_tx_queue),
@@ -439,7 +451,8 @@ ntb_txq_setup(struct rte_rawdev *dev,
static int
ntb_queue_setup(struct rte_rawdev *dev,
uint16_t queue_id,
- rte_rawdev_obj_t queue_conf)
+ rte_rawdev_obj_t queue_conf,
+ size_t conf_size)
{
struct ntb_hw *hw = dev->dev_private;
int ret;
@@ -447,11 +460,11 @@ ntb_queue_setup(struct rte_rawdev *dev,
if (queue_id >= hw->queue_pairs)
return -EINVAL;
- ret = ntb_txq_setup(dev, queue_id, queue_conf);
+ ret = ntb_txq_setup(dev, queue_id, queue_conf, conf_size);
if (ret < 0)
return ret;
- ret = ntb_rxq_setup(dev, queue_id, queue_conf);
+ ret = ntb_rxq_setup(dev, queue_id, queue_conf, conf_size);
return ret;
}
diff --git a/drivers/raw/skeleton/skeleton_rawdev.c b/drivers/raw/skeleton/skeleton_rawdev.c
index 531d0450c..f109e4d2c 100644
--- a/drivers/raw/skeleton/skeleton_rawdev.c
+++ b/drivers/raw/skeleton/skeleton_rawdev.c
@@ -222,14 +222,15 @@ static int skeleton_rawdev_reset(struct rte_rawdev *dev)
static void skeleton_rawdev_queue_def_conf(struct rte_rawdev *dev,
uint16_t queue_id,
- rte_rawdev_obj_t queue_conf)
+ rte_rawdev_obj_t queue_conf,
+ size_t conf_size)
{
struct skeleton_rawdev *skeldev;
struct skeleton_rawdev_queue *skelq;
SKELETON_PMD_FUNC_TRACE();
- if (!dev || !queue_conf)
+ if (!dev || !queue_conf || conf_size != sizeof(struct skeleton_rawdev_queue))
return;
skeldev = skeleton_rawdev_get_priv(dev);
@@ -252,7 +253,8 @@ clear_queue_bufs(int queue_id)
static int skeleton_rawdev_queue_setup(struct rte_rawdev *dev,
uint16_t queue_id,
- rte_rawdev_obj_t queue_conf)
+ rte_rawdev_obj_t queue_conf,
+ size_t conf_size)
{
int ret = 0;
struct skeleton_rawdev *skeldev;
@@ -260,7 +262,7 @@ static int skeleton_rawdev_queue_setup(struct rte_rawdev *dev,
SKELETON_PMD_FUNC_TRACE();
- if (!dev || !queue_conf)
+ if (!dev || !queue_conf || conf_size != sizeof(struct skeleton_rawdev_queue))
return -EINVAL;
skeldev = skeleton_rawdev_get_priv(dev);
diff --git a/drivers/raw/skeleton/skeleton_rawdev_test.c b/drivers/raw/skeleton/skeleton_rawdev_test.c
index 7dc7c7684..bb4b6efe4 100644
--- a/drivers/raw/skeleton/skeleton_rawdev_test.c
+++ b/drivers/raw/skeleton/skeleton_rawdev_test.c
@@ -185,7 +185,7 @@ test_rawdev_queue_default_conf_get(void)
* depth = DEF_DEPTH
*/
for (i = 0; i < rdev_conf_get.num_queues; i++) {
- rte_rawdev_queue_conf_get(test_dev_id, i, &q);
+ rte_rawdev_queue_conf_get(test_dev_id, i, &q, sizeof(q));
RTE_TEST_ASSERT_EQUAL(q.depth, SKELETON_QUEUE_DEF_DEPTH,
"Invalid default depth of queue (%d)",
q.depth);
@@ -235,11 +235,11 @@ test_rawdev_queue_setup(void)
/* Modify the queue depth for Queue 0 and attach it */
qset.depth = 15;
qset.state = SKELETON_QUEUE_ATTACH;
- ret = rte_rawdev_queue_setup(test_dev_id, 0, &qset);
+ ret = rte_rawdev_queue_setup(test_dev_id, 0, &qset, sizeof(qset));
RTE_TEST_ASSERT_SUCCESS(ret, "Failed to setup queue (%d)", ret);
/* Now, fetching the queue 0 should show depth as 15 */
- ret = rte_rawdev_queue_conf_get(test_dev_id, 0, &qget);
+ ret = rte_rawdev_queue_conf_get(test_dev_id, 0, &qget, sizeof(qget));
RTE_TEST_ASSERT_SUCCESS(ret, "Failed to get queue config (%d)", ret);
RTE_TEST_ASSERT_EQUAL(qset.depth, qget.depth,
@@ -263,7 +263,7 @@ test_rawdev_queue_release(void)
RTE_TEST_ASSERT_SUCCESS(ret, "Failed to release queue 0; (%d)", ret);
/* Now, fetching the queue 0 should show depth as default */
- ret = rte_rawdev_queue_conf_get(test_dev_id, 0, &qget);
+ ret = rte_rawdev_queue_conf_get(test_dev_id, 0, &qget, sizeof(qget));
RTE_TEST_ASSERT_SUCCESS(ret, "Failed to get queue config (%d)", ret);
RTE_TEST_ASSERT_EQUAL(qget.depth, SKELETON_QUEUE_DEF_DEPTH,
diff --git a/examples/ntb/ntb_fwd.c b/examples/ntb/ntb_fwd.c
index 656f73659..5a8439b8d 100644
--- a/examples/ntb/ntb_fwd.c
+++ b/examples/ntb/ntb_fwd.c
@@ -1411,7 +1411,8 @@ main(int argc, char **argv)
ntb_q_conf.rx_mp = mbuf_pool;
for (i = 0; i < num_queues; i++) {
/* Setup rawdev queue */
- ret = rte_rawdev_queue_setup(dev_id, i, &ntb_q_conf);
+ ret = rte_rawdev_queue_setup(dev_id, i, &ntb_q_conf,
+ sizeof(ntb_q_conf));
if (ret < 0)
rte_exit(EXIT_FAILURE,
"Failed to setup ntb queue %u.\n", i);
diff --git a/lib/librte_rawdev/rte_rawdev.c b/lib/librte_rawdev/rte_rawdev.c
index 6c4d783cc..8965f2ce3 100644
--- a/lib/librte_rawdev/rte_rawdev.c
+++ b/lib/librte_rawdev/rte_rawdev.c
@@ -137,7 +137,8 @@ rte_rawdev_configure(uint16_t dev_id, struct rte_rawdev_info *dev_conf,
int
rte_rawdev_queue_conf_get(uint16_t dev_id,
uint16_t queue_id,
- rte_rawdev_obj_t queue_conf)
+ rte_rawdev_obj_t queue_conf,
+ size_t queue_conf_size)
{
struct rte_rawdev *dev;
@@ -145,14 +146,15 @@ rte_rawdev_queue_conf_get(uint16_t dev_id,
dev = &rte_rawdevs[dev_id];
RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->queue_def_conf, -ENOTSUP);
- (*dev->dev_ops->queue_def_conf)(dev, queue_id, queue_conf);
+ (*dev->dev_ops->queue_def_conf)(dev, queue_id, queue_conf, queue_conf_size);
return 0;
}
int
rte_rawdev_queue_setup(uint16_t dev_id,
uint16_t queue_id,
- rte_rawdev_obj_t queue_conf)
+ rte_rawdev_obj_t queue_conf,
+ size_t queue_conf_size)
{
struct rte_rawdev *dev;
@@ -160,7 +162,7 @@ rte_rawdev_queue_setup(uint16_t dev_id,
dev = &rte_rawdevs[dev_id];
RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->queue_setup, -ENOTSUP);
- return (*dev->dev_ops->queue_setup)(dev, queue_id, queue_conf);
+ return (*dev->dev_ops->queue_setup)(dev, queue_id, queue_conf, queue_conf_size);
}
int
diff --git a/lib/librte_rawdev/rte_rawdev.h b/lib/librte_rawdev/rte_rawdev.h
index 73e3bd5ae..bbd63913a 100644
--- a/lib/librte_rawdev/rte_rawdev.h
+++ b/lib/librte_rawdev/rte_rawdev.h
@@ -146,6 +146,8 @@ rte_rawdev_configure(uint16_t dev_id, struct rte_rawdev_info *dev_conf,
* previously supplied to rte_rawdev_configure().
* @param[out] queue_conf
* The pointer to the default raw queue configuration data.
+ * @param queue_conf_size
+ * The size of the structure pointed to by queue_conf
* @return
* - 0: Success, driver updates the default raw queue configuration data.
* - <0: Error code returned by the driver info get function.
@@ -156,7 +158,8 @@ rte_rawdev_configure(uint16_t dev_id, struct rte_rawdev_info *dev_conf,
int
rte_rawdev_queue_conf_get(uint16_t dev_id,
uint16_t queue_id,
- rte_rawdev_obj_t queue_conf);
+ rte_rawdev_obj_t queue_conf,
+ size_t queue_conf_size);
/**
* Allocate and set up a raw queue for a raw device.
@@ -169,6 +172,8 @@ rte_rawdev_queue_conf_get(uint16_t dev_id,
* @param queue_conf
* The pointer to the configuration data to be used for the raw queue.
* NULL value is allowed, in which case default configuration used.
+ * @param queue_conf_size
+ * The size of the structure pointed to by queue_conf
*
* @see rte_rawdev_queue_conf_get()
*
@@ -179,7 +184,8 @@ rte_rawdev_queue_conf_get(uint16_t dev_id,
int
rte_rawdev_queue_setup(uint16_t dev_id,
uint16_t queue_id,
- rte_rawdev_obj_t queue_conf);
+ rte_rawdev_obj_t queue_conf,
+ size_t queue_conf_size);
/**
* Release and deallocate a raw queue from a raw device.
diff --git a/lib/librte_rawdev/rte_rawdev_pmd.h b/lib/librte_rawdev/rte_rawdev_pmd.h
index 050f8b029..34eb667f6 100644
--- a/lib/librte_rawdev/rte_rawdev_pmd.h
+++ b/lib/librte_rawdev/rte_rawdev_pmd.h
@@ -218,7 +218,8 @@ typedef int (*rawdev_reset_t)(struct rte_rawdev *dev);
*/
typedef void (*rawdev_queue_conf_get_t)(struct rte_rawdev *dev,
uint16_t queue_id,
- rte_rawdev_obj_t queue_conf);
+ rte_rawdev_obj_t queue_conf,
+ size_t queue_conf_size);
/**
* Setup an raw queue.
@@ -235,7 +236,8 @@ typedef void (*rawdev_queue_conf_get_t)(struct rte_rawdev *dev,
*/
typedef int (*rawdev_queue_setup_t)(struct rte_rawdev *dev,
uint16_t queue_id,
- rte_rawdev_obj_t queue_conf);
+ rte_rawdev_obj_t queue_conf,
+ size_t queue_conf_size);
/**
* Release resources allocated by given raw queue.
--
2.25.1
^ permalink raw reply [relevance 3%]
* [dpdk-dev] [PATCH 20.11 3/5] rawdev: add private data length parameter to config fn
2020-07-09 15:20 4% [dpdk-dev] [PATCH 20.11 0/5] Enhance rawdev APIs Bruce Richardson
2020-07-09 15:20 3% ` [dpdk-dev] [PATCH 20.11 1/5] rawdev: add private data length parameter to info fn Bruce Richardson
@ 2020-07-09 15:20 3% ` Bruce Richardson
2020-07-12 14:13 0% ` Xu, Rosen
2020-07-09 15:20 3% ` [dpdk-dev] [PATCH 20.11 4/5] rawdev: add private data length parameter to queue fns Bruce Richardson
2 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2020-07-09 15:20 UTC (permalink / raw)
To: Nipun Gupta, Hemant Agrawal
Cc: dev, Rosen Xu, Tianfei zhang, Xiaoyun Li, Jingjing Wu, Satha Rao,
Mahipal Challa, Jerin Jacob, Bruce Richardson
Currently with the rawdev API there is no way to check that the structure
passed in via the dev_private pointer in the structure passed to configure
API is of the correct type - it's just checked that it is non-NULL. Adding
in the length of the expected structure provides a measure of typechecking,
and can also be used for ABI compatibility in future, since ABI changes
involving structs almost always involve a change in size.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
drivers/raw/ifpga/ifpga_rawdev.c | 3 ++-
drivers/raw/ioat/ioat_rawdev.c | 5 +++--
drivers/raw/ioat/ioat_rawdev_test.c | 2 +-
drivers/raw/ntb/ntb.c | 6 +++++-
drivers/raw/octeontx2_dma/otx2_dpi_rawdev.c | 7 ++++---
drivers/raw/octeontx2_dma/otx2_dpi_test.c | 3 ++-
drivers/raw/octeontx2_ep/otx2_ep_rawdev.c | 7 ++++---
drivers/raw/octeontx2_ep/otx2_ep_test.c | 2 +-
drivers/raw/skeleton/skeleton_rawdev.c | 5 +++--
drivers/raw/skeleton/skeleton_rawdev_test.c | 5 +++--
examples/ioat/ioatfwd.c | 2 +-
examples/ntb/ntb_fwd.c | 2 +-
lib/librte_rawdev/rte_rawdev.c | 6 ++++--
lib/librte_rawdev/rte_rawdev.h | 8 +++++++-
lib/librte_rawdev/rte_rawdev_pmd.h | 3 ++-
15 files changed, 43 insertions(+), 23 deletions(-)
diff --git a/drivers/raw/ifpga/ifpga_rawdev.c b/drivers/raw/ifpga/ifpga_rawdev.c
index 32a2b96c9..a50173264 100644
--- a/drivers/raw/ifpga/ifpga_rawdev.c
+++ b/drivers/raw/ifpga/ifpga_rawdev.c
@@ -684,7 +684,8 @@ ifpga_rawdev_info_get(struct rte_rawdev *dev,
static int
ifpga_rawdev_configure(const struct rte_rawdev *dev,
- rte_rawdev_obj_t config)
+ rte_rawdev_obj_t config,
+ size_t config_size __rte_unused)
{
IFPGA_RAWDEV_PMD_FUNC_TRACE();
diff --git a/drivers/raw/ioat/ioat_rawdev.c b/drivers/raw/ioat/ioat_rawdev.c
index 6a336795d..b29ff983f 100644
--- a/drivers/raw/ioat/ioat_rawdev.c
+++ b/drivers/raw/ioat/ioat_rawdev.c
@@ -41,7 +41,8 @@ RTE_LOG_REGISTER(ioat_pmd_logtype, rawdev.ioat, INFO);
#define COMPLETION_SZ sizeof(__m128i)
static int
-ioat_dev_configure(const struct rte_rawdev *dev, rte_rawdev_obj_t config)
+ioat_dev_configure(const struct rte_rawdev *dev, rte_rawdev_obj_t config,
+ size_t config_size)
{
struct rte_ioat_rawdev_config *params = config;
struct rte_ioat_rawdev *ioat = dev->dev_private;
@@ -51,7 +52,7 @@ ioat_dev_configure(const struct rte_rawdev *dev, rte_rawdev_obj_t config)
if (dev->started)
return -EBUSY;
- if (params == NULL)
+ if (params == NULL || config_size != sizeof(*params))
return -EINVAL;
if (params->ring_size > 4096 || params->ring_size < 64 ||
diff --git a/drivers/raw/ioat/ioat_rawdev_test.c b/drivers/raw/ioat/ioat_rawdev_test.c
index 90f5974cd..e5b50ae9f 100644
--- a/drivers/raw/ioat/ioat_rawdev_test.c
+++ b/drivers/raw/ioat/ioat_rawdev_test.c
@@ -165,7 +165,7 @@ ioat_rawdev_test(uint16_t dev_id)
}
p.ring_size = IOAT_TEST_RINGSIZE;
- if (rte_rawdev_configure(dev_id, &info) != 0) {
+ if (rte_rawdev_configure(dev_id, &info, sizeof(p)) != 0) {
printf("Error with rte_rawdev_configure()\n");
return -1;
}
diff --git a/drivers/raw/ntb/ntb.c b/drivers/raw/ntb/ntb.c
index eaeb67b74..c181094d5 100644
--- a/drivers/raw/ntb/ntb.c
+++ b/drivers/raw/ntb/ntb.c
@@ -837,13 +837,17 @@ ntb_dev_info_get(struct rte_rawdev *dev, rte_rawdev_obj_t dev_info,
}
static int
-ntb_dev_configure(const struct rte_rawdev *dev, rte_rawdev_obj_t config)
+ntb_dev_configure(const struct rte_rawdev *dev, rte_rawdev_obj_t config,
+ size_t config_size)
{
struct ntb_dev_config *conf = config;
struct ntb_hw *hw = dev->dev_private;
uint32_t xstats_num;
int ret;
+ if (conf == NULL || config_size != sizeof(*conf))
+ return -EINVAL;
+
hw->queue_pairs = conf->num_queues;
hw->queue_size = conf->queue_size;
hw->used_mw_num = conf->mz_num;
diff --git a/drivers/raw/octeontx2_dma/otx2_dpi_rawdev.c b/drivers/raw/octeontx2_dma/otx2_dpi_rawdev.c
index e398abb75..5b496446c 100644
--- a/drivers/raw/octeontx2_dma/otx2_dpi_rawdev.c
+++ b/drivers/raw/octeontx2_dma/otx2_dpi_rawdev.c
@@ -294,7 +294,8 @@ otx2_dpi_rawdev_reset(struct rte_rawdev *dev)
}
static int
-otx2_dpi_rawdev_configure(const struct rte_rawdev *dev, rte_rawdev_obj_t config)
+otx2_dpi_rawdev_configure(const struct rte_rawdev *dev, rte_rawdev_obj_t config,
+ size_t config_size)
{
struct dpi_rawdev_conf_s *conf = config;
struct dpi_vf_s *dpivf = NULL;
@@ -302,8 +303,8 @@ otx2_dpi_rawdev_configure(const struct rte_rawdev *dev, rte_rawdev_obj_t config)
uintptr_t pool;
uint32_t gaura;
- if (conf == NULL) {
- otx2_dpi_dbg("NULL configuration");
+ if (conf == NULL || config_size != sizeof(*conf)) {
+ otx2_dpi_dbg("NULL or invalid configuration");
return -EINVAL;
}
dpivf = (struct dpi_vf_s *)dev->dev_private;
diff --git a/drivers/raw/octeontx2_dma/otx2_dpi_test.c b/drivers/raw/octeontx2_dma/otx2_dpi_test.c
index 276658af0..cec6ca91b 100644
--- a/drivers/raw/octeontx2_dma/otx2_dpi_test.c
+++ b/drivers/raw/octeontx2_dma/otx2_dpi_test.c
@@ -182,7 +182,8 @@ test_otx2_dma_rawdev(uint16_t val)
/* Configure rawdev ports */
conf.chunk_pool = dpi_create_mempool();
rdev_info.dev_private = &conf;
- ret = rte_rawdev_configure(i, (rte_rawdev_obj_t)&rdev_info);
+ ret = rte_rawdev_configure(i, (rte_rawdev_obj_t)&rdev_info,
+ sizeof(conf));
if (ret) {
otx2_dpi_dbg("Unable to configure DPIVF %d", i);
return -ENODEV;
diff --git a/drivers/raw/octeontx2_ep/otx2_ep_rawdev.c b/drivers/raw/octeontx2_ep/otx2_ep_rawdev.c
index 0778603d5..2b78a7941 100644
--- a/drivers/raw/octeontx2_ep/otx2_ep_rawdev.c
+++ b/drivers/raw/octeontx2_ep/otx2_ep_rawdev.c
@@ -224,13 +224,14 @@ sdp_rawdev_close(struct rte_rawdev *dev)
}
static int
-sdp_rawdev_configure(const struct rte_rawdev *dev, rte_rawdev_obj_t config)
+sdp_rawdev_configure(const struct rte_rawdev *dev, rte_rawdev_obj_t config,
+ size_t config_size)
{
struct sdp_rawdev_info *app_info = (struct sdp_rawdev_info *)config;
struct sdp_device *sdpvf;
- if (app_info == NULL) {
- otx2_err("Application config info [NULL]");
+ if (app_info == NULL || config_size != sizeof(*app_info)) {
+ otx2_err("Application config info [NULL] or incorrect size");
return -EINVAL;
}
diff --git a/drivers/raw/octeontx2_ep/otx2_ep_test.c b/drivers/raw/octeontx2_ep/otx2_ep_test.c
index 091f1827c..b876275f7 100644
--- a/drivers/raw/octeontx2_ep/otx2_ep_test.c
+++ b/drivers/raw/octeontx2_ep/otx2_ep_test.c
@@ -108,7 +108,7 @@ sdp_rawdev_selftest(uint16_t dev_id)
dev_info.dev_private = &app_info;
- ret = rte_rawdev_configure(dev_id, &dev_info);
+ ret = rte_rawdev_configure(dev_id, &dev_info, sizeof(app_info));
if (ret) {
otx2_err("Unable to configure SDP_VF %d", dev_id);
rte_mempool_free(ioq_mpool);
diff --git a/drivers/raw/skeleton/skeleton_rawdev.c b/drivers/raw/skeleton/skeleton_rawdev.c
index dce300c35..531d0450c 100644
--- a/drivers/raw/skeleton/skeleton_rawdev.c
+++ b/drivers/raw/skeleton/skeleton_rawdev.c
@@ -68,7 +68,8 @@ static int skeleton_rawdev_info_get(struct rte_rawdev *dev,
}
static int skeleton_rawdev_configure(const struct rte_rawdev *dev,
- rte_rawdev_obj_t config)
+ rte_rawdev_obj_t config,
+ size_t config_size)
{
struct skeleton_rawdev *skeldev;
struct skeleton_rawdev_conf *skeldev_conf;
@@ -77,7 +78,7 @@ static int skeleton_rawdev_configure(const struct rte_rawdev *dev,
RTE_FUNC_PTR_OR_ERR_RET(dev, -EINVAL);
- if (!config) {
+ if (config == NULL || config_size != sizeof(*skeldev_conf)) {
SKELETON_PMD_ERR("Invalid configuration");
return -EINVAL;
}
diff --git a/drivers/raw/skeleton/skeleton_rawdev_test.c b/drivers/raw/skeleton/skeleton_rawdev_test.c
index 9b8390dfb..7dc7c7684 100644
--- a/drivers/raw/skeleton/skeleton_rawdev_test.c
+++ b/drivers/raw/skeleton/skeleton_rawdev_test.c
@@ -126,7 +126,7 @@ test_rawdev_configure(void)
struct skeleton_rawdev_conf rdev_conf_get = {0};
/* Check invalid configuration */
- ret = rte_rawdev_configure(test_dev_id, NULL);
+ ret = rte_rawdev_configure(test_dev_id, NULL, 0);
RTE_TEST_ASSERT(ret == -EINVAL,
"Null configure; Expected -EINVAL, got %d", ret);
@@ -137,7 +137,8 @@ test_rawdev_configure(void)
rdev_info.dev_private = &rdev_conf_set;
ret = rte_rawdev_configure(test_dev_id,
- (rte_rawdev_obj_t)&rdev_info);
+ (rte_rawdev_obj_t)&rdev_info,
+ sizeof(rdev_conf_set));
RTE_TEST_ASSERT_SUCCESS(ret, "Failed to configure rawdev (%d)", ret);
rdev_info.dev_private = &rdev_conf_get;
diff --git a/examples/ioat/ioatfwd.c b/examples/ioat/ioatfwd.c
index 5c631da1b..8e9513e44 100644
--- a/examples/ioat/ioatfwd.c
+++ b/examples/ioat/ioatfwd.c
@@ -734,7 +734,7 @@ configure_rawdev_queue(uint32_t dev_id)
struct rte_ioat_rawdev_config dev_config = { .ring_size = ring_size };
struct rte_rawdev_info info = { .dev_private = &dev_config };
- if (rte_rawdev_configure(dev_id, &info) != 0) {
+ if (rte_rawdev_configure(dev_id, &info, sizeof(dev_config)) != 0) {
rte_exit(EXIT_FAILURE,
"Error with rte_rawdev_configure()\n");
}
diff --git a/examples/ntb/ntb_fwd.c b/examples/ntb/ntb_fwd.c
index 11e224451..656f73659 100644
--- a/examples/ntb/ntb_fwd.c
+++ b/examples/ntb/ntb_fwd.c
@@ -1401,7 +1401,7 @@ main(int argc, char **argv)
ntb_conf.num_queues = num_queues;
ntb_conf.queue_size = nb_desc;
ntb_rawdev_conf.dev_private = (rte_rawdev_obj_t)(&ntb_conf);
- ret = rte_rawdev_configure(dev_id, &ntb_rawdev_conf);
+ ret = rte_rawdev_configure(dev_id, &ntb_rawdev_conf, sizeof(ntb_conf));
if (ret)
rte_exit(EXIT_FAILURE, "Can't config ntb dev: err=%d, "
"port=%u\n", ret, dev_id);
diff --git a/lib/librte_rawdev/rte_rawdev.c b/lib/librte_rawdev/rte_rawdev.c
index bde33763e..6c4d783cc 100644
--- a/lib/librte_rawdev/rte_rawdev.c
+++ b/lib/librte_rawdev/rte_rawdev.c
@@ -104,7 +104,8 @@ rte_rawdev_info_get(uint16_t dev_id, struct rte_rawdev_info *dev_info,
}
int
-rte_rawdev_configure(uint16_t dev_id, struct rte_rawdev_info *dev_conf)
+rte_rawdev_configure(uint16_t dev_id, struct rte_rawdev_info *dev_conf,
+ size_t dev_private_size)
{
struct rte_rawdev *dev;
int diag;
@@ -123,7 +124,8 @@ rte_rawdev_configure(uint16_t dev_id, struct rte_rawdev_info *dev_conf)
}
/* Configure the device */
- diag = (*dev->dev_ops->dev_configure)(dev, dev_conf->dev_private);
+ diag = (*dev->dev_ops->dev_configure)(dev, dev_conf->dev_private,
+ dev_private_size);
if (diag != 0)
RTE_RDEV_ERR("dev%d dev_configure = %d", dev_id, diag);
else
diff --git a/lib/librte_rawdev/rte_rawdev.h b/lib/librte_rawdev/rte_rawdev.h
index cf6acfd26..73e3bd5ae 100644
--- a/lib/librte_rawdev/rte_rawdev.h
+++ b/lib/librte_rawdev/rte_rawdev.h
@@ -116,13 +116,19 @@ rte_rawdev_info_get(uint16_t dev_id, struct rte_rawdev_info *dev_info,
* driver/implementation can use to configure the device. It is also assumed
* that once the configuration is done, a `queue_id` type field can be used
* to refer to some arbitrary internal representation of a queue.
+ * @dev_private_size
+ * The length of the memory space pointed to by dev_private in dev_info.
+ * This should be set to the size of the expected private structure to be
+ * used by the driver, and may be checked by drivers to ensure the expected
+ * struct type is provided.
*
* @return
* - 0: Success, device configured.
* - <0: Error code returned by the driver configuration function.
*/
int
-rte_rawdev_configure(uint16_t dev_id, struct rte_rawdev_info *dev_conf);
+rte_rawdev_configure(uint16_t dev_id, struct rte_rawdev_info *dev_conf,
+ size_t dev_private_size);
/**
diff --git a/lib/librte_rawdev/rte_rawdev_pmd.h b/lib/librte_rawdev/rte_rawdev_pmd.h
index 89e46412a..050f8b029 100644
--- a/lib/librte_rawdev/rte_rawdev_pmd.h
+++ b/lib/librte_rawdev/rte_rawdev_pmd.h
@@ -160,7 +160,8 @@ typedef int (*rawdev_info_get_t)(struct rte_rawdev *dev,
* Returns 0 on success
*/
typedef int (*rawdev_configure_t)(const struct rte_rawdev *dev,
- rte_rawdev_obj_t config);
+ rte_rawdev_obj_t config,
+ size_t config_size);
/**
* Start a configured device.
--
2.25.1
^ permalink raw reply [relevance 3%]
* [dpdk-dev] [PATCH 20.11 1/5] rawdev: add private data length parameter to info fn
2020-07-09 15:20 4% [dpdk-dev] [PATCH 20.11 0/5] Enhance rawdev APIs Bruce Richardson
@ 2020-07-09 15:20 3% ` Bruce Richardson
2020-07-12 14:13 0% ` Xu, Rosen
2020-07-09 15:20 3% ` [dpdk-dev] [PATCH 20.11 3/5] rawdev: add private data length parameter to config fn Bruce Richardson
2020-07-09 15:20 3% ` [dpdk-dev] [PATCH 20.11 4/5] rawdev: add private data length parameter to queue fns Bruce Richardson
2 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2020-07-09 15:20 UTC (permalink / raw)
To: Nipun Gupta, Hemant Agrawal
Cc: dev, Rosen Xu, Tianfei zhang, Xiaoyun Li, Jingjing Wu, Satha Rao,
Mahipal Challa, Jerin Jacob, Bruce Richardson
Currently with the rawdev API there is no way to check that the structure
passed in via the dev_private pointer in the dev_info structure is of the
correct type - it's just checked that it is non-NULL. Adding in the length
of the expected structure provides a measure of typechecking, and can also
be used for ABI compatibility in future, since ABI changes involving
structs almost always involve a change in size.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
drivers/bus/ifpga/ifpga_bus.c | 2 +-
drivers/raw/ifpga/ifpga_rawdev.c | 5 +++--
drivers/raw/ioat/ioat_rawdev.c | 5 +++--
drivers/raw/ioat/ioat_rawdev_test.c | 4 ++--
drivers/raw/ntb/ntb.c | 8 +++++++-
drivers/raw/skeleton/skeleton_rawdev.c | 5 +++--
drivers/raw/skeleton/skeleton_rawdev_test.c | 19 ++++++++++++-------
examples/ioat/ioatfwd.c | 2 +-
examples/ntb/ntb_fwd.c | 2 +-
lib/librte_rawdev/rte_rawdev.c | 6 ++++--
lib/librte_rawdev/rte_rawdev.h | 9 ++++++++-
lib/librte_rawdev/rte_rawdev_pmd.h | 5 ++++-
12 files changed, 49 insertions(+), 23 deletions(-)
diff --git a/drivers/bus/ifpga/ifpga_bus.c b/drivers/bus/ifpga/ifpga_bus.c
index 6b16a20bb..bb8b3dcfb 100644
--- a/drivers/bus/ifpga/ifpga_bus.c
+++ b/drivers/bus/ifpga/ifpga_bus.c
@@ -162,7 +162,7 @@ ifpga_scan_one(struct rte_rawdev *rawdev,
afu_dev->id.port = afu_pr_conf.afu_id.port;
if (rawdev->dev_ops && rawdev->dev_ops->dev_info_get)
- rawdev->dev_ops->dev_info_get(rawdev, afu_dev);
+ rawdev->dev_ops->dev_info_get(rawdev, afu_dev, sizeof(*afu_dev));
if (rawdev->dev_ops &&
rawdev->dev_ops->dev_start &&
diff --git a/drivers/raw/ifpga/ifpga_rawdev.c b/drivers/raw/ifpga/ifpga_rawdev.c
index cc25c662b..47cfa3877 100644
--- a/drivers/raw/ifpga/ifpga_rawdev.c
+++ b/drivers/raw/ifpga/ifpga_rawdev.c
@@ -605,7 +605,8 @@ ifpga_fill_afu_dev(struct opae_accelerator *acc,
static void
ifpga_rawdev_info_get(struct rte_rawdev *dev,
- rte_rawdev_obj_t dev_info)
+ rte_rawdev_obj_t dev_info,
+ size_t dev_info_size)
{
struct opae_adapter *adapter;
struct opae_accelerator *acc;
@@ -617,7 +618,7 @@ ifpga_rawdev_info_get(struct rte_rawdev *dev,
IFPGA_RAWDEV_PMD_FUNC_TRACE();
- if (!dev_info) {
+ if (!dev_info || dev_info_size != sizeof(*afu_dev)) {
IFPGA_RAWDEV_PMD_ERR("Invalid request");
return;
}
diff --git a/drivers/raw/ioat/ioat_rawdev.c b/drivers/raw/ioat/ioat_rawdev.c
index f876ffc3f..8dd856c55 100644
--- a/drivers/raw/ioat/ioat_rawdev.c
+++ b/drivers/raw/ioat/ioat_rawdev.c
@@ -113,12 +113,13 @@ ioat_dev_stop(struct rte_rawdev *dev)
}
static void
-ioat_dev_info_get(struct rte_rawdev *dev, rte_rawdev_obj_t dev_info)
+ioat_dev_info_get(struct rte_rawdev *dev, rte_rawdev_obj_t dev_info,
+ size_t dev_info_size)
{
struct rte_ioat_rawdev_config *cfg = dev_info;
struct rte_ioat_rawdev *ioat = dev->dev_private;
- if (cfg != NULL)
+ if (cfg != NULL && dev_info_size == sizeof(*cfg))
cfg->ring_size = ioat->ring_size;
}
diff --git a/drivers/raw/ioat/ioat_rawdev_test.c b/drivers/raw/ioat/ioat_rawdev_test.c
index d99f1bd6b..90f5974cd 100644
--- a/drivers/raw/ioat/ioat_rawdev_test.c
+++ b/drivers/raw/ioat/ioat_rawdev_test.c
@@ -157,7 +157,7 @@ ioat_rawdev_test(uint16_t dev_id)
return TEST_SKIPPED;
}
- rte_rawdev_info_get(dev_id, &info);
+ rte_rawdev_info_get(dev_id, &info, sizeof(p));
if (p.ring_size != expected_ring_size[dev_id]) {
printf("Error, initial ring size is not as expected (Actual: %d, Expected: %d)\n",
(int)p.ring_size, expected_ring_size[dev_id]);
@@ -169,7 +169,7 @@ ioat_rawdev_test(uint16_t dev_id)
printf("Error with rte_rawdev_configure()\n");
return -1;
}
- rte_rawdev_info_get(dev_id, &info);
+ rte_rawdev_info_get(dev_id, &info, sizeof(p));
if (p.ring_size != IOAT_TEST_RINGSIZE) {
printf("Error, ring size is not %d (%d)\n",
IOAT_TEST_RINGSIZE, (int)p.ring_size);
diff --git a/drivers/raw/ntb/ntb.c b/drivers/raw/ntb/ntb.c
index e40412bb7..4676c6f8f 100644
--- a/drivers/raw/ntb/ntb.c
+++ b/drivers/raw/ntb/ntb.c
@@ -801,11 +801,17 @@ ntb_dequeue_bufs(struct rte_rawdev *dev,
}
static void
-ntb_dev_info_get(struct rte_rawdev *dev, rte_rawdev_obj_t dev_info)
+ntb_dev_info_get(struct rte_rawdev *dev, rte_rawdev_obj_t dev_info,
+ size_t dev_info_size)
{
struct ntb_hw *hw = dev->dev_private;
struct ntb_dev_info *info = dev_info;
+ if (dev_info_size != sizeof(*info)){
+ NTB_LOG(ERR, "Invalid size parameter to %s", __func__);
+ return;
+ }
+
info->mw_cnt = hw->mw_cnt;
info->mw_size = hw->mw_size;
diff --git a/drivers/raw/skeleton/skeleton_rawdev.c b/drivers/raw/skeleton/skeleton_rawdev.c
index 72ece887a..dc05f3ecf 100644
--- a/drivers/raw/skeleton/skeleton_rawdev.c
+++ b/drivers/raw/skeleton/skeleton_rawdev.c
@@ -42,14 +42,15 @@ static struct queue_buffers queue_buf[SKELETON_MAX_QUEUES] = {};
static void clear_queue_bufs(int queue_id);
static void skeleton_rawdev_info_get(struct rte_rawdev *dev,
- rte_rawdev_obj_t dev_info)
+ rte_rawdev_obj_t dev_info,
+ size_t dev_info_size)
{
struct skeleton_rawdev *skeldev;
struct skeleton_rawdev_conf *skeldev_conf;
SKELETON_PMD_FUNC_TRACE();
- if (!dev_info) {
+ if (!dev_info || dev_info_size != sizeof(*skeldev_conf)) {
SKELETON_PMD_ERR("Invalid request");
return;
}
diff --git a/drivers/raw/skeleton/skeleton_rawdev_test.c b/drivers/raw/skeleton/skeleton_rawdev_test.c
index 9ecfdee81..9b8390dfb 100644
--- a/drivers/raw/skeleton/skeleton_rawdev_test.c
+++ b/drivers/raw/skeleton/skeleton_rawdev_test.c
@@ -106,12 +106,12 @@ test_rawdev_info_get(void)
struct rte_rawdev_info rdev_info = {0};
struct skeleton_rawdev_conf skel_conf = {0};
- ret = rte_rawdev_info_get(test_dev_id, NULL);
+ ret = rte_rawdev_info_get(test_dev_id, NULL, 0);
RTE_TEST_ASSERT(ret == -EINVAL, "Expected -EINVAL, %d", ret);
rdev_info.dev_private = &skel_conf;
- ret = rte_rawdev_info_get(test_dev_id, &rdev_info);
+ ret = rte_rawdev_info_get(test_dev_id, &rdev_info, sizeof(skel_conf));
RTE_TEST_ASSERT_SUCCESS(ret, "Failed to get raw dev info");
return TEST_SUCCESS;
@@ -142,7 +142,8 @@ test_rawdev_configure(void)
rdev_info.dev_private = &rdev_conf_get;
ret = rte_rawdev_info_get(test_dev_id,
- (rte_rawdev_obj_t)&rdev_info);
+ (rte_rawdev_obj_t)&rdev_info,
+ sizeof(rdev_conf_get));
RTE_TEST_ASSERT_SUCCESS(ret,
"Failed to obtain rawdev configuration (%d)",
ret);
@@ -170,7 +171,8 @@ test_rawdev_queue_default_conf_get(void)
/* Get the current configuration */
rdev_info.dev_private = &rdev_conf_get;
ret = rte_rawdev_info_get(test_dev_id,
- (rte_rawdev_obj_t)&rdev_info);
+ (rte_rawdev_obj_t)&rdev_info,
+ sizeof(rdev_conf_get));
RTE_TEST_ASSERT_SUCCESS(ret, "Failed to obtain rawdev configuration (%d)",
ret);
@@ -218,7 +220,8 @@ test_rawdev_queue_setup(void)
/* Get the current configuration */
rdev_info.dev_private = &rdev_conf_get;
ret = rte_rawdev_info_get(test_dev_id,
- (rte_rawdev_obj_t)&rdev_info);
+ (rte_rawdev_obj_t)&rdev_info,
+ sizeof(rdev_conf_get));
RTE_TEST_ASSERT_SUCCESS(ret,
"Failed to obtain rawdev configuration (%d)",
ret);
@@ -327,7 +330,8 @@ test_rawdev_start_stop(void)
dummy_firmware = NULL;
rte_rawdev_start(test_dev_id);
- ret = rte_rawdev_info_get(test_dev_id, (rte_rawdev_obj_t)&rdev_info);
+ ret = rte_rawdev_info_get(test_dev_id, (rte_rawdev_obj_t)&rdev_info,
+ sizeof(rdev_conf_get));
RTE_TEST_ASSERT_SUCCESS(ret,
"Failed to obtain rawdev configuration (%d)",
ret);
@@ -336,7 +340,8 @@ test_rawdev_start_stop(void)
rdev_conf_get.device_state);
rte_rawdev_stop(test_dev_id);
- ret = rte_rawdev_info_get(test_dev_id, (rte_rawdev_obj_t)&rdev_info);
+ ret = rte_rawdev_info_get(test_dev_id, (rte_rawdev_obj_t)&rdev_info,
+ sizeof(rdev_conf_get));
RTE_TEST_ASSERT_SUCCESS(ret,
"Failed to obtain rawdev configuration (%d)",
ret);
diff --git a/examples/ioat/ioatfwd.c b/examples/ioat/ioatfwd.c
index b66ee73bc..5c631da1b 100644
--- a/examples/ioat/ioatfwd.c
+++ b/examples/ioat/ioatfwd.c
@@ -757,7 +757,7 @@ assign_rawdevs(void)
do {
if (rdev_id == rte_rawdev_count())
goto end;
- rte_rawdev_info_get(rdev_id++, &rdev_info);
+ rte_rawdev_info_get(rdev_id++, &rdev_info, 0);
} while (rdev_info.driver_name == NULL ||
strcmp(rdev_info.driver_name,
IOAT_PMD_RAWDEV_NAME_STR) != 0);
diff --git a/examples/ntb/ntb_fwd.c b/examples/ntb/ntb_fwd.c
index eba8ebf9f..11e224451 100644
--- a/examples/ntb/ntb_fwd.c
+++ b/examples/ntb/ntb_fwd.c
@@ -1389,7 +1389,7 @@ main(int argc, char **argv)
rte_rawdev_set_attr(dev_id, NTB_QUEUE_NUM_NAME, num_queues);
printf("Set queue number as %u.\n", num_queues);
ntb_rawdev_info.dev_private = (rte_rawdev_obj_t)(&ntb_info);
- rte_rawdev_info_get(dev_id, &ntb_rawdev_info);
+ rte_rawdev_info_get(dev_id, &ntb_rawdev_info, sizeof(ntb_info));
nb_mbuf = nb_desc * num_queues * 2 * 2 + rte_lcore_count() *
MEMPOOL_CACHE_SIZE;
diff --git a/lib/librte_rawdev/rte_rawdev.c b/lib/librte_rawdev/rte_rawdev.c
index 8f84d0b22..a57689035 100644
--- a/lib/librte_rawdev/rte_rawdev.c
+++ b/lib/librte_rawdev/rte_rawdev.c
@@ -78,7 +78,8 @@ rte_rawdev_socket_id(uint16_t dev_id)
}
int
-rte_rawdev_info_get(uint16_t dev_id, struct rte_rawdev_info *dev_info)
+rte_rawdev_info_get(uint16_t dev_id, struct rte_rawdev_info *dev_info,
+ size_t dev_private_size)
{
struct rte_rawdev *rawdev;
@@ -89,7 +90,8 @@ rte_rawdev_info_get(uint16_t dev_id, struct rte_rawdev_info *dev_info)
if (dev_info->dev_private != NULL) {
RTE_FUNC_PTR_OR_ERR_RET(*rawdev->dev_ops->dev_info_get, -ENOTSUP);
- (*rawdev->dev_ops->dev_info_get)(rawdev, dev_info->dev_private);
+ (*rawdev->dev_ops->dev_info_get)(rawdev, dev_info->dev_private,
+ dev_private_size);
}
dev_info->driver_name = rawdev->driver_name;
diff --git a/lib/librte_rawdev/rte_rawdev.h b/lib/librte_rawdev/rte_rawdev.h
index 32f6b8bb0..cf6acfd26 100644
--- a/lib/librte_rawdev/rte_rawdev.h
+++ b/lib/librte_rawdev/rte_rawdev.h
@@ -82,13 +82,20 @@ struct rte_rawdev_info;
* will be returned. This can be used to safely query the type of a rawdev
* instance without needing to know the size of the private data to return.
*
+ * @param dev_private_size
+ * The length of the memory space pointed to by dev_private in dev_info.
+ * This should be set to the size of the expected private structure to be
+ * returned, and may be checked by drivers to ensure the expected struct
+ * type is provided.
+ *
* @return
* - 0: Success, driver updates the contextual information of the raw device
* - <0: Error code returned by the driver info get function.
*
*/
int
-rte_rawdev_info_get(uint16_t dev_id, struct rte_rawdev_info *dev_info);
+rte_rawdev_info_get(uint16_t dev_id, struct rte_rawdev_info *dev_info,
+ size_t dev_private_size);
/**
* Configure a raw device.
diff --git a/lib/librte_rawdev/rte_rawdev_pmd.h b/lib/librte_rawdev/rte_rawdev_pmd.h
index 4395a2182..0e72a9205 100644
--- a/lib/librte_rawdev/rte_rawdev_pmd.h
+++ b/lib/librte_rawdev/rte_rawdev_pmd.h
@@ -138,12 +138,15 @@ rte_rawdev_pmd_is_valid_dev(uint8_t dev_id)
* Raw device pointer
* @param dev_info
* Raw device information structure
+ * @param dev_private_size
+ * The size of the structure pointed to by dev_info->dev_private
*
* @return
* Returns 0 on success
*/
typedef void (*rawdev_info_get_t)(struct rte_rawdev *dev,
- rte_rawdev_obj_t dev_info);
+ rte_rawdev_obj_t dev_info,
+ size_t dev_private_size);
/**
* Configure a device.
--
2.25.1
^ permalink raw reply [relevance 3%]
* [dpdk-dev] [PATCH 20.11 0/5] Enhance rawdev APIs
@ 2020-07-09 15:20 4% Bruce Richardson
2020-07-09 15:20 3% ` [dpdk-dev] [PATCH 20.11 1/5] rawdev: add private data length parameter to info fn Bruce Richardson
` (2 more replies)
0 siblings, 3 replies; 200+ results
From: Bruce Richardson @ 2020-07-09 15:20 UTC (permalink / raw)
To: Nipun Gupta, Hemant Agrawal
Cc: dev, Rosen Xu, Tianfei zhang, Xiaoyun Li, Jingjing Wu, Satha Rao,
Mahipal Challa, Jerin Jacob, Bruce Richardson
This patchset proposes some internal and externally-visible changes to the
rawdev API. If consensus is in favour, I will submit a deprecation notice
for the changes for the 20.08 release, so that these ABI/API-breaking
changes can be merged in 20.11
The changes are in two areas:
* For any APIs which take a void * parameter for driver-specific structs,
add an additional parameter to provide the struct length. This allows
some runtime type-checking, as well as possible ABI-compatibility support
in the future as structure change generally involve a change in the size
of the structure.
* Ensure all APIs which can return error values have int type, rather than
void. Since functions like info_get and queue_default_get can now do some
typechecking, they need to be modified to allow them to return error
codes on failure.
Bruce Richardson (5):
rawdev: add private data length parameter to info fn
rawdev: allow drivers to return error from info function
rawdev: add private data length parameter to config fn
rawdev: add private data length parameter to queue fns
rawdev: allow queue default config query to return error
drivers/bus/ifpga/ifpga_bus.c | 2 +-
drivers/raw/ifpga/ifpga_rawdev.c | 23 +++++-----
drivers/raw/ioat/ioat_rawdev.c | 17 ++++---
drivers/raw/ioat/ioat_rawdev_test.c | 6 +--
drivers/raw/ntb/ntb.c | 49 ++++++++++++++++-----
drivers/raw/octeontx2_dma/otx2_dpi_rawdev.c | 7 +--
drivers/raw/octeontx2_dma/otx2_dpi_test.c | 3 +-
drivers/raw/octeontx2_ep/otx2_ep_rawdev.c | 7 +--
drivers/raw/octeontx2_ep/otx2_ep_test.c | 2 +-
drivers/raw/skeleton/skeleton_rawdev.c | 34 ++++++++------
drivers/raw/skeleton/skeleton_rawdev_test.c | 32 ++++++++------
examples/ioat/ioatfwd.c | 4 +-
examples/ntb/ntb_fwd.c | 7 +--
lib/librte_rawdev/rte_rawdev.c | 27 +++++++-----
lib/librte_rawdev/rte_rawdev.h | 27 ++++++++++--
lib/librte_rawdev/rte_rawdev_pmd.h | 22 ++++++---
16 files changed, 178 insertions(+), 91 deletions(-)
--
2.25.1
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH v3] mbuf: use C11 atomic built-ins for refcnt operations
2020-07-09 13:31 0% ` Honnappa Nagarahalli
@ 2020-07-09 14:10 0% ` Phil Yang
0 siblings, 0 replies; 200+ results
From: Phil Yang @ 2020-07-09 14:10 UTC (permalink / raw)
To: Honnappa Nagarahalli, Olivier Matz
Cc: dev, stephen, david.marchand, drc, Ruifeng Wang, nd, nd
<snip>
>
> > >
> > > Hi Phil,
> > >
> > > On Thu, Jul 09, 2020 at 06:10:42PM +0800, Phil Yang wrote:
> > > > Use C11 atomic built-ins with explicit ordering instead of
> > > > rte_atomic ops which enforce unnecessary barriers on aarch64.
> > > >
> > > > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > > > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > > > ---
> > > > v3:
> > > > 1.Fix ABI breakage.
> > > > 2.Simplify data type cast.
> > > >
> > > > v2:
> > > > Fix ABI issue: revert the rte_mbuf_ext_shared_info struct refcnt
> > > > field to refcnt_atomic.
> > > >
> > > > lib/librte_mbuf/rte_mbuf.c | 1 -
> > > > lib/librte_mbuf/rte_mbuf.h | 19 ++++++++++---------
> > > > lib/librte_mbuf/rte_mbuf_core.h | 2 +-
> > > > 3 files changed, 11 insertions(+), 11 deletions(-)
> > > >
> > <snip>
> > > >
> > > > /* Reinitialize counter before mbuf freeing. */ diff --git
> > > > a/lib/librte_mbuf/rte_mbuf_core.h
> > > b/lib/librte_mbuf/rte_mbuf_core.h
> > > > index 16600f1..d65d1c8 100644
> > > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > > @@ -679,7 +679,7 @@ typedef void
> > > (*rte_mbuf_extbuf_free_callback_t)(void *addr, void *opaque);
> > > > struct rte_mbuf_ext_shared_info {
> > > > rte_mbuf_extbuf_free_callback_t free_cb; /**< Free callback
> > > function */
> > > > void *fcb_opaque; /**< Free callback argument */
> > > > -rte_atomic16_t refcnt_atomic; /**< Atomically accessed refcnt */
> > > > +uint16_t refcnt_atomic; /**< Atomically accessed refcnt */
> > > > };
> > >
> > > To avoid an API breakage (i.e. currently, an application that accesses
> > > to refcnt_atomic expects that its type is rte_atomic16_t), I suggest
> > > to do the same than in the mbuf struct:
> > >
> > > union {
> > > rte_atomic16_t refcnt_atomic;
> > > uint16_t refcnt;
> > > };
> > >
> > > I hope the ABI checker won't complain.
> > >
> > > It will also be better for 20.11 when the deprecated fields will be
> > > renamed: the remaining one will be called 'refcnt' in both mbuf and
> > > mbuf_ext_shared_info.
> Does this need a deprecation notice in 20.08?
Yes. We'd better do that.
I will add a notice for it in this patch. Thanks.
>
> >
> > Got it. I agree with you.
> > It should work. In my local test machine, the ABI checker happy with this
> > approach.
> > Once the test is done, I will upstream the new patch.
> >
> > Appreciate your comments.
> >
> > Thanks,
> > Phil
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v3] mbuf: use C11 atomic built-ins for refcnt operations
2020-07-09 13:00 3% ` Phil Yang
@ 2020-07-09 13:31 0% ` Honnappa Nagarahalli
2020-07-09 14:10 0% ` Phil Yang
0 siblings, 1 reply; 200+ results
From: Honnappa Nagarahalli @ 2020-07-09 13:31 UTC (permalink / raw)
To: Phil Yang, Olivier Matz
Cc: dev, stephen, david.marchand, drc, Ruifeng Wang, nd,
Honnappa Nagarahalli, nd
<snip>
> >
> > Hi Phil,
> >
> > On Thu, Jul 09, 2020 at 06:10:42PM +0800, Phil Yang wrote:
> > > Use C11 atomic built-ins with explicit ordering instead of
> > > rte_atomic ops which enforce unnecessary barriers on aarch64.
> > >
> > > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > > ---
> > > v3:
> > > 1.Fix ABI breakage.
> > > 2.Simplify data type cast.
> > >
> > > v2:
> > > Fix ABI issue: revert the rte_mbuf_ext_shared_info struct refcnt
> > > field to refcnt_atomic.
> > >
> > > lib/librte_mbuf/rte_mbuf.c | 1 -
> > > lib/librte_mbuf/rte_mbuf.h | 19 ++++++++++---------
> > > lib/librte_mbuf/rte_mbuf_core.h | 2 +-
> > > 3 files changed, 11 insertions(+), 11 deletions(-)
> > >
> <snip>
> > >
> > > /* Reinitialize counter before mbuf freeing. */ diff --git
> > > a/lib/librte_mbuf/rte_mbuf_core.h
> > b/lib/librte_mbuf/rte_mbuf_core.h
> > > index 16600f1..d65d1c8 100644
> > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > @@ -679,7 +679,7 @@ typedef void
> > (*rte_mbuf_extbuf_free_callback_t)(void *addr, void *opaque);
> > > struct rte_mbuf_ext_shared_info {
> > > rte_mbuf_extbuf_free_callback_t free_cb; /**< Free callback
> > function */
> > > void *fcb_opaque; /**< Free callback argument */
> > > -rte_atomic16_t refcnt_atomic; /**< Atomically accessed refcnt */
> > > +uint16_t refcnt_atomic; /**< Atomically accessed refcnt */
> > > };
> >
> > To avoid an API breakage (i.e. currently, an application that accesses
> > to refcnt_atomic expects that its type is rte_atomic16_t), I suggest
> > to do the same than in the mbuf struct:
> >
> > union {
> > rte_atomic16_t refcnt_atomic;
> > uint16_t refcnt;
> > };
> >
> > I hope the ABI checker won't complain.
> >
> > It will also be better for 20.11 when the deprecated fields will be
> > renamed: the remaining one will be called 'refcnt' in both mbuf and
> > mbuf_ext_shared_info.
Does this need a deprecation notice in 20.08?
>
> Got it. I agree with you.
> It should work. In my local test machine, the ABI checker happy with this
> approach.
> Once the test is done, I will upstream the new patch.
>
> Appreciate your comments.
>
> Thanks,
> Phil
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v3] mbuf: use C11 atomic built-ins for refcnt operations
2020-07-09 11:03 3% ` Olivier Matz
@ 2020-07-09 13:00 3% ` Phil Yang
2020-07-09 13:31 0% ` Honnappa Nagarahalli
0 siblings, 1 reply; 200+ results
From: Phil Yang @ 2020-07-09 13:00 UTC (permalink / raw)
To: Olivier Matz
Cc: dev, stephen, david.marchand, drc, Honnappa Nagarahalli,
Ruifeng Wang, nd
Hi Oliver,
> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Thursday, July 9, 2020 7:04 PM
> To: Phil Yang <Phil.Yang@arm.com>
> Cc: dev@dpdk.org; stephen@networkplumber.org;
> david.marchand@redhat.com; drc@linux.vnet.ibm.com; Honnappa
> Nagarahalli <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; nd <nd@arm.com>
> Subject: Re: [PATCH v3] mbuf: use C11 atomic built-ins for refcnt operations
>
> Hi Phil,
>
> On Thu, Jul 09, 2020 at 06:10:42PM +0800, Phil Yang wrote:
> > Use C11 atomic built-ins with explicit ordering instead of rte_atomic
> > ops which enforce unnecessary barriers on aarch64.
> >
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > ---
> > v3:
> > 1.Fix ABI breakage.
> > 2.Simplify data type cast.
> >
> > v2:
> > Fix ABI issue: revert the rte_mbuf_ext_shared_info struct refcnt field
> > to refcnt_atomic.
> >
> > lib/librte_mbuf/rte_mbuf.c | 1 -
> > lib/librte_mbuf/rte_mbuf.h | 19 ++++++++++---------
> > lib/librte_mbuf/rte_mbuf_core.h | 2 +-
> > 3 files changed, 11 insertions(+), 11 deletions(-)
> >
<snip>
> >
> > /* Reinitialize counter before mbuf freeing. */
> > diff --git a/lib/librte_mbuf/rte_mbuf_core.h
> b/lib/librte_mbuf/rte_mbuf_core.h
> > index 16600f1..d65d1c8 100644
> > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > @@ -679,7 +679,7 @@ typedef void
> (*rte_mbuf_extbuf_free_callback_t)(void *addr, void *opaque);
> > struct rte_mbuf_ext_shared_info {
> > rte_mbuf_extbuf_free_callback_t free_cb; /**< Free callback
> function */
> > void *fcb_opaque; /**< Free callback argument */
> > - rte_atomic16_t refcnt_atomic; /**< Atomically accessed refcnt */
> > + uint16_t refcnt_atomic; /**< Atomically accessed refcnt */
> > };
>
> To avoid an API breakage (i.e. currently, an application that accesses
> to refcnt_atomic expects that its type is rte_atomic16_t), I suggest to
> do the same than in the mbuf struct:
>
> union {
> rte_atomic16_t refcnt_atomic;
> uint16_t refcnt;
> };
>
> I hope the ABI checker won't complain.
>
> It will also be better for 20.11 when the deprecated fields will be
> renamed: the remaining one will be called 'refcnt' in both mbuf and
> mbuf_ext_shared_info.
Got it. I agree with you.
It should work. In my local test machine, the ABI checker happy with this approach.
Once the test is done, I will upstream the new patch.
Appreciate your comments.
Thanks,
Phil
^ permalink raw reply [relevance 3%]
* [dpdk-dev] [PATCH v6 1/2] mbuf: introduce accurate packet Tx scheduling
` (4 preceding siblings ...)
2020-07-08 15:47 2% ` [dpdk-dev] [PATCH v5 1/2] mbuf: introduce accurate packet Tx scheduling Viacheslav Ovsiienko
@ 2020-07-09 12:36 2% ` Viacheslav Ovsiienko
2020-07-09 23:47 0% ` Ferruh Yigit
2020-07-10 12:39 2% ` [dpdk-dev] [PATCH v7 " Viacheslav Ovsiienko
6 siblings, 1 reply; 200+ results
From: Viacheslav Ovsiienko @ 2020-07-09 12:36 UTC (permalink / raw)
To: dev; +Cc: matan, rasland, olivier.matz, bernard.iremonger, thomas
There is the requirement on some networks for precise traffic timing
management. The ability to send (and, generally speaking, receive)
the packets at the very precisely specified moment of time provides
the opportunity to support the connections with Time Division
Multiplexing using the contemporary general purpose NIC without involving
an auxiliary hardware. For example, the supporting of O-RAN Fronthaul
interface is one of the promising features for potentially usage of the
precise time management for the egress packets.
The main objective of this RFC is to specify the way how applications
can provide the moment of time at what the packet transmission must be
started and to describe in preliminary the supporting this feature from
mlx5 PMD side.
The new dynamic timestamp field is proposed, it provides some timing
information, the units and time references (initial phase) are not
explicitly defined but are maintained always the same for a given port.
Some devices allow to query rte_eth_read_clock() that will return
the current device timestamp. The dynamic timestamp flag tells whether
the field contains actual timestamp value. For the packets being sent
this value can be used by PMD to schedule packet sending.
The device clock is opaque entity, the units and frequency are
vendor specific and might depend on hardware capabilities and
configurations. If might (or not) be synchronized with real time
via PTP, might (or not) be synchronous with CPU clock (for example
if NIC and CPU share the same clock source there might be no
any drift between the NIC and CPU clocks), etc.
After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
and obsoleting, these dynamic flag and field will be used to manage
the timestamps on receiving datapath as well. Having the dedicated
flags for Rx/Tx timestamps allows applications not to perform explicit
flags reset on forwarding and not to promote received timestamps
to the transmitting datapath by default. The static PKT_RX_TIMESTAMP
is considered as candidate to become the dynamic flag.
When PMD sees the "rte_dynfield_timestamp" set on the packet being sent
it tries to synchronize the time of packet appearing on the wire with
the specified packet timestamp. If the specified one is in the past it
should be ignored, if one is in the distant future it should be capped
with some reasonable value (in range of seconds). These specific cases
("too late" and "distant future") can be optionally reported via
device xstats to assist applications to detect the time-related
problems.
There is no any packet reordering according timestamps is supposed,
neither within packet burst, nor between packets, it is an entirely
application responsibility to generate packets and its timestamps
in desired order. The timestamps can be put only in the first packet
in the burst providing the entire burst scheduling.
PMD reports the ability to synchronize packet sending on timestamp
with new offload flag:
This is palliative and is going to be replaced with new eth_dev API
about reporting/managing the supported dynamic flags and its related
features. This API would break ABI compatibility and can't be introduced
at the moment, so is postponed to 20.11.
For testing purposes it is proposed to update testpmd "txonly"
forwarding mode routine. With this update testpmd application generates
the packets and sets the dynamic timestamps according to specified time
pattern if it sees the "rte_dynfield_timestamp" is registered.
The new testpmd command is proposed to configure sending pattern:
set tx_times <burst_gap>,<intra_gap>
<intra_gap> - the delay between the packets within the burst
specified in the device clock units. The number
of packets in the burst is defined by txburst parameter
<burst_gap> - the delay between the bursts in the device clock units
As the result the bursts of packet will be transmitted with specific
delays between the packets within the burst and specific delay between
the bursts. The rte_eth_get_clock is supposed to be engaged to get the
current device clock value and provide the reference for the timestamps.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
---
v1->v4:
- dedicated dynamic Tx timestamp flag instead of shared with Rx
v4->v5:
- elaborated commit message
- more words about device clocks added,
- note about dedicated Rx/Tx timestamp flags added
v5->v6:
- release notes are updated
---
doc/guides/rel_notes/release_20_08.rst | 6 ++++++
lib/librte_ethdev/rte_ethdev.c | 1 +
lib/librte_ethdev/rte_ethdev.h | 4 ++++
lib/librte_mbuf/rte_mbuf_dyn.h | 31 +++++++++++++++++++++++++++++++
4 files changed, 42 insertions(+)
diff --git a/doc/guides/rel_notes/release_20_08.rst b/doc/guides/rel_notes/release_20_08.rst
index 988474c..5527bab 100644
--- a/doc/guides/rel_notes/release_20_08.rst
+++ b/doc/guides/rel_notes/release_20_08.rst
@@ -200,6 +200,12 @@ New Features
See the :doc:`../sample_app_ug/l2_forward_real_virtual` for more
details of this parameter usage.
+* **Introduced send packet scheduling on the timestamps.**
+
+ Added the new mbuf dynamic field and flag to provide timestamp on what packet
+ transmitting can be synchronized. The device Tx offload flag is added to
+ indicate the PMD supports send scheduling.
+
Removed Items
-------------
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 7022bd7..c48ca2a 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -160,6 +160,7 @@ struct rte_eth_xstats_name_off {
RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
+ RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
};
#undef RTE_TX_OFFLOAD_BIT2STR
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index 631b146..97313a0 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1178,6 +1178,10 @@ struct rte_eth_conf {
/** Device supports outer UDP checksum */
#define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM 0x00100000
+/** Device supports send on timestamp */
+#define DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
+
+
#define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
/**< Device supports Rx queue setup after device started*/
#define RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 96c3631..8407230 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -250,4 +250,35 @@ int rte_mbuf_dynflag_lookup(const char *name,
#define RTE_MBUF_DYNFIELD_METADATA_NAME "rte_flow_dynfield_metadata"
#define RTE_MBUF_DYNFLAG_METADATA_NAME "rte_flow_dynflag_metadata"
+/**
+ * The timestamp dynamic field provides some timing information, the
+ * units and time references (initial phase) are not explicitly defined
+ * but are maintained always the same for a given port. Some devices allow
+ * to query rte_eth_read_clock() that will return the current device
+ * timestamp. The dynamic Tx timestamp flag tells whether the field contains
+ * actual timestamp value for the packets being sent, this value can be
+ * used by PMD to schedule packet sending.
+ *
+ * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
+ * and obsoleting, the dedicated Rx timestamp flag is supposed to be
+ * introduced and the shared dynamic timestamp field will be used
+ * to handle the timestamps on receiving datapath as well.
+ */
+#define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME "rte_dynfield_timestamp"
+
+/**
+ * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME flag set on the
+ * packet being sent it tries to synchronize the time of packet appearing
+ * on the wire with the specified packet timestamp. If the specified one
+ * is in the past it should be ignored, if one is in the distant future
+ * it should be capped with some reasonable value (in range of seconds).
+ *
+ * There is no any packet reordering according to timestamps is supposed,
+ * neither for packet within the burst, nor for the whole bursts, it is
+ * an entirely application responsibility to generate packets and its
+ * timestamps in desired order. The timestamps might be put only in
+ * the first packet in the burst providing the entire burst scheduling.
+ */
+#define RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME "rte_dynflag_tx_timestamp"
+
#endif
--
1.8.3.1
^ permalink raw reply [relevance 2%]
* Re: [dpdk-dev] [PATCH v3] mbuf: use C11 atomic built-ins for refcnt operations
2020-07-09 10:10 4% ` [dpdk-dev] [PATCH v3] mbuf: use C11 atomic built-ins " Phil Yang
@ 2020-07-09 11:03 3% ` Olivier Matz
2020-07-09 13:00 3% ` Phil Yang
2020-07-09 15:58 4% ` [dpdk-dev] [PATCH v4 1/2] " Phil Yang
1 sibling, 1 reply; 200+ results
From: Olivier Matz @ 2020-07-09 11:03 UTC (permalink / raw)
To: Phil Yang
Cc: dev, stephen, david.marchand, drc, Honnappa.Nagarahalli,
Ruifeng.Wang, nd
Hi Phil,
On Thu, Jul 09, 2020 at 06:10:42PM +0800, Phil Yang wrote:
> Use C11 atomic built-ins with explicit ordering instead of rte_atomic
> ops which enforce unnecessary barriers on aarch64.
>
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
> v3:
> 1.Fix ABI breakage.
> 2.Simplify data type cast.
>
> v2:
> Fix ABI issue: revert the rte_mbuf_ext_shared_info struct refcnt field
> to refcnt_atomic.
>
> lib/librte_mbuf/rte_mbuf.c | 1 -
> lib/librte_mbuf/rte_mbuf.h | 19 ++++++++++---------
> lib/librte_mbuf/rte_mbuf_core.h | 2 +-
> 3 files changed, 11 insertions(+), 11 deletions(-)
>
> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> index ae91ae2..8a456e5 100644
> --- a/lib/librte_mbuf/rte_mbuf.c
> +++ b/lib/librte_mbuf/rte_mbuf.c
> @@ -22,7 +22,6 @@
> #include <rte_eal.h>
> #include <rte_per_lcore.h>
> #include <rte_lcore.h>
> -#include <rte_atomic.h>
> #include <rte_branch_prediction.h>
> #include <rte_mempool.h>
> #include <rte_mbuf.h>
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index f8e492e..c1c0956 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -37,7 +37,6 @@
> #include <rte_config.h>
> #include <rte_mempool.h>
> #include <rte_memory.h>
> -#include <rte_atomic.h>
> #include <rte_prefetch.h>
> #include <rte_branch_prediction.h>
> #include <rte_byteorder.h>
> @@ -365,7 +364,7 @@ rte_pktmbuf_priv_flags(struct rte_mempool *mp)
> static inline uint16_t
> rte_mbuf_refcnt_read(const struct rte_mbuf *m)
> {
> - return (uint16_t)(rte_atomic16_read(&m->refcnt_atomic));
> + return __atomic_load_n(&m->refcnt, __ATOMIC_RELAXED);
> }
>
> /**
> @@ -378,14 +377,15 @@ rte_mbuf_refcnt_read(const struct rte_mbuf *m)
> static inline void
> rte_mbuf_refcnt_set(struct rte_mbuf *m, uint16_t new_value)
> {
> - rte_atomic16_set(&m->refcnt_atomic, (int16_t)new_value);
> + __atomic_store_n(&m->refcnt, new_value, __ATOMIC_RELAXED);
> }
>
> /* internal */
> static inline uint16_t
> __rte_mbuf_refcnt_update(struct rte_mbuf *m, int16_t value)
> {
> - return (uint16_t)(rte_atomic16_add_return(&m->refcnt_atomic, value));
> + return __atomic_add_fetch(&m->refcnt, (uint16_t)value,
> + __ATOMIC_ACQ_REL);
> }
>
> /**
> @@ -466,7 +466,7 @@ rte_mbuf_refcnt_set(struct rte_mbuf *m, uint16_t new_value)
> static inline uint16_t
> rte_mbuf_ext_refcnt_read(const struct rte_mbuf_ext_shared_info *shinfo)
> {
> - return (uint16_t)(rte_atomic16_read(&shinfo->refcnt_atomic));
> + return __atomic_load_n(&shinfo->refcnt_atomic, __ATOMIC_RELAXED);
> }
>
> /**
> @@ -481,7 +481,7 @@ static inline void
> rte_mbuf_ext_refcnt_set(struct rte_mbuf_ext_shared_info *shinfo,
> uint16_t new_value)
> {
> - rte_atomic16_set(&shinfo->refcnt_atomic, (int16_t)new_value);
> + __atomic_store_n(&shinfo->refcnt_atomic, new_value, __ATOMIC_RELAXED);
> }
>
> /**
> @@ -505,7 +505,8 @@ rte_mbuf_ext_refcnt_update(struct rte_mbuf_ext_shared_info *shinfo,
> return (uint16_t)value;
> }
>
> - return (uint16_t)rte_atomic16_add_return(&shinfo->refcnt_atomic, value);
> + return __atomic_add_fetch(&shinfo->refcnt_atomic, (uint16_t)value,
> + __ATOMIC_ACQ_REL);
> }
>
> /** Mbuf prefetch */
> @@ -1304,8 +1305,8 @@ static inline int __rte_pktmbuf_pinned_extbuf_decref(struct rte_mbuf *m)
> * Direct usage of add primitive to avoid
> * duplication of comparing with one.
> */
> - if (likely(rte_atomic16_add_return
> - (&shinfo->refcnt_atomic, -1)))
> + if (likely(__atomic_add_fetch(&shinfo->refcnt_atomic, (uint16_t)-1,
> + __ATOMIC_ACQ_REL)))
> return 1;
>
> /* Reinitialize counter before mbuf freeing. */
> diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> index 16600f1..d65d1c8 100644
> --- a/lib/librte_mbuf/rte_mbuf_core.h
> +++ b/lib/librte_mbuf/rte_mbuf_core.h
> @@ -679,7 +679,7 @@ typedef void (*rte_mbuf_extbuf_free_callback_t)(void *addr, void *opaque);
> struct rte_mbuf_ext_shared_info {
> rte_mbuf_extbuf_free_callback_t free_cb; /**< Free callback function */
> void *fcb_opaque; /**< Free callback argument */
> - rte_atomic16_t refcnt_atomic; /**< Atomically accessed refcnt */
> + uint16_t refcnt_atomic; /**< Atomically accessed refcnt */
> };
To avoid an API breakage (i.e. currently, an application that accesses
to refcnt_atomic expects that its type is rte_atomic16_t), I suggest to
do the same than in the mbuf struct:
union {
rte_atomic16_t refcnt_atomic;
uint16_t refcnt;
};
I hope the ABI checker won't complain.
It will also be better for 20.11 when the deprecated fields will be
renamed: the remaining one will be called 'refcnt' in both mbuf and
mbuf_ext_shared_info.
Olivier
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v3] eal: use c11 atomic built-ins for interrupt status
2020-07-09 8:34 3% ` [dpdk-dev] [PATCH v3] " Phil Yang
@ 2020-07-09 10:30 0% ` David Marchand
2020-07-10 7:18 3% ` Dodji Seketeli
0 siblings, 1 reply; 200+ results
From: David Marchand @ 2020-07-09 10:30 UTC (permalink / raw)
To: Phil Yang, Ray Kinsella, Harman Kalra
Cc: dev, stefan.puiu, Aaron Conole, David Christensen,
Honnappa Nagarahalli, Ruifeng Wang (Arm Technology China),
nd, Dodji Seketeli, Neil Horman
On Thu, Jul 9, 2020 at 10:35 AM Phil Yang <phil.yang@arm.com> wrote:
>
> The event status is defined as a volatile variable and shared between
> threads. Use c11 atomic built-ins with explicit ordering instead of
> rte_atomic ops which enforce unnecessary barriers on aarch64.
>
> The event status has been cleaned up by the compare-and-swap operation
> when we free the event data, so there is no need to set it to invalid
> after that.
>
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Harman Kalra <hkalra@marvell.com>
> ---
> v3:
> Fixed typo.
>
> v2:
> 1. Fixed typo.
> 2. Updated libabigail.abignore to pass ABI check.
> 3. Merged v1 two patches into one patch.
>
> devtools/libabigail.abignore | 4 +++
> lib/librte_eal/include/rte_eal_interrupts.h | 2 +-
> lib/librte_eal/linux/eal_interrupts.c | 48 ++++++++++++++++++++---------
> 3 files changed, 38 insertions(+), 16 deletions(-)
>
> diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
> index 0133f75..daa4631 100644
> --- a/devtools/libabigail.abignore
> +++ b/devtools/libabigail.abignore
> @@ -48,6 +48,10 @@
> changed_enumerators = RTE_CRYPTO_AEAD_LIST_END
> [suppress_variable]
> name = rte_crypto_aead_algorithm_strings
> +; Ignore updates of epoll event
> +[suppress_type]
> + type_kind = struct
> + name = rte_epoll_event
In general, ignoring all changes on a structure is risky.
But the risk is acceptable as long as we remember this for the rest of
the 20.08 release (and we will start from scratch for 20.11).
Without any comment from others, I'll merge this by the end of (my) day.
Thanks.
--
David Marchand
^ permalink raw reply [relevance 0%]
* [dpdk-dev] [PATCH v3] mbuf: use C11 atomic built-ins for refcnt operations
2020-07-07 10:10 3% ` [dpdk-dev] [PATCH v2] mbuf: use C11 " Phil Yang
2020-07-08 5:11 3% ` Phil Yang
2020-07-08 11:44 0% ` Olivier Matz
@ 2020-07-09 10:10 4% ` Phil Yang
2020-07-09 11:03 3% ` Olivier Matz
2020-07-09 15:58 4% ` [dpdk-dev] [PATCH v4 1/2] " Phil Yang
2 siblings, 2 replies; 200+ results
From: Phil Yang @ 2020-07-09 10:10 UTC (permalink / raw)
To: olivier.matz, dev
Cc: stephen, david.marchand, drc, Honnappa.Nagarahalli, Ruifeng.Wang, nd
Use C11 atomic built-ins with explicit ordering instead of rte_atomic
ops which enforce unnecessary barriers on aarch64.
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
v3:
1.Fix ABI breakage.
2.Simplify data type cast.
v2:
Fix ABI issue: revert the rte_mbuf_ext_shared_info struct refcnt field
to refcnt_atomic.
lib/librte_mbuf/rte_mbuf.c | 1 -
lib/librte_mbuf/rte_mbuf.h | 19 ++++++++++---------
lib/librte_mbuf/rte_mbuf_core.h | 2 +-
3 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index ae91ae2..8a456e5 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -22,7 +22,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_mbuf.h>
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index f8e492e..c1c0956 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -37,7 +37,6 @@
#include <rte_config.h>
#include <rte_mempool.h>
#include <rte_memory.h>
-#include <rte_atomic.h>
#include <rte_prefetch.h>
#include <rte_branch_prediction.h>
#include <rte_byteorder.h>
@@ -365,7 +364,7 @@ rte_pktmbuf_priv_flags(struct rte_mempool *mp)
static inline uint16_t
rte_mbuf_refcnt_read(const struct rte_mbuf *m)
{
- return (uint16_t)(rte_atomic16_read(&m->refcnt_atomic));
+ return __atomic_load_n(&m->refcnt, __ATOMIC_RELAXED);
}
/**
@@ -378,14 +377,15 @@ rte_mbuf_refcnt_read(const struct rte_mbuf *m)
static inline void
rte_mbuf_refcnt_set(struct rte_mbuf *m, uint16_t new_value)
{
- rte_atomic16_set(&m->refcnt_atomic, (int16_t)new_value);
+ __atomic_store_n(&m->refcnt, new_value, __ATOMIC_RELAXED);
}
/* internal */
static inline uint16_t
__rte_mbuf_refcnt_update(struct rte_mbuf *m, int16_t value)
{
- return (uint16_t)(rte_atomic16_add_return(&m->refcnt_atomic, value));
+ return __atomic_add_fetch(&m->refcnt, (uint16_t)value,
+ __ATOMIC_ACQ_REL);
}
/**
@@ -466,7 +466,7 @@ rte_mbuf_refcnt_set(struct rte_mbuf *m, uint16_t new_value)
static inline uint16_t
rte_mbuf_ext_refcnt_read(const struct rte_mbuf_ext_shared_info *shinfo)
{
- return (uint16_t)(rte_atomic16_read(&shinfo->refcnt_atomic));
+ return __atomic_load_n(&shinfo->refcnt_atomic, __ATOMIC_RELAXED);
}
/**
@@ -481,7 +481,7 @@ static inline void
rte_mbuf_ext_refcnt_set(struct rte_mbuf_ext_shared_info *shinfo,
uint16_t new_value)
{
- rte_atomic16_set(&shinfo->refcnt_atomic, (int16_t)new_value);
+ __atomic_store_n(&shinfo->refcnt_atomic, new_value, __ATOMIC_RELAXED);
}
/**
@@ -505,7 +505,8 @@ rte_mbuf_ext_refcnt_update(struct rte_mbuf_ext_shared_info *shinfo,
return (uint16_t)value;
}
- return (uint16_t)rte_atomic16_add_return(&shinfo->refcnt_atomic, value);
+ return __atomic_add_fetch(&shinfo->refcnt_atomic, (uint16_t)value,
+ __ATOMIC_ACQ_REL);
}
/** Mbuf prefetch */
@@ -1304,8 +1305,8 @@ static inline int __rte_pktmbuf_pinned_extbuf_decref(struct rte_mbuf *m)
* Direct usage of add primitive to avoid
* duplication of comparing with one.
*/
- if (likely(rte_atomic16_add_return
- (&shinfo->refcnt_atomic, -1)))
+ if (likely(__atomic_add_fetch(&shinfo->refcnt_atomic, (uint16_t)-1,
+ __ATOMIC_ACQ_REL)))
return 1;
/* Reinitialize counter before mbuf freeing. */
diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
index 16600f1..d65d1c8 100644
--- a/lib/librte_mbuf/rte_mbuf_core.h
+++ b/lib/librte_mbuf/rte_mbuf_core.h
@@ -679,7 +679,7 @@ typedef void (*rte_mbuf_extbuf_free_callback_t)(void *addr, void *opaque);
struct rte_mbuf_ext_shared_info {
rte_mbuf_extbuf_free_callback_t free_cb; /**< Free callback function */
void *fcb_opaque; /**< Free callback argument */
- rte_atomic16_t refcnt_atomic; /**< Atomically accessed refcnt */
+ uint16_t refcnt_atomic; /**< Atomically accessed refcnt */
};
/**< Maximum number of nb_segs allowed. */
--
2.7.4
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH v2] mbuf: use C11 atomics for refcnt operations
2020-07-08 11:44 0% ` Olivier Matz
@ 2020-07-09 10:00 3% ` Phil Yang
0 siblings, 0 replies; 200+ results
From: Phil Yang @ 2020-07-09 10:00 UTC (permalink / raw)
To: Olivier Matz
Cc: david.marchand, dev, drc, Honnappa Nagarahalli, Ruifeng Wang, nd
> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Wednesday, July 8, 2020 7:44 PM
> To: Phil Yang <Phil.Yang@arm.com>
> Cc: david.marchand@redhat.com; dev@dpdk.org; drc@linux.vnet.ibm.com;
> Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; nd <nd@arm.com>
> Subject: Re: [PATCH v2] mbuf: use C11 atomics for refcnt operations
>
> Hi,
>
> On Tue, Jul 07, 2020 at 06:10:33PM +0800, Phil Yang wrote:
> > Use C11 atomics with explicit ordering instead of rte_atomic ops which
> > enforce unnecessary barriers on aarch64.
> >
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > ---
> > v2:
> > Fix ABI issue: revert the rte_mbuf_ext_shared_info struct refcnt field
> > to refcnt_atomic.
> >
> > lib/librte_mbuf/rte_mbuf.c | 1 -
> > lib/librte_mbuf/rte_mbuf.h | 19 ++++++++++---------
> > lib/librte_mbuf/rte_mbuf_core.h | 11 +++--------
> > 3 files changed, 13 insertions(+), 18 deletions(-)
> >
<snip>
>
> It seems this patch does 2 things:
> - remove refcnt_atomic
> - use C11 atomics
>
> The first change is an API break. I think it should be announced in a
> deprecation
> notice. The one about atomic does not talk about it.
>
> So I suggest to keep refcnt_atomic until next version.
Agreed.
I did a local test, this approach doesn't have any ABI breakage issue.
I will update in the next version.
Thanks,
Phil
>
>
> > uint16_t nb_segs; /**< Number of segments. */
> >
> > /** Input port (16 bits to support more than 256 virtual ports).
> > @@ -679,7 +674,7 @@ typedef void
> (*rte_mbuf_extbuf_free_callback_t)(void *addr, void *opaque);
> > struct rte_mbuf_ext_shared_info {
> > rte_mbuf_extbuf_free_callback_t free_cb; /**< Free callback
> function */
> > void *fcb_opaque; /**< Free callback argument */
> > - rte_atomic16_t refcnt_atomic; /**< Atomically accessed refcnt */
> > + uint16_t refcnt_atomic; /**< Atomically accessed refcnt */
> > };
> >
> > /**< Maximum number of nb_segs allowed. */
> > --
> > 2.7.4
> >
^ permalink raw reply [relevance 3%]
* [dpdk-dev] [PATCH v3] eal: use c11 atomic built-ins for interrupt status
2020-07-09 6:46 3% ` [dpdk-dev] [PATCH v2] eal: use c11 atomic built-ins " Phil Yang
2020-07-09 8:02 0% ` Stefan Puiu
@ 2020-07-09 8:34 3% ` Phil Yang
2020-07-09 10:30 0% ` David Marchand
1 sibling, 1 reply; 200+ results
From: Phil Yang @ 2020-07-09 8:34 UTC (permalink / raw)
To: david.marchand, dev
Cc: stefan.puiu, mdr, aconole, drc, Honnappa.Nagarahalli,
Ruifeng.Wang, nd, dodji, nhorman, hkalra
The event status is defined as a volatile variable and shared between
threads. Use c11 atomic built-ins with explicit ordering instead of
rte_atomic ops which enforce unnecessary barriers on aarch64.
The event status has been cleaned up by the compare-and-swap operation
when we free the event data, so there is no need to set it to invalid
after that.
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Harman Kalra <hkalra@marvell.com>
---
v3:
Fixed typo.
v2:
1. Fixed typo.
2. Updated libabigail.abignore to pass ABI check.
3. Merged v1 two patches into one patch.
devtools/libabigail.abignore | 4 +++
lib/librte_eal/include/rte_eal_interrupts.h | 2 +-
lib/librte_eal/linux/eal_interrupts.c | 48 ++++++++++++++++++++---------
3 files changed, 38 insertions(+), 16 deletions(-)
diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index 0133f75..daa4631 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -48,6 +48,10 @@
changed_enumerators = RTE_CRYPTO_AEAD_LIST_END
[suppress_variable]
name = rte_crypto_aead_algorithm_strings
+; Ignore updates of epoll event
+[suppress_type]
+ type_kind = struct
+ name = rte_epoll_event
;;;;;;;;;;;;;;;;;;;;;;
; Temporary exceptions till DPDK 20.11
diff --git a/lib/librte_eal/include/rte_eal_interrupts.h b/lib/librte_eal/include/rte_eal_interrupts.h
index 773a34a..b1e8a29 100644
--- a/lib/librte_eal/include/rte_eal_interrupts.h
+++ b/lib/librte_eal/include/rte_eal_interrupts.h
@@ -59,7 +59,7 @@ enum {
/** interrupt epoll event obj, taken by epoll_event.ptr */
struct rte_epoll_event {
- volatile uint32_t status; /**< OUT: event status */
+ uint32_t status; /**< OUT: event status */
int fd; /**< OUT: event fd */
int epfd; /**< OUT: epoll instance the ev associated with */
struct rte_epoll_data epdata;
diff --git a/lib/librte_eal/linux/eal_interrupts.c b/lib/librte_eal/linux/eal_interrupts.c
index 84eeaa1..ad09049 100644
--- a/lib/librte_eal/linux/eal_interrupts.c
+++ b/lib/librte_eal/linux/eal_interrupts.c
@@ -26,7 +26,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_debug.h>
#include <rte_log.h>
@@ -1221,11 +1220,18 @@ eal_epoll_process_event(struct epoll_event *evs, unsigned int n,
{
unsigned int i, count = 0;
struct rte_epoll_event *rev;
+ uint32_t valid_status;
for (i = 0; i < n; i++) {
rev = evs[i].data.ptr;
- if (!rev || !rte_atomic32_cmpset(&rev->status, RTE_EPOLL_VALID,
- RTE_EPOLL_EXEC))
+ valid_status = RTE_EPOLL_VALID;
+ /* ACQUIRE memory ordering here pairs with RELEASE
+ * ordering below acting as a lock to synchronize
+ * the event data updating.
+ */
+ if (!rev || !__atomic_compare_exchange_n(&rev->status,
+ &valid_status, RTE_EPOLL_EXEC, 0,
+ __ATOMIC_ACQUIRE, __ATOMIC_RELAXED))
continue;
events[count].status = RTE_EPOLL_VALID;
@@ -1237,8 +1243,11 @@ eal_epoll_process_event(struct epoll_event *evs, unsigned int n,
rev->epdata.cb_fun(rev->fd,
rev->epdata.cb_arg);
- rte_compiler_barrier();
- rev->status = RTE_EPOLL_VALID;
+ /* the status update should be observed after
+ * the other fields change.
+ */
+ __atomic_store_n(&rev->status, RTE_EPOLL_VALID,
+ __ATOMIC_RELEASE);
count++;
}
return count;
@@ -1308,10 +1317,14 @@ rte_epoll_wait(int epfd, struct rte_epoll_event *events,
static inline void
eal_epoll_data_safe_free(struct rte_epoll_event *ev)
{
- while (!rte_atomic32_cmpset(&ev->status, RTE_EPOLL_VALID,
- RTE_EPOLL_INVALID))
- while (ev->status != RTE_EPOLL_VALID)
+ uint32_t valid_status = RTE_EPOLL_VALID;
+ while (!__atomic_compare_exchange_n(&ev->status, &valid_status,
+ RTE_EPOLL_INVALID, 0, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) {
+ while (__atomic_load_n(&ev->status,
+ __ATOMIC_RELAXED) != RTE_EPOLL_VALID)
rte_pause();
+ valid_status = RTE_EPOLL_VALID;
+ }
memset(&ev->epdata, 0, sizeof(ev->epdata));
ev->fd = -1;
ev->epfd = -1;
@@ -1333,7 +1346,8 @@ rte_epoll_ctl(int epfd, int op, int fd,
epfd = rte_intr_tls_epfd();
if (op == EPOLL_CTL_ADD) {
- event->status = RTE_EPOLL_VALID;
+ __atomic_store_n(&event->status, RTE_EPOLL_VALID,
+ __ATOMIC_RELAXED);
event->fd = fd; /* ignore fd in event */
event->epfd = epfd;
ev.data.ptr = (void *)event;
@@ -1345,11 +1359,13 @@ rte_epoll_ctl(int epfd, int op, int fd,
op, fd, strerror(errno));
if (op == EPOLL_CTL_ADD)
/* rollback status when CTL_ADD fail */
- event->status = RTE_EPOLL_INVALID;
+ __atomic_store_n(&event->status, RTE_EPOLL_INVALID,
+ __ATOMIC_RELAXED);
return -1;
}
- if (op == EPOLL_CTL_DEL && event->status != RTE_EPOLL_INVALID)
+ if (op == EPOLL_CTL_DEL && __atomic_load_n(&event->status,
+ __ATOMIC_RELAXED) != RTE_EPOLL_INVALID)
eal_epoll_data_safe_free(event);
return 0;
@@ -1378,7 +1394,8 @@ rte_intr_rx_ctl(struct rte_intr_handle *intr_handle, int epfd,
case RTE_INTR_EVENT_ADD:
epfd_op = EPOLL_CTL_ADD;
rev = &intr_handle->elist[efd_idx];
- if (rev->status != RTE_EPOLL_INVALID) {
+ if (__atomic_load_n(&rev->status,
+ __ATOMIC_RELAXED) != RTE_EPOLL_INVALID) {
RTE_LOG(INFO, EAL, "Event already been added.\n");
return -EEXIST;
}
@@ -1401,7 +1418,8 @@ rte_intr_rx_ctl(struct rte_intr_handle *intr_handle, int epfd,
case RTE_INTR_EVENT_DEL:
epfd_op = EPOLL_CTL_DEL;
rev = &intr_handle->elist[efd_idx];
- if (rev->status == RTE_EPOLL_INVALID) {
+ if (__atomic_load_n(&rev->status,
+ __ATOMIC_RELAXED) == RTE_EPOLL_INVALID) {
RTE_LOG(INFO, EAL, "Event does not exist.\n");
return -EPERM;
}
@@ -1426,12 +1444,12 @@ rte_intr_free_epoll_fd(struct rte_intr_handle *intr_handle)
for (i = 0; i < intr_handle->nb_efd; i++) {
rev = &intr_handle->elist[i];
- if (rev->status == RTE_EPOLL_INVALID)
+ if (__atomic_load_n(&rev->status,
+ __ATOMIC_RELAXED) == RTE_EPOLL_INVALID)
continue;
if (rte_epoll_ctl(rev->epfd, EPOLL_CTL_DEL, rev->fd, rev)) {
/* force free if the entry valid */
eal_epoll_data_safe_free(rev);
- rev->status = RTE_EPOLL_INVALID;
}
}
}
--
2.7.4
^ permalink raw reply [relevance 3%]
* [dpdk-dev] [PATCH v8 1/3] lib/lpm: integrate RCU QSBR
2020-07-09 8:02 4% ` [dpdk-dev] [PATCH v8 0/3] RCU integration with LPM library Ruifeng Wang
@ 2020-07-09 8:02 2% ` Ruifeng Wang
0 siblings, 0 replies; 200+ results
From: Ruifeng Wang @ 2020-07-09 8:02 UTC (permalink / raw)
To: Bruce Richardson, Vladimir Medvedkin, John McNamara,
Marko Kovacevic, Ray Kinsella, Neil Horman
Cc: dev, konstantin.ananyev, honnappa.nagarahalli, nd, Ruifeng Wang
Currently, the tbl8 group is freed even though the readers might be
using the tbl8 group entries. The freed tbl8 group can be reallocated
quickly. This results in incorrect lookup results.
RCU QSBR process is integrated for safe tbl8 group reclaim.
Refer to RCU documentation to understand various aspects of
integrating RCU library into other libraries.
To avoid ABI breakage, a struct __rte_lpm is created for lpm library
internal use. This struct warps rte_lpm that has been exposed and
also includes members that don't need to be exposed such as RCU related
config.
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
doc/guides/prog_guide/lpm_lib.rst | 32 ++++++
lib/librte_lpm/Makefile | 2 +-
lib/librte_lpm/meson.build | 1 +
lib/librte_lpm/rte_lpm.c | 167 +++++++++++++++++++++++++----
lib/librte_lpm/rte_lpm.h | 53 +++++++++
lib/librte_lpm/rte_lpm_version.map | 6 ++
6 files changed, 237 insertions(+), 24 deletions(-)
diff --git a/doc/guides/prog_guide/lpm_lib.rst b/doc/guides/prog_guide/lpm_lib.rst
index 1609a57d0..03945904b 100644
--- a/doc/guides/prog_guide/lpm_lib.rst
+++ b/doc/guides/prog_guide/lpm_lib.rst
@@ -145,6 +145,38 @@ depending on whether we need to move to the next table or not.
Prefix expansion is one of the keys of this algorithm,
since it improves the speed dramatically by adding redundancy.
+Deletion
+~~~~~~~~
+
+When deleting a rule, a replacement rule is searched for. Replacement rule is an existing rule that has
+the longest prefix match with the rule to be deleted, but has shorter prefix.
+
+If a replacement rule is found, target tbl24 and tbl8 entries are updated to have the same depth and next hop
+value with the replacement rule.
+
+If no replacement rule can be found, target tbl24 and tbl8 entries will be cleared.
+
+Prefix expansion is performed if the rule's depth is not exactly 24 bits or 32 bits.
+
+After deleting a rule, a group of tbl8s that belongs to the same tbl24 entry are freed in following cases:
+
+* All tbl8s in the group are empty .
+
+* All tbl8s in the group have the same values and with depth no greater than 24.
+
+Free of tbl8s have different behaviors:
+
+* If RCU is not used, tbl8s are cleared and reclaimed immediately.
+
+* If RCU is used, tbl8s are reclaimed when readers are in quiescent state.
+
+When the LPM is not using RCU, tbl8 group can be freed immediately even though the readers might be using
+the tbl8 group entries. This might result in incorrect lookup results.
+
+RCU QSBR process is integrated for safe tbl8 group reclamation. Application has certain responsibilities
+while using this feature. Please refer to resource reclamation framework of :ref:`RCU library <RCU_Library>`
+for more details.
+
Lookup
~~~~~~
diff --git a/lib/librte_lpm/Makefile b/lib/librte_lpm/Makefile
index d682785b6..6f06c5c03 100644
--- a/lib/librte_lpm/Makefile
+++ b/lib/librte_lpm/Makefile
@@ -8,7 +8,7 @@ LIB = librte_lpm.a
CFLAGS += -O3
CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR)
-LDLIBS += -lrte_eal -lrte_hash
+LDLIBS += -lrte_eal -lrte_hash -lrte_rcu
EXPORT_MAP := rte_lpm_version.map
diff --git a/lib/librte_lpm/meson.build b/lib/librte_lpm/meson.build
index 021ac6d8d..6cfc083c5 100644
--- a/lib/librte_lpm/meson.build
+++ b/lib/librte_lpm/meson.build
@@ -7,3 +7,4 @@ headers = files('rte_lpm.h', 'rte_lpm6.h')
# without worrying about which architecture we actually need
headers += files('rte_lpm_altivec.h', 'rte_lpm_neon.h', 'rte_lpm_sse.h')
deps += ['hash']
+deps += ['rcu']
diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c
index 38ab512a4..4fbf5b6df 100644
--- a/lib/librte_lpm/rte_lpm.c
+++ b/lib/librte_lpm/rte_lpm.c
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2020 Arm Limited
*/
#include <string.h>
@@ -39,6 +40,17 @@ enum valid_flag {
VALID
};
+/** @internal LPM structure. */
+struct __rte_lpm {
+ /* LPM metadata. */
+ struct rte_lpm lpm;
+
+ /* RCU config. */
+ struct rte_rcu_qsbr *v; /* RCU QSBR variable. */
+ enum rte_lpm_qsbr_mode rcu_mode;/* Blocking, defer queue. */
+ struct rte_rcu_qsbr_dq *dq; /* RCU QSBR defer queue. */
+};
+
/* Macro to enable/disable run-time checks. */
#if defined(RTE_LIBRTE_LPM_DEBUG)
#include <rte_debug.h>
@@ -122,6 +134,7 @@ rte_lpm_create(const char *name, int socket_id,
const struct rte_lpm_config *config)
{
char mem_name[RTE_LPM_NAMESIZE];
+ struct __rte_lpm *internal_lpm = NULL;
struct rte_lpm *lpm = NULL;
struct rte_tailq_entry *te;
uint32_t mem_size, rules_size, tbl8s_size;
@@ -140,12 +153,6 @@ rte_lpm_create(const char *name, int socket_id,
snprintf(mem_name, sizeof(mem_name), "LPM_%s", name);
- /* Determine the amount of memory to allocate. */
- mem_size = sizeof(*lpm);
- rules_size = sizeof(struct rte_lpm_rule) * config->max_rules;
- tbl8s_size = (sizeof(struct rte_lpm_tbl_entry) *
- RTE_LPM_TBL8_GROUP_NUM_ENTRIES * config->number_tbl8s);
-
rte_mcfg_tailq_write_lock();
/* guarantee there's no existing */
@@ -161,6 +168,12 @@ rte_lpm_create(const char *name, int socket_id,
goto exit;
}
+ /* Determine the amount of memory to allocate. */
+ mem_size = sizeof(*internal_lpm);
+ rules_size = sizeof(struct rte_lpm_rule) * config->max_rules;
+ tbl8s_size = (sizeof(struct rte_lpm_tbl_entry) *
+ RTE_LPM_TBL8_GROUP_NUM_ENTRIES * config->number_tbl8s);
+
/* allocate tailq entry */
te = rte_zmalloc("LPM_TAILQ_ENTRY", sizeof(*te), 0);
if (te == NULL) {
@@ -170,22 +183,23 @@ rte_lpm_create(const char *name, int socket_id,
}
/* Allocate memory to store the LPM data structures. */
- lpm = rte_zmalloc_socket(mem_name, mem_size,
+ internal_lpm = rte_zmalloc_socket(mem_name, mem_size,
RTE_CACHE_LINE_SIZE, socket_id);
- if (lpm == NULL) {
+ if (internal_lpm == NULL) {
RTE_LOG(ERR, LPM, "LPM memory allocation failed\n");
rte_free(te);
rte_errno = ENOMEM;
goto exit;
}
+ lpm = &internal_lpm->lpm;
lpm->rules_tbl = rte_zmalloc_socket(NULL,
(size_t)rules_size, RTE_CACHE_LINE_SIZE, socket_id);
if (lpm->rules_tbl == NULL) {
RTE_LOG(ERR, LPM, "LPM rules_tbl memory allocation failed\n");
- rte_free(lpm);
- lpm = NULL;
+ rte_free(internal_lpm);
+ internal_lpm = NULL;
rte_free(te);
rte_errno = ENOMEM;
goto exit;
@@ -197,8 +211,8 @@ rte_lpm_create(const char *name, int socket_id,
if (lpm->tbl8 == NULL) {
RTE_LOG(ERR, LPM, "LPM tbl8 memory allocation failed\n");
rte_free(lpm->rules_tbl);
- rte_free(lpm);
- lpm = NULL;
+ rte_free(internal_lpm);
+ internal_lpm = NULL;
rte_free(te);
rte_errno = ENOMEM;
goto exit;
@@ -225,6 +239,7 @@ rte_lpm_create(const char *name, int socket_id,
void
rte_lpm_free(struct rte_lpm *lpm)
{
+ struct __rte_lpm *internal_lpm;
struct rte_lpm_list *lpm_list;
struct rte_tailq_entry *te;
@@ -246,12 +261,84 @@ rte_lpm_free(struct rte_lpm *lpm)
rte_mcfg_tailq_write_unlock();
+ internal_lpm = container_of(lpm, struct __rte_lpm, lpm);
+ if (internal_lpm->dq)
+ rte_rcu_qsbr_dq_delete(internal_lpm->dq);
rte_free(lpm->tbl8);
rte_free(lpm->rules_tbl);
rte_free(lpm);
rte_free(te);
}
+static void
+__lpm_rcu_qsbr_free_resource(void *p, void *data, unsigned int n)
+{
+ struct rte_lpm_tbl_entry zero_tbl8_entry = {0};
+ uint32_t tbl8_group_index = *(uint32_t *)data;
+ struct rte_lpm_tbl_entry *tbl8 = ((struct rte_lpm *)p)->tbl8;
+
+ RTE_SET_USED(n);
+ /* Set tbl8 group invalid */
+ __atomic_store(&tbl8[tbl8_group_index], &zero_tbl8_entry,
+ __ATOMIC_RELAXED);
+}
+
+/* Associate QSBR variable with an LPM object.
+ */
+int
+rte_lpm_rcu_qsbr_add(struct rte_lpm *lpm, struct rte_lpm_rcu_config *cfg,
+ struct rte_rcu_qsbr_dq **dq)
+{
+ struct __rte_lpm *internal_lpm;
+ char rcu_dq_name[RTE_RCU_QSBR_DQ_NAMESIZE];
+ struct rte_rcu_qsbr_dq_parameters params = {0};
+
+ if (lpm == NULL || cfg == NULL) {
+ rte_errno = EINVAL;
+ return 1;
+ }
+
+ internal_lpm = container_of(lpm, struct __rte_lpm, lpm);
+ if (internal_lpm->v != NULL) {
+ rte_errno = EEXIST;
+ return 1;
+ }
+
+ if (cfg->mode == RTE_LPM_QSBR_MODE_SYNC) {
+ /* No other things to do. */
+ } else if (cfg->mode == RTE_LPM_QSBR_MODE_DQ) {
+ /* Init QSBR defer queue. */
+ snprintf(rcu_dq_name, sizeof(rcu_dq_name),
+ "LPM_RCU_%s", lpm->name);
+ params.name = rcu_dq_name;
+ params.size = cfg->dq_size;
+ if (params.size == 0)
+ params.size = lpm->number_tbl8s;
+ params.trigger_reclaim_limit = cfg->reclaim_thd;
+ params.max_reclaim_size = cfg->reclaim_max;
+ if (params.max_reclaim_size == 0)
+ params.max_reclaim_size = RTE_LPM_RCU_DQ_RECLAIM_MAX;
+ params.esize = sizeof(uint32_t); /* tbl8 group index */
+ params.free_fn = __lpm_rcu_qsbr_free_resource;
+ params.p = lpm;
+ params.v = cfg->v;
+ internal_lpm->dq = rte_rcu_qsbr_dq_create(¶ms);
+ if (internal_lpm->dq == NULL) {
+ RTE_LOG(ERR, LPM, "LPM defer queue creation failed\n");
+ return 1;
+ }
+ if (dq)
+ *dq = internal_lpm->dq;
+ } else {
+ rte_errno = EINVAL;
+ return 1;
+ }
+ internal_lpm->rcu_mode = cfg->mode;
+ internal_lpm->v = cfg->v;
+
+ return 0;
+}
+
/*
* Adds a rule to the rule table.
*
@@ -394,14 +481,15 @@ rule_find(struct rte_lpm *lpm, uint32_t ip_masked, uint8_t depth)
* Find, clean and allocate a tbl8.
*/
static int32_t
-tbl8_alloc(struct rte_lpm_tbl_entry *tbl8, uint32_t number_tbl8s)
+_tbl8_alloc(struct rte_lpm *lpm)
{
uint32_t group_idx; /* tbl8 group index. */
struct rte_lpm_tbl_entry *tbl8_entry;
/* Scan through tbl8 to find a free (i.e. INVALID) tbl8 group. */
- for (group_idx = 0; group_idx < number_tbl8s; group_idx++) {
- tbl8_entry = &tbl8[group_idx * RTE_LPM_TBL8_GROUP_NUM_ENTRIES];
+ for (group_idx = 0; group_idx < lpm->number_tbl8s; group_idx++) {
+ tbl8_entry = &lpm->tbl8[group_idx *
+ RTE_LPM_TBL8_GROUP_NUM_ENTRIES];
/* If a free tbl8 group is found clean it and set as VALID. */
if (!tbl8_entry->valid_group) {
struct rte_lpm_tbl_entry new_tbl8_entry = {
@@ -427,14 +515,47 @@ tbl8_alloc(struct rte_lpm_tbl_entry *tbl8, uint32_t number_tbl8s)
return -ENOSPC;
}
+static int32_t
+tbl8_alloc(struct rte_lpm *lpm)
+{
+ struct __rte_lpm *internal_lpm = container_of(lpm,
+ struct __rte_lpm, lpm);
+ int32_t group_idx; /* tbl8 group index. */
+
+ group_idx = _tbl8_alloc(lpm);
+ if (group_idx == -ENOSPC && internal_lpm->dq != NULL) {
+ /* If there are no tbl8 groups try to reclaim one. */
+ if (rte_rcu_qsbr_dq_reclaim(internal_lpm->dq, 1,
+ NULL, NULL, NULL) == 0)
+ group_idx = _tbl8_alloc(lpm);
+ }
+
+ return group_idx;
+}
+
static void
-tbl8_free(struct rte_lpm_tbl_entry *tbl8, uint32_t tbl8_group_start)
+tbl8_free(struct rte_lpm *lpm, uint32_t tbl8_group_start)
{
- /* Set tbl8 group invalid*/
+ struct __rte_lpm *internal_lpm = container_of(lpm,
+ struct __rte_lpm, lpm);
struct rte_lpm_tbl_entry zero_tbl8_entry = {0};
- __atomic_store(&tbl8[tbl8_group_start], &zero_tbl8_entry,
- __ATOMIC_RELAXED);
+ if (internal_lpm->v == NULL) {
+ /* Set tbl8 group invalid*/
+ __atomic_store(&lpm->tbl8[tbl8_group_start], &zero_tbl8_entry,
+ __ATOMIC_RELAXED);
+ } else if (internal_lpm->rcu_mode == RTE_LPM_QSBR_MODE_SYNC) {
+ /* Wait for quiescent state change. */
+ rte_rcu_qsbr_synchronize(internal_lpm->v,
+ RTE_QSBR_THRID_INVALID);
+ /* Set tbl8 group invalid*/
+ __atomic_store(&lpm->tbl8[tbl8_group_start], &zero_tbl8_entry,
+ __ATOMIC_RELAXED);
+ } else if (internal_lpm->rcu_mode == RTE_LPM_QSBR_MODE_DQ) {
+ /* Push into QSBR defer queue. */
+ rte_rcu_qsbr_dq_enqueue(internal_lpm->dq,
+ (void *)&tbl8_group_start);
+ }
}
static __rte_noinline int32_t
@@ -523,7 +644,7 @@ add_depth_big(struct rte_lpm *lpm, uint32_t ip_masked, uint8_t depth,
if (!lpm->tbl24[tbl24_index].valid) {
/* Search for a free tbl8 group. */
- tbl8_group_index = tbl8_alloc(lpm->tbl8, lpm->number_tbl8s);
+ tbl8_group_index = tbl8_alloc(lpm);
/* Check tbl8 allocation was successful. */
if (tbl8_group_index < 0) {
@@ -569,7 +690,7 @@ add_depth_big(struct rte_lpm *lpm, uint32_t ip_masked, uint8_t depth,
} /* If valid entry but not extended calculate the index into Table8. */
else if (lpm->tbl24[tbl24_index].valid_group == 0) {
/* Search for free tbl8 group. */
- tbl8_group_index = tbl8_alloc(lpm->tbl8, lpm->number_tbl8s);
+ tbl8_group_index = tbl8_alloc(lpm);
if (tbl8_group_index < 0) {
return tbl8_group_index;
@@ -977,7 +1098,7 @@ delete_depth_big(struct rte_lpm *lpm, uint32_t ip_masked,
*/
lpm->tbl24[tbl24_index].valid = 0;
__atomic_thread_fence(__ATOMIC_RELEASE);
- tbl8_free(lpm->tbl8, tbl8_group_start);
+ tbl8_free(lpm, tbl8_group_start);
} else if (tbl8_recycle_index > -1) {
/* Update tbl24 entry. */
struct rte_lpm_tbl_entry new_tbl24_entry = {
@@ -993,7 +1114,7 @@ delete_depth_big(struct rte_lpm *lpm, uint32_t ip_masked,
__atomic_store(&lpm->tbl24[tbl24_index], &new_tbl24_entry,
__ATOMIC_RELAXED);
__atomic_thread_fence(__ATOMIC_RELEASE);
- tbl8_free(lpm->tbl8, tbl8_group_start);
+ tbl8_free(lpm, tbl8_group_start);
}
#undef group_idx
return 0;
diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
index b9d49ac87..a9568fcdd 100644
--- a/lib/librte_lpm/rte_lpm.h
+++ b/lib/librte_lpm/rte_lpm.h
@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2020 Arm Limited
*/
#ifndef _RTE_LPM_H_
@@ -20,6 +21,7 @@
#include <rte_memory.h>
#include <rte_common.h>
#include <rte_vect.h>
+#include <rte_rcu_qsbr.h>
#ifdef __cplusplus
extern "C" {
@@ -62,6 +64,17 @@ extern "C" {
/** Bitmask used to indicate successful lookup */
#define RTE_LPM_LOOKUP_SUCCESS 0x01000000
+/** @internal Default RCU defer queue entries to reclaim in one go. */
+#define RTE_LPM_RCU_DQ_RECLAIM_MAX 16
+
+/** RCU reclamation modes */
+enum rte_lpm_qsbr_mode {
+ /** Create defer queue for reclaim. */
+ RTE_LPM_QSBR_MODE_DQ = 0,
+ /** Use blocking mode reclaim. No defer queue created. */
+ RTE_LPM_QSBR_MODE_SYNC
+};
+
#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
/** @internal Tbl24 entry structure. */
__extension__
@@ -132,6 +145,22 @@ struct rte_lpm {
struct rte_lpm_rule *rules_tbl; /**< LPM rules. */
};
+/** LPM RCU QSBR configuration structure. */
+struct rte_lpm_rcu_config {
+ struct rte_rcu_qsbr *v; /* RCU QSBR variable. */
+ /* Mode of RCU QSBR. RTE_LPM_QSBR_MODE_xxx
+ * '0' for default: create defer queue for reclaim.
+ */
+ enum rte_lpm_qsbr_mode mode;
+ uint32_t dq_size; /* RCU defer queue size.
+ * default: lpm->number_tbl8s.
+ */
+ uint32_t reclaim_thd; /* Threshold to trigger auto reclaim. */
+ uint32_t reclaim_max; /* Max entries to reclaim in one go.
+ * default: RTE_LPM_RCU_DQ_RECLAIM_MAX.
+ */
+};
+
/**
* Create an LPM object.
*
@@ -179,6 +208,30 @@ rte_lpm_find_existing(const char *name);
void
rte_lpm_free(struct rte_lpm *lpm);
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Associate RCU QSBR variable with an LPM object.
+ *
+ * @param lpm
+ * the lpm object to add RCU QSBR
+ * @param cfg
+ * RCU QSBR configuration
+ * @param dq
+ * handler of created RCU QSBR defer queue
+ * @return
+ * On success - 0
+ * On error - 1 with error code set in rte_errno.
+ * Possible rte_errno codes are:
+ * - EINVAL - invalid pointer
+ * - EEXIST - already added QSBR
+ * - ENOMEM - memory allocation failure
+ */
+__rte_experimental
+int rte_lpm_rcu_qsbr_add(struct rte_lpm *lpm, struct rte_lpm_rcu_config *cfg,
+ struct rte_rcu_qsbr_dq **dq);
+
/**
* Add a rule to the LPM table.
*
diff --git a/lib/librte_lpm/rte_lpm_version.map b/lib/librte_lpm/rte_lpm_version.map
index 500f58b80..bfccd7eac 100644
--- a/lib/librte_lpm/rte_lpm_version.map
+++ b/lib/librte_lpm/rte_lpm_version.map
@@ -21,3 +21,9 @@ DPDK_20.0 {
local: *;
};
+
+EXPERIMENTAL {
+ global:
+
+ rte_lpm_rcu_qsbr_add;
+};
--
2.17.1
^ permalink raw reply [relevance 2%]
* [dpdk-dev] [PATCH v8 0/3] RCU integration with LPM library
` (2 preceding siblings ...)
2020-07-07 15:15 3% ` [dpdk-dev] [PATCH v7 " Ruifeng Wang
@ 2020-07-09 8:02 4% ` Ruifeng Wang
2020-07-09 8:02 2% ` [dpdk-dev] [PATCH v8 1/3] lib/lpm: integrate RCU QSBR Ruifeng Wang
2020-07-09 15:42 4% ` [dpdk-dev] [PATCH v9 0/3] RCU integration with LPM library Ruifeng Wang
2020-07-10 2:22 4% ` [dpdk-dev] [PATCH v10 0/3] RCU integration with LPM library Ruifeng Wang
5 siblings, 1 reply; 200+ results
From: Ruifeng Wang @ 2020-07-09 8:02 UTC (permalink / raw)
Cc: dev, mdr, konstantin.ananyev, honnappa.nagarahalli, nd, Ruifeng Wang
This patchset integrates RCU QSBR support with LPM library.
Resource reclaimation implementation was splitted from the original
series, and has already been part of RCU library. Rework the series
to base LPM integration on RCU reclaimation APIs.
New API rte_lpm_rcu_qsbr_add is introduced for application to
register a RCU variable that LPM library will use. This provides
user the handle to enable RCU that integrated in LPM library.
Functional tests and performance tests are added to cover the
integration with RCU.
---
v8:
Fixed ABI issue by adding internal LPM control structure. (David)
Changed to use RFC5737 address in unit test. (Vladimir)
v7:
Fixed typos in document.
v6:
Remove ALLOW_EXPERIMENTAL_API from rte_lpm.c.
v5:
No default value for reclaim_thd. This allows reclamation triggering with every call.
Pass LPM pointer instead of tbl8 as argument of reclaim callback free function.
Updated group_idx check at tbl8 allocation.
Use enums instead of defines for different reclamation modes.
RCU QSBR integrated path is inside ALLOW_EXPERIMENTAL_API to avoid ABI change.
v4:
Allow user to configure defer queue: size, reclaim threshold, max entries.
Return defer queue handler so user can manually trigger reclaimation.
Add blocking mode support. Defer queue will not be created.
Honnappa Nagarahalli (1):
test/lpm: add RCU integration performance tests
Ruifeng Wang (2):
lib/lpm: integrate RCU QSBR
test/lpm: add LPM RCU integration functional tests
app/test/test_lpm.c | 291 ++++++++++++++++-
app/test/test_lpm_perf.c | 492 ++++++++++++++++++++++++++++-
doc/guides/prog_guide/lpm_lib.rst | 32 ++
lib/librte_lpm/Makefile | 2 +-
lib/librte_lpm/meson.build | 1 +
lib/librte_lpm/rte_lpm.c | 167 ++++++++--
lib/librte_lpm/rte_lpm.h | 53 ++++
lib/librte_lpm/rte_lpm_version.map | 6 +
8 files changed, 1016 insertions(+), 28 deletions(-)
--
2.17.1
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH v2] eal: use c11 atomic built-ins for interrupt status
2020-07-09 6:46 3% ` [dpdk-dev] [PATCH v2] eal: use c11 atomic built-ins " Phil Yang
@ 2020-07-09 8:02 0% ` Stefan Puiu
2020-07-09 8:34 3% ` [dpdk-dev] [PATCH v3] " Phil Yang
1 sibling, 0 replies; 200+ results
From: Stefan Puiu @ 2020-07-09 8:02 UTC (permalink / raw)
To: Phil Yang
Cc: david.marchand, dev, mdr, aconole, drc, Honnappa.Nagarahalli,
Ruifeng.Wang, nd, dodji, Neil Horman, hkalra
Hi,
Noticed 2 typos:
On Thu, Jul 9, 2020 at 9:46 AM Phil Yang <phil.yang@arm.com> wrote:
>
> The event status is defined as a volatile variable and shared between
> threads. Use c11 atomic built-ins with explicit ordering instead of
> rte_atomic ops which enforce unnecessary barriers on aarch64.
>
> The event status has been cleaned up by the compare-and-swap operation
> when we free the event data, so there is no need to set it to invalid
> after that.
>
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Harman Kalra <hkalra@marvell.com>
> ---
> v2:
> 1. Fixed typo.
> 2. Updated libabigail.abignore to pass ABI check.
> 3. Merged v1 two patches into one patch.
>
> devtools/libabigail.abignore | 4 +++
> lib/librte_eal/include/rte_eal_interrupts.h | 2 +-
> lib/librte_eal/linux/eal_interrupts.c | 48 ++++++++++++++++++++---------
> 3 files changed, 38 insertions(+), 16 deletions(-)
>
> diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
> index 0133f75..daa4631 100644
> --- a/devtools/libabigail.abignore
> +++ b/devtools/libabigail.abignore
> @@ -48,6 +48,10 @@
> changed_enumerators = RTE_CRYPTO_AEAD_LIST_END
> [suppress_variable]
> name = rte_crypto_aead_algorithm_strings
> +; Ignore updates of epoll event
> +[suppress_type]
> + type_kind = struct
> + name = rte_epoll_event
>
> ;;;;;;;;;;;;;;;;;;;;;;
> ; Temporary exceptions till DPDK 20.11
> diff --git a/lib/librte_eal/include/rte_eal_interrupts.h b/lib/librte_eal/include/rte_eal_interrupts.h
> index 773a34a..b1e8a29 100644
> --- a/lib/librte_eal/include/rte_eal_interrupts.h
> +++ b/lib/librte_eal/include/rte_eal_interrupts.h
> @@ -59,7 +59,7 @@ enum {
>
> /** interrupt epoll event obj, taken by epoll_event.ptr */
> struct rte_epoll_event {
> - volatile uint32_t status; /**< OUT: event status */
> + uint32_t status; /**< OUT: event status */
> int fd; /**< OUT: event fd */
> int epfd; /**< OUT: epoll instance the ev associated with */
> struct rte_epoll_data epdata;
> diff --git a/lib/librte_eal/linux/eal_interrupts.c b/lib/librte_eal/linux/eal_interrupts.c
> index 84eeaa1..7a50869 100644
> --- a/lib/librte_eal/linux/eal_interrupts.c
> +++ b/lib/librte_eal/linux/eal_interrupts.c
> @@ -26,7 +26,6 @@
> #include <rte_eal.h>
> #include <rte_per_lcore.h>
> #include <rte_lcore.h>
> -#include <rte_atomic.h>
> #include <rte_branch_prediction.h>
> #include <rte_debug.h>
> #include <rte_log.h>
> @@ -1221,11 +1220,18 @@ eal_epoll_process_event(struct epoll_event *evs, unsigned int n,
> {
> unsigned int i, count = 0;
> struct rte_epoll_event *rev;
> + uint32_t valid_status;
>
> for (i = 0; i < n; i++) {
> rev = evs[i].data.ptr;
> - if (!rev || !rte_atomic32_cmpset(&rev->status, RTE_EPOLL_VALID,
> - RTE_EPOLL_EXEC))
> + valid_status = RTE_EPOLL_VALID;
> + /* ACQUIRE memory ordering here pairs with RELEASE
> + * ordering bellow acting as a lock to synchronize
s/bellow/below
> + * the event data updating.
> + */
> + if (!rev || !__atomic_compare_exchange_n(&rev->status,
> + &valid_status, RTE_EPOLL_EXEC, 0,
> + __ATOMIC_ACQUIRE, __ATOMIC_RELAXED))
> continue;
>
> events[count].status = RTE_EPOLL_VALID;
> @@ -1237,8 +1243,11 @@ eal_epoll_process_event(struct epoll_event *evs, unsigned int n,
> rev->epdata.cb_fun(rev->fd,
> rev->epdata.cb_arg);
>
> - rte_compiler_barrier();
> - rev->status = RTE_EPOLL_VALID;
> + /* the status update should be observed after
> + * the other fields changes.
s/fields changes/fields change/
Thanks,
Stefan.
> + */
> + __atomic_store_n(&rev->status, RTE_EPOLL_VALID,
> + __ATOMIC_RELEASE);
> count++;
> }
> return count;
> @@ -1308,10 +1317,14 @@ rte_epoll_wait(int epfd, struct rte_epoll_event *events,
> static inline void
> eal_epoll_data_safe_free(struct rte_epoll_event *ev)
> {
> - while (!rte_atomic32_cmpset(&ev->status, RTE_EPOLL_VALID,
> - RTE_EPOLL_INVALID))
> - while (ev->status != RTE_EPOLL_VALID)
> + uint32_t valid_status = RTE_EPOLL_VALID;
> + while (!__atomic_compare_exchange_n(&ev->status, &valid_status,
> + RTE_EPOLL_INVALID, 0, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) {
> + while (__atomic_load_n(&ev->status,
> + __ATOMIC_RELAXED) != RTE_EPOLL_VALID)
> rte_pause();
> + valid_status = RTE_EPOLL_VALID;
> + }
> memset(&ev->epdata, 0, sizeof(ev->epdata));
> ev->fd = -1;
> ev->epfd = -1;
> @@ -1333,7 +1346,8 @@ rte_epoll_ctl(int epfd, int op, int fd,
> epfd = rte_intr_tls_epfd();
>
> if (op == EPOLL_CTL_ADD) {
> - event->status = RTE_EPOLL_VALID;
> + __atomic_store_n(&event->status, RTE_EPOLL_VALID,
> + __ATOMIC_RELAXED);
> event->fd = fd; /* ignore fd in event */
> event->epfd = epfd;
> ev.data.ptr = (void *)event;
> @@ -1345,11 +1359,13 @@ rte_epoll_ctl(int epfd, int op, int fd,
> op, fd, strerror(errno));
> if (op == EPOLL_CTL_ADD)
> /* rollback status when CTL_ADD fail */
> - event->status = RTE_EPOLL_INVALID;
> + __atomic_store_n(&event->status, RTE_EPOLL_INVALID,
> + __ATOMIC_RELAXED);
> return -1;
> }
>
> - if (op == EPOLL_CTL_DEL && event->status != RTE_EPOLL_INVALID)
> + if (op == EPOLL_CTL_DEL && __atomic_load_n(&event->status,
> + __ATOMIC_RELAXED) != RTE_EPOLL_INVALID)
> eal_epoll_data_safe_free(event);
>
> return 0;
> @@ -1378,7 +1394,8 @@ rte_intr_rx_ctl(struct rte_intr_handle *intr_handle, int epfd,
> case RTE_INTR_EVENT_ADD:
> epfd_op = EPOLL_CTL_ADD;
> rev = &intr_handle->elist[efd_idx];
> - if (rev->status != RTE_EPOLL_INVALID) {
> + if (__atomic_load_n(&rev->status,
> + __ATOMIC_RELAXED) != RTE_EPOLL_INVALID) {
> RTE_LOG(INFO, EAL, "Event already been added.\n");
> return -EEXIST;
> }
> @@ -1401,7 +1418,8 @@ rte_intr_rx_ctl(struct rte_intr_handle *intr_handle, int epfd,
> case RTE_INTR_EVENT_DEL:
> epfd_op = EPOLL_CTL_DEL;
> rev = &intr_handle->elist[efd_idx];
> - if (rev->status == RTE_EPOLL_INVALID) {
> + if (__atomic_load_n(&rev->status,
> + __ATOMIC_RELAXED) == RTE_EPOLL_INVALID) {
> RTE_LOG(INFO, EAL, "Event does not exist.\n");
> return -EPERM;
> }
> @@ -1426,12 +1444,12 @@ rte_intr_free_epoll_fd(struct rte_intr_handle *intr_handle)
>
> for (i = 0; i < intr_handle->nb_efd; i++) {
> rev = &intr_handle->elist[i];
> - if (rev->status == RTE_EPOLL_INVALID)
> + if (__atomic_load_n(&rev->status,
> + __ATOMIC_RELAXED) == RTE_EPOLL_INVALID)
> continue;
> if (rte_epoll_ctl(rev->epfd, EPOLL_CTL_DEL, rev->fd, rev)) {
> /* force free if the entry valid */
> eal_epoll_data_safe_free(rev);
> - rev->status = RTE_EPOLL_INVALID;
> }
> }
> }
> --
> 2.7.4
>
^ permalink raw reply [relevance 0%]
* [dpdk-dev] [PATCH] devtools: fix ninja break under default DESTDIR path
@ 2020-07-09 6:53 4% Phil Yang
0 siblings, 0 replies; 200+ results
From: Phil Yang @ 2020-07-09 6:53 UTC (permalink / raw)
To: david.marchand, dev; +Cc: Honnappa.Nagarahalli, Ruifeng.Wang, nd
If DPDK_ABI_REF_DIR is not set, the default DESTDIR is a relative path.
This will break ninja in the ABI check test.
Fixes: 777014e56d07 ("devtools: add ABI checks")
Signed-off-by: Phil Yang <phil.yang@arm.com>
---
devtools/test-meson-builds.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/devtools/test-meson-builds.sh b/devtools/test-meson-builds.sh
index a87de63..2bfcaca 100755
--- a/devtools/test-meson-builds.sh
+++ b/devtools/test-meson-builds.sh
@@ -143,7 +143,7 @@ build () # <directory> <target compiler | cross file> <meson options>
config $srcdir $builds_dir/$targetdir $cross --werror $*
compile $builds_dir/$targetdir
if [ -n "$DPDK_ABI_REF_VERSION" ]; then
- abirefdir=${DPDK_ABI_REF_DIR:-reference}/$DPDK_ABI_REF_VERSION
+ abirefdir=${DPDK_ABI_REF_DIR:-$(pwd)/reference}/$DPDK_ABI_REF_VERSION
if [ ! -d $abirefdir/$targetdir ]; then
# clone current sources
if [ ! -d $abirefdir/src ]; then
--
2.7.4
^ permalink raw reply [relevance 4%]
* [dpdk-dev] [PATCH v2] eal: use c11 atomic built-ins for interrupt status
@ 2020-07-09 6:46 3% ` Phil Yang
2020-07-09 8:02 0% ` Stefan Puiu
2020-07-09 8:34 3% ` [dpdk-dev] [PATCH v3] " Phil Yang
1 sibling, 2 replies; 200+ results
From: Phil Yang @ 2020-07-09 6:46 UTC (permalink / raw)
To: david.marchand, dev
Cc: mdr, aconole, drc, Honnappa.Nagarahalli, Ruifeng.Wang, nd, dodji,
nhorman, hkalra
The event status is defined as a volatile variable and shared between
threads. Use c11 atomic built-ins with explicit ordering instead of
rte_atomic ops which enforce unnecessary barriers on aarch64.
The event status has been cleaned up by the compare-and-swap operation
when we free the event data, so there is no need to set it to invalid
after that.
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Harman Kalra <hkalra@marvell.com>
---
v2:
1. Fixed typo.
2. Updated libabigail.abignore to pass ABI check.
3. Merged v1 two patches into one patch.
devtools/libabigail.abignore | 4 +++
lib/librte_eal/include/rte_eal_interrupts.h | 2 +-
lib/librte_eal/linux/eal_interrupts.c | 48 ++++++++++++++++++++---------
3 files changed, 38 insertions(+), 16 deletions(-)
diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index 0133f75..daa4631 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -48,6 +48,10 @@
changed_enumerators = RTE_CRYPTO_AEAD_LIST_END
[suppress_variable]
name = rte_crypto_aead_algorithm_strings
+; Ignore updates of epoll event
+[suppress_type]
+ type_kind = struct
+ name = rte_epoll_event
;;;;;;;;;;;;;;;;;;;;;;
; Temporary exceptions till DPDK 20.11
diff --git a/lib/librte_eal/include/rte_eal_interrupts.h b/lib/librte_eal/include/rte_eal_interrupts.h
index 773a34a..b1e8a29 100644
--- a/lib/librte_eal/include/rte_eal_interrupts.h
+++ b/lib/librte_eal/include/rte_eal_interrupts.h
@@ -59,7 +59,7 @@ enum {
/** interrupt epoll event obj, taken by epoll_event.ptr */
struct rte_epoll_event {
- volatile uint32_t status; /**< OUT: event status */
+ uint32_t status; /**< OUT: event status */
int fd; /**< OUT: event fd */
int epfd; /**< OUT: epoll instance the ev associated with */
struct rte_epoll_data epdata;
diff --git a/lib/librte_eal/linux/eal_interrupts.c b/lib/librte_eal/linux/eal_interrupts.c
index 84eeaa1..7a50869 100644
--- a/lib/librte_eal/linux/eal_interrupts.c
+++ b/lib/librte_eal/linux/eal_interrupts.c
@@ -26,7 +26,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_debug.h>
#include <rte_log.h>
@@ -1221,11 +1220,18 @@ eal_epoll_process_event(struct epoll_event *evs, unsigned int n,
{
unsigned int i, count = 0;
struct rte_epoll_event *rev;
+ uint32_t valid_status;
for (i = 0; i < n; i++) {
rev = evs[i].data.ptr;
- if (!rev || !rte_atomic32_cmpset(&rev->status, RTE_EPOLL_VALID,
- RTE_EPOLL_EXEC))
+ valid_status = RTE_EPOLL_VALID;
+ /* ACQUIRE memory ordering here pairs with RELEASE
+ * ordering bellow acting as a lock to synchronize
+ * the event data updating.
+ */
+ if (!rev || !__atomic_compare_exchange_n(&rev->status,
+ &valid_status, RTE_EPOLL_EXEC, 0,
+ __ATOMIC_ACQUIRE, __ATOMIC_RELAXED))
continue;
events[count].status = RTE_EPOLL_VALID;
@@ -1237,8 +1243,11 @@ eal_epoll_process_event(struct epoll_event *evs, unsigned int n,
rev->epdata.cb_fun(rev->fd,
rev->epdata.cb_arg);
- rte_compiler_barrier();
- rev->status = RTE_EPOLL_VALID;
+ /* the status update should be observed after
+ * the other fields changes.
+ */
+ __atomic_store_n(&rev->status, RTE_EPOLL_VALID,
+ __ATOMIC_RELEASE);
count++;
}
return count;
@@ -1308,10 +1317,14 @@ rte_epoll_wait(int epfd, struct rte_epoll_event *events,
static inline void
eal_epoll_data_safe_free(struct rte_epoll_event *ev)
{
- while (!rte_atomic32_cmpset(&ev->status, RTE_EPOLL_VALID,
- RTE_EPOLL_INVALID))
- while (ev->status != RTE_EPOLL_VALID)
+ uint32_t valid_status = RTE_EPOLL_VALID;
+ while (!__atomic_compare_exchange_n(&ev->status, &valid_status,
+ RTE_EPOLL_INVALID, 0, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) {
+ while (__atomic_load_n(&ev->status,
+ __ATOMIC_RELAXED) != RTE_EPOLL_VALID)
rte_pause();
+ valid_status = RTE_EPOLL_VALID;
+ }
memset(&ev->epdata, 0, sizeof(ev->epdata));
ev->fd = -1;
ev->epfd = -1;
@@ -1333,7 +1346,8 @@ rte_epoll_ctl(int epfd, int op, int fd,
epfd = rte_intr_tls_epfd();
if (op == EPOLL_CTL_ADD) {
- event->status = RTE_EPOLL_VALID;
+ __atomic_store_n(&event->status, RTE_EPOLL_VALID,
+ __ATOMIC_RELAXED);
event->fd = fd; /* ignore fd in event */
event->epfd = epfd;
ev.data.ptr = (void *)event;
@@ -1345,11 +1359,13 @@ rte_epoll_ctl(int epfd, int op, int fd,
op, fd, strerror(errno));
if (op == EPOLL_CTL_ADD)
/* rollback status when CTL_ADD fail */
- event->status = RTE_EPOLL_INVALID;
+ __atomic_store_n(&event->status, RTE_EPOLL_INVALID,
+ __ATOMIC_RELAXED);
return -1;
}
- if (op == EPOLL_CTL_DEL && event->status != RTE_EPOLL_INVALID)
+ if (op == EPOLL_CTL_DEL && __atomic_load_n(&event->status,
+ __ATOMIC_RELAXED) != RTE_EPOLL_INVALID)
eal_epoll_data_safe_free(event);
return 0;
@@ -1378,7 +1394,8 @@ rte_intr_rx_ctl(struct rte_intr_handle *intr_handle, int epfd,
case RTE_INTR_EVENT_ADD:
epfd_op = EPOLL_CTL_ADD;
rev = &intr_handle->elist[efd_idx];
- if (rev->status != RTE_EPOLL_INVALID) {
+ if (__atomic_load_n(&rev->status,
+ __ATOMIC_RELAXED) != RTE_EPOLL_INVALID) {
RTE_LOG(INFO, EAL, "Event already been added.\n");
return -EEXIST;
}
@@ -1401,7 +1418,8 @@ rte_intr_rx_ctl(struct rte_intr_handle *intr_handle, int epfd,
case RTE_INTR_EVENT_DEL:
epfd_op = EPOLL_CTL_DEL;
rev = &intr_handle->elist[efd_idx];
- if (rev->status == RTE_EPOLL_INVALID) {
+ if (__atomic_load_n(&rev->status,
+ __ATOMIC_RELAXED) == RTE_EPOLL_INVALID) {
RTE_LOG(INFO, EAL, "Event does not exist.\n");
return -EPERM;
}
@@ -1426,12 +1444,12 @@ rte_intr_free_epoll_fd(struct rte_intr_handle *intr_handle)
for (i = 0; i < intr_handle->nb_efd; i++) {
rev = &intr_handle->elist[i];
- if (rev->status == RTE_EPOLL_INVALID)
+ if (__atomic_load_n(&rev->status,
+ __ATOMIC_RELAXED) == RTE_EPOLL_INVALID)
continue;
if (rte_epoll_ctl(rev->epfd, EPOLL_CTL_DEL, rev->fd, rev)) {
/* force free if the entry valid */
eal_epoll_data_safe_free(rev);
- rev->status = RTE_EPOLL_INVALID;
}
}
}
--
2.7.4
^ permalink raw reply [relevance 3%]
* [dpdk-dev] [PATCH v2 2/3] ring: remove experimental tag for ring element APIs
2020-07-09 6:12 3% ` [dpdk-dev] [PATCH v2 1/3] ring: remove experimental tag for ring reset API Feifei Wang
@ 2020-07-09 6:12 3% ` Feifei Wang
1 sibling, 0 replies; 200+ results
From: Feifei Wang @ 2020-07-09 6:12 UTC (permalink / raw)
To: Honnappa Nagarahalli, Konstantin Ananyev, Ray Kinsella, Neil Horman
Cc: dev, nd, Ruifeng.wang, Feifei Wang
Remove the experimental tag for rte_ring_xxx_elem APIs that have been
around for 2 releases.
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
v2:
1. add the changed API into DPDK_21 ABI in the map file. (Ray)
lib/librte_ring/rte_ring.h | 5 +----
lib/librte_ring/rte_ring_elem.h | 8 --------
lib/librte_ring/rte_ring_version.map | 10 ++--------
3 files changed, 3 insertions(+), 20 deletions(-)
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 7181c33b4..35f3f8c42 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -40,6 +40,7 @@ extern "C" {
#endif
#include <rte_ring_core.h>
+#include <rte_ring_elem.h>
/**
* Calculate the memory size needed for a ring
@@ -401,10 +402,6 @@ rte_ring_sp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
RTE_RING_SYNC_ST, free_space);
}
-#ifdef ALLOW_EXPERIMENTAL_API
-#include <rte_ring_elem.h>
-#endif
-
/**
* Enqueue several objects on a ring.
*
diff --git a/lib/librte_ring/rte_ring_elem.h b/lib/librte_ring/rte_ring_elem.h
index 9e5192ae6..69dc51746 100644
--- a/lib/librte_ring/rte_ring_elem.h
+++ b/lib/librte_ring/rte_ring_elem.h
@@ -23,9 +23,6 @@ extern "C" {
#include <rte_ring_core.h>
/**
- * @warning
- * @b EXPERIMENTAL: this API may change without prior notice
- *
* Calculate the memory size needed for a ring with given element size
*
* This function returns the number of bytes needed for a ring, given
@@ -43,13 +40,9 @@ extern "C" {
* - -EINVAL - esize is not a multiple of 4 or count provided is not a
* power of 2.
*/
-__rte_experimental
ssize_t rte_ring_get_memsize_elem(unsigned int esize, unsigned int count);
/**
- * @warning
- * @b EXPERIMENTAL: this API may change without prior notice
- *
* Create a new ring named *name* that stores elements with given size.
*
* This function uses ``memzone_reserve()`` to allocate memory. Then it
@@ -109,7 +102,6 @@ ssize_t rte_ring_get_memsize_elem(unsigned int esize, unsigned int count);
* - EEXIST - a memzone with the same name already exists
* - ENOMEM - no appropriate memory area found in which to create memzone
*/
-__rte_experimental
struct rte_ring *rte_ring_create_elem(const char *name, unsigned int esize,
unsigned int count, int socket_id, unsigned int flags);
diff --git a/lib/librte_ring/rte_ring_version.map b/lib/librte_ring/rte_ring_version.map
index 9a6ce4d32..ac392f3ca 100644
--- a/lib/librte_ring/rte_ring_version.map
+++ b/lib/librte_ring/rte_ring_version.map
@@ -15,13 +15,7 @@ DPDK_20.0 {
DPDK_21 {
global:
- rte_ring_reset;
-} DPDK_20.0;
-
-EXPERIMENTAL {
- global:
-
- # added in 20.02
rte_ring_create_elem;
rte_ring_get_memsize_elem;
-};
+ rte_ring_reset;
+} DPDK_20.0;
--
2.17.1
^ permalink raw reply [relevance 3%]
* [dpdk-dev] [PATCH v2 1/3] ring: remove experimental tag for ring reset API
@ 2020-07-09 6:12 3% ` Feifei Wang
2020-07-09 6:12 3% ` [dpdk-dev] [PATCH v2 2/3] ring: remove experimental tag for ring element APIs Feifei Wang
1 sibling, 0 replies; 200+ results
From: Feifei Wang @ 2020-07-09 6:12 UTC (permalink / raw)
To: Honnappa Nagarahalli, Konstantin Ananyev, Ray Kinsella, Neil Horman
Cc: dev, nd, Ruifeng.wang, Feifei Wang
Remove the experimental tag for rte_ring_reset API that have been around
for 4 releases.
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
v2:
1. add the changed API into DPDK_21 ABI in the map file. (Ray)
lib/librte_ring/rte_ring.h | 3 ---
lib/librte_ring/rte_ring_version.map | 7 +++++--
2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index f67141482..7181c33b4 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -663,15 +663,12 @@ rte_ring_dequeue(struct rte_ring *r, void **obj_p)
*
* This function flush all the elements in a ring
*
- * @b EXPERIMENTAL: this API may change without prior notice
- *
* @warning
* Make sure the ring is not in use while calling this function.
*
* @param r
* A pointer to the ring structure.
*/
-__rte_experimental
void
rte_ring_reset(struct rte_ring *r);
diff --git a/lib/librte_ring/rte_ring_version.map b/lib/librte_ring/rte_ring_version.map
index e88c143cf..9a6ce4d32 100644
--- a/lib/librte_ring/rte_ring_version.map
+++ b/lib/librte_ring/rte_ring_version.map
@@ -12,11 +12,14 @@ DPDK_20.0 {
local: *;
};
-EXPERIMENTAL {
+DPDK_21 {
global:
- # added in 19.08
rte_ring_reset;
+} DPDK_20.0;
+
+EXPERIMENTAL {
+ global:
# added in 20.02
rte_ring_create_elem;
--
2.17.1
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH 2/2] eal: use c11 atomics for interrupt status
2020-07-08 15:04 0% ` Kinsella, Ray
@ 2020-07-09 5:21 0% ` Phil Yang
0 siblings, 0 replies; 200+ results
From: Phil Yang @ 2020-07-09 5:21 UTC (permalink / raw)
To: Kinsella, Ray, David Marchand, Aaron Conole
Cc: dev, David Christensen, Honnappa Nagarahalli, Ruifeng Wang, nd,
Dodji Seketeli, Neil Horman, Harman Kalra
> -----Original Message-----
> From: Kinsella, Ray <mdr@ashroe.eu>
> Sent: Wednesday, July 8, 2020 11:05 PM
> To: David Marchand <david.marchand@redhat.com>; Phil Yang
> <Phil.Yang@arm.com>; Aaron Conole <aconole@redhat.com>
> Cc: dev <dev@dpdk.org>; David Christensen <drc@linux.vnet.ibm.com>;
> Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; nd <nd@arm.com>; Dodji Seketeli
> <dodji@redhat.com>; Neil Horman <nhorman@tuxdriver.com>; Harman
> Kalra <hkalra@marvell.com>
> Subject: Re: [dpdk-dev] [PATCH 2/2] eal: use c11 atomics for interrupt status
>
>
>
> On 08/07/2020 13:29, David Marchand wrote:
> > On Thu, Jun 11, 2020 at 12:25 PM Phil Yang <phil.yang@arm.com> wrote:
> >>
> >> The event status is defined as a volatile variable and shared
> >> between threads. Use c11 atomics with explicit ordering instead
> >> of rte_atomic ops which enforce unnecessary barriers on aarch64.
> >>
> >> Signed-off-by: Phil Yang <phil.yang@arm.com>
> >> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> >> ---
> >> lib/librte_eal/include/rte_eal_interrupts.h | 2 +-
> >> lib/librte_eal/linux/eal_interrupts.c | 47 ++++++++++++++++++++----
> -----
> >> 2 files changed, 34 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/lib/librte_eal/include/rte_eal_interrupts.h
> b/lib/librte_eal/include/rte_eal_interrupts.h
> >> index 773a34a..b1e8a29 100644
> >> --- a/lib/librte_eal/include/rte_eal_interrupts.h
> >> +++ b/lib/librte_eal/include/rte_eal_interrupts.h
> >> @@ -59,7 +59,7 @@ enum {
> >>
> >> /** interrupt epoll event obj, taken by epoll_event.ptr */
> >> struct rte_epoll_event {
> >> - volatile uint32_t status; /**< OUT: event status */
> >> + uint32_t status; /**< OUT: event status */
> >> int fd; /**< OUT: event fd */
> >> int epfd; /**< OUT: epoll instance the ev associated with */
> >> struct rte_epoll_data epdata;
> >
> > I got a reject from the ABI check in my env.
> >
> > 1 function with some indirect sub-type change:
> >
> > [C]'function int rte_pci_ioport_map(rte_pci_device*, int,
> > rte_pci_ioport*)' at pci.c:756:1 has some indirect sub-type changes:
> > parameter 1 of type 'rte_pci_device*' has sub-type changes:
> > in pointed to type 'struct rte_pci_device' at rte_bus_pci.h:57:1:
> > type size hasn't changed
> > 1 data member changes (2 filtered):
> > type of 'rte_intr_handle rte_pci_device::intr_handle' changed:
> > type size hasn't changed
> > 1 data member change:
> > type of 'rte_epoll_event rte_intr_handle::elist[512]' changed:
> > array element type 'struct rte_epoll_event' changed:
> > type size hasn't changed
> > 1 data member change:
> > type of 'volatile uint32_t rte_epoll_event::status' changed:
> > entity changed from 'volatile uint32_t' to 'typedef
> > uint32_t' at stdint-uintn.h:26:1
> > type size hasn't changed
> >
> > type size hasn't changed
> >
> >
> > This is probably harmless in our case (going from volatile to non
> > volatile), but it won't pass the check in the CI without an exception
> > rule.
> >
> > Note: checking on the test-report ml, I saw nothing, but ovsrobot did
> > catch the issue with this change too, Aaron?
> >
> >
> Agreed, probably harmless and requires something in libagigail.ignore.
OK. Will update libagigail.ignore in the next version.
Thanks,
Phil
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v5 1/2] mbuf: introduce accurate packet Tx scheduling
2020-07-08 15:47 2% ` [dpdk-dev] [PATCH v5 1/2] mbuf: introduce accurate packet Tx scheduling Viacheslav Ovsiienko
@ 2020-07-08 16:05 0% ` Slava Ovsiienko
0 siblings, 0 replies; 200+ results
From: Slava Ovsiienko @ 2020-07-08 16:05 UTC (permalink / raw)
To: Slava Ovsiienko, dev
Cc: Matan Azrad, Raslan Darawsheh, olivier.matz, bernard.iremonger,
thomas, mb
> promote Acked-bt from previous patch version to maintain patchwork status accordingly
Acked-by: Olivier Matz <olivier.matz@6wind.com>
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Viacheslav Ovsiienko
> Sent: Wednesday, July 8, 2020 18:47
> To: dev@dpdk.org
> Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; olivier.matz@6wind.com;
> bernard.iremonger@intel.com; thomas@monjalon.com;
> mb@smartsharesystems.com
> Subject: [dpdk-dev] [PATCH v5 1/2] mbuf: introduce accurate packet Tx
> scheduling
>
> There is the requirement on some networks for precise traffic timing
> management. The ability to send (and, generally speaking, receive) the
> packets at the very precisely specified moment of time provides the
> opportunity to support the connections with Time Division Multiplexing using
> the contemporary general purpose NIC without involving an auxiliary
> hardware. For example, the supporting of O-RAN Fronthaul interface is one
> of the promising features for potentially usage of the precise time
> management for the egress packets.
>
> The main objective of this RFC is to specify the way how applications can
> provide the moment of time at what the packet transmission must be started
> and to describe in preliminary the supporting this feature from
> mlx5 PMD side.
>
> The new dynamic timestamp field is proposed, it provides some timing
> information, the units and time references (initial phase) are not explicitly
> defined but are maintained always the same for a given port.
> Some devices allow to query rte_eth_read_clock() that will return the current
> device timestamp. The dynamic timestamp flag tells whether the field
> contains actual timestamp value. For the packets being sent this value can be
> used by PMD to schedule packet sending.
>
> The device clock is opaque entity, the units and frequency are vendor specific
> and might depend on hardware capabilities and configurations. If might (or
> not) be synchronized with real time via PTP, might (or not) be synchronous
> with CPU clock (for example if NIC and CPU share the same clock source
> there might be no any drift between the NIC and CPU clocks), etc.
>
> After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation and
> obsoleting, these dynamic flag and field will be used to manage the
> timestamps on receiving datapath as well. Having the dedicated flags for
> Rx/Tx timestamps allows applications not to perform explicit flags reset on
> forwarding and not to promote received timestamps to the transmitting
> datapath by default. The static PKT_RX_TIMESTAMP is considered as
> candidate to become the dynamic flag.
>
> When PMD sees the "rte_dynfield_timestamp" set on the packet being sent it
> tries to synchronize the time of packet appearing on the wire with the
> specified packet timestamp. If the specified one is in the past it should be
> ignored, if one is in the distant future it should be capped with some
> reasonable value (in range of seconds). These specific cases ("too late" and
> "distant future") can be optionally reported via device xstats to assist
> applications to detect the time-related problems.
>
> There is no any packet reordering according timestamps is supposed, neither
> within packet burst, nor between packets, it is an entirely application
> responsibility to generate packets and its timestamps in desired order. The
> timestamps can be put only in the first packet in the burst providing the
> entire burst scheduling.
>
> PMD reports the ability to synchronize packet sending on timestamp with
> new offload flag:
>
> This is palliative and is going to be replaced with new eth_dev API about
> reporting/managing the supported dynamic flags and its related features.
> This API would break ABI compatibility and can't be introduced at the
> moment, so is postponed to 20.11.
>
> For testing purposes it is proposed to update testpmd "txonly"
> forwarding mode routine. With this update testpmd application generates
> the packets and sets the dynamic timestamps according to specified time
> pattern if it sees the "rte_dynfield_timestamp" is registered.
>
> The new testpmd command is proposed to configure sending pattern:
>
> set tx_times <burst_gap>,<intra_gap>
>
> <intra_gap> - the delay between the packets within the burst
> specified in the device clock units. The number
> of packets in the burst is defined by txburst parameter
>
> <burst_gap> - the delay between the bursts in the device clock units
>
> As the result the bursts of packet will be transmitted with specific delays
> between the packets within the burst and specific delay between the bursts.
> The rte_eth_get_clock is supposed to be engaged to get the current device
> clock value and provide the reference for the timestamps.
>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
>
> ---
> v1->v4:
> - dedicated dynamic Tx timestamp flag instead of shared with Rx
> v4->v5:
> - elaborated commit message
> - more words about device clocks added,
> - note about dedicated Rx/Tx timestamp flags added
>
> ---
> lib/librte_ethdev/rte_ethdev.c | 1 +
> lib/librte_ethdev/rte_ethdev.h | 4 ++++ lib/librte_mbuf/rte_mbuf_dyn.h |
> 31 +++++++++++++++++++++++++++++++
> 3 files changed, 36 insertions(+)
>
> diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
> index 8e10a6f..02157d5 100644
> --- a/lib/librte_ethdev/rte_ethdev.c
> +++ b/lib/librte_ethdev/rte_ethdev.c
> @@ -162,6 +162,7 @@ struct rte_eth_xstats_name_off {
> RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
> RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
> RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
> + RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
> };
>
> #undef RTE_TX_OFFLOAD_BIT2STR
> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> index a49242b..6f6454c 100644
> --- a/lib/librte_ethdev/rte_ethdev.h
> +++ b/lib/librte_ethdev/rte_ethdev.h
> @@ -1178,6 +1178,10 @@ struct rte_eth_conf {
> /** Device supports outer UDP checksum */ #define
> DEV_TX_OFFLOAD_OUTER_UDP_CKSUM 0x00100000
>
> +/** Device supports send on timestamp */ #define
> +DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
> +
> +
> #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
> /**< Device supports Rx queue setup after device started*/ #define
> RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002 diff --git
> a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h index
> 96c3631..8407230 100644
> --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> @@ -250,4 +250,35 @@ int rte_mbuf_dynflag_lookup(const char *name,
> #define RTE_MBUF_DYNFIELD_METADATA_NAME
> "rte_flow_dynfield_metadata"
> #define RTE_MBUF_DYNFLAG_METADATA_NAME
> "rte_flow_dynflag_metadata"
>
> +/**
> + * The timestamp dynamic field provides some timing information, the
> + * units and time references (initial phase) are not explicitly defined
> + * but are maintained always the same for a given port. Some devices
> +allow
> + * to query rte_eth_read_clock() that will return the current device
> + * timestamp. The dynamic Tx timestamp flag tells whether the field
> +contains
> + * actual timestamp value for the packets being sent, this value can be
> + * used by PMD to schedule packet sending.
> + *
> + * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
> + * and obsoleting, the dedicated Rx timestamp flag is supposed to be
> + * introduced and the shared dynamic timestamp field will be used
> + * to handle the timestamps on receiving datapath as well.
> + */
> +#define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
> "rte_dynfield_timestamp"
> +
> +/**
> + * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME flag
> set on the
> + * packet being sent it tries to synchronize the time of packet
> +appearing
> + * on the wire with the specified packet timestamp. If the specified
> +one
> + * is in the past it should be ignored, if one is in the distant future
> + * it should be capped with some reasonable value (in range of seconds).
> + *
> + * There is no any packet reordering according to timestamps is
> +supposed,
> + * neither for packet within the burst, nor for the whole bursts, it is
> + * an entirely application responsibility to generate packets and its
> + * timestamps in desired order. The timestamps might be put only in
> + * the first packet in the burst providing the entire burst scheduling.
> + */
> +#define RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME
> "rte_dynflag_tx_timestamp"
> +
> #endif
> --
> 1.8.3.1
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v4 1/2] mbuf: introduce accurate packet Txscheduling
2020-07-08 15:27 0% ` Morten Brørup
@ 2020-07-08 15:51 0% ` Slava Ovsiienko
0 siblings, 0 replies; 200+ results
From: Slava Ovsiienko @ 2020-07-08 15:51 UTC (permalink / raw)
To: Morten Brørup, dev
Cc: Matan Azrad, Raslan Darawsheh, olivier.matz, bernard.iremonger, thomas
Hi, Morten
Addressed most of your comments in the v5 commit message.
Header file comments are close to become too wordy,
and I did not dare to elaborate ones more.
With best regards, Slava
> -----Original Message-----
> From: Morten Brørup <mb@smartsharesystems.com>
> Sent: Wednesday, July 8, 2020 18:27
> To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; olivier.matz@6wind.com;
> bernard.iremonger@intel.com; thomas@mellanox.net
> Subject: RE: [dpdk-dev] [PATCH v4 1/2] mbuf: introduce accurate packet
> Txscheduling
>
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Slava Ovsiienko
> > Sent: Wednesday, July 8, 2020 4:54 PM
> >
> > Hi, Morten
> >
> > Thank you for the comments. Please, see below.
> >
> > > -----Original Message-----
> > > From: Morten Brørup <mb@smartsharesystems.com>
> > > Sent: Wednesday, July 8, 2020 17:16
> > > To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> > > Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> > > <rasland@mellanox.com>; olivier.matz@6wind.com;
> > > bernard.iremonger@intel.com; thomas@mellanox.net
> > > Subject: RE: [dpdk-dev] [PATCH v4 1/2] mbuf: introduce accurate
> > packet
> > > Txscheduling
> > >
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Viacheslav
> > > > Ovsiienko
> > > > Sent: Tuesday, July 7, 2020 4:57 PM
> > > >
> > > > There is the requirement on some networks for precise traffic
> > timing
> > > > management. The ability to send (and, generally speaking, receive)
> > the
> > > > packets at the very precisely specified moment of time provides
> > > > the opportunity to support the connections with Time Division
> > Multiplexing
> > > > using the contemporary general purpose NIC without involving an
> > > > auxiliary hardware. For example, the supporting of O-RAN Fronthaul
> > > > interface is one of the promising features for potentially usage
> > > > of the precise time management for the egress packets.
> > > >
> > > > The main objective of this RFC is to specify the way how
> > applications
> > > > can provide the moment of time at what the packet transmission
> > > > must
> > be
> > > > started and to describe in preliminary the supporting this feature
> > > > from
> > > > mlx5 PMD side.
> > > >
> > > > The new dynamic timestamp field is proposed, it provides some
> > timing
> > > > information, the units and time references (initial phase) are not
> > > > explicitly defined but are maintained always the same for a given
> > port.
> > > > Some devices allow to query rte_eth_read_clock() that will return
> > the
> > > > current device timestamp. The dynamic timestamp flag tells whether
> > the
> > > > field contains actual timestamp value. For the packets being sent
> > this
> > > > value can be used by PMD to schedule packet sending.
> > > >
> > > > After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
> > and
> > > > obsoleting, these dynamic flag and field will be used to manage
> > > > the timestamps on receiving datapath as well.
> > > >
> > > > When PMD sees the "rte_dynfield_timestamp" set on the packet being
> > > > sent it tries to synchronize the time of packet appearing on the
> > wire
> > > > with the specified packet timestamp. If the specified one is in
> > > > the past it should be ignored, if one is in the distant future it
> > should
> > > > be capped with some reasonable value (in range of seconds). These
> > > > specific cases ("too late" and "distant future") can be optionally
> > > > reported via device xstats to assist applications to detect the
> > > > time-related problems.
> > > >
> > > > There is no any packet reordering according timestamps is
> > > > supposed, neither within packet burst, nor between packets, it is
> > > > an entirely application responsibility to generate packets and its
> > > > timestamps
> > in
> > > > desired order. The timestamps can be put only in the first packet
> > in
> > > > the burst providing the entire burst scheduling.
> > > >
> > > > PMD reports the ability to synchronize packet sending on timestamp
> > > > with new offload flag:
> > > >
> > > > This is palliative and is going to be replaced with new eth_dev
> > > > API about reporting/managing the supported dynamic flags and its
> > related
> > > > features. This API would break ABI compatibility and can't be
> > > > introduced at the moment, so is postponed to 20.11.
> > > >
> > > > For testing purposes it is proposed to update testpmd "txonly"
> > > > forwarding mode routine. With this update testpmd application
> > > > generates the packets and sets the dynamic timestamps according to
> > > > specified time pattern if it sees the "rte_dynfield_timestamp" is
> > registered.
> > > >
> > > > The new testpmd command is proposed to configure sending pattern:
> > > >
> > > > set tx_times <burst_gap>,<intra_gap>
> > > >
> > > > <intra_gap> - the delay between the packets within the burst
> > > > specified in the device clock units. The number
> > > > of packets in the burst is defined by txburst
> > parameter
> > > >
> > > > <burst_gap> - the delay between the bursts in the device clock
> > units
> > > >
> > > > As the result the bursts of packet will be transmitted with
> > specific
> > > > delays between the packets within the burst and specific delay
> > between
> > > > the bursts. The rte_eth_get_clock is supposed to be engaged to get
> > the
> > > > current device clock value and provide the reference for the
> > > > timestamps.
> > > >
> > > > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > > > ---
> > > > v1->v4:
> > > > - dedicated dynamic Tx timestamp flag instead of shared with
> > > > Rx
> > >
> > > The detailed description above should be updated to reflect that it
> > is now
> > > two flags.
> > OK
> >
> > >
> > > > - Doxygen-style comment
> > > > - comments update
> > > >
> > > > ---
> > > > lib/librte_ethdev/rte_ethdev.c | 1 +
> > lib/librte_ethdev/rte_ethdev.h
> > > > | 4 ++++ lib/librte_mbuf/rte_mbuf_dyn.h | 31
> > > > +++++++++++++++++++++++++++++++
> > > > 3 files changed, 36 insertions(+)
> > > >
> > > > diff --git a/lib/librte_ethdev/rte_ethdev.c
> > > > b/lib/librte_ethdev/rte_ethdev.c index 8e10a6f..02157d5 100644
> > > > --- a/lib/librte_ethdev/rte_ethdev.c
> > > > +++ b/lib/librte_ethdev/rte_ethdev.c
> > > > @@ -162,6 +162,7 @@ struct rte_eth_xstats_name_off {
> > > > RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
> > > > RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
> > > > RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
> > > > + RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
> > > > };
> > > >
> > > > #undef RTE_TX_OFFLOAD_BIT2STR
> > > > diff --git a/lib/librte_ethdev/rte_ethdev.h
> > > > b/lib/librte_ethdev/rte_ethdev.h index a49242b..6f6454c 100644
> > > > --- a/lib/librte_ethdev/rte_ethdev.h
> > > > +++ b/lib/librte_ethdev/rte_ethdev.h
> > > > @@ -1178,6 +1178,10 @@ struct rte_eth_conf {
> > > > /** Device supports outer UDP checksum */ #define
> > > > DEV_TX_OFFLOAD_OUTER_UDP_CKSUM 0x00100000
> > > >
> > > > +/** Device supports send on timestamp */ #define
> > > > +DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
> > > > +
> > > > +
> > > > #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP
> 0x00000001
> > > /**<
> > > > Device supports Rx queue setup after device started*/ #define
> > > > RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002 diff --
> git
> > > > a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > index 96c3631..7e9f7d2 100644
> > > > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > > > @@ -250,4 +250,35 @@ int rte_mbuf_dynflag_lookup(const char
> *name,
> > > > #define RTE_MBUF_DYNFIELD_METADATA_NAME
> > > "rte_flow_dynfield_metadata"
> > > > #define RTE_MBUF_DYNFLAG_METADATA_NAME
> > > "rte_flow_dynflag_metadata"
> > > >
> > > > +/**
> > > > + * The timestamp dynamic field provides some timing information,
> > the
> > > > + * units and time references (initial phase) are not explicitly
> > > > defined
> > > > + * but are maintained always the same for a given port. Some
> > devices
> > > > allow4
> > > > + * to query rte_eth_read_clock() that will return the current
> > device
> > > > + * timestamp. The dynamic Tx timestamp flag tells whether the
> > field
> > > > contains
> > > > + * actual timestamp value. For the packets being sent this value
> > can
> > > > be
> > > > + * used by PMD to schedule packet sending.
> > > > + *
> > > > + * After PKT_RX_TIMESTAMP flag and fixed timestamp field
> > deprecation
> > > > + * and obsoleting, the dedicated Rx timestamp flag is supposed to
> > be
> > > > + * introduced and the shared dynamic timestamp field will be used
> > > > + * to handle the timestamps on receiving datapath as well.
> > > > + */
> > > > +#define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
> > > "rte_dynfield_timestamp"
> > >
> > > The description above should not say anything about the dynamic TX
> > > timestamp flag.
> > It does not. Or do you mean RX?
> > Not sure, field and flag are tightly coupled, it is nice to mention
> > this relation for better understanding.
> > And mentioning the RX explains why it is not like this:
> > RTE_MBUF_DYNFIELD_[TX]_TIMESTAMP_NAME
>
> Sorry. I misunderstood its purpose!
> It's the name of the field, and the field will not only be used for RX, but in the
> future also for RX.
> (I thought it was the name of the RX flag, reserved for future use.)
>
> >
> > >
> > > Please elaborate "some timing information", e.g. add "... about when
> > the
> > > packet was received".
> >
> > Sorry, I do not follow, currently the dynamic field is not "about
> > when the packet was received". Now it is introduced for Tx only and
> > just the opportunity to be shared with Rx one in coming releases is
> > mentioned. "Some" means - not specified (herein) exactly.
> > And it is elaborated what Is not specified and how it is supposed to
> > use Tx timestamp.
>
> It should be described when it is valid, and how it is being used, e.g. by
> adding a reference to the "rte_dynflag_tx_timestamp" flag.
>
> > >
> > > > +
> > > > +/**
> > > > + * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME
> flag
> > > set on
> > > > the
> > > > + * packet being sent it tries to synchronize the time of packet
> > > > appearing
> > > > + * on the wire with the specified packet timestamp. If the
> > specified
> > > > one
> > > > + * is in the past it should be ignored, if one is in the distant
> > > > future
> > > > + * it should be capped with some reasonable value (in range of
> > > > seconds).
> > > > + *
> > > > + * There is no any packet reordering according to timestamps is
> > > > supposed,
> > > > + * neither for packet within the burst, nor for the whole bursts,
> > it
> > > > is
> > > > + * an entirely application responsibility to generate packets and
> > its
> > > > + * timestamps in desired order. The timestamps might be put only
> > in
> > > > + * the first packet in the burst providing the entire burst
> > > > scheduling.
> > > > + */
> > > > +#define RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME
> > > "rte_dynflag_tx_timestamp"
> > > > +
> > > > #endif
> > > > --
> > > > 1.8.3.1
> > > >
> > >
> > > It may be worth adding some documentation about how the clocks of
> > > the NICs are out of sync with the clock of the CPU, and are all
> > > drifting
> > relatively.
> > >
> > > And those clocks are also out of sync with the actual time (NTP
> > clock).
> >
> > IMO, It is out of scope of this very generic patch. As for mlx NICs -
> > the internal device clock might be (or might be not) synchronized with
> > PTP, it can provide timestamps in real nanoseconds in various formats
> > or just some free running counter.
>
> Cool!
>
> > On some systems the NIC and CPU might share the same clock source (for
> > their PLL inputs for example) and there will be no any drifts. As we
> > can see - it is a wide and interesting opic to discuss, but, IMO, the
> > comment in header file might be not the most relevant place to do. As
> > for mlx5 devices clock specifics - it will be documented in PMD
> > chapter.
> >
> > OK, will add few generic words, the few ones - in order not to make
> > comment wordy, just point the direction for further thinking.
>
> I agree - we don't want cookbooks in the header files. Only enough
> description to avoid the worst misunderstandings.
>
> >
> > >
> > > Preferably, some sort of cookbook for handling this should be
> > provided.
> > > PCAP could be used as an example.
> > >
> > testpmd example is included in series, mlx5 PMD patch is prepared and
> > coming soon.
>
> Great.
>
> And I suppose that the more detailed cookbook/example - regarding offset
> and drift of various clocks - is probably more relevant for the RX side (for
> various PCAP applications), and thus completely unrelated to this patch.
>
> >
> > With best regards, Slava
^ permalink raw reply [relevance 0%]
* [dpdk-dev] [PATCH v5 1/2] mbuf: introduce accurate packet Tx scheduling
` (3 preceding siblings ...)
2020-07-07 14:57 2% ` [dpdk-dev] [PATCH v4 " Viacheslav Ovsiienko
@ 2020-07-08 15:47 2% ` Viacheslav Ovsiienko
2020-07-08 16:05 0% ` Slava Ovsiienko
2020-07-09 12:36 2% ` [dpdk-dev] [PATCH v6 " Viacheslav Ovsiienko
2020-07-10 12:39 2% ` [dpdk-dev] [PATCH v7 " Viacheslav Ovsiienko
6 siblings, 1 reply; 200+ results
From: Viacheslav Ovsiienko @ 2020-07-08 15:47 UTC (permalink / raw)
To: dev; +Cc: matan, rasland, olivier.matz, bernard.iremonger, thomas, mb
There is the requirement on some networks for precise traffic timing
management. The ability to send (and, generally speaking, receive)
the packets at the very precisely specified moment of time provides
the opportunity to support the connections with Time Division
Multiplexing using the contemporary general purpose NIC without involving
an auxiliary hardware. For example, the supporting of O-RAN Fronthaul
interface is one of the promising features for potentially usage of the
precise time management for the egress packets.
The main objective of this RFC is to specify the way how applications
can provide the moment of time at what the packet transmission must be
started and to describe in preliminary the supporting this feature from
mlx5 PMD side.
The new dynamic timestamp field is proposed, it provides some timing
information, the units and time references (initial phase) are not
explicitly defined but are maintained always the same for a given port.
Some devices allow to query rte_eth_read_clock() that will return
the current device timestamp. The dynamic timestamp flag tells whether
the field contains actual timestamp value. For the packets being sent
this value can be used by PMD to schedule packet sending.
The device clock is opaque entity, the units and frequency are
vendor specific and might depend on hardware capabilities and
configurations. If might (or not) be synchronized with real time
via PTP, might (or not) be synchronous with CPU clock (for example
if NIC and CPU share the same clock source there might be no
any drift between the NIC and CPU clocks), etc.
After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
and obsoleting, these dynamic flag and field will be used to manage
the timestamps on receiving datapath as well. Having the dedicated
flags for Rx/Tx timestamps allows applications not to perform explicit
flags reset on forwarding and not to promote received timestamps
to the transmitting datapath by default. The static PKT_RX_TIMESTAMP
is considered as candidate to become the dynamic flag.
When PMD sees the "rte_dynfield_timestamp" set on the packet being sent
it tries to synchronize the time of packet appearing on the wire with
the specified packet timestamp. If the specified one is in the past it
should be ignored, if one is in the distant future it should be capped
with some reasonable value (in range of seconds). These specific cases
("too late" and "distant future") can be optionally reported via
device xstats to assist applications to detect the time-related
problems.
There is no any packet reordering according timestamps is supposed,
neither within packet burst, nor between packets, it is an entirely
application responsibility to generate packets and its timestamps
in desired order. The timestamps can be put only in the first packet
in the burst providing the entire burst scheduling.
PMD reports the ability to synchronize packet sending on timestamp
with new offload flag:
This is palliative and is going to be replaced with new eth_dev API
about reporting/managing the supported dynamic flags and its related
features. This API would break ABI compatibility and can't be introduced
at the moment, so is postponed to 20.11.
For testing purposes it is proposed to update testpmd "txonly"
forwarding mode routine. With this update testpmd application generates
the packets and sets the dynamic timestamps according to specified time
pattern if it sees the "rte_dynfield_timestamp" is registered.
The new testpmd command is proposed to configure sending pattern:
set tx_times <burst_gap>,<intra_gap>
<intra_gap> - the delay between the packets within the burst
specified in the device clock units. The number
of packets in the burst is defined by txburst parameter
<burst_gap> - the delay between the bursts in the device clock units
As the result the bursts of packet will be transmitted with specific
delays between the packets within the burst and specific delay between
the bursts. The rte_eth_get_clock is supposed to be engaged to get the
current device clock value and provide the reference for the timestamps.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
v1->v4:
- dedicated dynamic Tx timestamp flag instead of shared with Rx
v4->v5:
- elaborated commit message
- more words about device clocks added,
- note about dedicated Rx/Tx timestamp flags added
---
lib/librte_ethdev/rte_ethdev.c | 1 +
lib/librte_ethdev/rte_ethdev.h | 4 ++++
lib/librte_mbuf/rte_mbuf_dyn.h | 31 +++++++++++++++++++++++++++++++
3 files changed, 36 insertions(+)
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 8e10a6f..02157d5 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -162,6 +162,7 @@ struct rte_eth_xstats_name_off {
RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
+ RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
};
#undef RTE_TX_OFFLOAD_BIT2STR
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index a49242b..6f6454c 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1178,6 +1178,10 @@ struct rte_eth_conf {
/** Device supports outer UDP checksum */
#define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM 0x00100000
+/** Device supports send on timestamp */
+#define DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
+
+
#define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
/**< Device supports Rx queue setup after device started*/
#define RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 96c3631..8407230 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -250,4 +250,35 @@ int rte_mbuf_dynflag_lookup(const char *name,
#define RTE_MBUF_DYNFIELD_METADATA_NAME "rte_flow_dynfield_metadata"
#define RTE_MBUF_DYNFLAG_METADATA_NAME "rte_flow_dynflag_metadata"
+/**
+ * The timestamp dynamic field provides some timing information, the
+ * units and time references (initial phase) are not explicitly defined
+ * but are maintained always the same for a given port. Some devices allow
+ * to query rte_eth_read_clock() that will return the current device
+ * timestamp. The dynamic Tx timestamp flag tells whether the field contains
+ * actual timestamp value for the packets being sent, this value can be
+ * used by PMD to schedule packet sending.
+ *
+ * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
+ * and obsoleting, the dedicated Rx timestamp flag is supposed to be
+ * introduced and the shared dynamic timestamp field will be used
+ * to handle the timestamps on receiving datapath as well.
+ */
+#define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME "rte_dynfield_timestamp"
+
+/**
+ * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME flag set on the
+ * packet being sent it tries to synchronize the time of packet appearing
+ * on the wire with the specified packet timestamp. If the specified one
+ * is in the past it should be ignored, if one is in the distant future
+ * it should be capped with some reasonable value (in range of seconds).
+ *
+ * There is no any packet reordering according to timestamps is supposed,
+ * neither for packet within the burst, nor for the whole bursts, it is
+ * an entirely application responsibility to generate packets and its
+ * timestamps in desired order. The timestamps might be put only in
+ * the first packet in the burst providing the entire burst scheduling.
+ */
+#define RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME "rte_dynflag_tx_timestamp"
+
#endif
--
1.8.3.1
^ permalink raw reply [relevance 2%]
* Re: [dpdk-dev] [PATCH v7 1/3] lib/lpm: integrate RCU QSBR
2020-07-08 14:30 2% ` David Marchand
@ 2020-07-08 15:34 5% ` Ruifeng Wang
0 siblings, 0 replies; 200+ results
From: Ruifeng Wang @ 2020-07-08 15:34 UTC (permalink / raw)
To: David Marchand
Cc: Bruce Richardson, Vladimir Medvedkin, John McNamara,
Marko Kovacevic, Ray Kinsella, Neil Horman, dev, Ananyev,
Konstantin, Honnappa Nagarahalli, nd, nd
> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Wednesday, July 8, 2020 10:30 PM
> To: Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: Bruce Richardson <bruce.richardson@intel.com>; Vladimir Medvedkin
> <vladimir.medvedkin@intel.com>; John McNamara
> <john.mcnamara@intel.com>; Marko Kovacevic
> <marko.kovacevic@intel.com>; Ray Kinsella <mdr@ashroe.eu>; Neil Horman
> <nhorman@tuxdriver.com>; dev <dev@dpdk.org>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v7 1/3] lib/lpm: integrate RCU QSBR
>
> On Tue, Jul 7, 2020 at 5:16 PM Ruifeng Wang <ruifeng.wang@arm.com>
> wrote:
> > diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h index
> > b9d49ac87..7889f21b3 100644
> > --- a/lib/librte_lpm/rte_lpm.h
> > +++ b/lib/librte_lpm/rte_lpm.h
> > @@ -1,5 +1,6 @@
> > /* SPDX-License-Identifier: BSD-3-Clause
> > * Copyright(c) 2010-2014 Intel Corporation
> > + * Copyright(c) 2020 Arm Limited
> > */
> >
> > #ifndef _RTE_LPM_H_
> > @@ -20,6 +21,7 @@
> > #include <rte_memory.h>
> > #include <rte_common.h>
> > #include <rte_vect.h>
> > +#include <rte_rcu_qsbr.h>
> >
> > #ifdef __cplusplus
> > extern "C" {
> > @@ -62,6 +64,17 @@ extern "C" {
> > /** Bitmask used to indicate successful lookup */
> > #define RTE_LPM_LOOKUP_SUCCESS 0x01000000
> >
> > +/** @internal Default RCU defer queue entries to reclaim in one go. */
> > +#define RTE_LPM_RCU_DQ_RECLAIM_MAX 16
> > +
> > +/** RCU reclamation modes */
> > +enum rte_lpm_qsbr_mode {
> > + /** Create defer queue for reclaim. */
> > + RTE_LPM_QSBR_MODE_DQ = 0,
> > + /** Use blocking mode reclaim. No defer queue created. */
> > + RTE_LPM_QSBR_MODE_SYNC
> > +};
> > +
> > #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > /** @internal Tbl24 entry structure. */ __extension__ @@ -130,6
> > +143,28 @@ struct rte_lpm {
> > __rte_cache_aligned; /**< LPM tbl24 table. */
> > struct rte_lpm_tbl_entry *tbl8; /**< LPM tbl8 table. */
> > struct rte_lpm_rule *rules_tbl; /**< LPM rules. */
> > +#ifdef ALLOW_EXPERIMENTAL_API
> > + /* RCU config. */
> > + struct rte_rcu_qsbr *v; /* RCU QSBR variable. */
> > + enum rte_lpm_qsbr_mode rcu_mode;/* Blocking, defer queue. */
> > + struct rte_rcu_qsbr_dq *dq; /* RCU QSBR defer queue. */
> > +#endif
> > +};
>
> I can see failures in travis reports for v7 and v6.
> I reproduced them in my env.
>
> 1 function with some indirect sub-type change:
>
> [C]'function int rte_lpm_add(rte_lpm*, uint32_t, uint8_t, uint32_t)'
> at rte_lpm.c:764:1 has some indirect sub-type changes:
> parameter 1 of type 'rte_lpm*' has sub-type changes:
> in pointed to type 'struct rte_lpm' at rte_lpm.h:134:1:
> type size hasn't changed
> 3 data member insertions:
> 'rte_rcu_qsbr* rte_lpm::v', at offset 536873600 (in bits) at
> rte_lpm.h:148:1
> 'rte_lpm_qsbr_mode rte_lpm::rcu_mode', at offset 536873664 (in bits)
> at rte_lpm.h:149:1
> 'rte_rcu_qsbr_dq* rte_lpm::dq', at offset 536873728 (in
> bits) at rte_lpm.h:150:1
>
Sorry, I thought if ALLOW_EXPERIMENTAL was added, ABI would be kept when experimental was not allowed by user.
ABI and ALLOW_EXPERIMENTAL should be two different things.
>
> Going back to my proposal of hiding what does not need to be seen.
>
> Disclaimer, *this is quick & dirty* but it builds and passes ABI check:
>
> $ git diff
> diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c index
> d498ba761..7109aef6a 100644
> --- a/lib/librte_lpm/rte_lpm.c
> +++ b/lib/librte_lpm/rte_lpm.c
I understand your proposal in v5 now. A new data structure encloses rte_lpm and new members that for RCU use.
In this way, rte_lpm ABI is kept. And we can move out other members in rte_lpm that not need to be exposed in 20.11 release.
I will fix the ABI issue in next version.
> @@ -115,6 +115,15 @@ rte_lpm_find_existing(const char *name)
> return l;
> }
>
> +struct internal_lpm {
> + /* Public object */
> + struct rte_lpm lpm;
> + /* RCU config. */
> + struct rte_rcu_qsbr *v; /* RCU QSBR variable. */
> + enum rte_lpm_qsbr_mode rcu_mode;/* Blocking, defer queue. */
> + struct rte_rcu_qsbr_dq *dq; /* RCU QSBR defer queue. */
> +};
> +
> /*
> * Allocates memory for LPM object
> */
> @@ -123,6 +132,7 @@ rte_lpm_create(const char *name, int socket_id,
> const struct rte_lpm_config *config) {
> char mem_name[RTE_LPM_NAMESIZE];
> + struct internal_lpm *internal = NULL;
> struct rte_lpm *lpm = NULL;
> struct rte_tailq_entry *te;
> uint32_t mem_size, rules_size, tbl8s_size; @@ -141,12 +151,6 @@
> rte_lpm_create(const char *name, int socket_id,
>
> snprintf(mem_name, sizeof(mem_name), "LPM_%s", name);
>
> - /* Determine the amount of memory to allocate. */
> - mem_size = sizeof(*lpm);
> - rules_size = sizeof(struct rte_lpm_rule) * config->max_rules;
> - tbl8s_size = (sizeof(struct rte_lpm_tbl_entry) *
> - RTE_LPM_TBL8_GROUP_NUM_ENTRIES * config-
> >number_tbl8s);
> -
> rte_mcfg_tailq_write_lock();
>
> /* guarantee there's no existing */ @@ -170,16 +174,23 @@
> rte_lpm_create(const char *name, int socket_id,
> goto exit;
> }
>
> + /* Determine the amount of memory to allocate. */
> + mem_size = sizeof(*internal);
> + rules_size = sizeof(struct rte_lpm_rule) * config->max_rules;
> + tbl8s_size = (sizeof(struct rte_lpm_tbl_entry) *
> + RTE_LPM_TBL8_GROUP_NUM_ENTRIES *
> + config->number_tbl8s);
> +
> /* Allocate memory to store the LPM data structures. */
> - lpm = rte_zmalloc_socket(mem_name, mem_size,
> + internal = rte_zmalloc_socket(mem_name, mem_size,
> RTE_CACHE_LINE_SIZE, socket_id);
> - if (lpm == NULL) {
> + if (internal == NULL) {
> RTE_LOG(ERR, LPM, "LPM memory allocation failed\n");
> rte_free(te);
> rte_errno = ENOMEM;
> goto exit;
> }
>
> + lpm = &internal->lpm;
> lpm->rules_tbl = rte_zmalloc_socket(NULL,
> (size_t)rules_size, RTE_CACHE_LINE_SIZE, socket_id);
>
> @@ -226,6 +237,7 @@ rte_lpm_create(const char *name, int socket_id,
> void rte_lpm_free(struct rte_lpm *lpm) {
> + struct internal_lpm *internal;
> struct rte_lpm_list *lpm_list;
> struct rte_tailq_entry *te;
>
> @@ -247,8 +259,9 @@ rte_lpm_free(struct rte_lpm *lpm)
>
> rte_mcfg_tailq_write_unlock();
>
> - if (lpm->dq)
> - rte_rcu_qsbr_dq_delete(lpm->dq);
> + internal = container_of(lpm, struct internal_lpm, lpm);
> + if (internal->dq != NULL)
> + rte_rcu_qsbr_dq_delete(internal->dq);
> rte_free(lpm->tbl8);
> rte_free(lpm->rules_tbl);
> rte_free(lpm);
> @@ -276,13 +289,15 @@ rte_lpm_rcu_qsbr_add(struct rte_lpm *lpm, struct
> rte_lpm_rcu_config *cfg, {
> char rcu_dq_name[RTE_RCU_QSBR_DQ_NAMESIZE];
> struct rte_rcu_qsbr_dq_parameters params = {0};
> + struct internal_lpm *internal;
>
> - if ((lpm == NULL) || (cfg == NULL)) {
> + if (lpm == NULL || cfg == NULL) {
> rte_errno = EINVAL;
> return 1;
> }
>
> - if (lpm->v) {
> + internal = container_of(lpm, struct internal_lpm, lpm);
> + if (internal->v != NULL) {
> rte_errno = EEXIST;
> return 1;
> }
> @@ -305,20 +320,19 @@ rte_lpm_rcu_qsbr_add(struct rte_lpm *lpm, struct
> rte_lpm_rcu_config *cfg,
> params.free_fn = __lpm_rcu_qsbr_free_resource;
> params.p = lpm;
> params.v = cfg->v;
> - lpm->dq = rte_rcu_qsbr_dq_create(¶ms);
> - if (lpm->dq == NULL) {
> - RTE_LOG(ERR, LPM,
> - "LPM QS defer queue creation failed\n");
> + internal->dq = rte_rcu_qsbr_dq_create(¶ms);
> + if (internal->dq == NULL) {
> + RTE_LOG(ERR, LPM, "LPM QS defer queue creation
> failed\n");
> return 1;
> }
> if (dq)
> - *dq = lpm->dq;
> + *dq = internal->dq;
> } else {
> rte_errno = EINVAL;
> return 1;
> }
> - lpm->rcu_mode = cfg->mode;
> - lpm->v = cfg->v;
> + internal->rcu_mode = cfg->mode;
> + internal->v = cfg->v;
>
> return 0;
> }
> @@ -502,12 +516,13 @@ _tbl8_alloc(struct rte_lpm *lpm) static int32_t
> tbl8_alloc(struct rte_lpm *lpm) {
> + struct internal_lpm *internal = container_of(lpm, struct
> internal_lpm, lpm);
> int32_t group_idx; /* tbl8 group index. */
>
> group_idx = _tbl8_alloc(lpm);
> - if ((group_idx == -ENOSPC) && (lpm->dq != NULL)) {
> + if (group_idx == -ENOSPC && internal->dq != NULL) {
> /* If there are no tbl8 groups try to reclaim one. */
> - if (rte_rcu_qsbr_dq_reclaim(lpm->dq, 1, NULL, NULL, NULL) == 0)
> + if (rte_rcu_qsbr_dq_reclaim(internal->dq, 1, NULL,
> NULL, NULL) == 0)
> group_idx = _tbl8_alloc(lpm);
> }
>
> @@ -518,20 +533,21 @@ static void
> tbl8_free(struct rte_lpm *lpm, uint32_t tbl8_group_start) {
> struct rte_lpm_tbl_entry zero_tbl8_entry = {0};
> + struct internal_lpm *internal = container_of(lpm, struct
> internal_lpm, lpm);
>
> - if (!lpm->v) {
> + if (internal->v == NULL) {
> /* Set tbl8 group invalid*/
> __atomic_store(&lpm->tbl8[tbl8_group_start], &zero_tbl8_entry,
> __ATOMIC_RELAXED);
> - } else if (lpm->rcu_mode == RTE_LPM_QSBR_MODE_SYNC) {
> + } else if (internal->rcu_mode == RTE_LPM_QSBR_MODE_SYNC) {
> /* Wait for quiescent state change. */
> - rte_rcu_qsbr_synchronize(lpm->v, RTE_QSBR_THRID_INVALID);
> + rte_rcu_qsbr_synchronize(internal->v,
> + RTE_QSBR_THRID_INVALID);
> /* Set tbl8 group invalid*/
> __atomic_store(&lpm->tbl8[tbl8_group_start], &zero_tbl8_entry,
> __ATOMIC_RELAXED);
> - } else if (lpm->rcu_mode == RTE_LPM_QSBR_MODE_DQ) {
> + } else if (internal->rcu_mode == RTE_LPM_QSBR_MODE_DQ) {
> /* Push into QSBR defer queue. */
> - rte_rcu_qsbr_dq_enqueue(lpm->dq, (void *)&tbl8_group_start);
> + rte_rcu_qsbr_dq_enqueue(internal->dq, (void
> *)&tbl8_group_start);
> }
> }
>
> diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h index
> 7889f21b3..a9568fcdd 100644
> --- a/lib/librte_lpm/rte_lpm.h
> +++ b/lib/librte_lpm/rte_lpm.h
> @@ -143,12 +143,6 @@ struct rte_lpm {
> __rte_cache_aligned; /**< LPM tbl24 table. */
> struct rte_lpm_tbl_entry *tbl8; /**< LPM tbl8 table. */
> struct rte_lpm_rule *rules_tbl; /**< LPM rules. */ -#ifdef
> ALLOW_EXPERIMENTAL_API
> - /* RCU config. */
> - struct rte_rcu_qsbr *v; /* RCU QSBR variable. */
> - enum rte_lpm_qsbr_mode rcu_mode;/* Blocking, defer queue. */
> - struct rte_rcu_qsbr_dq *dq; /* RCU QSBR defer queue. */
> -#endif
> };
>
> /** LPM RCU QSBR configuration structure. */
>
>
>
>
> --
> David Marchand
^ permalink raw reply [relevance 5%]
* Re: [dpdk-dev] [PATCH v4 1/2] mbuf: introduce accurate packet Txscheduling
2020-07-08 14:54 0% ` Slava Ovsiienko
@ 2020-07-08 15:27 0% ` Morten Brørup
2020-07-08 15:51 0% ` Slava Ovsiienko
0 siblings, 1 reply; 200+ results
From: Morten Brørup @ 2020-07-08 15:27 UTC (permalink / raw)
To: Slava Ovsiienko, dev
Cc: Matan Azrad, Raslan Darawsheh, olivier.matz, bernard.iremonger, thomas
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Slava Ovsiienko
> Sent: Wednesday, July 8, 2020 4:54 PM
>
> Hi, Morten
>
> Thank you for the comments. Please, see below.
>
> > -----Original Message-----
> > From: Morten Brørup <mb@smartsharesystems.com>
> > Sent: Wednesday, July 8, 2020 17:16
> > To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> > Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> > <rasland@mellanox.com>; olivier.matz@6wind.com;
> > bernard.iremonger@intel.com; thomas@mellanox.net
> > Subject: RE: [dpdk-dev] [PATCH v4 1/2] mbuf: introduce accurate
> packet
> > Txscheduling
> >
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Viacheslav
> > > Ovsiienko
> > > Sent: Tuesday, July 7, 2020 4:57 PM
> > >
> > > There is the requirement on some networks for precise traffic
> timing
> > > management. The ability to send (and, generally speaking, receive)
> the
> > > packets at the very precisely specified moment of time provides the
> > > opportunity to support the connections with Time Division
> Multiplexing
> > > using the contemporary general purpose NIC without involving an
> > > auxiliary hardware. For example, the supporting of O-RAN Fronthaul
> > > interface is one of the promising features for potentially usage of
> > > the precise time management for the egress packets.
> > >
> > > The main objective of this RFC is to specify the way how
> applications
> > > can provide the moment of time at what the packet transmission must
> be
> > > started and to describe in preliminary the supporting this feature
> > > from
> > > mlx5 PMD side.
> > >
> > > The new dynamic timestamp field is proposed, it provides some
> timing
> > > information, the units and time references (initial phase) are not
> > > explicitly defined but are maintained always the same for a given
> port.
> > > Some devices allow to query rte_eth_read_clock() that will return
> the
> > > current device timestamp. The dynamic timestamp flag tells whether
> the
> > > field contains actual timestamp value. For the packets being sent
> this
> > > value can be used by PMD to schedule packet sending.
> > >
> > > After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
> and
> > > obsoleting, these dynamic flag and field will be used to manage the
> > > timestamps on receiving datapath as well.
> > >
> > > When PMD sees the "rte_dynfield_timestamp" set on the packet being
> > > sent it tries to synchronize the time of packet appearing on the
> wire
> > > with the specified packet timestamp. If the specified one is in the
> > > past it should be ignored, if one is in the distant future it
> should
> > > be capped with some reasonable value (in range of seconds). These
> > > specific cases ("too late" and "distant future") can be optionally
> > > reported via device xstats to assist applications to detect the
> > > time-related problems.
> > >
> > > There is no any packet reordering according timestamps is supposed,
> > > neither within packet burst, nor between packets, it is an entirely
> > > application responsibility to generate packets and its timestamps
> in
> > > desired order. The timestamps can be put only in the first packet
> in
> > > the burst providing the entire burst scheduling.
> > >
> > > PMD reports the ability to synchronize packet sending on timestamp
> > > with new offload flag:
> > >
> > > This is palliative and is going to be replaced with new eth_dev API
> > > about reporting/managing the supported dynamic flags and its
> related
> > > features. This API would break ABI compatibility and can't be
> > > introduced at the moment, so is postponed to 20.11.
> > >
> > > For testing purposes it is proposed to update testpmd "txonly"
> > > forwarding mode routine. With this update testpmd application
> > > generates the packets and sets the dynamic timestamps according to
> > > specified time pattern if it sees the "rte_dynfield_timestamp" is
> registered.
> > >
> > > The new testpmd command is proposed to configure sending pattern:
> > >
> > > set tx_times <burst_gap>,<intra_gap>
> > >
> > > <intra_gap> - the delay between the packets within the burst
> > > specified in the device clock units. The number
> > > of packets in the burst is defined by txburst
> parameter
> > >
> > > <burst_gap> - the delay between the bursts in the device clock
> units
> > >
> > > As the result the bursts of packet will be transmitted with
> specific
> > > delays between the packets within the burst and specific delay
> between
> > > the bursts. The rte_eth_get_clock is supposed to be engaged to get
> the
> > > current device clock value and provide the reference for the
> > > timestamps.
> > >
> > > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > > ---
> > > v1->v4:
> > > - dedicated dynamic Tx timestamp flag instead of shared with Rx
> >
> > The detailed description above should be updated to reflect that it
> is now
> > two flags.
> OK
>
> >
> > > - Doxygen-style comment
> > > - comments update
> > >
> > > ---
> > > lib/librte_ethdev/rte_ethdev.c | 1 +
> lib/librte_ethdev/rte_ethdev.h
> > > | 4 ++++ lib/librte_mbuf/rte_mbuf_dyn.h | 31
> > > +++++++++++++++++++++++++++++++
> > > 3 files changed, 36 insertions(+)
> > >
> > > diff --git a/lib/librte_ethdev/rte_ethdev.c
> > > b/lib/librte_ethdev/rte_ethdev.c index 8e10a6f..02157d5 100644
> > > --- a/lib/librte_ethdev/rte_ethdev.c
> > > +++ b/lib/librte_ethdev/rte_ethdev.c
> > > @@ -162,6 +162,7 @@ struct rte_eth_xstats_name_off {
> > > RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
> > > RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
> > > RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
> > > + RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
> > > };
> > >
> > > #undef RTE_TX_OFFLOAD_BIT2STR
> > > diff --git a/lib/librte_ethdev/rte_ethdev.h
> > > b/lib/librte_ethdev/rte_ethdev.h index a49242b..6f6454c 100644
> > > --- a/lib/librte_ethdev/rte_ethdev.h
> > > +++ b/lib/librte_ethdev/rte_ethdev.h
> > > @@ -1178,6 +1178,10 @@ struct rte_eth_conf {
> > > /** Device supports outer UDP checksum */ #define
> > > DEV_TX_OFFLOAD_OUTER_UDP_CKSUM 0x00100000
> > >
> > > +/** Device supports send on timestamp */ #define
> > > +DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
> > > +
> > > +
> > > #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
> > /**<
> > > Device supports Rx queue setup after device started*/ #define
> > > RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002 diff --git
> > > a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
> > > index 96c3631..7e9f7d2 100644
> > > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > > @@ -250,4 +250,35 @@ int rte_mbuf_dynflag_lookup(const char *name,
> > > #define RTE_MBUF_DYNFIELD_METADATA_NAME
> > "rte_flow_dynfield_metadata"
> > > #define RTE_MBUF_DYNFLAG_METADATA_NAME
> > "rte_flow_dynflag_metadata"
> > >
> > > +/**
> > > + * The timestamp dynamic field provides some timing information,
> the
> > > + * units and time references (initial phase) are not explicitly
> > > defined
> > > + * but are maintained always the same for a given port. Some
> devices
> > > allow4
> > > + * to query rte_eth_read_clock() that will return the current
> device
> > > + * timestamp. The dynamic Tx timestamp flag tells whether the
> field
> > > contains
> > > + * actual timestamp value. For the packets being sent this value
> can
> > > be
> > > + * used by PMD to schedule packet sending.
> > > + *
> > > + * After PKT_RX_TIMESTAMP flag and fixed timestamp field
> deprecation
> > > + * and obsoleting, the dedicated Rx timestamp flag is supposed to
> be
> > > + * introduced and the shared dynamic timestamp field will be used
> > > + * to handle the timestamps on receiving datapath as well.
> > > + */
> > > +#define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
> > "rte_dynfield_timestamp"
> >
> > The description above should not say anything about the dynamic TX
> > timestamp flag.
> It does not. Or do you mean RX?
> Not sure, field and flag are tightly coupled,
> it is nice to mention this relation for better understanding.
> And mentioning the RX explains why it is not like this:
> RTE_MBUF_DYNFIELD_[TX]_TIMESTAMP_NAME
Sorry. I misunderstood its purpose!
It's the name of the field, and the field will not only be used for RX, but in the future also for RX.
(I thought it was the name of the RX flag, reserved for future use.)
>
> >
> > Please elaborate "some timing information", e.g. add "... about when
> the
> > packet was received".
>
> Sorry, I do not follow, currently the dynamic field is not
> "about when the packet was received". Now it is introduced for Tx
> only and just the opportunity to be shared with Rx one in coming
> releases
> is mentioned. "Some" means - not specified (herein) exactly.
> And it is elaborated what Is not specified and how it is supposed
> to use Tx timestamp.
It should be described when it is valid, and how it is being used, e.g. by adding a reference to the "rte_dynflag_tx_timestamp" flag.
> >
> > > +
> > > +/**
> > > + * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME flag
> > set on
> > > the
> > > + * packet being sent it tries to synchronize the time of packet
> > > appearing
> > > + * on the wire with the specified packet timestamp. If the
> specified
> > > one
> > > + * is in the past it should be ignored, if one is in the distant
> > > future
> > > + * it should be capped with some reasonable value (in range of
> > > seconds).
> > > + *
> > > + * There is no any packet reordering according to timestamps is
> > > supposed,
> > > + * neither for packet within the burst, nor for the whole bursts,
> it
> > > is
> > > + * an entirely application responsibility to generate packets and
> its
> > > + * timestamps in desired order. The timestamps might be put only
> in
> > > + * the first packet in the burst providing the entire burst
> > > scheduling.
> > > + */
> > > +#define RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME
> > "rte_dynflag_tx_timestamp"
> > > +
> > > #endif
> > > --
> > > 1.8.3.1
> > >
> >
> > It may be worth adding some documentation about how the clocks of the
> > NICs are out of sync with the clock of the CPU, and are all drifting
> relatively.
> >
> > And those clocks are also out of sync with the actual time (NTP
> clock).
>
> IMO, It is out of scope of this very generic patch. As for mlx NICs -
> the internal device
> clock might be (or might be not) synchronized with PTP, it can provide
> timestamps
> in real nanoseconds in various formats or just some free running
> counter.
Cool!
> On some systems the NIC and CPU might share the same clock source (for
> their PLL inputs
> for example) and there will be no any drifts. As we can see - it is a
> wide and interesting
> opic to discuss, but, IMO, the comment in header file might be not the
> most relevant
> place to do. As for mlx5 devices clock specifics - it will be
> documented in PMD chapter.
>
> OK, will add few generic words, the few ones - in order not to make
> comment wordy, just
> point the direction for further thinking.
I agree - we don't want cookbooks in the header files. Only enough description to avoid the worst misunderstandings.
>
> >
> > Preferably, some sort of cookbook for handling this should be
> provided.
> > PCAP could be used as an example.
> >
> testpmd example is included in series, mlx5 PMD patch is prepared and
> coming soon.
Great.
And I suppose that the more detailed cookbook/example - regarding offset and drift of various clocks - is probably more relevant for the RX side (for various PCAP applications), and thus completely unrelated to this patch.
>
> With best regards, Slava
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH 2/2] eal: use c11 atomics for interrupt status
2020-07-08 12:29 3% ` David Marchand
2020-07-08 13:43 0% ` Aaron Conole
@ 2020-07-08 15:04 0% ` Kinsella, Ray
2020-07-09 5:21 0% ` Phil Yang
1 sibling, 1 reply; 200+ results
From: Kinsella, Ray @ 2020-07-08 15:04 UTC (permalink / raw)
To: David Marchand, Phil Yang, Aaron Conole
Cc: dev, David Christensen, Honnappa Nagarahalli,
Ruifeng Wang (Arm Technology China),
nd, Dodji Seketeli, Neil Horman, Harman Kalra
On 08/07/2020 13:29, David Marchand wrote:
> On Thu, Jun 11, 2020 at 12:25 PM Phil Yang <phil.yang@arm.com> wrote:
>>
>> The event status is defined as a volatile variable and shared
>> between threads. Use c11 atomics with explicit ordering instead
>> of rte_atomic ops which enforce unnecessary barriers on aarch64.
>>
>> Signed-off-by: Phil Yang <phil.yang@arm.com>
>> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
>> ---
>> lib/librte_eal/include/rte_eal_interrupts.h | 2 +-
>> lib/librte_eal/linux/eal_interrupts.c | 47 ++++++++++++++++++++---------
>> 2 files changed, 34 insertions(+), 15 deletions(-)
>>
>> diff --git a/lib/librte_eal/include/rte_eal_interrupts.h b/lib/librte_eal/include/rte_eal_interrupts.h
>> index 773a34a..b1e8a29 100644
>> --- a/lib/librte_eal/include/rte_eal_interrupts.h
>> +++ b/lib/librte_eal/include/rte_eal_interrupts.h
>> @@ -59,7 +59,7 @@ enum {
>>
>> /** interrupt epoll event obj, taken by epoll_event.ptr */
>> struct rte_epoll_event {
>> - volatile uint32_t status; /**< OUT: event status */
>> + uint32_t status; /**< OUT: event status */
>> int fd; /**< OUT: event fd */
>> int epfd; /**< OUT: epoll instance the ev associated with */
>> struct rte_epoll_data epdata;
>
> I got a reject from the ABI check in my env.
>
> 1 function with some indirect sub-type change:
>
> [C]'function int rte_pci_ioport_map(rte_pci_device*, int,
> rte_pci_ioport*)' at pci.c:756:1 has some indirect sub-type changes:
> parameter 1 of type 'rte_pci_device*' has sub-type changes:
> in pointed to type 'struct rte_pci_device' at rte_bus_pci.h:57:1:
> type size hasn't changed
> 1 data member changes (2 filtered):
> type of 'rte_intr_handle rte_pci_device::intr_handle' changed:
> type size hasn't changed
> 1 data member change:
> type of 'rte_epoll_event rte_intr_handle::elist[512]' changed:
> array element type 'struct rte_epoll_event' changed:
> type size hasn't changed
> 1 data member change:
> type of 'volatile uint32_t rte_epoll_event::status' changed:
> entity changed from 'volatile uint32_t' to 'typedef
> uint32_t' at stdint-uintn.h:26:1
> type size hasn't changed
>
> type size hasn't changed
>
>
> This is probably harmless in our case (going from volatile to non
> volatile), but it won't pass the check in the CI without an exception
> rule.
>
> Note: checking on the test-report ml, I saw nothing, but ovsrobot did
> catch the issue with this change too, Aaron?
>
>
Agreed, probably harmless and requires something in libagigail.ignore.
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v4 1/4] eventdev: fix race condition on timer list counter
2020-07-08 13:30 4% ` [dpdk-dev] [PATCH v4 1/4] eventdev: fix race condition on timer list counter Jerin Jacob
@ 2020-07-08 15:01 0% ` Thomas Monjalon
0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2020-07-08 15:01 UTC (permalink / raw)
To: Phil Yang, Jerin Jacob
Cc: dev, Erik Gabriel Carrillo, Honnappa Nagarahalli,
David Christensen, Ruifeng Wang (Arm Technology China),
Dharmik Thakkar, nd, David Marchand, Ray Kinsella, Neil Horman,
dodji, dpdk stable, Jerin Jacob
08/07/2020 15:30, Jerin Jacob:
> On Tue, Jul 7, 2020 at 9:25 PM Phil Yang <phil.yang@arm.com> wrote:
> >
> > The n_poll_lcores counter and poll_lcore array are shared between lcores
> > and the update of these variables are out of the protection of spinlock
> > on each lcore timer list. The read-modify-write operations of the counter
> > are not atomic, so it has the potential of race condition between lcores.
> >
> > Use c11 atomics with RELAXED ordering to prevent confliction.
> >
> > Fixes: cc7b73ea9e3b ("eventdev: add new software timer adapter")
> > Cc: erik.g.carrillo@intel.com
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
>
> Hi Thomas,
>
> The latest version does not have ABI breakage issue.
>
> I have added the ABI verifier in my local patch verification setup.
>
> Series applied to dpdk-next-eventdev/master.
>
> Please pull this series from dpdk-next-eventdev/master. Thanks.
>
> I am marking this patch series as "Awaiting Upstream" in patchwork
> status to reflect the actual status.
OK, pulled and marked as Accepted in patchwork.
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v4 1/2] mbuf: introduce accurate packet Txscheduling
2020-07-08 14:16 0% ` [dpdk-dev] [PATCH v4 1/2] mbuf: introduce accurate packet Txscheduling Morten Brørup
@ 2020-07-08 14:54 0% ` Slava Ovsiienko
2020-07-08 15:27 0% ` Morten Brørup
0 siblings, 1 reply; 200+ results
From: Slava Ovsiienko @ 2020-07-08 14:54 UTC (permalink / raw)
To: Morten Brørup, dev
Cc: Matan Azrad, Raslan Darawsheh, olivier.matz, bernard.iremonger, thomas
Hi, Morten
Thank you for the comments. Please, see below.
> -----Original Message-----
> From: Morten Brørup <mb@smartsharesystems.com>
> Sent: Wednesday, July 8, 2020 17:16
> To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; olivier.matz@6wind.com;
> bernard.iremonger@intel.com; thomas@mellanox.net
> Subject: RE: [dpdk-dev] [PATCH v4 1/2] mbuf: introduce accurate packet
> Txscheduling
>
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Viacheslav
> > Ovsiienko
> > Sent: Tuesday, July 7, 2020 4:57 PM
> >
> > There is the requirement on some networks for precise traffic timing
> > management. The ability to send (and, generally speaking, receive) the
> > packets at the very precisely specified moment of time provides the
> > opportunity to support the connections with Time Division Multiplexing
> > using the contemporary general purpose NIC without involving an
> > auxiliary hardware. For example, the supporting of O-RAN Fronthaul
> > interface is one of the promising features for potentially usage of
> > the precise time management for the egress packets.
> >
> > The main objective of this RFC is to specify the way how applications
> > can provide the moment of time at what the packet transmission must be
> > started and to describe in preliminary the supporting this feature
> > from
> > mlx5 PMD side.
> >
> > The new dynamic timestamp field is proposed, it provides some timing
> > information, the units and time references (initial phase) are not
> > explicitly defined but are maintained always the same for a given port.
> > Some devices allow to query rte_eth_read_clock() that will return the
> > current device timestamp. The dynamic timestamp flag tells whether the
> > field contains actual timestamp value. For the packets being sent this
> > value can be used by PMD to schedule packet sending.
> >
> > After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation and
> > obsoleting, these dynamic flag and field will be used to manage the
> > timestamps on receiving datapath as well.
> >
> > When PMD sees the "rte_dynfield_timestamp" set on the packet being
> > sent it tries to synchronize the time of packet appearing on the wire
> > with the specified packet timestamp. If the specified one is in the
> > past it should be ignored, if one is in the distant future it should
> > be capped with some reasonable value (in range of seconds). These
> > specific cases ("too late" and "distant future") can be optionally
> > reported via device xstats to assist applications to detect the
> > time-related problems.
> >
> > There is no any packet reordering according timestamps is supposed,
> > neither within packet burst, nor between packets, it is an entirely
> > application responsibility to generate packets and its timestamps in
> > desired order. The timestamps can be put only in the first packet in
> > the burst providing the entire burst scheduling.
> >
> > PMD reports the ability to synchronize packet sending on timestamp
> > with new offload flag:
> >
> > This is palliative and is going to be replaced with new eth_dev API
> > about reporting/managing the supported dynamic flags and its related
> > features. This API would break ABI compatibility and can't be
> > introduced at the moment, so is postponed to 20.11.
> >
> > For testing purposes it is proposed to update testpmd "txonly"
> > forwarding mode routine. With this update testpmd application
> > generates the packets and sets the dynamic timestamps according to
> > specified time pattern if it sees the "rte_dynfield_timestamp" is registered.
> >
> > The new testpmd command is proposed to configure sending pattern:
> >
> > set tx_times <burst_gap>,<intra_gap>
> >
> > <intra_gap> - the delay between the packets within the burst
> > specified in the device clock units. The number
> > of packets in the burst is defined by txburst parameter
> >
> > <burst_gap> - the delay between the bursts in the device clock units
> >
> > As the result the bursts of packet will be transmitted with specific
> > delays between the packets within the burst and specific delay between
> > the bursts. The rte_eth_get_clock is supposed to be engaged to get the
> > current device clock value and provide the reference for the
> > timestamps.
> >
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > ---
> > v1->v4:
> > - dedicated dynamic Tx timestamp flag instead of shared with Rx
>
> The detailed description above should be updated to reflect that it is now
> two flags.
OK
>
> > - Doxygen-style comment
> > - comments update
> >
> > ---
> > lib/librte_ethdev/rte_ethdev.c | 1 + lib/librte_ethdev/rte_ethdev.h
> > | 4 ++++ lib/librte_mbuf/rte_mbuf_dyn.h | 31
> > +++++++++++++++++++++++++++++++
> > 3 files changed, 36 insertions(+)
> >
> > diff --git a/lib/librte_ethdev/rte_ethdev.c
> > b/lib/librte_ethdev/rte_ethdev.c index 8e10a6f..02157d5 100644
> > --- a/lib/librte_ethdev/rte_ethdev.c
> > +++ b/lib/librte_ethdev/rte_ethdev.c
> > @@ -162,6 +162,7 @@ struct rte_eth_xstats_name_off {
> > RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
> > RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
> > RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
> > + RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
> > };
> >
> > #undef RTE_TX_OFFLOAD_BIT2STR
> > diff --git a/lib/librte_ethdev/rte_ethdev.h
> > b/lib/librte_ethdev/rte_ethdev.h index a49242b..6f6454c 100644
> > --- a/lib/librte_ethdev/rte_ethdev.h
> > +++ b/lib/librte_ethdev/rte_ethdev.h
> > @@ -1178,6 +1178,10 @@ struct rte_eth_conf {
> > /** Device supports outer UDP checksum */ #define
> > DEV_TX_OFFLOAD_OUTER_UDP_CKSUM 0x00100000
> >
> > +/** Device supports send on timestamp */ #define
> > +DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
> > +
> > +
> > #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
> /**<
> > Device supports Rx queue setup after device started*/ #define
> > RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002 diff --git
> > a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
> > index 96c3631..7e9f7d2 100644
> > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > @@ -250,4 +250,35 @@ int rte_mbuf_dynflag_lookup(const char *name,
> > #define RTE_MBUF_DYNFIELD_METADATA_NAME
> "rte_flow_dynfield_metadata"
> > #define RTE_MBUF_DYNFLAG_METADATA_NAME
> "rte_flow_dynflag_metadata"
> >
> > +/**
> > + * The timestamp dynamic field provides some timing information, the
> > + * units and time references (initial phase) are not explicitly
> > defined
> > + * but are maintained always the same for a given port. Some devices
> > allow4
> > + * to query rte_eth_read_clock() that will return the current device
> > + * timestamp. The dynamic Tx timestamp flag tells whether the field
> > contains
> > + * actual timestamp value. For the packets being sent this value can
> > be
> > + * used by PMD to schedule packet sending.
> > + *
> > + * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
> > + * and obsoleting, the dedicated Rx timestamp flag is supposed to be
> > + * introduced and the shared dynamic timestamp field will be used
> > + * to handle the timestamps on receiving datapath as well.
> > + */
> > +#define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
> "rte_dynfield_timestamp"
>
> The description above should not say anything about the dynamic TX
> timestamp flag.
It does not. Or do you mean RX?
Not sure, field and flag are tightly coupled,
it is nice to mention this relation for better understanding.
And mentioning the RX explains why it is not like this:
RTE_MBUF_DYNFIELD_[TX]_TIMESTAMP_NAME
>
> Please elaborate "some timing information", e.g. add "... about when the
> packet was received".
Sorry, I do not follow, currently the dynamic field is not
"about when the packet was received". Now it is introduced for Tx
only and just the opportunity to be shared with Rx one in coming releases
is mentioned. "Some" means - not specified (herein) exactly.
And it is elaborated what Is not specified and how it is supposed
to use Tx timestamp.
>
> > +
> > +/**
> > + * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME flag
> set on
> > the
> > + * packet being sent it tries to synchronize the time of packet
> > appearing
> > + * on the wire with the specified packet timestamp. If the specified
> > one
> > + * is in the past it should be ignored, if one is in the distant
> > future
> > + * it should be capped with some reasonable value (in range of
> > seconds).
> > + *
> > + * There is no any packet reordering according to timestamps is
> > supposed,
> > + * neither for packet within the burst, nor for the whole bursts, it
> > is
> > + * an entirely application responsibility to generate packets and its
> > + * timestamps in desired order. The timestamps might be put only in
> > + * the first packet in the burst providing the entire burst
> > scheduling.
> > + */
> > +#define RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME
> "rte_dynflag_tx_timestamp"
> > +
> > #endif
> > --
> > 1.8.3.1
> >
>
> It may be worth adding some documentation about how the clocks of the
> NICs are out of sync with the clock of the CPU, and are all drifting relatively.
>
> And those clocks are also out of sync with the actual time (NTP clock).
IMO, It is out of scope of this very generic patch. As for mlx NICs - the internal device
clock might be (or might be not) synchronized with PTP, it can provide timestamps
in real nanoseconds in various formats or just some free running counter.
On some systems the NIC and CPU might share the same clock source (for their PLL inputs
for example) and there will be no any drifts. As we can see - it is a wide and interesting
opic to discuss, but, IMO, the comment in header file might be not the most relevant
place to do. As for mlx5 devices clock specifics - it will be documented in PMD chapter.
OK, will add few generic words, the few ones - in order not to make comment wordy, just
point the direction for further thinking.
>
> Preferably, some sort of cookbook for handling this should be provided.
> PCAP could be used as an example.
>
testpmd example is included in series, mlx5 PMD patch is prepared and coming soon.
With best regards, Slava
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v7 1/3] lib/lpm: integrate RCU QSBR
@ 2020-07-08 14:30 2% ` David Marchand
2020-07-08 15:34 5% ` Ruifeng Wang
0 siblings, 1 reply; 200+ results
From: David Marchand @ 2020-07-08 14:30 UTC (permalink / raw)
To: Ruifeng Wang
Cc: Bruce Richardson, Vladimir Medvedkin, John McNamara,
Marko Kovacevic, Ray Kinsella, Neil Horman, dev, Ananyev,
Konstantin, Honnappa Nagarahalli, nd
On Tue, Jul 7, 2020 at 5:16 PM Ruifeng Wang <ruifeng.wang@arm.com> wrote:
> diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
> index b9d49ac87..7889f21b3 100644
> --- a/lib/librte_lpm/rte_lpm.h
> +++ b/lib/librte_lpm/rte_lpm.h
> @@ -1,5 +1,6 @@
> /* SPDX-License-Identifier: BSD-3-Clause
> * Copyright(c) 2010-2014 Intel Corporation
> + * Copyright(c) 2020 Arm Limited
> */
>
> #ifndef _RTE_LPM_H_
> @@ -20,6 +21,7 @@
> #include <rte_memory.h>
> #include <rte_common.h>
> #include <rte_vect.h>
> +#include <rte_rcu_qsbr.h>
>
> #ifdef __cplusplus
> extern "C" {
> @@ -62,6 +64,17 @@ extern "C" {
> /** Bitmask used to indicate successful lookup */
> #define RTE_LPM_LOOKUP_SUCCESS 0x01000000
>
> +/** @internal Default RCU defer queue entries to reclaim in one go. */
> +#define RTE_LPM_RCU_DQ_RECLAIM_MAX 16
> +
> +/** RCU reclamation modes */
> +enum rte_lpm_qsbr_mode {
> + /** Create defer queue for reclaim. */
> + RTE_LPM_QSBR_MODE_DQ = 0,
> + /** Use blocking mode reclaim. No defer queue created. */
> + RTE_LPM_QSBR_MODE_SYNC
> +};
> +
> #if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> /** @internal Tbl24 entry structure. */
> __extension__
> @@ -130,6 +143,28 @@ struct rte_lpm {
> __rte_cache_aligned; /**< LPM tbl24 table. */
> struct rte_lpm_tbl_entry *tbl8; /**< LPM tbl8 table. */
> struct rte_lpm_rule *rules_tbl; /**< LPM rules. */
> +#ifdef ALLOW_EXPERIMENTAL_API
> + /* RCU config. */
> + struct rte_rcu_qsbr *v; /* RCU QSBR variable. */
> + enum rte_lpm_qsbr_mode rcu_mode;/* Blocking, defer queue. */
> + struct rte_rcu_qsbr_dq *dq; /* RCU QSBR defer queue. */
> +#endif
> +};
I can see failures in travis reports for v7 and v6.
I reproduced them in my env.
1 function with some indirect sub-type change:
[C]'function int rte_lpm_add(rte_lpm*, uint32_t, uint8_t, uint32_t)'
at rte_lpm.c:764:1 has some indirect sub-type changes:
parameter 1 of type 'rte_lpm*' has sub-type changes:
in pointed to type 'struct rte_lpm' at rte_lpm.h:134:1:
type size hasn't changed
3 data member insertions:
'rte_rcu_qsbr* rte_lpm::v', at offset 536873600 (in bits) at
rte_lpm.h:148:1
'rte_lpm_qsbr_mode rte_lpm::rcu_mode', at offset 536873664
(in bits) at rte_lpm.h:149:1
'rte_rcu_qsbr_dq* rte_lpm::dq', at offset 536873728 (in
bits) at rte_lpm.h:150:1
Going back to my proposal of hiding what does not need to be seen.
Disclaimer, *this is quick & dirty* but it builds and passes ABI check:
$ git diff
diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c
index d498ba761..7109aef6a 100644
--- a/lib/librte_lpm/rte_lpm.c
+++ b/lib/librte_lpm/rte_lpm.c
@@ -115,6 +115,15 @@ rte_lpm_find_existing(const char *name)
return l;
}
+struct internal_lpm {
+ /* Public object */
+ struct rte_lpm lpm;
+ /* RCU config. */
+ struct rte_rcu_qsbr *v; /* RCU QSBR variable. */
+ enum rte_lpm_qsbr_mode rcu_mode;/* Blocking, defer queue. */
+ struct rte_rcu_qsbr_dq *dq; /* RCU QSBR defer queue. */
+};
+
/*
* Allocates memory for LPM object
*/
@@ -123,6 +132,7 @@ rte_lpm_create(const char *name, int socket_id,
const struct rte_lpm_config *config)
{
char mem_name[RTE_LPM_NAMESIZE];
+ struct internal_lpm *internal = NULL;
struct rte_lpm *lpm = NULL;
struct rte_tailq_entry *te;
uint32_t mem_size, rules_size, tbl8s_size;
@@ -141,12 +151,6 @@ rte_lpm_create(const char *name, int socket_id,
snprintf(mem_name, sizeof(mem_name), "LPM_%s", name);
- /* Determine the amount of memory to allocate. */
- mem_size = sizeof(*lpm);
- rules_size = sizeof(struct rte_lpm_rule) * config->max_rules;
- tbl8s_size = (sizeof(struct rte_lpm_tbl_entry) *
- RTE_LPM_TBL8_GROUP_NUM_ENTRIES * config->number_tbl8s);
-
rte_mcfg_tailq_write_lock();
/* guarantee there's no existing */
@@ -170,16 +174,23 @@ rte_lpm_create(const char *name, int socket_id,
goto exit;
}
+ /* Determine the amount of memory to allocate. */
+ mem_size = sizeof(*internal);
+ rules_size = sizeof(struct rte_lpm_rule) * config->max_rules;
+ tbl8s_size = (sizeof(struct rte_lpm_tbl_entry) *
+ RTE_LPM_TBL8_GROUP_NUM_ENTRIES * config->number_tbl8s);
+
/* Allocate memory to store the LPM data structures. */
- lpm = rte_zmalloc_socket(mem_name, mem_size,
+ internal = rte_zmalloc_socket(mem_name, mem_size,
RTE_CACHE_LINE_SIZE, socket_id);
- if (lpm == NULL) {
+ if (internal == NULL) {
RTE_LOG(ERR, LPM, "LPM memory allocation failed\n");
rte_free(te);
rte_errno = ENOMEM;
goto exit;
}
+ lpm = &internal->lpm;
lpm->rules_tbl = rte_zmalloc_socket(NULL,
(size_t)rules_size, RTE_CACHE_LINE_SIZE, socket_id);
@@ -226,6 +237,7 @@ rte_lpm_create(const char *name, int socket_id,
void
rte_lpm_free(struct rte_lpm *lpm)
{
+ struct internal_lpm *internal;
struct rte_lpm_list *lpm_list;
struct rte_tailq_entry *te;
@@ -247,8 +259,9 @@ rte_lpm_free(struct rte_lpm *lpm)
rte_mcfg_tailq_write_unlock();
- if (lpm->dq)
- rte_rcu_qsbr_dq_delete(lpm->dq);
+ internal = container_of(lpm, struct internal_lpm, lpm);
+ if (internal->dq != NULL)
+ rte_rcu_qsbr_dq_delete(internal->dq);
rte_free(lpm->tbl8);
rte_free(lpm->rules_tbl);
rte_free(lpm);
@@ -276,13 +289,15 @@ rte_lpm_rcu_qsbr_add(struct rte_lpm *lpm, struct
rte_lpm_rcu_config *cfg,
{
char rcu_dq_name[RTE_RCU_QSBR_DQ_NAMESIZE];
struct rte_rcu_qsbr_dq_parameters params = {0};
+ struct internal_lpm *internal;
- if ((lpm == NULL) || (cfg == NULL)) {
+ if (lpm == NULL || cfg == NULL) {
rte_errno = EINVAL;
return 1;
}
- if (lpm->v) {
+ internal = container_of(lpm, struct internal_lpm, lpm);
+ if (internal->v != NULL) {
rte_errno = EEXIST;
return 1;
}
@@ -305,20 +320,19 @@ rte_lpm_rcu_qsbr_add(struct rte_lpm *lpm, struct
rte_lpm_rcu_config *cfg,
params.free_fn = __lpm_rcu_qsbr_free_resource;
params.p = lpm;
params.v = cfg->v;
- lpm->dq = rte_rcu_qsbr_dq_create(¶ms);
- if (lpm->dq == NULL) {
- RTE_LOG(ERR, LPM,
- "LPM QS defer queue creation failed\n");
+ internal->dq = rte_rcu_qsbr_dq_create(¶ms);
+ if (internal->dq == NULL) {
+ RTE_LOG(ERR, LPM, "LPM QS defer queue creation
failed\n");
return 1;
}
if (dq)
- *dq = lpm->dq;
+ *dq = internal->dq;
} else {
rte_errno = EINVAL;
return 1;
}
- lpm->rcu_mode = cfg->mode;
- lpm->v = cfg->v;
+ internal->rcu_mode = cfg->mode;
+ internal->v = cfg->v;
return 0;
}
@@ -502,12 +516,13 @@ _tbl8_alloc(struct rte_lpm *lpm)
static int32_t
tbl8_alloc(struct rte_lpm *lpm)
{
+ struct internal_lpm *internal = container_of(lpm, struct
internal_lpm, lpm);
int32_t group_idx; /* tbl8 group index. */
group_idx = _tbl8_alloc(lpm);
- if ((group_idx == -ENOSPC) && (lpm->dq != NULL)) {
+ if (group_idx == -ENOSPC && internal->dq != NULL) {
/* If there are no tbl8 groups try to reclaim one. */
- if (rte_rcu_qsbr_dq_reclaim(lpm->dq, 1, NULL, NULL, NULL) == 0)
+ if (rte_rcu_qsbr_dq_reclaim(internal->dq, 1, NULL,
NULL, NULL) == 0)
group_idx = _tbl8_alloc(lpm);
}
@@ -518,20 +533,21 @@ static void
tbl8_free(struct rte_lpm *lpm, uint32_t tbl8_group_start)
{
struct rte_lpm_tbl_entry zero_tbl8_entry = {0};
+ struct internal_lpm *internal = container_of(lpm, struct
internal_lpm, lpm);
- if (!lpm->v) {
+ if (internal->v == NULL) {
/* Set tbl8 group invalid*/
__atomic_store(&lpm->tbl8[tbl8_group_start], &zero_tbl8_entry,
__ATOMIC_RELAXED);
- } else if (lpm->rcu_mode == RTE_LPM_QSBR_MODE_SYNC) {
+ } else if (internal->rcu_mode == RTE_LPM_QSBR_MODE_SYNC) {
/* Wait for quiescent state change. */
- rte_rcu_qsbr_synchronize(lpm->v, RTE_QSBR_THRID_INVALID);
+ rte_rcu_qsbr_synchronize(internal->v, RTE_QSBR_THRID_INVALID);
/* Set tbl8 group invalid*/
__atomic_store(&lpm->tbl8[tbl8_group_start], &zero_tbl8_entry,
__ATOMIC_RELAXED);
- } else if (lpm->rcu_mode == RTE_LPM_QSBR_MODE_DQ) {
+ } else if (internal->rcu_mode == RTE_LPM_QSBR_MODE_DQ) {
/* Push into QSBR defer queue. */
- rte_rcu_qsbr_dq_enqueue(lpm->dq, (void *)&tbl8_group_start);
+ rte_rcu_qsbr_dq_enqueue(internal->dq, (void
*)&tbl8_group_start);
}
}
diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
index 7889f21b3..a9568fcdd 100644
--- a/lib/librte_lpm/rte_lpm.h
+++ b/lib/librte_lpm/rte_lpm.h
@@ -143,12 +143,6 @@ struct rte_lpm {
__rte_cache_aligned; /**< LPM tbl24 table. */
struct rte_lpm_tbl_entry *tbl8; /**< LPM tbl8 table. */
struct rte_lpm_rule *rules_tbl; /**< LPM rules. */
-#ifdef ALLOW_EXPERIMENTAL_API
- /* RCU config. */
- struct rte_rcu_qsbr *v; /* RCU QSBR variable. */
- enum rte_lpm_qsbr_mode rcu_mode;/* Blocking, defer queue. */
- struct rte_rcu_qsbr_dq *dq; /* RCU QSBR defer queue. */
-#endif
};
/** LPM RCU QSBR configuration structure. */
--
David Marchand
^ permalink raw reply [relevance 2%]
* Re: [dpdk-dev] [PATCH v4 1/2] mbuf: introduce accurate packet Txscheduling
2020-07-07 14:57 2% ` [dpdk-dev] [PATCH v4 " Viacheslav Ovsiienko
2020-07-07 15:23 0% ` Olivier Matz
@ 2020-07-08 14:16 0% ` Morten Brørup
2020-07-08 14:54 0% ` Slava Ovsiienko
1 sibling, 1 reply; 200+ results
From: Morten Brørup @ 2020-07-08 14:16 UTC (permalink / raw)
To: Viacheslav Ovsiienko, dev
Cc: matan, rasland, olivier.matz, bernard.iremonger, thomas
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Viacheslav
> Ovsiienko
> Sent: Tuesday, July 7, 2020 4:57 PM
>
> There is the requirement on some networks for precise traffic timing
> management. The ability to send (and, generally speaking, receive)
> the packets at the very precisely specified moment of time provides
> the opportunity to support the connections with Time Division
> Multiplexing using the contemporary general purpose NIC without
> involving
> an auxiliary hardware. For example, the supporting of O-RAN Fronthaul
> interface is one of the promising features for potentially usage of the
> precise time management for the egress packets.
>
> The main objective of this RFC is to specify the way how applications
> can provide the moment of time at what the packet transmission must be
> started and to describe in preliminary the supporting this feature from
> mlx5 PMD side.
>
> The new dynamic timestamp field is proposed, it provides some timing
> information, the units and time references (initial phase) are not
> explicitly defined but are maintained always the same for a given port.
> Some devices allow to query rte_eth_read_clock() that will return
> the current device timestamp. The dynamic timestamp flag tells whether
> the field contains actual timestamp value. For the packets being sent
> this value can be used by PMD to schedule packet sending.
>
> After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
> and obsoleting, these dynamic flag and field will be used to manage
> the timestamps on receiving datapath as well.
>
> When PMD sees the "rte_dynfield_timestamp" set on the packet being sent
> it tries to synchronize the time of packet appearing on the wire with
> the specified packet timestamp. If the specified one is in the past it
> should be ignored, if one is in the distant future it should be capped
> with some reasonable value (in range of seconds). These specific cases
> ("too late" and "distant future") can be optionally reported via
> device xstats to assist applications to detect the time-related
> problems.
>
> There is no any packet reordering according timestamps is supposed,
> neither within packet burst, nor between packets, it is an entirely
> application responsibility to generate packets and its timestamps
> in desired order. The timestamps can be put only in the first packet
> in the burst providing the entire burst scheduling.
>
> PMD reports the ability to synchronize packet sending on timestamp
> with new offload flag:
>
> This is palliative and is going to be replaced with new eth_dev API
> about reporting/managing the supported dynamic flags and its related
> features. This API would break ABI compatibility and can't be
> introduced
> at the moment, so is postponed to 20.11.
>
> For testing purposes it is proposed to update testpmd "txonly"
> forwarding mode routine. With this update testpmd application generates
> the packets and sets the dynamic timestamps according to specified time
> pattern if it sees the "rte_dynfield_timestamp" is registered.
>
> The new testpmd command is proposed to configure sending pattern:
>
> set tx_times <burst_gap>,<intra_gap>
>
> <intra_gap> - the delay between the packets within the burst
> specified in the device clock units. The number
> of packets in the burst is defined by txburst parameter
>
> <burst_gap> - the delay between the bursts in the device clock units
>
> As the result the bursts of packet will be transmitted with specific
> delays between the packets within the burst and specific delay between
> the bursts. The rte_eth_get_clock is supposed to be engaged to get the
> current device clock value and provide the reference for the
> timestamps.
>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
> v1->v4:
> - dedicated dynamic Tx timestamp flag instead of shared with Rx
The detailed description above should be updated to reflect that it is now two flags.
> - Doxygen-style comment
> - comments update
>
> ---
> lib/librte_ethdev/rte_ethdev.c | 1 +
> lib/librte_ethdev/rte_ethdev.h | 4 ++++
> lib/librte_mbuf/rte_mbuf_dyn.h | 31 +++++++++++++++++++++++++++++++
> 3 files changed, 36 insertions(+)
>
> diff --git a/lib/librte_ethdev/rte_ethdev.c
> b/lib/librte_ethdev/rte_ethdev.c
> index 8e10a6f..02157d5 100644
> --- a/lib/librte_ethdev/rte_ethdev.c
> +++ b/lib/librte_ethdev/rte_ethdev.c
> @@ -162,6 +162,7 @@ struct rte_eth_xstats_name_off {
> RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
> RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
> RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
> + RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
> };
>
> #undef RTE_TX_OFFLOAD_BIT2STR
> diff --git a/lib/librte_ethdev/rte_ethdev.h
> b/lib/librte_ethdev/rte_ethdev.h
> index a49242b..6f6454c 100644
> --- a/lib/librte_ethdev/rte_ethdev.h
> +++ b/lib/librte_ethdev/rte_ethdev.h
> @@ -1178,6 +1178,10 @@ struct rte_eth_conf {
> /** Device supports outer UDP checksum */
> #define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM 0x00100000
>
> +/** Device supports send on timestamp */
> +#define DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
> +
> +
> #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
> /**< Device supports Rx queue setup after device started*/
> #define RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002
> diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h
> b/lib/librte_mbuf/rte_mbuf_dyn.h
> index 96c3631..7e9f7d2 100644
> --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> @@ -250,4 +250,35 @@ int rte_mbuf_dynflag_lookup(const char *name,
> #define RTE_MBUF_DYNFIELD_METADATA_NAME "rte_flow_dynfield_metadata"
> #define RTE_MBUF_DYNFLAG_METADATA_NAME "rte_flow_dynflag_metadata"
>
> +/**
> + * The timestamp dynamic field provides some timing information, the
> + * units and time references (initial phase) are not explicitly
> defined
> + * but are maintained always the same for a given port. Some devices
> allow4
> + * to query rte_eth_read_clock() that will return the current device
> + * timestamp. The dynamic Tx timestamp flag tells whether the field
> contains
> + * actual timestamp value. For the packets being sent this value can
> be
> + * used by PMD to schedule packet sending.
> + *
> + * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
> + * and obsoleting, the dedicated Rx timestamp flag is supposed to be
> + * introduced and the shared dynamic timestamp field will be used
> + * to handle the timestamps on receiving datapath as well.
> + */
> +#define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME "rte_dynfield_timestamp"
The description above should not say anything about the dynamic TX timestamp flag.
Please elaborate "some timing information", e.g. add "... about when the packet was received".
> +
> +/**
> + * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME flag set on
> the
> + * packet being sent it tries to synchronize the time of packet
> appearing
> + * on the wire with the specified packet timestamp. If the specified
> one
> + * is in the past it should be ignored, if one is in the distant
> future
> + * it should be capped with some reasonable value (in range of
> seconds).
> + *
> + * There is no any packet reordering according to timestamps is
> supposed,
> + * neither for packet within the burst, nor for the whole bursts, it
> is
> + * an entirely application responsibility to generate packets and its
> + * timestamps in desired order. The timestamps might be put only in
> + * the first packet in the burst providing the entire burst
> scheduling.
> + */
> +#define RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME "rte_dynflag_tx_timestamp"
> +
> #endif
> --
> 1.8.3.1
>
It may be worth adding some documentation about how the clocks of the NICs are out of sync with the clock of the CPU, and are all drifting relatively.
And those clocks are also out of sync with the actual time (NTP clock).
Preferably, some sort of cookbook for handling this should be provided. PCAP could be used as an example.
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH] devtools: give some hints for ABI errors
2020-07-08 13:45 7% ` Aaron Conole
@ 2020-07-08 14:01 4% ` Kinsella, Ray
0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2020-07-08 14:01 UTC (permalink / raw)
To: Aaron Conole; +Cc: David Marchand, dev, thomas, dodji, Neil Horman
On 08/07/2020 14:45, Aaron Conole wrote:
> "Kinsella, Ray" <mdr@ashroe.eu> writes:
>
>> + Aaron
>>
>> On 08/07/2020 11:22, David Marchand wrote:
>>> abidiff can provide some more information about the ABI difference it
>>> detected.
>>> In all cases, a discussion on the mailing must happen but we can give
>>> some hints to know if this is a problem with the script calling abidiff,
>>> a potential ABI breakage or an unambiguous ABI breakage.
>>>
>>> Signed-off-by: David Marchand <david.marchand@redhat.com>
>>> ---
>>> devtools/check-abi.sh | 16 ++++++++++++++--
>>> 1 file changed, 14 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/devtools/check-abi.sh b/devtools/check-abi.sh
>>> index e17fedbd9f..521e2cce7c 100755
>>> --- a/devtools/check-abi.sh
>>> +++ b/devtools/check-abi.sh
>>> @@ -50,10 +50,22 @@ for dump in $(find $refdir -name "*.dump"); do
>>> error=1
>>> continue
>>> fi
>>> - if ! abidiff $ABIDIFF_OPTIONS $dump $dump2; then
>>> + abidiff $ABIDIFF_OPTIONS $dump $dump2 || {
>>> + abiret=$?
>>> echo "Error: ABI issue reported for 'abidiff $ABIDIFF_OPTIONS $dump $dump2'"
>>> error=1
>>> - fi
>>> + echo
>>> + if [ $(($abiret & 3)) != 0 ]; then
>>> + echo "ABIDIFF_ERROR|ABIDIFF_USAGE_ERROR, please report this to dev@dpdk.org."
>>> + fi
>>> + if [ $(($abiret & 4)) != 0 ]; then
>>> + echo "ABIDIFF_ABI_CHANGE, this change requires a review (abidiff flagged this as a potential issue)."
>>> + fi
>>> + if [ $(($abiret & 8)) != 0 ]; then
>>> + echo "ABIDIFF_ABI_INCOMPATIBLE_CHANGE, this change breaks the ABI."
>>> + fi
>>> + echo
>>> + }
>>> done
>>>
>>> [ -z "$error" ] || [ -n "$warnonly" ]
>>>
>>
>> This look good to me, my only thought was can we do anything to help the ABI checks play nice with Travis.
>> At the moment it takes time to find the failure reason in the Travis log.
>
> That's a problem even for non-ABI failures. I was considering pulling
> the travis log for each failed build and attaching it, but even that
> isn't a great solution (very large emails aren't much easier to search).
>
> I'm open to suggestions.
For me the problem arises when you log on to the Travis interface,
you need to search for ERROR etc ... there must a better way.
>
>> Ray K
>
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH] devtools: give some hints for ABI errors
2020-07-08 13:09 7% ` Kinsella, Ray
2020-07-08 13:15 4% ` David Marchand
@ 2020-07-08 13:45 7% ` Aaron Conole
2020-07-08 14:01 4% ` Kinsella, Ray
1 sibling, 1 reply; 200+ results
From: Aaron Conole @ 2020-07-08 13:45 UTC (permalink / raw)
To: Kinsella, Ray; +Cc: David Marchand, dev, thomas, dodji, Neil Horman
"Kinsella, Ray" <mdr@ashroe.eu> writes:
> + Aaron
>
> On 08/07/2020 11:22, David Marchand wrote:
>> abidiff can provide some more information about the ABI difference it
>> detected.
>> In all cases, a discussion on the mailing must happen but we can give
>> some hints to know if this is a problem with the script calling abidiff,
>> a potential ABI breakage or an unambiguous ABI breakage.
>>
>> Signed-off-by: David Marchand <david.marchand@redhat.com>
>> ---
>> devtools/check-abi.sh | 16 ++++++++++++++--
>> 1 file changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/devtools/check-abi.sh b/devtools/check-abi.sh
>> index e17fedbd9f..521e2cce7c 100755
>> --- a/devtools/check-abi.sh
>> +++ b/devtools/check-abi.sh
>> @@ -50,10 +50,22 @@ for dump in $(find $refdir -name "*.dump"); do
>> error=1
>> continue
>> fi
>> - if ! abidiff $ABIDIFF_OPTIONS $dump $dump2; then
>> + abidiff $ABIDIFF_OPTIONS $dump $dump2 || {
>> + abiret=$?
>> echo "Error: ABI issue reported for 'abidiff $ABIDIFF_OPTIONS $dump $dump2'"
>> error=1
>> - fi
>> + echo
>> + if [ $(($abiret & 3)) != 0 ]; then
>> + echo "ABIDIFF_ERROR|ABIDIFF_USAGE_ERROR, please report this to dev@dpdk.org."
>> + fi
>> + if [ $(($abiret & 4)) != 0 ]; then
>> + echo "ABIDIFF_ABI_CHANGE, this change requires a review (abidiff flagged this as a potential issue)."
>> + fi
>> + if [ $(($abiret & 8)) != 0 ]; then
>> + echo "ABIDIFF_ABI_INCOMPATIBLE_CHANGE, this change breaks the ABI."
>> + fi
>> + echo
>> + }
>> done
>>
>> [ -z "$error" ] || [ -n "$warnonly" ]
>>
>
> This look good to me, my only thought was can we do anything to help the ABI checks play nice with Travis.
> At the moment it takes time to find the failure reason in the Travis log.
That's a problem even for non-ABI failures. I was considering pulling
the travis log for each failed build and attaching it, but even that
isn't a great solution (very large emails aren't much easier to search).
I'm open to suggestions.
> Ray K
^ permalink raw reply [relevance 7%]
* Re: [dpdk-dev] [PATCH 2/2] eal: use c11 atomics for interrupt status
2020-07-08 12:29 3% ` David Marchand
@ 2020-07-08 13:43 0% ` Aaron Conole
2020-07-08 15:04 0% ` Kinsella, Ray
1 sibling, 0 replies; 200+ results
From: Aaron Conole @ 2020-07-08 13:43 UTC (permalink / raw)
To: David Marchand
Cc: Phil Yang, dev, David Christensen, Honnappa Nagarahalli,
Ruifeng Wang (Arm Technology China),
nd, Dodji Seketeli, Neil Horman, Ray Kinsella, Harman Kalra
David Marchand <david.marchand@redhat.com> writes:
> On Thu, Jun 11, 2020 at 12:25 PM Phil Yang <phil.yang@arm.com> wrote:
>>
>> The event status is defined as a volatile variable and shared
>> between threads. Use c11 atomics with explicit ordering instead
>> of rte_atomic ops which enforce unnecessary barriers on aarch64.
>>
>> Signed-off-by: Phil Yang <phil.yang@arm.com>
>> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
>> ---
>> lib/librte_eal/include/rte_eal_interrupts.h | 2 +-
>> lib/librte_eal/linux/eal_interrupts.c | 47 ++++++++++++++++++++---------
>> 2 files changed, 34 insertions(+), 15 deletions(-)
>>
>> diff --git a/lib/librte_eal/include/rte_eal_interrupts.h b/lib/librte_eal/include/rte_eal_interrupts.h
>> index 773a34a..b1e8a29 100644
>> --- a/lib/librte_eal/include/rte_eal_interrupts.h
>> +++ b/lib/librte_eal/include/rte_eal_interrupts.h
>> @@ -59,7 +59,7 @@ enum {
>>
>> /** interrupt epoll event obj, taken by epoll_event.ptr */
>> struct rte_epoll_event {
>> - volatile uint32_t status; /**< OUT: event status */
>> + uint32_t status; /**< OUT: event status */
>> int fd; /**< OUT: event fd */
>> int epfd; /**< OUT: epoll instance the ev associated with */
>> struct rte_epoll_data epdata;
>
> I got a reject from the ABI check in my env.
>
> 1 function with some indirect sub-type change:
>
> [C]'function int rte_pci_ioport_map(rte_pci_device*, int,
> rte_pci_ioport*)' at pci.c:756:1 has some indirect sub-type changes:
> parameter 1 of type 'rte_pci_device*' has sub-type changes:
> in pointed to type 'struct rte_pci_device' at rte_bus_pci.h:57:1:
> type size hasn't changed
> 1 data member changes (2 filtered):
> type of 'rte_intr_handle rte_pci_device::intr_handle' changed:
> type size hasn't changed
> 1 data member change:
> type of 'rte_epoll_event rte_intr_handle::elist[512]' changed:
> array element type 'struct rte_epoll_event' changed:
> type size hasn't changed
> 1 data member change:
> type of 'volatile uint32_t rte_epoll_event::status' changed:
> entity changed from 'volatile uint32_t' to 'typedef
> uint32_t' at stdint-uintn.h:26:1
> type size hasn't changed
>
> type size hasn't changed
>
>
> This is probably harmless in our case (going from volatile to non
> volatile), but it won't pass the check in the CI without an exception
> rule.
>
> Note: checking on the test-report ml, I saw nothing, but ovsrobot did
> catch the issue with this change too, Aaron?
I don't have archives back to Jun 11 on the robot server. I think it
doesn't preserve forever (and the archives seem to go back only until
Jul 03). I will update it.
I do see that we have a failed travis job:
https://travis-ci.org/github/ovsrobot/dpdk/builds/697180855
I'm surprised this didn't go out. Have we seen other failures to report
of the ovs robot recently? I can double check the job config.
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v4 1/4] eventdev: fix race condition on timer list counter
2020-07-07 15:54 4% ` [dpdk-dev] [PATCH v4 4/4] eventdev: relax smp barriers with C11 atomics Phil Yang
@ 2020-07-08 13:30 4% ` Jerin Jacob
2020-07-08 15:01 0% ` Thomas Monjalon
1 sibling, 1 reply; 200+ results
From: Jerin Jacob @ 2020-07-08 13:30 UTC (permalink / raw)
To: Phil Yang
Cc: Jerin Jacob, dpdk-dev, Thomas Monjalon, Erik Gabriel Carrillo,
Honnappa Nagarahalli, David Christensen,
Ruifeng Wang (Arm Technology China),
Dharmik Thakkar, nd, David Marchand, Ray Kinsella, Neil Horman,
dodji, dpdk stable
On Tue, Jul 7, 2020 at 9:25 PM Phil Yang <phil.yang@arm.com> wrote:
>
> The n_poll_lcores counter and poll_lcore array are shared between lcores
> and the update of these variables are out of the protection of spinlock
> on each lcore timer list. The read-modify-write operations of the counter
> are not atomic, so it has the potential of race condition between lcores.
>
> Use c11 atomics with RELAXED ordering to prevent confliction.
>
> Fixes: cc7b73ea9e3b ("eventdev: add new software timer adapter")
> Cc: erik.g.carrillo@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Hi Thomas,
The latest version does not have ABI breakage issue.
I have added the ABI verifier in my local patch verification setup.
Series applied to dpdk-next-eventdev/master.
Please pull this series from dpdk-next-eventdev/master. Thanks.
I am marking this patch series as "Awaiting Upstream" in patchwork
status to reflect the actual status.
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH] devtools: give some hints for ABI errors
2020-07-08 13:15 4% ` David Marchand
@ 2020-07-08 13:22 4% ` Kinsella, Ray
0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2020-07-08 13:22 UTC (permalink / raw)
To: David Marchand
Cc: dev, Thomas Monjalon, Dodji Seketeli, Neil Horman, Aaron Conole
On 08/07/2020 14:15, David Marchand wrote:
> On Wed, Jul 8, 2020 at 3:09 PM Kinsella, Ray <mdr@ashroe.eu> wrote:
>>
>> + Aaron
>>
>> On 08/07/2020 11:22, David Marchand wrote:
>>> abidiff can provide some more information about the ABI difference it
>>> detected.
>>> In all cases, a discussion on the mailing must happen but we can give
>>> some hints to know if this is a problem with the script calling abidiff,
>>> a potential ABI breakage or an unambiguous ABI breakage.
>>>
>>> Signed-off-by: David Marchand <david.marchand@redhat.com>
>>> ---
>>> devtools/check-abi.sh | 16 ++++++++++++++--
>>> 1 file changed, 14 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/devtools/check-abi.sh b/devtools/check-abi.sh
>>> index e17fedbd9f..521e2cce7c 100755
>>> --- a/devtools/check-abi.sh
>>> +++ b/devtools/check-abi.sh
>>> @@ -50,10 +50,22 @@ for dump in $(find $refdir -name "*.dump"); do
>>> error=1
>>> continue
>>> fi
>>> - if ! abidiff $ABIDIFF_OPTIONS $dump $dump2; then
>>> + abidiff $ABIDIFF_OPTIONS $dump $dump2 || {
>>> + abiret=$?
>>> echo "Error: ABI issue reported for 'abidiff $ABIDIFF_OPTIONS $dump $dump2'"
>>> error=1
>>> - fi
>>> + echo
>>> + if [ $(($abiret & 3)) != 0 ]; then
>>> + echo "ABIDIFF_ERROR|ABIDIFF_USAGE_ERROR, please report this to dev@dpdk.org."
>
> Forgot to --amend.
> Hopefully yes, this will be reported to dev@dpdk.org... I wanted to
> highlight this could be a script or env issue.
>
>
>>> + fi
>>> + if [ $(($abiret & 4)) != 0 ]; then
>>> + echo "ABIDIFF_ABI_CHANGE, this change requires a review (abidiff flagged this as a potential issue)."
>>> + fi
>>> + if [ $(($abiret & 8)) != 0 ]; then
>>> + echo "ABIDIFF_ABI_INCOMPATIBLE_CHANGE, this change breaks the ABI."
>>> + fi
>>> + echo
>>> + }
>>> done
>>>
>>> [ -z "$error" ] || [ -n "$warnonly" ]
>>>
>>
>> This look good to me, my only thought was can we do anything to help the ABI checks play nice with Travis.
>> At the moment it takes time to find the failure reason in the Travis log.
>
> I usually look for "FILES_TO" to get to the last error.
>
Right, but there is hopefully a better way to give Travis some clues ...
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH] devtools: give some hints for ABI errors
2020-07-08 13:09 7% ` Kinsella, Ray
@ 2020-07-08 13:15 4% ` David Marchand
2020-07-08 13:22 4% ` Kinsella, Ray
2020-07-08 13:45 7% ` Aaron Conole
1 sibling, 1 reply; 200+ results
From: David Marchand @ 2020-07-08 13:15 UTC (permalink / raw)
To: Kinsella, Ray
Cc: dev, Thomas Monjalon, Dodji Seketeli, Neil Horman, Aaron Conole
On Wed, Jul 8, 2020 at 3:09 PM Kinsella, Ray <mdr@ashroe.eu> wrote:
>
> + Aaron
>
> On 08/07/2020 11:22, David Marchand wrote:
> > abidiff can provide some more information about the ABI difference it
> > detected.
> > In all cases, a discussion on the mailing must happen but we can give
> > some hints to know if this is a problem with the script calling abidiff,
> > a potential ABI breakage or an unambiguous ABI breakage.
> >
> > Signed-off-by: David Marchand <david.marchand@redhat.com>
> > ---
> > devtools/check-abi.sh | 16 ++++++++++++++--
> > 1 file changed, 14 insertions(+), 2 deletions(-)
> >
> > diff --git a/devtools/check-abi.sh b/devtools/check-abi.sh
> > index e17fedbd9f..521e2cce7c 100755
> > --- a/devtools/check-abi.sh
> > +++ b/devtools/check-abi.sh
> > @@ -50,10 +50,22 @@ for dump in $(find $refdir -name "*.dump"); do
> > error=1
> > continue
> > fi
> > - if ! abidiff $ABIDIFF_OPTIONS $dump $dump2; then
> > + abidiff $ABIDIFF_OPTIONS $dump $dump2 || {
> > + abiret=$?
> > echo "Error: ABI issue reported for 'abidiff $ABIDIFF_OPTIONS $dump $dump2'"
> > error=1
> > - fi
> > + echo
> > + if [ $(($abiret & 3)) != 0 ]; then
> > + echo "ABIDIFF_ERROR|ABIDIFF_USAGE_ERROR, please report this to dev@dpdk.org."
Forgot to --amend.
Hopefully yes, this will be reported to dev@dpdk.org... I wanted to
highlight this could be a script or env issue.
> > + fi
> > + if [ $(($abiret & 4)) != 0 ]; then
> > + echo "ABIDIFF_ABI_CHANGE, this change requires a review (abidiff flagged this as a potential issue)."
> > + fi
> > + if [ $(($abiret & 8)) != 0 ]; then
> > + echo "ABIDIFF_ABI_INCOMPATIBLE_CHANGE, this change breaks the ABI."
> > + fi
> > + echo
> > + }
> > done
> >
> > [ -z "$error" ] || [ -n "$warnonly" ]
> >
>
> This look good to me, my only thought was can we do anything to help the ABI checks play nice with Travis.
> At the moment it takes time to find the failure reason in the Travis log.
I usually look for "FILES_TO" to get to the last error.
--
David Marchand
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH] devtools: give some hints for ABI errors
2020-07-08 10:22 25% [dpdk-dev] [PATCH] devtools: give some hints for ABI errors David Marchand
@ 2020-07-08 13:09 7% ` Kinsella, Ray
2020-07-08 13:15 4% ` David Marchand
2020-07-08 13:45 7% ` Aaron Conole
2020-07-09 15:52 4% ` Dodji Seketeli
` (3 subsequent siblings)
4 siblings, 2 replies; 200+ results
From: Kinsella, Ray @ 2020-07-08 13:09 UTC (permalink / raw)
To: David Marchand, dev; +Cc: thomas, dodji, Neil Horman, Aaron Conole
+ Aaron
On 08/07/2020 11:22, David Marchand wrote:
> abidiff can provide some more information about the ABI difference it
> detected.
> In all cases, a discussion on the mailing must happen but we can give
> some hints to know if this is a problem with the script calling abidiff,
> a potential ABI breakage or an unambiguous ABI breakage.
>
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---
> devtools/check-abi.sh | 16 ++++++++++++++--
> 1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/devtools/check-abi.sh b/devtools/check-abi.sh
> index e17fedbd9f..521e2cce7c 100755
> --- a/devtools/check-abi.sh
> +++ b/devtools/check-abi.sh
> @@ -50,10 +50,22 @@ for dump in $(find $refdir -name "*.dump"); do
> error=1
> continue
> fi
> - if ! abidiff $ABIDIFF_OPTIONS $dump $dump2; then
> + abidiff $ABIDIFF_OPTIONS $dump $dump2 || {
> + abiret=$?
> echo "Error: ABI issue reported for 'abidiff $ABIDIFF_OPTIONS $dump $dump2'"
> error=1
> - fi
> + echo
> + if [ $(($abiret & 3)) != 0 ]; then
> + echo "ABIDIFF_ERROR|ABIDIFF_USAGE_ERROR, please report this to dev@dpdk.org."
> + fi
> + if [ $(($abiret & 4)) != 0 ]; then
> + echo "ABIDIFF_ABI_CHANGE, this change requires a review (abidiff flagged this as a potential issue)."
> + fi
> + if [ $(($abiret & 8)) != 0 ]; then
> + echo "ABIDIFF_ABI_INCOMPATIBLE_CHANGE, this change breaks the ABI."
> + fi
> + echo
> + }
> done
>
> [ -z "$error" ] || [ -n "$warnonly" ]
>
This look good to me, my only thought was can we do anything to help the ABI checks play nice with Travis.
At the moment it takes time to find the failure reason in the Travis log.
Ray K
^ permalink raw reply [relevance 7%]
* Re: [dpdk-dev] [PATCH 2/2] eal: use c11 atomics for interrupt status
@ 2020-07-08 12:29 3% ` David Marchand
2020-07-08 13:43 0% ` Aaron Conole
2020-07-08 15:04 0% ` Kinsella, Ray
0 siblings, 2 replies; 200+ results
From: David Marchand @ 2020-07-08 12:29 UTC (permalink / raw)
To: Phil Yang, Aaron Conole
Cc: dev, David Christensen, Honnappa Nagarahalli,
Ruifeng Wang (Arm Technology China),
nd, Dodji Seketeli, Neil Horman, Ray Kinsella, Harman Kalra
On Thu, Jun 11, 2020 at 12:25 PM Phil Yang <phil.yang@arm.com> wrote:
>
> The event status is defined as a volatile variable and shared
> between threads. Use c11 atomics with explicit ordering instead
> of rte_atomic ops which enforce unnecessary barriers on aarch64.
>
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
> lib/librte_eal/include/rte_eal_interrupts.h | 2 +-
> lib/librte_eal/linux/eal_interrupts.c | 47 ++++++++++++++++++++---------
> 2 files changed, 34 insertions(+), 15 deletions(-)
>
> diff --git a/lib/librte_eal/include/rte_eal_interrupts.h b/lib/librte_eal/include/rte_eal_interrupts.h
> index 773a34a..b1e8a29 100644
> --- a/lib/librte_eal/include/rte_eal_interrupts.h
> +++ b/lib/librte_eal/include/rte_eal_interrupts.h
> @@ -59,7 +59,7 @@ enum {
>
> /** interrupt epoll event obj, taken by epoll_event.ptr */
> struct rte_epoll_event {
> - volatile uint32_t status; /**< OUT: event status */
> + uint32_t status; /**< OUT: event status */
> int fd; /**< OUT: event fd */
> int epfd; /**< OUT: epoll instance the ev associated with */
> struct rte_epoll_data epdata;
I got a reject from the ABI check in my env.
1 function with some indirect sub-type change:
[C]'function int rte_pci_ioport_map(rte_pci_device*, int,
rte_pci_ioport*)' at pci.c:756:1 has some indirect sub-type changes:
parameter 1 of type 'rte_pci_device*' has sub-type changes:
in pointed to type 'struct rte_pci_device' at rte_bus_pci.h:57:1:
type size hasn't changed
1 data member changes (2 filtered):
type of 'rte_intr_handle rte_pci_device::intr_handle' changed:
type size hasn't changed
1 data member change:
type of 'rte_epoll_event rte_intr_handle::elist[512]' changed:
array element type 'struct rte_epoll_event' changed:
type size hasn't changed
1 data member change:
type of 'volatile uint32_t rte_epoll_event::status' changed:
entity changed from 'volatile uint32_t' to 'typedef
uint32_t' at stdint-uintn.h:26:1
type size hasn't changed
type size hasn't changed
This is probably harmless in our case (going from volatile to non
volatile), but it won't pass the check in the CI without an exception
rule.
Note: checking on the test-report ml, I saw nothing, but ovsrobot did
catch the issue with this change too, Aaron?
--
David Marchand
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v2 0/2] doc: minor abi policy fixes
2020-07-08 10:32 7% ` [dpdk-dev] [PATCH v2 0/2] doc: minor abi policy fixes Thomas Monjalon
@ 2020-07-08 12:02 4% ` Kinsella, Ray
0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2020-07-08 12:02 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, fady, Honnappa.Nagarahalli, Neil Horman, John McNamara,
Marko Kovacevic, Harini Ramakrishnan, Omar Cardona,
Pallavi Kadam, Ranjit Menon
On 08/07/2020 11:32, Thomas Monjalon wrote:
> 07/07/2020 19:50, Ray Kinsella:
>> Few documentation fixexs, clarifing the Windows ABI policy and aliases to
>> experimental mode.
>>
>> Ray Kinsella (2):
>> doc: reword abi policy for windows
>> doc: clarify alias to experimental period
>>
>> v2:
>> Addressed feedback from Thomas Monjalon.
>
> One more sentence needs to start on its line,
> avoiding to split a link on two lines.
ah yes, missed that one sorry.
>
> Reworded titles with uppercases as well:
> doc: reword ABI policy for Windows
> doc: clarify period of alias to experimental symbol
>
> Applied with above changes, thanks
>
>
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH v2] mbuf: use C11 atomics for refcnt operations
2020-07-07 10:10 3% ` [dpdk-dev] [PATCH v2] mbuf: use C11 " Phil Yang
2020-07-08 5:11 3% ` Phil Yang
@ 2020-07-08 11:44 0% ` Olivier Matz
2020-07-09 10:00 3% ` Phil Yang
2020-07-09 10:10 4% ` [dpdk-dev] [PATCH v3] mbuf: use C11 atomic built-ins " Phil Yang
2 siblings, 1 reply; 200+ results
From: Olivier Matz @ 2020-07-08 11:44 UTC (permalink / raw)
To: Phil Yang
Cc: david.marchand, dev, drc, Honnappa.Nagarahalli, ruifeng.wang, nd
Hi,
On Tue, Jul 07, 2020 at 06:10:33PM +0800, Phil Yang wrote:
> Use C11 atomics with explicit ordering instead of rte_atomic ops which
> enforce unnecessary barriers on aarch64.
>
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
> v2:
> Fix ABI issue: revert the rte_mbuf_ext_shared_info struct refcnt field
> to refcnt_atomic.
>
> lib/librte_mbuf/rte_mbuf.c | 1 -
> lib/librte_mbuf/rte_mbuf.h | 19 ++++++++++---------
> lib/librte_mbuf/rte_mbuf_core.h | 11 +++--------
> 3 files changed, 13 insertions(+), 18 deletions(-)
>
> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> index ae91ae2..8a456e5 100644
> --- a/lib/librte_mbuf/rte_mbuf.c
> +++ b/lib/librte_mbuf/rte_mbuf.c
> @@ -22,7 +22,6 @@
> #include <rte_eal.h>
> #include <rte_per_lcore.h>
> #include <rte_lcore.h>
> -#include <rte_atomic.h>
> #include <rte_branch_prediction.h>
> #include <rte_mempool.h>
> #include <rte_mbuf.h>
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index f8e492e..4a7a98c 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -37,7 +37,6 @@
> #include <rte_config.h>
> #include <rte_mempool.h>
> #include <rte_memory.h>
> -#include <rte_atomic.h>
> #include <rte_prefetch.h>
> #include <rte_branch_prediction.h>
> #include <rte_byteorder.h>
> @@ -365,7 +364,7 @@ rte_pktmbuf_priv_flags(struct rte_mempool *mp)
> static inline uint16_t
> rte_mbuf_refcnt_read(const struct rte_mbuf *m)
> {
> - return (uint16_t)(rte_atomic16_read(&m->refcnt_atomic));
> + return __atomic_load_n(&m->refcnt, __ATOMIC_RELAXED);
> }
>
> /**
> @@ -378,14 +377,15 @@ rte_mbuf_refcnt_read(const struct rte_mbuf *m)
> static inline void
> rte_mbuf_refcnt_set(struct rte_mbuf *m, uint16_t new_value)
> {
> - rte_atomic16_set(&m->refcnt_atomic, (int16_t)new_value);
> + __atomic_store_n(&m->refcnt, new_value, __ATOMIC_RELAXED);
> }
>
> /* internal */
> static inline uint16_t
> __rte_mbuf_refcnt_update(struct rte_mbuf *m, int16_t value)
> {
> - return (uint16_t)(rte_atomic16_add_return(&m->refcnt_atomic, value));
> + return (uint16_t)(__atomic_add_fetch((int16_t *)&m->refcnt, value,
> + __ATOMIC_ACQ_REL));
> }
>
> /**
> @@ -466,7 +466,7 @@ rte_mbuf_refcnt_set(struct rte_mbuf *m, uint16_t new_value)
> static inline uint16_t
> rte_mbuf_ext_refcnt_read(const struct rte_mbuf_ext_shared_info *shinfo)
> {
> - return (uint16_t)(rte_atomic16_read(&shinfo->refcnt_atomic));
> + return __atomic_load_n(&shinfo->refcnt_atomic, __ATOMIC_RELAXED);
> }
>
> /**
> @@ -481,7 +481,7 @@ static inline void
> rte_mbuf_ext_refcnt_set(struct rte_mbuf_ext_shared_info *shinfo,
> uint16_t new_value)
> {
> - rte_atomic16_set(&shinfo->refcnt_atomic, (int16_t)new_value);
> + __atomic_store_n(&shinfo->refcnt_atomic, new_value, __ATOMIC_RELAXED);
> }
>
> /**
> @@ -505,7 +505,8 @@ rte_mbuf_ext_refcnt_update(struct rte_mbuf_ext_shared_info *shinfo,
> return (uint16_t)value;
> }
>
> - return (uint16_t)rte_atomic16_add_return(&shinfo->refcnt_atomic, value);
> + return (uint16_t)(__atomic_add_fetch((int16_t *)&shinfo->refcnt_atomic,
> + value, __ATOMIC_ACQ_REL));
> }
>
> /** Mbuf prefetch */
> @@ -1304,8 +1305,8 @@ static inline int __rte_pktmbuf_pinned_extbuf_decref(struct rte_mbuf *m)
> * Direct usage of add primitive to avoid
> * duplication of comparing with one.
> */
> - if (likely(rte_atomic16_add_return
> - (&shinfo->refcnt_atomic, -1)))
> + if (likely(__atomic_add_fetch((int *)&shinfo->refcnt_atomic, -1,
> + __ATOMIC_ACQ_REL)))
> return 1;
>
> /* Reinitialize counter before mbuf freeing. */
> diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> index 16600f1..806313a 100644
> --- a/lib/librte_mbuf/rte_mbuf_core.h
> +++ b/lib/librte_mbuf/rte_mbuf_core.h
> @@ -18,7 +18,6 @@
>
> #include <stdint.h>
> #include <rte_compat.h>
> -#include <generic/rte_atomic.h>
>
> #ifdef __cplusplus
> extern "C" {
> @@ -495,12 +494,8 @@ struct rte_mbuf {
> * or non-atomic) is controlled by the CONFIG_RTE_MBUF_REFCNT_ATOMIC
> * config option.
> */
> - RTE_STD_C11
> - union {
> - rte_atomic16_t refcnt_atomic; /**< Atomically accessed refcnt */
> - /** Non-atomically accessed refcnt */
> - uint16_t refcnt;
> - };
> + uint16_t refcnt;
> +
It seems this patch does 2 things:
- remove refcnt_atomic
- use C11 atomics
The first change is an API break. I think it should be announced in a deprecation
notice. The one about atomic does not talk about it.
So I suggest to keep refcnt_atomic until next version.
> uint16_t nb_segs; /**< Number of segments. */
>
> /** Input port (16 bits to support more than 256 virtual ports).
> @@ -679,7 +674,7 @@ typedef void (*rte_mbuf_extbuf_free_callback_t)(void *addr, void *opaque);
> struct rte_mbuf_ext_shared_info {
> rte_mbuf_extbuf_free_callback_t free_cb; /**< Free callback function */
> void *fcb_opaque; /**< Free callback argument */
> - rte_atomic16_t refcnt_atomic; /**< Atomically accessed refcnt */
> + uint16_t refcnt_atomic; /**< Atomically accessed refcnt */
> };
>
> /**< Maximum number of nb_segs allowed. */
> --
> 2.7.4
>
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v2 0/2] doc: minor abi policy fixes
2020-07-07 17:50 8% ` [dpdk-dev] [PATCH v2 0/2] doc: minor abi policy fixes Ray Kinsella
2020-07-07 17:51 24% ` [dpdk-dev] [PATCH v2 1/2] doc: reword abi policy for windows Ray Kinsella
2020-07-07 17:51 12% ` [dpdk-dev] [PATCH v2 2/2] doc: clarify alias to experimental period Ray Kinsella
@ 2020-07-08 10:32 7% ` Thomas Monjalon
2020-07-08 12:02 4% ` Kinsella, Ray
2 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2020-07-08 10:32 UTC (permalink / raw)
To: Ray Kinsella
Cc: dev, fady, Honnappa.Nagarahalli, Neil Horman, John McNamara,
Marko Kovacevic, Harini Ramakrishnan, Omar Cardona,
Pallavi Kadam, Ranjit Menon
07/07/2020 19:50, Ray Kinsella:
> Few documentation fixexs, clarifing the Windows ABI policy and aliases to
> experimental mode.
>
> Ray Kinsella (2):
> doc: reword abi policy for windows
> doc: clarify alias to experimental period
>
> v2:
> Addressed feedback from Thomas Monjalon.
One more sentence needs to start on its line,
avoiding to split a link on two lines.
Reworded titles with uppercases as well:
doc: reword ABI policy for Windows
doc: clarify period of alias to experimental symbol
Applied with above changes, thanks
^ permalink raw reply [relevance 7%]
* [dpdk-dev] [PATCH] devtools: give some hints for ABI errors
@ 2020-07-08 10:22 25% David Marchand
2020-07-08 13:09 7% ` Kinsella, Ray
` (4 more replies)
0 siblings, 5 replies; 200+ results
From: David Marchand @ 2020-07-08 10:22 UTC (permalink / raw)
To: dev; +Cc: thomas, dodji, Ray Kinsella, Neil Horman
abidiff can provide some more information about the ABI difference it
detected.
In all cases, a discussion on the mailing must happen but we can give
some hints to know if this is a problem with the script calling abidiff,
a potential ABI breakage or an unambiguous ABI breakage.
Signed-off-by: David Marchand <david.marchand@redhat.com>
---
devtools/check-abi.sh | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/devtools/check-abi.sh b/devtools/check-abi.sh
index e17fedbd9f..521e2cce7c 100755
--- a/devtools/check-abi.sh
+++ b/devtools/check-abi.sh
@@ -50,10 +50,22 @@ for dump in $(find $refdir -name "*.dump"); do
error=1
continue
fi
- if ! abidiff $ABIDIFF_OPTIONS $dump $dump2; then
+ abidiff $ABIDIFF_OPTIONS $dump $dump2 || {
+ abiret=$?
echo "Error: ABI issue reported for 'abidiff $ABIDIFF_OPTIONS $dump $dump2'"
error=1
- fi
+ echo
+ if [ $(($abiret & 3)) != 0 ]; then
+ echo "ABIDIFF_ERROR|ABIDIFF_USAGE_ERROR, please report this to dev@dpdk.org."
+ fi
+ if [ $(($abiret & 4)) != 0 ]; then
+ echo "ABIDIFF_ABI_CHANGE, this change requires a review (abidiff flagged this as a potential issue)."
+ fi
+ if [ $(($abiret & 8)) != 0 ]; then
+ echo "ABIDIFF_ABI_INCOMPATIBLE_CHANGE, this change breaks the ABI."
+ fi
+ echo
+ }
done
[ -z "$error" ] || [ -n "$warnonly" ]
--
2.23.0
^ permalink raw reply [relevance 25%]
* Re: [dpdk-dev] [PATCH v2] mbuf: use C11 atomics for refcnt operations
2020-07-07 10:10 3% ` [dpdk-dev] [PATCH v2] mbuf: use C11 " Phil Yang
@ 2020-07-08 5:11 3% ` Phil Yang
2020-07-08 11:44 0% ` Olivier Matz
2020-07-09 10:10 4% ` [dpdk-dev] [PATCH v3] mbuf: use C11 atomic built-ins " Phil Yang
2 siblings, 0 replies; 200+ results
From: Phil Yang @ 2020-07-08 5:11 UTC (permalink / raw)
To: Phil Yang, david.marchand, dev
Cc: drc, Honnappa Nagarahalli, olivier.matz, Ruifeng Wang, nd
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Phil Yang
> Sent: Tuesday, July 7, 2020 6:11 PM
> To: david.marchand@redhat.com; dev@dpdk.org
> Cc: drc@linux.vnet.ibm.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; olivier.matz@6wind.com; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; nd <nd@arm.com>
> Subject: [dpdk-dev] [PATCH v2] mbuf: use C11 atomics for refcnt operations
>
> Use C11 atomics with explicit ordering instead of rte_atomic ops which
> enforce unnecessary barriers on aarch64.
>
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
> v2:
> Fix ABI issue: revert the rte_mbuf_ext_shared_info struct refcnt field
> to refcnt_atomic.
>
<snip>
> diff --git a/lib/librte_mbuf/rte_mbuf_core.h
> b/lib/librte_mbuf/rte_mbuf_core.h
> index 16600f1..806313a 100644
> --- a/lib/librte_mbuf/rte_mbuf_core.h
> +++ b/lib/librte_mbuf/rte_mbuf_core.h
> @@ -18,7 +18,6 @@
>
> #include <stdint.h>
> #include <rte_compat.h>
> -#include <generic/rte_atomic.h>
>
> #ifdef __cplusplus
> extern "C" {
> @@ -495,12 +494,8 @@ struct rte_mbuf {
> * or non-atomic) is controlled by the
> CONFIG_RTE_MBUF_REFCNT_ATOMIC
> * config option.
> */
> - RTE_STD_C11
> - union {
> - rte_atomic16_t refcnt_atomic; /**< Atomically accessed
> refcnt */
> - /** Non-atomically accessed refcnt */
> - uint16_t refcnt;
> - };
> + uint16_t refcnt;
> +
> uint16_t nb_segs; /**< Number of segments. */
>
> /** Input port (16 bits to support more than 256 virtual ports).
> @@ -679,7 +674,7 @@ typedef void
> (*rte_mbuf_extbuf_free_callback_t)(void *addr, void *opaque);
> struct rte_mbuf_ext_shared_info {
> rte_mbuf_extbuf_free_callback_t free_cb; /**< Free callback
> function */
> void *fcb_opaque; /**< Free callback argument */
> - rte_atomic16_t refcnt_atomic; /**< Atomically accessed refcnt */
> + uint16_t refcnt_atomic; /**< Atomically accessed refcnt */
It still causes an ABI check failure in Travis CI on this type change.
I think we need an exception in libabigail.abignore for this.
Thanks,
Phil
> };
>
> /**< Maximum number of nb_segs allowed. */
> --
> 2.7.4
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH 6/7] cmdline: support Windows
@ 2020-07-08 1:09 0% ` Dmitry Kozlyuk
0 siblings, 0 replies; 200+ results
From: Dmitry Kozlyuk @ 2020-07-08 1:09 UTC (permalink / raw)
To: Tal Shnaiderman
Cc: Ranjit Menon, Fady Bader, dev, Dmitry Malloy,
Narcisa Ana Maria Vasile, Thomas Monjalon, Olivier Matz
On Tue, 30 Jun 2020 02:56:20 +0300, Dmitry Kozlyuk wrote:
> On Mon, 29 Jun 2020 08:12:51 +0000, Tal Shnaiderman wrote:
> > > From: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
> > > Subject: Re: [dpdk-dev] [PATCH 6/7] cmdline: support Windows
> > >
> > > On Sun, 28 Jun 2020 23:23:11 -0700, Ranjit Menon wrote:
[snip]
> > > > The issue is that UINT8, UINT16, INT32, INT64 etc. are reserved types
> > > > in Windows headers for integer types. We found that it is easier to
> > > > change the enum in cmdline_parse_num.h than try to play with the
> > > > include order of headers. AFAIK, the enums were only used to determine
> > > > the type in a series of switch() statements in librte_cmdline, so we
> > > > simply renamed the enums. Not sure, if that will be acceptable here.
> > >
> > > +1 for renaming enum values. It's not a problem of librte_cmdline itself
> > > +but a
> > > problem of its consumption on Windows, however renaming enum values
> > > doesn't break ABI and winn make librte_cmdline API "namespaced".
> > >
[snip]
> >
> > test_pmd redefine BOOLEAN and PATTERN in the index enum, I'm not sure how many more conflicts we will face because of this huge include.
> >
> > Also, DPDK applications will inherit it unknowingly, not sure if this is common for windows libraries.
>
> I never hit these particular conflicts, but you're right that there will be
> more, e.g. I remember particularly nasty clashes in failsafe PMD, unrelated
> to cmdline token names.
Still, I'd go for renaming, with or without additional steps to hide
<windows.h>. Although I wouldn't include it in this series: renaming will
touch numerous places and require much more reviewers.
> We could take the same approach as with networking headers: copy required
> declarations instead of including them from SDK. Here's a list of what
> pthread.h uses:
While this will resolve the issue for DPDK code, applications using DPDK
headers can easily hit it by including <windows.h> on their own. On the other
hand, they can always split translation units and I don't know how practical
it is to use system and DPDK networking headers at the same time.
--
Dmitry Kozlyuk
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v2 2/2] doc: clarify alias to experimental period
2020-07-07 17:51 12% ` [dpdk-dev] [PATCH v2 2/2] doc: clarify alias to experimental period Ray Kinsella
@ 2020-07-07 18:44 0% ` Honnappa Nagarahalli
0 siblings, 0 replies; 200+ results
From: Honnappa Nagarahalli @ 2020-07-07 18:44 UTC (permalink / raw)
To: Ray Kinsella, dev
Cc: fady, thomas, Neil Horman, John McNamara, Marko Kovacevic,
Harini Ramakrishnan, Omar Cardona, Pallavi Kadam, Ranjit Menon,
Honnappa Nagarahalli, nd, nd
<snip>
> Subject: [PATCH v2 2/2] doc: clarify alias to experimental period
>
> Clarify retention period for aliases to experimental.
>
> Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> ---
> doc/guides/contributing/abi_versioning.rst | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/doc/guides/contributing/abi_versioning.rst
> b/doc/guides/contributing/abi_versioning.rst
> index 31a9205..b1d09c7 100644
> --- a/doc/guides/contributing/abi_versioning.rst
> +++ b/doc/guides/contributing/abi_versioning.rst
> @@ -158,7 +158,7 @@ The macros exported are:
> * ``VERSION_SYMBOL_EXPERIMENTAL(b, e)``: Creates a symbol version table
> entry
> binding versioned symbol ``b@EXPERIMENTAL`` to the internal function
> ``be``.
> The macro is used when a symbol matures to become part of the stable ABI,
> to
> - provide an alias to experimental for some time.
> + provide an alias to experimental until the next major ABI version.
>
> .. _example_abi_macro_usage:
>
> @@ -428,8 +428,9 @@ _____________________________
>
> In situations in which an ``experimental`` symbol has been stable for some
> time, and it becomes a candidate for promotion to the stable ABI. At this
> time, when -promoting the symbol, maintainer may choose to provide an
> alias to the
> +promoting the symbol, the maintainer may choose to provide an alias to
> +the
> ``experimental`` symbol version, so as not to break consuming applications.
> +This alias is then dropped in the next major ABI version.
>
> The process to provide an alias to ``experimental`` is similar to that,
> of :ref:`symbol versioning <example_abi_macro_usage>` described above.
> --
> 2.7.4
^ permalink raw reply [relevance 0%]
* [dpdk-dev] [PATCH v2 2/2] doc: clarify alias to experimental period
2020-07-07 17:50 8% ` [dpdk-dev] [PATCH v2 0/2] doc: minor abi policy fixes Ray Kinsella
2020-07-07 17:51 24% ` [dpdk-dev] [PATCH v2 1/2] doc: reword abi policy for windows Ray Kinsella
@ 2020-07-07 17:51 12% ` Ray Kinsella
2020-07-07 18:44 0% ` Honnappa Nagarahalli
2020-07-08 10:32 7% ` [dpdk-dev] [PATCH v2 0/2] doc: minor abi policy fixes Thomas Monjalon
2 siblings, 1 reply; 200+ results
From: Ray Kinsella @ 2020-07-07 17:51 UTC (permalink / raw)
To: dev
Cc: fady, thomas, Honnappa.Nagarahalli, Ray Kinsella, Neil Horman,
John McNamara, Marko Kovacevic, Harini Ramakrishnan,
Omar Cardona, Pallavi Kadam, Ranjit Menon
Clarify retention period for aliases to experimental.
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
---
doc/guides/contributing/abi_versioning.rst | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/doc/guides/contributing/abi_versioning.rst b/doc/guides/contributing/abi_versioning.rst
index 31a9205..b1d09c7 100644
--- a/doc/guides/contributing/abi_versioning.rst
+++ b/doc/guides/contributing/abi_versioning.rst
@@ -158,7 +158,7 @@ The macros exported are:
* ``VERSION_SYMBOL_EXPERIMENTAL(b, e)``: Creates a symbol version table entry
binding versioned symbol ``b@EXPERIMENTAL`` to the internal function ``be``.
The macro is used when a symbol matures to become part of the stable ABI, to
- provide an alias to experimental for some time.
+ provide an alias to experimental until the next major ABI version.
.. _example_abi_macro_usage:
@@ -428,8 +428,9 @@ _____________________________
In situations in which an ``experimental`` symbol has been stable for some time,
and it becomes a candidate for promotion to the stable ABI. At this time, when
-promoting the symbol, maintainer may choose to provide an alias to the
+promoting the symbol, the maintainer may choose to provide an alias to the
``experimental`` symbol version, so as not to break consuming applications.
+This alias is then dropped in the next major ABI version.
The process to provide an alias to ``experimental`` is similar to that, of
:ref:`symbol versioning <example_abi_macro_usage>` described above.
--
2.7.4
^ permalink raw reply [relevance 12%]
* [dpdk-dev] [PATCH v2 0/2] doc: minor abi policy fixes
2020-07-07 14:45 8% [dpdk-dev] [PATCH v1 0/2] doc: minor abi policy fixes Ray Kinsella
2020-07-07 14:45 24% ` [dpdk-dev] [PATCH v1 1/2] doc: reword abi policy for windows Ray Kinsella
2020-07-07 14:45 12% ` [dpdk-dev] [PATCH v1 2/2] doc: clarify alias to experimental period Ray Kinsella
@ 2020-07-07 17:50 8% ` Ray Kinsella
2020-07-07 17:51 24% ` [dpdk-dev] [PATCH v2 1/2] doc: reword abi policy for windows Ray Kinsella
` (2 more replies)
2 siblings, 3 replies; 200+ results
From: Ray Kinsella @ 2020-07-07 17:50 UTC (permalink / raw)
To: dev
Cc: fady, thomas, Honnappa.Nagarahalli, Ray Kinsella, Neil Horman,
John McNamara, Marko Kovacevic, Harini Ramakrishnan,
Omar Cardona, Pallavi Kadam, Ranjit Menon
Few documentation fixexs, clarifing the Windows ABI policy and aliases to
experimental mode.
Ray Kinsella (2):
doc: reword abi policy for windows
doc: clarify alias to experimental period
v2:
Addressed feedback from Thomas Monjalon.
doc/guides/contributing/abi_policy.rst | 4 +++-
doc/guides/contributing/abi_versioning.rst | 5 +++--
doc/guides/windows_gsg/intro.rst | 6 +++---
3 files changed, 9 insertions(+), 6 deletions(-)
--
2.7.4
^ permalink raw reply [relevance 8%]
* [dpdk-dev] [PATCH v2 1/2] doc: reword abi policy for windows
2020-07-07 17:50 8% ` [dpdk-dev] [PATCH v2 0/2] doc: minor abi policy fixes Ray Kinsella
@ 2020-07-07 17:51 24% ` Ray Kinsella
2020-07-07 17:51 12% ` [dpdk-dev] [PATCH v2 2/2] doc: clarify alias to experimental period Ray Kinsella
2020-07-08 10:32 7% ` [dpdk-dev] [PATCH v2 0/2] doc: minor abi policy fixes Thomas Monjalon
2 siblings, 0 replies; 200+ results
From: Ray Kinsella @ 2020-07-07 17:51 UTC (permalink / raw)
To: dev
Cc: fady, thomas, Honnappa.Nagarahalli, Ray Kinsella, Neil Horman,
John McNamara, Marko Kovacevic, Harini Ramakrishnan,
Omar Cardona, Pallavi Kadam, Ranjit Menon
Minor changes to the abi policy for windows.
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
---
doc/guides/contributing/abi_policy.rst | 4 +++-
doc/guides/windows_gsg/intro.rst | 6 +++---
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/doc/guides/contributing/abi_policy.rst b/doc/guides/contributing/abi_policy.rst
index d0affa9..4452362 100644
--- a/doc/guides/contributing/abi_policy.rst
+++ b/doc/guides/contributing/abi_policy.rst
@@ -40,7 +40,9 @@ General Guidelines
maintaining ABI stability through one year of DPDK releases starting from
DPDK 19.11. This policy will be reviewed in 2020, with intention of
lengthening the stability period. Additional implementation detail can be
- found in the :ref:`release notes <20_02_abi_changes>`.
+ found in the :ref:`release notes <20_02_abi_changes>`. Please note that this
+ policy does not currently apply to the :doc:`Windows build
+ <../windows_gsg/intro>`.
What is an ABI?
~~~~~~~~~~~~~~~
diff --git a/doc/guides/windows_gsg/intro.rst b/doc/guides/windows_gsg/intro.rst
index 58c6246..4ac7f97 100644
--- a/doc/guides/windows_gsg/intro.rst
+++ b/doc/guides/windows_gsg/intro.rst
@@ -19,6 +19,6 @@ compile. Support is being added in pieces so as to limit the overall scope
of any individual patch series. The goal is to be able to run any DPDK
application natively on Windows.
-The :doc:`../contributing/abi_policy` cannot be respected for Windows.
-Minor ABI versions may be incompatible
-because function versioning is not supported on Windows.
+The :doc:`../contributing/abi_policy` does not apply to the Windows build,
+as function versioning is not supported on Windows,
+therefore minor ABI versions may be incompatible.
--
2.7.4
^ permalink raw reply [relevance 24%]
* Re: [dpdk-dev] [PATCH v1 2/2] doc: clarify alias to experimental period
2020-07-07 17:01 4% ` Kinsella, Ray
@ 2020-07-07 17:08 0% ` Thomas Monjalon
0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2020-07-07 17:08 UTC (permalink / raw)
To: Kinsella, Ray
Cc: dev, fady, Honnappa.Nagarahalli, Neil Horman, John McNamara,
Marko Kovacevic, Harini Ramakrishnan, Omar Cardona,
Pallavi Kadam, Ranjit Menon, david.marchand, bruce.richardson
07/07/2020 19:01, Kinsella, Ray:
> On 07/07/2020 17:57, Thomas Monjalon wrote:
> > 07/07/2020 18:37, Kinsella, Ray:
> >> On 07/07/2020 17:36, Thomas Monjalon wrote:
> >>> 07/07/2020 18:35, Kinsella, Ray:
> >>>> On 07/07/2020 16:26, Thomas Monjalon wrote:
> >>>>> 07/07/2020 16:45, Ray Kinsella:
> >>>>>> Clarify retention period for aliases to experimental.
> >>>>>>
> >>>>>> Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
> >>>>>> ---
> >>>>>> --- a/doc/guides/contributing/abi_versioning.rst
> >>>>>> +++ b/doc/guides/contributing/abi_versioning.rst
> >>>>>> @@ -158,7 +158,7 @@ The macros exported are:
> >>>>>> * ``VERSION_SYMBOL_EXPERIMENTAL(b, e)``: Creates a symbol version table entry
> >>>>>> binding versioned symbol ``b@EXPERIMENTAL`` to the internal function ``be``.
> >>>>>> The macro is used when a symbol matures to become part of the stable ABI, to
> >>>>>> - provide an alias to experimental for some time.
> >>>>>> + provide an alias to experimental until the next major ABI version.
> >>>>>
> >>>>> Why limiting the period for experimental status?
> >>>>> Some API want to remain experimental longer.
> >>>>>
> >>>>> [...]
> >>>>>> +alias will then typically be dropped in the next major ABI version.
> >>>>>
> >>>>> I don't see the need for the time estimation.
> >>>>
> >>>> Will reword to ...
> >>>>
> >>>> "This alias will then be dropped in the next major ABI version."
> >>>
> >>> It is not addressing my first comment. Please see above.
> >>
> >> Thank you, I don't necessarily agree with the first comment :-)
> >
> > You don't have to agree. But in this case we must discuss :-)
> >
> >> We need to say when the alias should be dropped no?
> >
> > I don't think so.
> > Until now, it is let to the appreciation of the maintainer.
> > If we want to change the rule, especially for experimental period,
> > it must be said clearly and debated.
>
> It doesn't make _any_ sense to maintain an alias after the new ABI.
>
> The alias is there to maintain ABI compatibility,
> there is no reason to maintain compatibility in the new ABI - so it should be dropped
Yes I was wrong, sorry.
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v1 2/2] doc: clarify alias to experimental period
2020-07-07 17:00 0% ` Thomas Monjalon
@ 2020-07-07 17:01 0% ` Kinsella, Ray
0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2020-07-07 17:01 UTC (permalink / raw)
To: Thomas Monjalon, Honnappa Nagarahalli
Cc: dev, fady, Neil Horman, John McNamara, Marko Kovacevic,
Harini Ramakrishnan, Omar Cardona, Pallavi Kadam, Ranjit Menon,
david.marchand, bruce.richardson, nd
On 07/07/2020 18:00, Thomas Monjalon wrote:
> 07/07/2020 18:55, Honnappa Nagarahalli:
>>> On 07/07/2020 17:36, Thomas Monjalon wrote:
>>>> 07/07/2020 18:35, Kinsella, Ray:
>>>>> On 07/07/2020 16:26, Thomas Monjalon wrote:
>>>>>> 07/07/2020 16:45, Ray Kinsella:
>>>>>>> Clarify retention period for aliases to experimental.
>>>>>>>
>>>>>>> Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
>>>>>>> ---
>>>>>>> --- a/doc/guides/contributing/abi_versioning.rst
>>>>>>> +++ b/doc/guides/contributing/abi_versioning.rst
>>>>>>> @@ -158,7 +158,7 @@ The macros exported are:
>>>>>>> * ``VERSION_SYMBOL_EXPERIMENTAL(b, e)``: Creates a symbol version
>>> table entry
>>>>>>> binding versioned symbol ``b@EXPERIMENTAL`` to the internal
>>> function ``be``.
>>>>>>> The macro is used when a symbol matures to become part of the
>>>>>>> stable ABI, to
>>>>>>> - provide an alias to experimental for some time.
>>>>>>> + provide an alias to experimental until the next major ABI version.
>>>>>>
>>>>>> Why limiting the period for experimental status?
>>>>>> Some API want to remain experimental longer.
>>
>> This is not limiting the period.
>> This is about how long VERSION_SYMBOL_EXPERIMENTAL should be in place
>> for a symbol after the experimental tag is removed for the symbol.
>
> Oh wait, I was wrong. It is only about the alias which is set
> AFTER the experimental period.
>
>>>>>> [...]
>>>>>>> In situations in which an ``experimental`` symbol has been stable
>>>>>>> for some time, and it becomes a candidate for promotion to the
>>>>>>> stable ABI. At this time, when -promoting the symbol, maintainer
>>>>>>> may choose to provide an alias to the -``experimental`` symbol version,
>>> so as not to break consuming applications.
>>>>>>> +promoting the symbol, the maintainer may choose to provide an
>>>>>>> +alias to the ``experimental`` symbol version, so as not to break
>>>>>>> +consuming applications. This
>>>>>>
>>>>>> Please start a sentence on a new line.
>>>>>
>>>>> ACK
>>>>>
>>>>>>
>>>>>>> +alias will then typically be dropped in the next major ABI version.
>>>>>>
>>>>>> I don't see the need for the time estimation.
>>
>> I prefer this wording as it clarifying what should be done while creating a patch.
>
> Yes, after a second read, I am OK.
>
perfect, I will sort out the other bits.
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v1 2/2] doc: clarify alias to experimental period
2020-07-07 16:57 0% ` Thomas Monjalon
@ 2020-07-07 17:01 4% ` Kinsella, Ray
2020-07-07 17:08 0% ` Thomas Monjalon
0 siblings, 1 reply; 200+ results
From: Kinsella, Ray @ 2020-07-07 17:01 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, fady, Honnappa.Nagarahalli, Neil Horman, John McNamara,
Marko Kovacevic, Harini Ramakrishnan, Omar Cardona,
Pallavi Kadam, Ranjit Menon, david.marchand, bruce.richardson
On 07/07/2020 17:57, Thomas Monjalon wrote:
> 07/07/2020 18:37, Kinsella, Ray:
>>
>> On 07/07/2020 17:36, Thomas Monjalon wrote:
>>> 07/07/2020 18:35, Kinsella, Ray:
>>>> On 07/07/2020 16:26, Thomas Monjalon wrote:
>>>>> 07/07/2020 16:45, Ray Kinsella:
>>>>>> Clarify retention period for aliases to experimental.
>>>>>>
>>>>>> Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
>>>>>> ---
>>>>>> --- a/doc/guides/contributing/abi_versioning.rst
>>>>>> +++ b/doc/guides/contributing/abi_versioning.rst
>>>>>> @@ -158,7 +158,7 @@ The macros exported are:
>>>>>> * ``VERSION_SYMBOL_EXPERIMENTAL(b, e)``: Creates a symbol version table entry
>>>>>> binding versioned symbol ``b@EXPERIMENTAL`` to the internal function ``be``.
>>>>>> The macro is used when a symbol matures to become part of the stable ABI, to
>>>>>> - provide an alias to experimental for some time.
>>>>>> + provide an alias to experimental until the next major ABI version.
>>>>>
>>>>> Why limiting the period for experimental status?
>>>>> Some API want to remain experimental longer.
>>>>>
>>>>> [...]
>>>>>> +alias will then typically be dropped in the next major ABI version.
>>>>>
>>>>> I don't see the need for the time estimation.
>>>>
>>>> Will reword to ...
>>>>
>>>> "This alias will then be dropped in the next major ABI version."
>>>
>>> It is not addressing my first comment. Please see above.
>>
>> Thank you, I don't necessarily agree with the first comment :-)
>
> You don't have to agree. But in this case we must discuss :-)
>
>> We need to say when the alias should be dropped no?
>
> I don't think so.
> Until now, it is let to the appreciation of the maintainer.
> If we want to change the rule, especially for experimental period,
> it must be said clearly and debated.
It doesn't make _any_ sense to maintain an alias after the new ABI.
The alias is there to maintain ABI compatibility,
there is no reason to maintain compatibility in the new ABI - so it should be dropped
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH v1 2/2] doc: clarify alias to experimental period
2020-07-07 16:55 0% ` Honnappa Nagarahalli
@ 2020-07-07 17:00 0% ` Thomas Monjalon
2020-07-07 17:01 0% ` Kinsella, Ray
0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2020-07-07 17:00 UTC (permalink / raw)
To: Kinsella, Ray, Honnappa Nagarahalli
Cc: dev, fady, Neil Horman, John McNamara, Marko Kovacevic,
Harini Ramakrishnan, Omar Cardona, Pallavi Kadam, Ranjit Menon,
david.marchand, bruce.richardson, nd
07/07/2020 18:55, Honnappa Nagarahalli:
> > On 07/07/2020 17:36, Thomas Monjalon wrote:
> > > 07/07/2020 18:35, Kinsella, Ray:
> > >> On 07/07/2020 16:26, Thomas Monjalon wrote:
> > >>> 07/07/2020 16:45, Ray Kinsella:
> > >>>> Clarify retention period for aliases to experimental.
> > >>>>
> > >>>> Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
> > >>>> ---
> > >>>> --- a/doc/guides/contributing/abi_versioning.rst
> > >>>> +++ b/doc/guides/contributing/abi_versioning.rst
> > >>>> @@ -158,7 +158,7 @@ The macros exported are:
> > >>>> * ``VERSION_SYMBOL_EXPERIMENTAL(b, e)``: Creates a symbol version
> > table entry
> > >>>> binding versioned symbol ``b@EXPERIMENTAL`` to the internal
> > function ``be``.
> > >>>> The macro is used when a symbol matures to become part of the
> > >>>> stable ABI, to
> > >>>> - provide an alias to experimental for some time.
> > >>>> + provide an alias to experimental until the next major ABI version.
> > >>>
> > >>> Why limiting the period for experimental status?
> > >>> Some API want to remain experimental longer.
>
> This is not limiting the period.
> This is about how long VERSION_SYMBOL_EXPERIMENTAL should be in place
> for a symbol after the experimental tag is removed for the symbol.
Oh wait, I was wrong. It is only about the alias which is set
AFTER the experimental period.
> > >>> [...]
> > >>>> In situations in which an ``experimental`` symbol has been stable
> > >>>> for some time, and it becomes a candidate for promotion to the
> > >>>> stable ABI. At this time, when -promoting the symbol, maintainer
> > >>>> may choose to provide an alias to the -``experimental`` symbol version,
> > so as not to break consuming applications.
> > >>>> +promoting the symbol, the maintainer may choose to provide an
> > >>>> +alias to the ``experimental`` symbol version, so as not to break
> > >>>> +consuming applications. This
> > >>>
> > >>> Please start a sentence on a new line.
> > >>
> > >> ACK
> > >>
> > >>>
> > >>>> +alias will then typically be dropped in the next major ABI version.
> > >>>
> > >>> I don't see the need for the time estimation.
>
> I prefer this wording as it clarifying what should be done while creating a patch.
Yes, after a second read, I am OK.
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v1 2/2] doc: clarify alias to experimental period
2020-07-07 16:37 0% ` Kinsella, Ray
2020-07-07 16:55 0% ` Honnappa Nagarahalli
@ 2020-07-07 16:57 0% ` Thomas Monjalon
2020-07-07 17:01 4% ` Kinsella, Ray
1 sibling, 1 reply; 200+ results
From: Thomas Monjalon @ 2020-07-07 16:57 UTC (permalink / raw)
To: Kinsella, Ray
Cc: dev, fady, Honnappa.Nagarahalli, Neil Horman, John McNamara,
Marko Kovacevic, Harini Ramakrishnan, Omar Cardona,
Pallavi Kadam, Ranjit Menon, david.marchand, bruce.richardson
07/07/2020 18:37, Kinsella, Ray:
>
> On 07/07/2020 17:36, Thomas Monjalon wrote:
> > 07/07/2020 18:35, Kinsella, Ray:
> >> On 07/07/2020 16:26, Thomas Monjalon wrote:
> >>> 07/07/2020 16:45, Ray Kinsella:
> >>>> Clarify retention period for aliases to experimental.
> >>>>
> >>>> Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
> >>>> ---
> >>>> --- a/doc/guides/contributing/abi_versioning.rst
> >>>> +++ b/doc/guides/contributing/abi_versioning.rst
> >>>> @@ -158,7 +158,7 @@ The macros exported are:
> >>>> * ``VERSION_SYMBOL_EXPERIMENTAL(b, e)``: Creates a symbol version table entry
> >>>> binding versioned symbol ``b@EXPERIMENTAL`` to the internal function ``be``.
> >>>> The macro is used when a symbol matures to become part of the stable ABI, to
> >>>> - provide an alias to experimental for some time.
> >>>> + provide an alias to experimental until the next major ABI version.
> >>>
> >>> Why limiting the period for experimental status?
> >>> Some API want to remain experimental longer.
> >>>
> >>> [...]
> >>>> +alias will then typically be dropped in the next major ABI version.
> >>>
> >>> I don't see the need for the time estimation.
> >>
> >> Will reword to ...
> >>
> >> "This alias will then be dropped in the next major ABI version."
> >
> > It is not addressing my first comment. Please see above.
>
> Thank you, I don't necessarily agree with the first comment :-)
You don't have to agree. But in this case we must discuss :-)
> We need to say when the alias should be dropped no?
I don't think so.
Until now, it is let to the appreciation of the maintainer.
If we want to change the rule, especially for experimental period,
it must be said clearly and debated.
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v1 2/2] doc: clarify alias to experimental period
2020-07-07 16:37 0% ` Kinsella, Ray
@ 2020-07-07 16:55 0% ` Honnappa Nagarahalli
2020-07-07 17:00 0% ` Thomas Monjalon
2020-07-07 16:57 0% ` Thomas Monjalon
1 sibling, 1 reply; 200+ results
From: Honnappa Nagarahalli @ 2020-07-07 16:55 UTC (permalink / raw)
To: Kinsella, Ray, thomas
Cc: dev, fady, Neil Horman, John McNamara, Marko Kovacevic,
Harini Ramakrishnan, Omar Cardona, Pallavi Kadam, Ranjit Menon,
david.marchand, bruce.richardson, Honnappa Nagarahalli, nd, nd
<snip>
> Subject: Re: [PATCH v1 2/2] doc: clarify alias to experimental period
>
>
>
> On 07/07/2020 17:36, Thomas Monjalon wrote:
> > 07/07/2020 18:35, Kinsella, Ray:
> >> On 07/07/2020 16:26, Thomas Monjalon wrote:
> >>> 07/07/2020 16:45, Ray Kinsella:
> >>>> Clarify retention period for aliases to experimental.
> >>>>
> >>>> Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
> >>>> ---
> >>>> --- a/doc/guides/contributing/abi_versioning.rst
> >>>> +++ b/doc/guides/contributing/abi_versioning.rst
> >>>> @@ -158,7 +158,7 @@ The macros exported are:
> >>>> * ``VERSION_SYMBOL_EXPERIMENTAL(b, e)``: Creates a symbol version
> table entry
> >>>> binding versioned symbol ``b@EXPERIMENTAL`` to the internal
> function ``be``.
> >>>> The macro is used when a symbol matures to become part of the
> >>>> stable ABI, to
> >>>> - provide an alias to experimental for some time.
> >>>> + provide an alias to experimental until the next major ABI version.
> >>>
> >>> Why limiting the period for experimental status?
> >>> Some API want to remain experimental longer.
This is not limiting the period. This is about how long VERSION_SYMBOL_EXPERIMENTAL should be in place for a symbol after the experimental tag is removed for the symbol.
> >>>
> >>> [...]
> >>>> In situations in which an ``experimental`` symbol has been stable
> >>>> for some time, and it becomes a candidate for promotion to the
> >>>> stable ABI. At this time, when -promoting the symbol, maintainer
> >>>> may choose to provide an alias to the -``experimental`` symbol version,
> so as not to break consuming applications.
> >>>> +promoting the symbol, the maintainer may choose to provide an
> >>>> +alias to the ``experimental`` symbol version, so as not to break
> >>>> +consuming applications. This
> >>>
> >>> Please start a sentence on a new line.
> >>
> >> ACK
> >>
> >>>
> >>>> +alias will then typically be dropped in the next major ABI version.
> >>>
> >>> I don't see the need for the time estimation.
I prefer this wording as it clarifying what should be done while creating a patch.
> >>>
> >>>
> >>
> >> Will reword to ...
> >>
> >> "This alias will then be dropped in the next major ABI version."
> >
> > It is not addressing my first comment. Please see above.
> >
>
> Thank you, I don't necessarily agree with the first comment :-) We need to say
> when the alias should be dropped no?
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v1 2/2] doc: clarify alias to experimental period
2020-07-07 16:36 0% ` Thomas Monjalon
@ 2020-07-07 16:37 0% ` Kinsella, Ray
2020-07-07 16:55 0% ` Honnappa Nagarahalli
2020-07-07 16:57 0% ` Thomas Monjalon
0 siblings, 2 replies; 200+ results
From: Kinsella, Ray @ 2020-07-07 16:37 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, fady, Honnappa.Nagarahalli, Neil Horman, John McNamara,
Marko Kovacevic, Harini Ramakrishnan, Omar Cardona,
Pallavi Kadam, Ranjit Menon, david.marchand, bruce.richardson
On 07/07/2020 17:36, Thomas Monjalon wrote:
> 07/07/2020 18:35, Kinsella, Ray:
>> On 07/07/2020 16:26, Thomas Monjalon wrote:
>>> 07/07/2020 16:45, Ray Kinsella:
>>>> Clarify retention period for aliases to experimental.
>>>>
>>>> Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
>>>> ---
>>>> --- a/doc/guides/contributing/abi_versioning.rst
>>>> +++ b/doc/guides/contributing/abi_versioning.rst
>>>> @@ -158,7 +158,7 @@ The macros exported are:
>>>> * ``VERSION_SYMBOL_EXPERIMENTAL(b, e)``: Creates a symbol version table entry
>>>> binding versioned symbol ``b@EXPERIMENTAL`` to the internal function ``be``.
>>>> The macro is used when a symbol matures to become part of the stable ABI, to
>>>> - provide an alias to experimental for some time.
>>>> + provide an alias to experimental until the next major ABI version.
>>>
>>> Why limiting the period for experimental status?
>>> Some API want to remain experimental longer.
>>>
>>> [...]
>>>> In situations in which an ``experimental`` symbol has been stable for some time,
>>>> and it becomes a candidate for promotion to the stable ABI. At this time, when
>>>> -promoting the symbol, maintainer may choose to provide an alias to the
>>>> -``experimental`` symbol version, so as not to break consuming applications.
>>>> +promoting the symbol, the maintainer may choose to provide an alias to the
>>>> +``experimental`` symbol version, so as not to break consuming applications. This
>>>
>>> Please start a sentence on a new line.
>>
>> ACK
>>
>>>
>>>> +alias will then typically be dropped in the next major ABI version.
>>>
>>> I don't see the need for the time estimation.
>>>
>>>
>>
>> Will reword to ...
>>
>> "This alias will then be dropped in the next major ABI version."
>
> It is not addressing my first comment. Please see above.
>
Thank you, I don't necessarily agree with the first comment :-)
We need to say when the alias should be dropped no?
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v1 2/2] doc: clarify alias to experimental period
2020-07-07 16:35 3% ` Kinsella, Ray
@ 2020-07-07 16:36 0% ` Thomas Monjalon
2020-07-07 16:37 0% ` Kinsella, Ray
0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2020-07-07 16:36 UTC (permalink / raw)
To: Kinsella, Ray
Cc: dev, fady, Honnappa.Nagarahalli, Neil Horman, John McNamara,
Marko Kovacevic, Harini Ramakrishnan, Omar Cardona,
Pallavi Kadam, Ranjit Menon, david.marchand, bruce.richardson
07/07/2020 18:35, Kinsella, Ray:
> On 07/07/2020 16:26, Thomas Monjalon wrote:
> > 07/07/2020 16:45, Ray Kinsella:
> >> Clarify retention period for aliases to experimental.
> >>
> >> Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
> >> ---
> >> --- a/doc/guides/contributing/abi_versioning.rst
> >> +++ b/doc/guides/contributing/abi_versioning.rst
> >> @@ -158,7 +158,7 @@ The macros exported are:
> >> * ``VERSION_SYMBOL_EXPERIMENTAL(b, e)``: Creates a symbol version table entry
> >> binding versioned symbol ``b@EXPERIMENTAL`` to the internal function ``be``.
> >> The macro is used when a symbol matures to become part of the stable ABI, to
> >> - provide an alias to experimental for some time.
> >> + provide an alias to experimental until the next major ABI version.
> >
> > Why limiting the period for experimental status?
> > Some API want to remain experimental longer.
> >
> > [...]
> >> In situations in which an ``experimental`` symbol has been stable for some time,
> >> and it becomes a candidate for promotion to the stable ABI. At this time, when
> >> -promoting the symbol, maintainer may choose to provide an alias to the
> >> -``experimental`` symbol version, so as not to break consuming applications.
> >> +promoting the symbol, the maintainer may choose to provide an alias to the
> >> +``experimental`` symbol version, so as not to break consuming applications. This
> >
> > Please start a sentence on a new line.
>
> ACK
>
> >
> >> +alias will then typically be dropped in the next major ABI version.
> >
> > I don't see the need for the time estimation.
> >
> >
>
> Will reword to ...
>
> "This alias will then be dropped in the next major ABI version."
It is not addressing my first comment. Please see above.
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v1 2/2] doc: clarify alias to experimental period
2020-07-07 15:26 0% ` Thomas Monjalon
@ 2020-07-07 16:35 3% ` Kinsella, Ray
2020-07-07 16:36 0% ` Thomas Monjalon
0 siblings, 1 reply; 200+ results
From: Kinsella, Ray @ 2020-07-07 16:35 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, fady, Honnappa.Nagarahalli, Neil Horman, John McNamara,
Marko Kovacevic, Harini Ramakrishnan, Omar Cardona,
Pallavi Kadam, Ranjit Menon, david.marchand, bruce.richardson
On 07/07/2020 16:26, Thomas Monjalon wrote:
> 07/07/2020 16:45, Ray Kinsella:
>> Clarify retention period for aliases to experimental.
>>
>> Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
>> ---
>> --- a/doc/guides/contributing/abi_versioning.rst
>> +++ b/doc/guides/contributing/abi_versioning.rst
>> @@ -158,7 +158,7 @@ The macros exported are:
>> * ``VERSION_SYMBOL_EXPERIMENTAL(b, e)``: Creates a symbol version table entry
>> binding versioned symbol ``b@EXPERIMENTAL`` to the internal function ``be``.
>> The macro is used when a symbol matures to become part of the stable ABI, to
>> - provide an alias to experimental for some time.
>> + provide an alias to experimental until the next major ABI version.
>
> Why limiting the period for experimental status?
> Some API want to remain experimental longer.
>
> [...]
>> In situations in which an ``experimental`` symbol has been stable for some time,
>> and it becomes a candidate for promotion to the stable ABI. At this time, when
>> -promoting the symbol, maintainer may choose to provide an alias to the
>> -``experimental`` symbol version, so as not to break consuming applications.
>> +promoting the symbol, the maintainer may choose to provide an alias to the
>> +``experimental`` symbol version, so as not to break consuming applications. This
>
> Please start a sentence on a new line.
ACK
>
>> +alias will then typically be dropped in the next major ABI version.
>
> I don't see the need for the time estimation.
>
>
Will reword to ...
"This alias will then be dropped in the next major ABI version."
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v1 1/2] doc: reword abi policy for windows
2020-07-07 15:23 7% ` Thomas Monjalon
@ 2020-07-07 16:33 4% ` Kinsella, Ray
0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2020-07-07 16:33 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, fady, Honnappa.Nagarahalli, Neil Horman, John McNamara,
Marko Kovacevic, Harini Ramakrishnan, Omar Cardona,
Pallavi Kadam, Ranjit Menon, talshn
On 07/07/2020 16:23, Thomas Monjalon wrote:
> 07/07/2020 16:45, Ray Kinsella:
>> Minor changes to the abi policy for windows.
>
> It looks like you were not fast enough to comment
> in the original thread :)
> Please add a Fixes line to reference the original commit.
>
>> Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
>> ---
>> doc/guides/contributing/abi_policy.rst | 4 +++-
>> doc/guides/windows_gsg/intro.rst | 6 +++---
>> 2 files changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/doc/guides/contributing/abi_policy.rst b/doc/guides/contributing/abi_policy.rst
>> index d0affa9..8e70b45 100644
>> --- a/doc/guides/contributing/abi_policy.rst
>> +++ b/doc/guides/contributing/abi_policy.rst
>> @@ -40,7 +40,9 @@ General Guidelines
>> maintaining ABI stability through one year of DPDK releases starting from
>> DPDK 19.11. This policy will be reviewed in 2020, with intention of
>> lengthening the stability period. Additional implementation detail can be
>> - found in the :ref:`release notes <20_02_abi_changes>`.
>> + found in the :ref:`release notes <20_02_abi_changes>`. Please note that this
>> + policy does not currently apply to the :doc:`Window build
>
> Window -> Windows
ACK
>
>> + <../windows_gsg/intro>`.
>>
>> What is an ABI?
>> ~~~~~~~~~~~~~~~
>> diff --git a/doc/guides/windows_gsg/intro.rst b/doc/guides/windows_gsg/intro.rst
>> index 58c6246..707afd3 100644
>> --- a/doc/guides/windows_gsg/intro.rst
>> +++ b/doc/guides/windows_gsg/intro.rst
>> @@ -19,6 +19,6 @@ compile. Support is being added in pieces so as to limit the overall scope
>> of any individual patch series. The goal is to be able to run any DPDK
>> application natively on Windows.
>>
>> -The :doc:`../contributing/abi_policy` cannot be respected for Windows.
>> -Minor ABI versions may be incompatible
>> -because function versioning is not supported on Windows.
>> +The :doc:`../contributing/abi_policy` does not apply to the Windows build, as
>> +function versioning is not supported on Windows, therefore minor ABI versions
>> +may be incompatible.
>
> Please I really prefer we split lines logically rather than filling the space:
> The :doc:`../contributing/abi_policy` does not apply to the Windows build,
> as function versioning is not supported on Windows,
> therefore minor ABI versions may be incompatible.
>
That is a single line though :-)
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH v3 4/4] eventdev: relax smp barriers with C11 atomics
2020-07-07 14:29 0% ` Jerin Jacob
@ 2020-07-07 15:56 0% ` Phil Yang
0 siblings, 0 replies; 200+ results
From: Phil Yang @ 2020-07-07 15:56 UTC (permalink / raw)
To: Jerin Jacob
Cc: thomas, Erik Gabriel Carrillo, dpdk-dev, jerinj,
Honnappa Nagarahalli, David Christensen, Ruifeng Wang,
Dharmik Thakkar, nd, David Marchand, Ray Kinsella, Neil Horman,
dodji
> -----Original Message-----
> From: Jerin Jacob <jerinjacobk@gmail.com>
> Sent: Tuesday, July 7, 2020 10:30 PM
> To: Phil Yang <Phil.Yang@arm.com>
> Cc: thomas@monjalon.net; Erik Gabriel Carrillo <erik.g.carrillo@intel.com>;
> dpdk-dev <dev@dpdk.org>; jerinj@marvell.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; David Christensen
> <drc@linux.vnet.ibm.com>; Ruifeng Wang <Ruifeng.Wang@arm.com>;
> Dharmik Thakkar <Dharmik.Thakkar@arm.com>; nd <nd@arm.com>; David
> Marchand <david.marchand@redhat.com>; Ray Kinsella <mdr@ashroe.eu>;
> Neil Horman <nhorman@tuxdriver.com>; dodji@redhat.com
> Subject: Re: [dpdk-dev] [PATCH v3 4/4] eventdev: relax smp barriers with
> C11 atomics
>
> On Tue, Jul 7, 2020 at 4:45 PM Phil Yang <phil.yang@arm.com> wrote:
> >
> > The impl_opaque field is shared between the timer arm and cancel
> > operations. Meanwhile, the state flag acts as a guard variable to
> > make sure the update of impl_opaque is synchronized. The original
> > code uses rte_smp barriers to achieve that. This patch uses C11
> > atomics with an explicit one-way memory barrier instead of full
> > barriers rte_smp_w/rmb() to avoid the unnecessary barrier on aarch64.
> >
> > Since compilers can generate the same instructions for volatile and
> > non-volatile variable in C11 __atomics built-ins, so remain the volatile
> > keyword in front of state enum to avoid the ABI break issue.
> >
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
>
>
> Could you fix the following:
>
> WARNING:TYPO_SPELLING: 'opague' may be misspelled - perhaps 'opaque'?
> #184: FILE: lib/librte_eventdev/rte_event_timer_adapter.c:1161:
> + * specific opague data under the correct state.
Done.
Thanks,
Phil
^ permalink raw reply [relevance 0%]
* [dpdk-dev] [PATCH v4 4/4] eventdev: relax smp barriers with C11 atomics
@ 2020-07-07 15:54 4% ` Phil Yang
2020-07-08 13:30 4% ` [dpdk-dev] [PATCH v4 1/4] eventdev: fix race condition on timer list counter Jerin Jacob
1 sibling, 0 replies; 200+ results
From: Phil Yang @ 2020-07-07 15:54 UTC (permalink / raw)
To: jerinj, dev
Cc: thomas, erik.g.carrillo, Honnappa.Nagarahalli, drc, Ruifeng.Wang,
Dharmik.Thakkar, nd, david.marchand, mdr, nhorman, dodji, stable
The impl_opaque field is shared between the timer arm and cancel
operations. Meanwhile, the state flag acts as a guard variable to
make sure the update of impl_opaque is synchronized. The original
code uses rte_smp barriers to achieve that. This patch uses C11
atomics with an explicit one-way memory barrier instead of full
barriers rte_smp_w/rmb() to avoid the unnecessary barrier on aarch64.
Since compilers can generate the same instructions for volatile and
non-volatile variable in C11 __atomics built-ins, so remain the volatile
keyword in front of state enum to avoid the ABI break issue.
Cc: stable@dpdk.org
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
---
v4:
1. Fix typo.
2. Cc to stable release. (Honnappa)
v3:
Fix ABI issue: revert to 'volatile enum rte_event_timer_state type state'.
v2:
1. Removed implementation-specific opaque data cleanup code.
2. Replaced thread fence with atomic ACQURE/RELEASE ordering on state access.
lib/librte_eventdev/rte_event_timer_adapter.c | 55 ++++++++++++++++++---------
1 file changed, 37 insertions(+), 18 deletions(-)
diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
index aa01b4d..4c5e49e 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter.c
+++ b/lib/librte_eventdev/rte_event_timer_adapter.c
@@ -629,7 +629,8 @@ swtim_callback(struct rte_timer *tim)
sw->expired_timers[sw->n_expired_timers++] = tim;
sw->stats.evtim_exp_count++;
- evtim->state = RTE_EVENT_TIMER_NOT_ARMED;
+ __atomic_store_n(&evtim->state, RTE_EVENT_TIMER_NOT_ARMED,
+ __ATOMIC_RELEASE);
}
if (event_buffer_batch_ready(&sw->buffer)) {
@@ -1020,6 +1021,7 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
int n_lcores;
/* Timer list for this lcore is not in use. */
uint16_t exp_state = 0;
+ enum rte_event_timer_state n_state;
#ifdef RTE_LIBRTE_EVENTDEV_DEBUG
/* Check that the service is running. */
@@ -1060,30 +1062,36 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
}
for (i = 0; i < nb_evtims; i++) {
- /* Don't modify the event timer state in these cases */
- if (evtims[i]->state == RTE_EVENT_TIMER_ARMED) {
+ n_state = __atomic_load_n(&evtims[i]->state, __ATOMIC_ACQUIRE);
+ if (n_state == RTE_EVENT_TIMER_ARMED) {
rte_errno = EALREADY;
break;
- } else if (!(evtims[i]->state == RTE_EVENT_TIMER_NOT_ARMED ||
- evtims[i]->state == RTE_EVENT_TIMER_CANCELED)) {
+ } else if (!(n_state == RTE_EVENT_TIMER_NOT_ARMED ||
+ n_state == RTE_EVENT_TIMER_CANCELED)) {
rte_errno = EINVAL;
break;
}
ret = check_timeout(evtims[i], adapter);
if (unlikely(ret == -1)) {
- evtims[i]->state = RTE_EVENT_TIMER_ERROR_TOOLATE;
+ __atomic_store_n(&evtims[i]->state,
+ RTE_EVENT_TIMER_ERROR_TOOLATE,
+ __ATOMIC_RELAXED);
rte_errno = EINVAL;
break;
} else if (unlikely(ret == -2)) {
- evtims[i]->state = RTE_EVENT_TIMER_ERROR_TOOEARLY;
+ __atomic_store_n(&evtims[i]->state,
+ RTE_EVENT_TIMER_ERROR_TOOEARLY,
+ __ATOMIC_RELAXED);
rte_errno = EINVAL;
break;
}
if (unlikely(check_destination_event_queue(evtims[i],
adapter) < 0)) {
- evtims[i]->state = RTE_EVENT_TIMER_ERROR;
+ __atomic_store_n(&evtims[i]->state,
+ RTE_EVENT_TIMER_ERROR,
+ __ATOMIC_RELAXED);
rte_errno = EINVAL;
break;
}
@@ -1099,13 +1107,18 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
SINGLE, lcore_id, NULL, evtims[i]);
if (ret < 0) {
/* tim was in RUNNING or CONFIG state */
- evtims[i]->state = RTE_EVENT_TIMER_ERROR;
+ __atomic_store_n(&evtims[i]->state,
+ RTE_EVENT_TIMER_ERROR,
+ __ATOMIC_RELEASE);
break;
}
- rte_smp_wmb();
EVTIM_LOG_DBG("armed an event timer");
- evtims[i]->state = RTE_EVENT_TIMER_ARMED;
+ /* RELEASE ordering guarantees the adapter specific value
+ * changes observed before the update of state.
+ */
+ __atomic_store_n(&evtims[i]->state, RTE_EVENT_TIMER_ARMED,
+ __ATOMIC_RELEASE);
}
if (i < nb_evtims)
@@ -1132,6 +1145,7 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
struct rte_timer *timp;
uint64_t opaque;
struct swtim *sw = swtim_pmd_priv(adapter);
+ enum rte_event_timer_state n_state;
#ifdef RTE_LIBRTE_EVENTDEV_DEBUG
/* Check that the service is running. */
@@ -1143,16 +1157,18 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
for (i = 0; i < nb_evtims; i++) {
/* Don't modify the event timer state in these cases */
- if (evtims[i]->state == RTE_EVENT_TIMER_CANCELED) {
+ /* ACQUIRE ordering guarantees the access of implementation
+ * specific opaque data under the correct state.
+ */
+ n_state = __atomic_load_n(&evtims[i]->state, __ATOMIC_ACQUIRE);
+ if (n_state == RTE_EVENT_TIMER_CANCELED) {
rte_errno = EALREADY;
break;
- } else if (evtims[i]->state != RTE_EVENT_TIMER_ARMED) {
+ } else if (n_state != RTE_EVENT_TIMER_ARMED) {
rte_errno = EINVAL;
break;
}
- rte_smp_rmb();
-
opaque = evtims[i]->impl_opaque[0];
timp = (struct rte_timer *)(uintptr_t)opaque;
RTE_ASSERT(timp != NULL);
@@ -1166,9 +1182,12 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
rte_mempool_put(sw->tim_pool, (void **)timp);
- evtims[i]->state = RTE_EVENT_TIMER_CANCELED;
-
- rte_smp_wmb();
+ /* The RELEASE ordering here pairs with atomic ordering
+ * to make sure the state update data observed between
+ * threads.
+ */
+ __atomic_store_n(&evtims[i]->state, RTE_EVENT_TIMER_CANCELED,
+ __ATOMIC_RELEASE);
}
return i;
--
2.7.4
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH v1 2/2] doc: clarify alias to experimental period
2020-07-07 14:45 12% ` [dpdk-dev] [PATCH v1 2/2] doc: clarify alias to experimental period Ray Kinsella
@ 2020-07-07 15:26 0% ` Thomas Monjalon
2020-07-07 16:35 3% ` Kinsella, Ray
0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2020-07-07 15:26 UTC (permalink / raw)
To: Ray Kinsella
Cc: dev, fady, Honnappa.Nagarahalli, Ray Kinsella, Neil Horman,
John McNamara, Marko Kovacevic, Harini Ramakrishnan,
Omar Cardona, Pallavi Kadam, Ranjit Menon, david.marchand,
nhorman, bruce.richardson
07/07/2020 16:45, Ray Kinsella:
> Clarify retention period for aliases to experimental.
>
> Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
> ---
> --- a/doc/guides/contributing/abi_versioning.rst
> +++ b/doc/guides/contributing/abi_versioning.rst
> @@ -158,7 +158,7 @@ The macros exported are:
> * ``VERSION_SYMBOL_EXPERIMENTAL(b, e)``: Creates a symbol version table entry
> binding versioned symbol ``b@EXPERIMENTAL`` to the internal function ``be``.
> The macro is used when a symbol matures to become part of the stable ABI, to
> - provide an alias to experimental for some time.
> + provide an alias to experimental until the next major ABI version.
Why limiting the period for experimental status?
Some API want to remain experimental longer.
[...]
> In situations in which an ``experimental`` symbol has been stable for some time,
> and it becomes a candidate for promotion to the stable ABI. At this time, when
> -promoting the symbol, maintainer may choose to provide an alias to the
> -``experimental`` symbol version, so as not to break consuming applications.
> +promoting the symbol, the maintainer may choose to provide an alias to the
> +``experimental`` symbol version, so as not to break consuming applications. This
Please start a sentence on a new line.
> +alias will then typically be dropped in the next major ABI version.
I don't see the need for the time estimation.
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v1 1/2] doc: reword abi policy for windows
2020-07-07 14:45 24% ` [dpdk-dev] [PATCH v1 1/2] doc: reword abi policy for windows Ray Kinsella
@ 2020-07-07 15:23 7% ` Thomas Monjalon
2020-07-07 16:33 4% ` Kinsella, Ray
0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2020-07-07 15:23 UTC (permalink / raw)
To: Ray Kinsella
Cc: dev, fady, Honnappa.Nagarahalli, Ray Kinsella, Neil Horman,
John McNamara, Marko Kovacevic, Harini Ramakrishnan,
Omar Cardona, Pallavi Kadam, Ranjit Menon, talshn
07/07/2020 16:45, Ray Kinsella:
> Minor changes to the abi policy for windows.
It looks like you were not fast enough to comment
in the original thread :)
Please add a Fixes line to reference the original commit.
> Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
> ---
> doc/guides/contributing/abi_policy.rst | 4 +++-
> doc/guides/windows_gsg/intro.rst | 6 +++---
> 2 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/doc/guides/contributing/abi_policy.rst b/doc/guides/contributing/abi_policy.rst
> index d0affa9..8e70b45 100644
> --- a/doc/guides/contributing/abi_policy.rst
> +++ b/doc/guides/contributing/abi_policy.rst
> @@ -40,7 +40,9 @@ General Guidelines
> maintaining ABI stability through one year of DPDK releases starting from
> DPDK 19.11. This policy will be reviewed in 2020, with intention of
> lengthening the stability period. Additional implementation detail can be
> - found in the :ref:`release notes <20_02_abi_changes>`.
> + found in the :ref:`release notes <20_02_abi_changes>`. Please note that this
> + policy does not currently apply to the :doc:`Window build
Window -> Windows
> + <../windows_gsg/intro>`.
>
> What is an ABI?
> ~~~~~~~~~~~~~~~
> diff --git a/doc/guides/windows_gsg/intro.rst b/doc/guides/windows_gsg/intro.rst
> index 58c6246..707afd3 100644
> --- a/doc/guides/windows_gsg/intro.rst
> +++ b/doc/guides/windows_gsg/intro.rst
> @@ -19,6 +19,6 @@ compile. Support is being added in pieces so as to limit the overall scope
> of any individual patch series. The goal is to be able to run any DPDK
> application natively on Windows.
>
> -The :doc:`../contributing/abi_policy` cannot be respected for Windows.
> -Minor ABI versions may be incompatible
> -because function versioning is not supported on Windows.
> +The :doc:`../contributing/abi_policy` does not apply to the Windows build, as
> +function versioning is not supported on Windows, therefore minor ABI versions
> +may be incompatible.
Please I really prefer we split lines logically rather than filling the space:
The :doc:`../contributing/abi_policy` does not apply to the Windows build,
as function versioning is not supported on Windows,
therefore minor ABI versions may be incompatible.
^ permalink raw reply [relevance 7%]
* Re: [dpdk-dev] [PATCH v4 1/2] mbuf: introduce accurate packet Tx scheduling
2020-07-07 14:57 2% ` [dpdk-dev] [PATCH v4 " Viacheslav Ovsiienko
@ 2020-07-07 15:23 0% ` Olivier Matz
2020-07-08 14:16 0% ` [dpdk-dev] [PATCH v4 1/2] mbuf: introduce accurate packet Txscheduling Morten Brørup
1 sibling, 0 replies; 200+ results
From: Olivier Matz @ 2020-07-07 15:23 UTC (permalink / raw)
To: Viacheslav Ovsiienko; +Cc: dev, matan, rasland, bernard.iremonger, thomas
On Tue, Jul 07, 2020 at 02:57:11PM +0000, Viacheslav Ovsiienko wrote:
> There is the requirement on some networks for precise traffic timing
> management. The ability to send (and, generally speaking, receive)
> the packets at the very precisely specified moment of time provides
> the opportunity to support the connections with Time Division
> Multiplexing using the contemporary general purpose NIC without involving
> an auxiliary hardware. For example, the supporting of O-RAN Fronthaul
> interface is one of the promising features for potentially usage of the
> precise time management for the egress packets.
>
> The main objective of this RFC is to specify the way how applications
> can provide the moment of time at what the packet transmission must be
> started and to describe in preliminary the supporting this feature from
> mlx5 PMD side.
>
> The new dynamic timestamp field is proposed, it provides some timing
> information, the units and time references (initial phase) are not
> explicitly defined but are maintained always the same for a given port.
> Some devices allow to query rte_eth_read_clock() that will return
> the current device timestamp. The dynamic timestamp flag tells whether
> the field contains actual timestamp value. For the packets being sent
> this value can be used by PMD to schedule packet sending.
>
> After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
> and obsoleting, these dynamic flag and field will be used to manage
> the timestamps on receiving datapath as well.
>
> When PMD sees the "rte_dynfield_timestamp" set on the packet being sent
> it tries to synchronize the time of packet appearing on the wire with
> the specified packet timestamp. If the specified one is in the past it
> should be ignored, if one is in the distant future it should be capped
> with some reasonable value (in range of seconds). These specific cases
> ("too late" and "distant future") can be optionally reported via
> device xstats to assist applications to detect the time-related
> problems.
>
> There is no any packet reordering according timestamps is supposed,
> neither within packet burst, nor between packets, it is an entirely
> application responsibility to generate packets and its timestamps
> in desired order. The timestamps can be put only in the first packet
> in the burst providing the entire burst scheduling.
>
> PMD reports the ability to synchronize packet sending on timestamp
> with new offload flag:
>
> This is palliative and is going to be replaced with new eth_dev API
> about reporting/managing the supported dynamic flags and its related
> features. This API would break ABI compatibility and can't be introduced
> at the moment, so is postponed to 20.11.
>
> For testing purposes it is proposed to update testpmd "txonly"
> forwarding mode routine. With this update testpmd application generates
> the packets and sets the dynamic timestamps according to specified time
> pattern if it sees the "rte_dynfield_timestamp" is registered.
>
> The new testpmd command is proposed to configure sending pattern:
>
> set tx_times <burst_gap>,<intra_gap>
>
> <intra_gap> - the delay between the packets within the burst
> specified in the device clock units. The number
> of packets in the burst is defined by txburst parameter
>
> <burst_gap> - the delay between the bursts in the device clock units
>
> As the result the bursts of packet will be transmitted with specific
> delays between the packets within the burst and specific delay between
> the bursts. The rte_eth_get_clock is supposed to be engaged to get the
> current device clock value and provide the reference for the timestamps.
>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
> v1->v4:
> - dedicated dynamic Tx timestamp flag instead of shared with Rx
> - Doxygen-style comment
> - comments update
>
> ---
> lib/librte_ethdev/rte_ethdev.c | 1 +
> lib/librte_ethdev/rte_ethdev.h | 4 ++++
> lib/librte_mbuf/rte_mbuf_dyn.h | 31 +++++++++++++++++++++++++++++++
> 3 files changed, 36 insertions(+)
>
> diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
> index 8e10a6f..02157d5 100644
> --- a/lib/librte_ethdev/rte_ethdev.c
> +++ b/lib/librte_ethdev/rte_ethdev.c
> @@ -162,6 +162,7 @@ struct rte_eth_xstats_name_off {
> RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
> RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
> RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
> + RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
> };
>
> #undef RTE_TX_OFFLOAD_BIT2STR
> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> index a49242b..6f6454c 100644
> --- a/lib/librte_ethdev/rte_ethdev.h
> +++ b/lib/librte_ethdev/rte_ethdev.h
> @@ -1178,6 +1178,10 @@ struct rte_eth_conf {
> /** Device supports outer UDP checksum */
> #define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM 0x00100000
>
> +/** Device supports send on timestamp */
> +#define DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
> +
> +
> #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
> /**< Device supports Rx queue setup after device started*/
> #define RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002
> diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
> index 96c3631..7e9f7d2 100644
> --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> @@ -250,4 +250,35 @@ int rte_mbuf_dynflag_lookup(const char *name,
> #define RTE_MBUF_DYNFIELD_METADATA_NAME "rte_flow_dynfield_metadata"
> #define RTE_MBUF_DYNFLAG_METADATA_NAME "rte_flow_dynflag_metadata"
>
> +/**
> + * The timestamp dynamic field provides some timing information, the
> + * units and time references (initial phase) are not explicitly defined
> + * but are maintained always the same for a given port. Some devices allow4
allow4 -> allow
> + * to query rte_eth_read_clock() that will return the current device
> + * timestamp. The dynamic Tx timestamp flag tells whether the field contains
> + * actual timestamp value. For the packets being sent this value can be
> + * used by PMD to schedule packet sending.
> + *
> + * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
> + * and obsoleting, the dedicated Rx timestamp flag is supposed to be
> + * introduced and the shared dynamic timestamp field will be used
> + * to handle the timestamps on receiving datapath as well.
> + */
> +#define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME "rte_dynfield_timestamp"
> +
> +/**
> + * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME flag set on the
> + * packet being sent it tries to synchronize the time of packet appearing
> + * on the wire with the specified packet timestamp. If the specified one
> + * is in the past it should be ignored, if one is in the distant future
> + * it should be capped with some reasonable value (in range of seconds).
> + *
> + * There is no any packet reordering according to timestamps is supposed,
> + * neither for packet within the burst, nor for the whole bursts, it is
> + * an entirely application responsibility to generate packets and its
> + * timestamps in desired order. The timestamps might be put only in
> + * the first packet in the burst providing the entire burst scheduling.
> + */
> +#define RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME "rte_dynflag_tx_timestamp"
> +
Acked-by: Olivier Matz <olivier.matz@6wind.com>
^ permalink raw reply [relevance 0%]
* [dpdk-dev] [PATCH v7 0/3] RCU integration with LPM library
2020-07-07 14:40 3% ` [dpdk-dev] [PATCH v6 0/3] RCU integration with LPM library Ruifeng Wang
@ 2020-07-07 15:15 3% ` Ruifeng Wang
2020-07-09 8:02 4% ` [dpdk-dev] [PATCH v8 0/3] RCU integration with LPM library Ruifeng Wang
` (2 subsequent siblings)
5 siblings, 1 reply; 200+ results
From: Ruifeng Wang @ 2020-07-07 15:15 UTC (permalink / raw)
Cc: dev, mdr, konstantin.ananyev, honnappa.nagarahalli, nd, Ruifeng Wang
This patchset integrates RCU QSBR support with LPM library.
Resource reclaimation implementation was splitted from the original
series, and has already been part of RCU library. Rework the series
to base LPM integration on RCU reclaimation APIs.
New API rte_lpm_rcu_qsbr_add is introduced for application to
register a RCU variable that LPM library will use. This provides
user the handle to enable RCU that integrated in LPM library.
Functional tests and performance tests are added to cover the
integration with RCU.
---
v7:
Fixed typos in document.
v6:
Remove ALLOW_EXPERIMENTAL_API from rte_lpm.c.
v5:
No default value for reclaim_thd. This allows reclamation triggering with every call.
Pass LPM pointer instead of tbl8 as argument of reclaim callback free function.
Updated group_idx check at tbl8 allocation.
Use enums instead of defines for different reclamation modes.
RCU QSBR integrated path is inside ALLOW_EXPERIMENTAL_API to avoid ABI change.
v4:
Allow user to configure defer queue: size, reclaim threshold, max entries.
Return defer queue handler so user can manually trigger reclaimation.
Add blocking mode support. Defer queue will not be created.
Honnappa Nagarahalli (1):
test/lpm: add RCU integration performance tests
Ruifeng Wang (2):
lib/lpm: integrate RCU QSBR
test/lpm: add LPM RCU integration functional tests
app/test/test_lpm.c | 291 ++++++++++++++++-
app/test/test_lpm_perf.c | 492 ++++++++++++++++++++++++++++-
doc/guides/prog_guide/lpm_lib.rst | 32 ++
lib/librte_lpm/Makefile | 2 +-
lib/librte_lpm/meson.build | 1 +
lib/librte_lpm/rte_lpm.c | 120 ++++++-
lib/librte_lpm/rte_lpm.h | 59 ++++
lib/librte_lpm/rte_lpm_version.map | 6 +
8 files changed, 987 insertions(+), 16 deletions(-)
--
2.17.1
^ permalink raw reply [relevance 3%]
* [dpdk-dev] [PATCH v4 1/2] mbuf: introduce accurate packet Tx scheduling
` (2 preceding siblings ...)
2020-07-07 13:08 2% ` [dpdk-dev] [PATCH v3 " Viacheslav Ovsiienko
@ 2020-07-07 14:57 2% ` Viacheslav Ovsiienko
2020-07-07 15:23 0% ` Olivier Matz
2020-07-08 14:16 0% ` [dpdk-dev] [PATCH v4 1/2] mbuf: introduce accurate packet Txscheduling Morten Brørup
2020-07-08 15:47 2% ` [dpdk-dev] [PATCH v5 1/2] mbuf: introduce accurate packet Tx scheduling Viacheslav Ovsiienko
` (2 subsequent siblings)
6 siblings, 2 replies; 200+ results
From: Viacheslav Ovsiienko @ 2020-07-07 14:57 UTC (permalink / raw)
To: dev; +Cc: matan, rasland, olivier.matz, bernard.iremonger, thomas
There is the requirement on some networks for precise traffic timing
management. The ability to send (and, generally speaking, receive)
the packets at the very precisely specified moment of time provides
the opportunity to support the connections with Time Division
Multiplexing using the contemporary general purpose NIC without involving
an auxiliary hardware. For example, the supporting of O-RAN Fronthaul
interface is one of the promising features for potentially usage of the
precise time management for the egress packets.
The main objective of this RFC is to specify the way how applications
can provide the moment of time at what the packet transmission must be
started and to describe in preliminary the supporting this feature from
mlx5 PMD side.
The new dynamic timestamp field is proposed, it provides some timing
information, the units and time references (initial phase) are not
explicitly defined but are maintained always the same for a given port.
Some devices allow to query rte_eth_read_clock() that will return
the current device timestamp. The dynamic timestamp flag tells whether
the field contains actual timestamp value. For the packets being sent
this value can be used by PMD to schedule packet sending.
After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
and obsoleting, these dynamic flag and field will be used to manage
the timestamps on receiving datapath as well.
When PMD sees the "rte_dynfield_timestamp" set on the packet being sent
it tries to synchronize the time of packet appearing on the wire with
the specified packet timestamp. If the specified one is in the past it
should be ignored, if one is in the distant future it should be capped
with some reasonable value (in range of seconds). These specific cases
("too late" and "distant future") can be optionally reported via
device xstats to assist applications to detect the time-related
problems.
There is no any packet reordering according timestamps is supposed,
neither within packet burst, nor between packets, it is an entirely
application responsibility to generate packets and its timestamps
in desired order. The timestamps can be put only in the first packet
in the burst providing the entire burst scheduling.
PMD reports the ability to synchronize packet sending on timestamp
with new offload flag:
This is palliative and is going to be replaced with new eth_dev API
about reporting/managing the supported dynamic flags and its related
features. This API would break ABI compatibility and can't be introduced
at the moment, so is postponed to 20.11.
For testing purposes it is proposed to update testpmd "txonly"
forwarding mode routine. With this update testpmd application generates
the packets and sets the dynamic timestamps according to specified time
pattern if it sees the "rte_dynfield_timestamp" is registered.
The new testpmd command is proposed to configure sending pattern:
set tx_times <burst_gap>,<intra_gap>
<intra_gap> - the delay between the packets within the burst
specified in the device clock units. The number
of packets in the burst is defined by txburst parameter
<burst_gap> - the delay between the bursts in the device clock units
As the result the bursts of packet will be transmitted with specific
delays between the packets within the burst and specific delay between
the bursts. The rte_eth_get_clock is supposed to be engaged to get the
current device clock value and provide the reference for the timestamps.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
v1->v4:
- dedicated dynamic Tx timestamp flag instead of shared with Rx
- Doxygen-style comment
- comments update
---
lib/librte_ethdev/rte_ethdev.c | 1 +
lib/librte_ethdev/rte_ethdev.h | 4 ++++
lib/librte_mbuf/rte_mbuf_dyn.h | 31 +++++++++++++++++++++++++++++++
3 files changed, 36 insertions(+)
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 8e10a6f..02157d5 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -162,6 +162,7 @@ struct rte_eth_xstats_name_off {
RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
+ RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
};
#undef RTE_TX_OFFLOAD_BIT2STR
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index a49242b..6f6454c 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1178,6 +1178,10 @@ struct rte_eth_conf {
/** Device supports outer UDP checksum */
#define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM 0x00100000
+/** Device supports send on timestamp */
+#define DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
+
+
#define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
/**< Device supports Rx queue setup after device started*/
#define RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 96c3631..7e9f7d2 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -250,4 +250,35 @@ int rte_mbuf_dynflag_lookup(const char *name,
#define RTE_MBUF_DYNFIELD_METADATA_NAME "rte_flow_dynfield_metadata"
#define RTE_MBUF_DYNFLAG_METADATA_NAME "rte_flow_dynflag_metadata"
+/**
+ * The timestamp dynamic field provides some timing information, the
+ * units and time references (initial phase) are not explicitly defined
+ * but are maintained always the same for a given port. Some devices allow4
+ * to query rte_eth_read_clock() that will return the current device
+ * timestamp. The dynamic Tx timestamp flag tells whether the field contains
+ * actual timestamp value. For the packets being sent this value can be
+ * used by PMD to schedule packet sending.
+ *
+ * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
+ * and obsoleting, the dedicated Rx timestamp flag is supposed to be
+ * introduced and the shared dynamic timestamp field will be used
+ * to handle the timestamps on receiving datapath as well.
+ */
+#define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME "rte_dynfield_timestamp"
+
+/**
+ * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME flag set on the
+ * packet being sent it tries to synchronize the time of packet appearing
+ * on the wire with the specified packet timestamp. If the specified one
+ * is in the past it should be ignored, if one is in the distant future
+ * it should be capped with some reasonable value (in range of seconds).
+ *
+ * There is no any packet reordering according to timestamps is supposed,
+ * neither for packet within the burst, nor for the whole bursts, it is
+ * an entirely application responsibility to generate packets and its
+ * timestamps in desired order. The timestamps might be put only in
+ * the first packet in the burst providing the entire burst scheduling.
+ */
+#define RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME "rte_dynflag_tx_timestamp"
+
#endif
--
1.8.3.1
^ permalink raw reply [relevance 2%]
* [dpdk-dev] [PATCH v1 2/2] doc: clarify alias to experimental period
2020-07-07 14:45 8% [dpdk-dev] [PATCH v1 0/2] doc: minor abi policy fixes Ray Kinsella
2020-07-07 14:45 24% ` [dpdk-dev] [PATCH v1 1/2] doc: reword abi policy for windows Ray Kinsella
@ 2020-07-07 14:45 12% ` Ray Kinsella
2020-07-07 15:26 0% ` Thomas Monjalon
2020-07-07 17:50 8% ` [dpdk-dev] [PATCH v2 0/2] doc: minor abi policy fixes Ray Kinsella
2 siblings, 1 reply; 200+ results
From: Ray Kinsella @ 2020-07-07 14:45 UTC (permalink / raw)
To: dev
Cc: fady, thomas, Honnappa.Nagarahalli, Ray Kinsella, Neil Horman,
John McNamara, Marko Kovacevic, Harini Ramakrishnan,
Omar Cardona, Pallavi Kadam, Ranjit Menon
Clarify retention period for aliases to experimental.
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
---
doc/guides/contributing/abi_versioning.rst | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/doc/guides/contributing/abi_versioning.rst b/doc/guides/contributing/abi_versioning.rst
index 31a9205..e00dfa8 100644
--- a/doc/guides/contributing/abi_versioning.rst
+++ b/doc/guides/contributing/abi_versioning.rst
@@ -158,7 +158,7 @@ The macros exported are:
* ``VERSION_SYMBOL_EXPERIMENTAL(b, e)``: Creates a symbol version table entry
binding versioned symbol ``b@EXPERIMENTAL`` to the internal function ``be``.
The macro is used when a symbol matures to become part of the stable ABI, to
- provide an alias to experimental for some time.
+ provide an alias to experimental until the next major ABI version.
.. _example_abi_macro_usage:
@@ -428,8 +428,9 @@ _____________________________
In situations in which an ``experimental`` symbol has been stable for some time,
and it becomes a candidate for promotion to the stable ABI. At this time, when
-promoting the symbol, maintainer may choose to provide an alias to the
-``experimental`` symbol version, so as not to break consuming applications.
+promoting the symbol, the maintainer may choose to provide an alias to the
+``experimental`` symbol version, so as not to break consuming applications. This
+alias will then typically be dropped in the next major ABI version.
The process to provide an alias to ``experimental`` is similar to that, of
:ref:`symbol versioning <example_abi_macro_usage>` described above.
--
2.7.4
^ permalink raw reply [relevance 12%]
* [dpdk-dev] [PATCH v1 1/2] doc: reword abi policy for windows
2020-07-07 14:45 8% [dpdk-dev] [PATCH v1 0/2] doc: minor abi policy fixes Ray Kinsella
@ 2020-07-07 14:45 24% ` Ray Kinsella
2020-07-07 15:23 7% ` Thomas Monjalon
2020-07-07 14:45 12% ` [dpdk-dev] [PATCH v1 2/2] doc: clarify alias to experimental period Ray Kinsella
2020-07-07 17:50 8% ` [dpdk-dev] [PATCH v2 0/2] doc: minor abi policy fixes Ray Kinsella
2 siblings, 1 reply; 200+ results
From: Ray Kinsella @ 2020-07-07 14:45 UTC (permalink / raw)
To: dev
Cc: fady, thomas, Honnappa.Nagarahalli, Ray Kinsella, Neil Horman,
John McNamara, Marko Kovacevic, Harini Ramakrishnan,
Omar Cardona, Pallavi Kadam, Ranjit Menon
Minor changes to the abi policy for windows.
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
---
doc/guides/contributing/abi_policy.rst | 4 +++-
doc/guides/windows_gsg/intro.rst | 6 +++---
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/doc/guides/contributing/abi_policy.rst b/doc/guides/contributing/abi_policy.rst
index d0affa9..8e70b45 100644
--- a/doc/guides/contributing/abi_policy.rst
+++ b/doc/guides/contributing/abi_policy.rst
@@ -40,7 +40,9 @@ General Guidelines
maintaining ABI stability through one year of DPDK releases starting from
DPDK 19.11. This policy will be reviewed in 2020, with intention of
lengthening the stability period. Additional implementation detail can be
- found in the :ref:`release notes <20_02_abi_changes>`.
+ found in the :ref:`release notes <20_02_abi_changes>`. Please note that this
+ policy does not currently apply to the :doc:`Window build
+ <../windows_gsg/intro>`.
What is an ABI?
~~~~~~~~~~~~~~~
diff --git a/doc/guides/windows_gsg/intro.rst b/doc/guides/windows_gsg/intro.rst
index 58c6246..707afd3 100644
--- a/doc/guides/windows_gsg/intro.rst
+++ b/doc/guides/windows_gsg/intro.rst
@@ -19,6 +19,6 @@ compile. Support is being added in pieces so as to limit the overall scope
of any individual patch series. The goal is to be able to run any DPDK
application natively on Windows.
-The :doc:`../contributing/abi_policy` cannot be respected for Windows.
-Minor ABI versions may be incompatible
-because function versioning is not supported on Windows.
+The :doc:`../contributing/abi_policy` does not apply to the Windows build, as
+function versioning is not supported on Windows, therefore minor ABI versions
+may be incompatible.
--
2.7.4
^ permalink raw reply [relevance 24%]
* [dpdk-dev] [PATCH v1 0/2] doc: minor abi policy fixes
@ 2020-07-07 14:45 8% Ray Kinsella
2020-07-07 14:45 24% ` [dpdk-dev] [PATCH v1 1/2] doc: reword abi policy for windows Ray Kinsella
` (2 more replies)
0 siblings, 3 replies; 200+ results
From: Ray Kinsella @ 2020-07-07 14:45 UTC (permalink / raw)
To: dev
Cc: fady, thomas, Honnappa.Nagarahalli, Ray Kinsella, Neil Horman,
John McNamara, Marko Kovacevic, Harini Ramakrishnan,
Omar Cardona, Pallavi Kadam, Ranjit Menon
Few documentation fixexs, clarifing the Windows ABI policy and aliases to
experimental mode.
Ray Kinsella (2):
doc: reword abi policy for windows
doc: clarify alias to experimental period
doc/guides/contributing/abi_policy.rst | 4 +++-
doc/guides/contributing/abi_versioning.rst | 7 ++++---
doc/guides/windows_gsg/intro.rst | 6 +++---
3 files changed, 10 insertions(+), 7 deletions(-)
--
2.7.4
^ permalink raw reply [relevance 8%]
* [dpdk-dev] [PATCH v6 0/3] RCU integration with LPM library
@ 2020-07-07 14:40 3% ` Ruifeng Wang
2020-07-07 15:15 3% ` [dpdk-dev] [PATCH v7 " Ruifeng Wang
` (3 subsequent siblings)
5 siblings, 0 replies; 200+ results
From: Ruifeng Wang @ 2020-07-07 14:40 UTC (permalink / raw)
Cc: dev, mdr, konstantin.ananyev, honnappa.nagarahalli, nd, Ruifeng Wang
This patchset integrates RCU QSBR support with LPM library.
Resource reclaimation implementation was splitted from the original
series, and has already been part of RCU library. Rework the series
to base LPM integration on RCU reclaimation APIs.
New API rte_lpm_rcu_qsbr_add is introduced for application to
register a RCU variable that LPM library will use. This provides
user the handle to enable RCU that integrated in LPM library.
Functional tests and performance tests are added to cover the
integration with RCU.
---
v6:
Remove ALLOW_EXPERIMENTAL_API from rte_lpm.c.
v5:
No default value for reclaim_thd. This allows reclamation triggering with every call.
Pass LPM pointer instead of tbl8 as argument of reclaim callback free function.
Updated group_idx check at tbl8 allocation.
Use enums instead of defines for different reclamation modes.
RCU QSBR integrated path is inside ALLOW_EXPERIMENTAL_API to avoid ABI change.
v4:
Allow user to configure defer queue: size, reclaim threshold, max entries.
Return defer queue handler so user can manually trigger reclaimation.
Add blocking mode support. Defer queue will not be created.
Honnappa Nagarahalli (1):
test/lpm: add RCU integration performance tests
Ruifeng Wang (2):
lib/lpm: integrate RCU QSBR
test/lpm: add LPM RCU integration functional tests
app/test/test_lpm.c | 291 ++++++++++++++++-
app/test/test_lpm_perf.c | 492 ++++++++++++++++++++++++++++-
doc/guides/prog_guide/lpm_lib.rst | 32 ++
lib/librte_lpm/Makefile | 2 +-
lib/librte_lpm/meson.build | 1 +
lib/librte_lpm/rte_lpm.c | 120 ++++++-
lib/librte_lpm/rte_lpm.h | 59 ++++
lib/librte_lpm/rte_lpm_version.map | 6 +
8 files changed, 987 insertions(+), 16 deletions(-)
--
2.17.1
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v3 1/2] mbuf: introduce accurate packet Tx scheduling
2020-07-07 13:08 2% ` [dpdk-dev] [PATCH v3 " Viacheslav Ovsiienko
@ 2020-07-07 14:32 0% ` Olivier Matz
0 siblings, 0 replies; 200+ results
From: Olivier Matz @ 2020-07-07 14:32 UTC (permalink / raw)
To: Viacheslav Ovsiienko; +Cc: dev, matan, rasland, bernard.iremonger, thomas
On Tue, Jul 07, 2020 at 01:08:02PM +0000, Viacheslav Ovsiienko wrote:
> There is the requirement on some networks for precise traffic timing
> management. The ability to send (and, generally speaking, receive)
> the packets at the very precisely specified moment of time provides
> the opportunity to support the connections with Time Division
> Multiplexing using the contemporary general purpose NIC without involving
> an auxiliary hardware. For example, the supporting of O-RAN Fronthaul
> interface is one of the promising features for potentially usage of the
> precise time management for the egress packets.
>
> The main objective of this RFC is to specify the way how applications
> can provide the moment of time at what the packet transmission must be
> started and to describe in preliminary the supporting this feature from
> mlx5 PMD side.
>
> The new dynamic timestamp field is proposed, it provides some timing
> information, the units and time references (initial phase) are not
> explicitly defined but are maintained always the same for a given port.
> Some devices allow to query rte_eth_read_clock() that will return
> the current device timestamp. The dynamic timestamp flag tells whether
> the field contains actual timestamp value. For the packets being sent
> this value can be used by PMD to schedule packet sending.
>
> After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
> and obsoleting, these dynamic flag and field will be used to manage
> the timestamps on receiving datapath as well.
>
> When PMD sees the "rte_dynfield_timestamp" set on the packet being sent
> it tries to synchronize the time of packet appearing on the wire with
> the specified packet timestamp. If the specified one is in the past it
> should be ignored, if one is in the distant future it should be capped
> with some reasonable value (in range of seconds). These specific cases
> ("too late" and "distant future") can be optionally reported via
> device xstats to assist applications to detect the time-related
> problems.
>
> There is no any packet reordering according timestamps is supposed,
> neither within packet burst, nor between packets, it is an entirely
> application responsibility to generate packets and its timestamps
> in desired order. The timestamps can be put only in the first packet
> in the burst providing the entire burst scheduling.
>
> PMD reports the ability to synchronize packet sending on timestamp
> with new offload flag:
>
> This is palliative and is going to be replaced with new eth_dev API
> about reporting/managing the supported dynamic flags and its related
> features. This API would break ABI compatibility and can't be introduced
> at the moment, so is postponed to 20.11.
>
> For testing purposes it is proposed to update testpmd "txonly"
> forwarding mode routine. With this update testpmd application generates
> the packets and sets the dynamic timestamps according to specified time
> pattern if it sees the "rte_dynfield_timestamp" is registered.
>
> The new testpmd command is proposed to configure sending pattern:
>
> set tx_times <burst_gap>,<intra_gap>
>
> <intra_gap> - the delay between the packets within the burst
> specified in the device clock units. The number
> of packets in the burst is defined by txburst parameter
>
> <burst_gap> - the delay between the bursts in the device clock units
>
> As the result the bursts of packet will be transmitted with specific
> delays between the packets within the burst and specific delay between
> the bursts. The rte_eth_get_clock is supposed to be engaged to get the
> current device clock value and provide the reference for the timestamps.
>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
> lib/librte_ethdev/rte_ethdev.c | 1 +
> lib/librte_ethdev/rte_ethdev.h | 4 ++++
> lib/librte_mbuf/rte_mbuf_dyn.h | 32 ++++++++++++++++++++++++++++++++
> 3 files changed, 37 insertions(+)
>
> diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
> index 8e10a6f..02157d5 100644
> --- a/lib/librte_ethdev/rte_ethdev.c
> +++ b/lib/librte_ethdev/rte_ethdev.c
> @@ -162,6 +162,7 @@ struct rte_eth_xstats_name_off {
> RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
> RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
> RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
> + RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
> };
>
> #undef RTE_TX_OFFLOAD_BIT2STR
> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> index a49242b..6f6454c 100644
> --- a/lib/librte_ethdev/rte_ethdev.h
> +++ b/lib/librte_ethdev/rte_ethdev.h
> @@ -1178,6 +1178,10 @@ struct rte_eth_conf {
> /** Device supports outer UDP checksum */
> #define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM 0x00100000
>
> +/** Device supports send on timestamp */
> +#define DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
> +
> +
> #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
> /**< Device supports Rx queue setup after device started*/
> #define RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002
> diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
> index 96c3631..5b2f3da 100644
> --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> @@ -250,4 +250,36 @@ int rte_mbuf_dynflag_lookup(const char *name,
> #define RTE_MBUF_DYNFIELD_METADATA_NAME "rte_flow_dynfield_metadata"
> #define RTE_MBUF_DYNFLAG_METADATA_NAME "rte_flow_dynflag_metadata"
>
> +/**
> + * The timestamp dynamic field provides some timing information, the
> + * units and time references (initial phase) are not explicitly defined
> + * but are maintained always the same for a given port. Some devices allow
> + * to query rte_eth_read_clock() that will return the current device
> + * timestamp. The dynamic Tx timestamp flag tells whether the field contains
> + * actual timestamp value. For the packets being sent this value can be
> + * used by PMD to schedule packet sending.
> + *
> + * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
> + * and obsoleting, the dedicated Rx timestamp flag is supposed to be
> + * introduced and the shared timestamp field will be used to handle the
> + * timestamps on receiving datapath as well. Having the dedicated flags
> + * for Rx/Tx timstamps allows applications not to perform explicit flags
> + * reset on forwarding and not to promote received timestamps to the
> + * transmitting datapath by default.
> + *
> + * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME flag set on the
> + * packet being sent it tries to synchronize the time of packet appearing
> + * on the wire with the specified packet timestamp. If the specified one
> + * is in the past it should be ignored, if one is in the distant future
> + * it should be capped with some reasonable value (in range of seconds).
> + *
> + * There is no any packet reordering according timestamps is supposed,
I think there is a typo here
> + * neither within packet burst, nor between packets, it is an entirely
> + * application responsibility to generate packets and its timestamps in
> + * desired order. The timestamps might be put only in the first packet in
> + * the burst providing the entire burst scheduling.
> + */
> +#define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME "rte_dynfield_timestamp"
> +#define RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME "rte_dynflag_tx_timestamp"
Is it possible to split the comment, to document both
RTE_MBUF_DYNFIELD_TIMESTAMP_NAME and RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME ? I
didn't try to generate the documentation, but I think, like this, that
RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME will be undocumented.
Apart from that, it looks good to me.
> +
> #endif
> --
> 1.8.3.1
>
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v3 4/4] eventdev: relax smp barriers with C11 atomics
2020-07-07 11:13 4% ` [dpdk-dev] [PATCH v3 4/4] eventdev: relax smp barriers with C11 atomics Phil Yang
@ 2020-07-07 14:29 0% ` Jerin Jacob
2020-07-07 15:56 0% ` Phil Yang
0 siblings, 1 reply; 200+ results
From: Jerin Jacob @ 2020-07-07 14:29 UTC (permalink / raw)
To: Phil Yang
Cc: Thomas Monjalon, Erik Gabriel Carrillo, dpdk-dev, Jerin Jacob,
Honnappa Nagarahalli, David Christensen,
Ruifeng Wang (Arm Technology China),
Dharmik Thakkar, nd, David Marchand, Ray Kinsella, Neil Horman,
dodji
On Tue, Jul 7, 2020 at 4:45 PM Phil Yang <phil.yang@arm.com> wrote:
>
> The impl_opaque field is shared between the timer arm and cancel
> operations. Meanwhile, the state flag acts as a guard variable to
> make sure the update of impl_opaque is synchronized. The original
> code uses rte_smp barriers to achieve that. This patch uses C11
> atomics with an explicit one-way memory barrier instead of full
> barriers rte_smp_w/rmb() to avoid the unnecessary barrier on aarch64.
>
> Since compilers can generate the same instructions for volatile and
> non-volatile variable in C11 __atomics built-ins, so remain the volatile
> keyword in front of state enum to avoid the ABI break issue.
>
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Could you fix the following:
WARNING:TYPO_SPELLING: 'opague' may be misspelled - perhaps 'opaque'?
#184: FILE: lib/librte_eventdev/rte_event_timer_adapter.c:1161:
+ * specific opague data under the correct state.
total: 0 errors, 1 warnings, 124 lines checked
> ---
> v3:
> Fix ABI issue: revert to 'volatile enum rte_event_timer_state type state'.
>
> v2:
> 1. Removed implementation-specific opaque data cleanup code.
> 2. Replaced thread fence with atomic ACQURE/RELEASE ordering on state access.
>
> lib/librte_eventdev/rte_event_timer_adapter.c | 55 ++++++++++++++++++---------
> 1 file changed, 37 insertions(+), 18 deletions(-)
>
> diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
> index d75415c..eb2c93a 100644
> --- a/lib/librte_eventdev/rte_event_timer_adapter.c
> +++ b/lib/librte_eventdev/rte_event_timer_adapter.c
> @@ -629,7 +629,8 @@ swtim_callback(struct rte_timer *tim)
> sw->expired_timers[sw->n_expired_timers++] = tim;
> sw->stats.evtim_exp_count++;
>
> - evtim->state = RTE_EVENT_TIMER_NOT_ARMED;
> + __atomic_store_n(&evtim->state, RTE_EVENT_TIMER_NOT_ARMED,
> + __ATOMIC_RELEASE);
> }
>
> if (event_buffer_batch_ready(&sw->buffer)) {
> @@ -1020,6 +1021,7 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
> int n_lcores;
> /* Timer list for this lcore is not in use. */
> uint16_t exp_state = 0;
> + enum rte_event_timer_state n_state;
>
> #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
> /* Check that the service is running. */
> @@ -1060,30 +1062,36 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
> }
>
> for (i = 0; i < nb_evtims; i++) {
> - /* Don't modify the event timer state in these cases */
> - if (evtims[i]->state == RTE_EVENT_TIMER_ARMED) {
> + n_state = __atomic_load_n(&evtims[i]->state, __ATOMIC_ACQUIRE);
> + if (n_state == RTE_EVENT_TIMER_ARMED) {
> rte_errno = EALREADY;
> break;
> - } else if (!(evtims[i]->state == RTE_EVENT_TIMER_NOT_ARMED ||
> - evtims[i]->state == RTE_EVENT_TIMER_CANCELED)) {
> + } else if (!(n_state == RTE_EVENT_TIMER_NOT_ARMED ||
> + n_state == RTE_EVENT_TIMER_CANCELED)) {
> rte_errno = EINVAL;
> break;
> }
>
> ret = check_timeout(evtims[i], adapter);
> if (unlikely(ret == -1)) {
> - evtims[i]->state = RTE_EVENT_TIMER_ERROR_TOOLATE;
> + __atomic_store_n(&evtims[i]->state,
> + RTE_EVENT_TIMER_ERROR_TOOLATE,
> + __ATOMIC_RELAXED);
> rte_errno = EINVAL;
> break;
> } else if (unlikely(ret == -2)) {
> - evtims[i]->state = RTE_EVENT_TIMER_ERROR_TOOEARLY;
> + __atomic_store_n(&evtims[i]->state,
> + RTE_EVENT_TIMER_ERROR_TOOEARLY,
> + __ATOMIC_RELAXED);
> rte_errno = EINVAL;
> break;
> }
>
> if (unlikely(check_destination_event_queue(evtims[i],
> adapter) < 0)) {
> - evtims[i]->state = RTE_EVENT_TIMER_ERROR;
> + __atomic_store_n(&evtims[i]->state,
> + RTE_EVENT_TIMER_ERROR,
> + __ATOMIC_RELAXED);
> rte_errno = EINVAL;
> break;
> }
> @@ -1099,13 +1107,18 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
> SINGLE, lcore_id, NULL, evtims[i]);
> if (ret < 0) {
> /* tim was in RUNNING or CONFIG state */
> - evtims[i]->state = RTE_EVENT_TIMER_ERROR;
> + __atomic_store_n(&evtims[i]->state,
> + RTE_EVENT_TIMER_ERROR,
> + __ATOMIC_RELEASE);
> break;
> }
>
> - rte_smp_wmb();
> EVTIM_LOG_DBG("armed an event timer");
> - evtims[i]->state = RTE_EVENT_TIMER_ARMED;
> + /* RELEASE ordering guarantees the adapter specific value
> + * changes observed before the update of state.
> + */
> + __atomic_store_n(&evtims[i]->state, RTE_EVENT_TIMER_ARMED,
> + __ATOMIC_RELEASE);
> }
>
> if (i < nb_evtims)
> @@ -1132,6 +1145,7 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
> struct rte_timer *timp;
> uint64_t opaque;
> struct swtim *sw = swtim_pmd_priv(adapter);
> + enum rte_event_timer_state n_state;
>
> #ifdef RTE_LIBRTE_EVENTDEV_DEBUG
> /* Check that the service is running. */
> @@ -1143,16 +1157,18 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
>
> for (i = 0; i < nb_evtims; i++) {
> /* Don't modify the event timer state in these cases */
> - if (evtims[i]->state == RTE_EVENT_TIMER_CANCELED) {
> + /* ACQUIRE ordering guarantees the access of implementation
> + * specific opague data under the correct state.
> + */
> + n_state = __atomic_load_n(&evtims[i]->state, __ATOMIC_ACQUIRE);
> + if (n_state == RTE_EVENT_TIMER_CANCELED) {
> rte_errno = EALREADY;
> break;
> - } else if (evtims[i]->state != RTE_EVENT_TIMER_ARMED) {
> + } else if (n_state != RTE_EVENT_TIMER_ARMED) {
> rte_errno = EINVAL;
> break;
> }
>
> - rte_smp_rmb();
> -
> opaque = evtims[i]->impl_opaque[0];
> timp = (struct rte_timer *)(uintptr_t)opaque;
> RTE_ASSERT(timp != NULL);
> @@ -1166,9 +1182,12 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
>
> rte_mempool_put(sw->tim_pool, (void **)timp);
>
> - evtims[i]->state = RTE_EVENT_TIMER_CANCELED;
> -
> - rte_smp_wmb();
> + /* The RELEASE ordering here pairs with atomic ordering
> + * to make sure the state update data observed between
> + * threads.
> + */
> + __atomic_store_n(&evtims[i]->state, RTE_EVENT_TIMER_CANCELED,
> + __ATOMIC_RELEASE);
> }
>
> return i;
> --
> 2.7.4
>
^ permalink raw reply [relevance 0%]
* [dpdk-dev] [PATCH v3 1/2] mbuf: introduce accurate packet Tx scheduling
2020-07-07 12:59 2% ` [dpdk-dev] [PATCH v2 " Viacheslav Ovsiienko
@ 2020-07-07 13:08 2% ` Viacheslav Ovsiienko
2020-07-07 14:32 0% ` Olivier Matz
2020-07-07 14:57 2% ` [dpdk-dev] [PATCH v4 " Viacheslav Ovsiienko
` (3 subsequent siblings)
6 siblings, 1 reply; 200+ results
From: Viacheslav Ovsiienko @ 2020-07-07 13:08 UTC (permalink / raw)
To: dev; +Cc: matan, rasland, olivier.matz, bernard.iremonger, thomas
There is the requirement on some networks for precise traffic timing
management. The ability to send (and, generally speaking, receive)
the packets at the very precisely specified moment of time provides
the opportunity to support the connections with Time Division
Multiplexing using the contemporary general purpose NIC without involving
an auxiliary hardware. For example, the supporting of O-RAN Fronthaul
interface is one of the promising features for potentially usage of the
precise time management for the egress packets.
The main objective of this RFC is to specify the way how applications
can provide the moment of time at what the packet transmission must be
started and to describe in preliminary the supporting this feature from
mlx5 PMD side.
The new dynamic timestamp field is proposed, it provides some timing
information, the units and time references (initial phase) are not
explicitly defined but are maintained always the same for a given port.
Some devices allow to query rte_eth_read_clock() that will return
the current device timestamp. The dynamic timestamp flag tells whether
the field contains actual timestamp value. For the packets being sent
this value can be used by PMD to schedule packet sending.
After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
and obsoleting, these dynamic flag and field will be used to manage
the timestamps on receiving datapath as well.
When PMD sees the "rte_dynfield_timestamp" set on the packet being sent
it tries to synchronize the time of packet appearing on the wire with
the specified packet timestamp. If the specified one is in the past it
should be ignored, if one is in the distant future it should be capped
with some reasonable value (in range of seconds). These specific cases
("too late" and "distant future") can be optionally reported via
device xstats to assist applications to detect the time-related
problems.
There is no any packet reordering according timestamps is supposed,
neither within packet burst, nor between packets, it is an entirely
application responsibility to generate packets and its timestamps
in desired order. The timestamps can be put only in the first packet
in the burst providing the entire burst scheduling.
PMD reports the ability to synchronize packet sending on timestamp
with new offload flag:
This is palliative and is going to be replaced with new eth_dev API
about reporting/managing the supported dynamic flags and its related
features. This API would break ABI compatibility and can't be introduced
at the moment, so is postponed to 20.11.
For testing purposes it is proposed to update testpmd "txonly"
forwarding mode routine. With this update testpmd application generates
the packets and sets the dynamic timestamps according to specified time
pattern if it sees the "rte_dynfield_timestamp" is registered.
The new testpmd command is proposed to configure sending pattern:
set tx_times <burst_gap>,<intra_gap>
<intra_gap> - the delay between the packets within the burst
specified in the device clock units. The number
of packets in the burst is defined by txburst parameter
<burst_gap> - the delay between the bursts in the device clock units
As the result the bursts of packet will be transmitted with specific
delays between the packets within the burst and specific delay between
the bursts. The rte_eth_get_clock is supposed to be engaged to get the
current device clock value and provide the reference for the timestamps.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
lib/librte_ethdev/rte_ethdev.c | 1 +
lib/librte_ethdev/rte_ethdev.h | 4 ++++
lib/librte_mbuf/rte_mbuf_dyn.h | 32 ++++++++++++++++++++++++++++++++
3 files changed, 37 insertions(+)
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 8e10a6f..02157d5 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -162,6 +162,7 @@ struct rte_eth_xstats_name_off {
RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
+ RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
};
#undef RTE_TX_OFFLOAD_BIT2STR
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index a49242b..6f6454c 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1178,6 +1178,10 @@ struct rte_eth_conf {
/** Device supports outer UDP checksum */
#define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM 0x00100000
+/** Device supports send on timestamp */
+#define DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
+
+
#define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
/**< Device supports Rx queue setup after device started*/
#define RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 96c3631..5b2f3da 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -250,4 +250,36 @@ int rte_mbuf_dynflag_lookup(const char *name,
#define RTE_MBUF_DYNFIELD_METADATA_NAME "rte_flow_dynfield_metadata"
#define RTE_MBUF_DYNFLAG_METADATA_NAME "rte_flow_dynflag_metadata"
+/**
+ * The timestamp dynamic field provides some timing information, the
+ * units and time references (initial phase) are not explicitly defined
+ * but are maintained always the same for a given port. Some devices allow
+ * to query rte_eth_read_clock() that will return the current device
+ * timestamp. The dynamic Tx timestamp flag tells whether the field contains
+ * actual timestamp value. For the packets being sent this value can be
+ * used by PMD to schedule packet sending.
+ *
+ * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
+ * and obsoleting, the dedicated Rx timestamp flag is supposed to be
+ * introduced and the shared timestamp field will be used to handle the
+ * timestamps on receiving datapath as well. Having the dedicated flags
+ * for Rx/Tx timstamps allows applications not to perform explicit flags
+ * reset on forwarding and not to promote received timestamps to the
+ * transmitting datapath by default.
+ *
+ * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME flag set on the
+ * packet being sent it tries to synchronize the time of packet appearing
+ * on the wire with the specified packet timestamp. If the specified one
+ * is in the past it should be ignored, if one is in the distant future
+ * it should be capped with some reasonable value (in range of seconds).
+ *
+ * There is no any packet reordering according timestamps is supposed,
+ * neither within packet burst, nor between packets, it is an entirely
+ * application responsibility to generate packets and its timestamps in
+ * desired order. The timestamps might be put only in the first packet in
+ * the burst providing the entire burst scheduling.
+ */
+#define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME "rte_dynfield_timestamp"
+#define RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME "rte_dynflag_tx_timestamp"
+
#endif
--
1.8.3.1
^ permalink raw reply [relevance 2%]
* [dpdk-dev] [PATCH v2 1/2] mbuf: introduce accurate packet Tx scheduling
@ 2020-07-07 12:59 2% ` Viacheslav Ovsiienko
2020-07-07 13:08 2% ` [dpdk-dev] [PATCH v3 " Viacheslav Ovsiienko
` (4 subsequent siblings)
6 siblings, 0 replies; 200+ results
From: Viacheslav Ovsiienko @ 2020-07-07 12:59 UTC (permalink / raw)
To: dev; +Cc: matan, rasland, olivier.matz, bernard.iremonger, thomas
There is the requirement on some networks for precise traffic timing
management. The ability to send (and, generally speaking, receive)
the packets at the very precisely specified moment of time provides
the opportunity to support the connections with Time Division
Multiplexing using the contemporary general purpose NIC without involving
an auxiliary hardware. For example, the supporting of O-RAN Fronthaul
interface is one of the promising features for potentially usage of the
precise time management for the egress packets.
The main objective of this RFC is to specify the way how applications
can provide the moment of time at what the packet transmission must be
started and to describe in preliminary the supporting this feature from
mlx5 PMD side.
The new dynamic timestamp field is proposed, it provides some timing
information, the units and time references (initial phase) are not
explicitly defined but are maintained always the same for a given port.
Some devices allow to query rte_eth_read_clock() that will return
the current device timestamp. The dynamic timestamp flag tells whether
the field contains actual timestamp value. For the packets being sent
this value can be used by PMD to schedule packet sending.
After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
and obsoleting, these dynamic flag and field will be used to manage
the timestamps on receiving datapath as well.
When PMD sees the "rte_dynfield_timestamp" set on the packet being sent
it tries to synchronize the time of packet appearing on the wire with
the specified packet timestamp. If the specified one is in the past it
should be ignored, if one is in the distant future it should be capped
with some reasonable value (in range of seconds). These specific cases
("too late" and "distant future") can be optionally reported via
device xstats to assist applications to detect the time-related
problems.
There is no any packet reordering according timestamps is supposed,
neither within packet burst, nor between packets, it is an entirely
application responsibility to generate packets and its timestamps
in desired order. The timestamps can be put only in the first packet
in the burst providing the entire burst scheduling.
PMD reports the ability to synchronize packet sending on timestamp
with new offload flag:
This is palliative and is going to be replaced with new eth_dev API
about reporting/managing the supported dynamic flags and its related
features. This API would break ABI compatibility and can't be introduced
at the moment, so is postponed to 20.11.
For testing purposes it is proposed to update testpmd "txonly"
forwarding mode routine. With this update testpmd application generates
the packets and sets the dynamic timestamps according to specified time
pattern if it sees the "rte_dynfield_timestamp" is registered.
The new testpmd command is proposed to configure sending pattern:
set tx_times <burst_gap>,<intra_gap>
<intra_gap> - the delay between the packets within the burst
specified in the device clock units. The number
of packets in the burst is defined by txburst parameter
<burst_gap> - the delay between the bursts in the device clock units
As the result the bursts of packet will be transmitted with specific
delays between the packets within the burst and specific delay between
the bursts. The rte_eth_get_clock is supposed to be engaged to get the
current device clock value and provide the reference for the timestamps.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
---
lib/librte_ethdev/rte_ethdev.c | 1 +
lib/librte_ethdev/rte_ethdev.h | 4 ++++
lib/librte_mbuf/rte_mbuf_dyn.h | 32 ++++++++++++++++++++++++++++++++
3 files changed, 37 insertions(+)
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 8e10a6f..02157d5 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -162,6 +162,7 @@ struct rte_eth_xstats_name_off {
RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
+ RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
};
#undef RTE_TX_OFFLOAD_BIT2STR
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index a49242b..6f6454c 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1178,6 +1178,10 @@ struct rte_eth_conf {
/** Device supports outer UDP checksum */
#define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM 0x00100000
+/** Device supports send on timestamp */
+#define DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
+
+
#define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
/**< Device supports Rx queue setup after device started*/
#define RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002
diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
index 96c3631..834acdd 100644
--- a/lib/librte_mbuf/rte_mbuf_dyn.h
+++ b/lib/librte_mbuf/rte_mbuf_dyn.h
@@ -250,4 +250,36 @@ int rte_mbuf_dynflag_lookup(const char *name,
#define RTE_MBUF_DYNFIELD_METADATA_NAME "rte_flow_dynfield_metadata"
#define RTE_MBUF_DYNFLAG_METADATA_NAME "rte_flow_dynflag_metadata"
+/**
+ * The timestamp dynamic field provides some timing information, the
+ * units and time references (initial phase) are not explicitly defined
+ * but are maintained always the same for a given port. Some devices allow
+ * to query rte_eth_read_clock() that will return the current device
+ * timestamp. The dynamic Tx timestamp flag tells whether the field contains
+ * actual timestamp value. For the packets being sent this value can be
+ * used by PMD to schedule packet sending.
+ *
+ * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
+ * and obsoleting, the dedicated Rx timestamp flag is supposed to be
+ * introduced and the shared timestamp field will be used to handle the
+ * timestamps on receiving datapath as well. Having the dedicated flags
+ * for Rx/Tx timstamps allows applications not to perform explicit flags
+ * reset on forwaring and not to promote received timestamps to the
+ * transmitting datapath by default.
+ *
+ * When PMD sees the RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME flag set on the
+ * packet being sent it tries to synchronize the time of packet appearing
+ * on the wire with the specified packet timestamp. If the specified one
+ * is in the past it should be ignored, if one is in the distant future
+ * it should be capped with some reasonable value (in range of seconds).
+ *
+ * There is no any packet reordering according timestamps is supposed,
+ * neither within packet burst, nor between packets, it is an entirely
+ * application responsibility to generate packets and its timestamps in
+ * desired order. The timestamps might be put only in the first packet in
+ * the burst providing the entire burst scheduling.
+ */
+#define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME "rte_dynfield_timestamp"
+#define RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME "rte_dynflag_tx_timestamp"
+
#endif
--
1.8.3.1
^ permalink raw reply [relevance 2%]
* Re: [dpdk-dev] [PATCH 1/2] mbuf: introduce accurate packet Tx scheduling
2020-07-07 11:50 0% ` Olivier Matz
@ 2020-07-07 12:46 0% ` Slava Ovsiienko
0 siblings, 0 replies; 200+ results
From: Slava Ovsiienko @ 2020-07-07 12:46 UTC (permalink / raw)
To: Olivier Matz
Cc: dev, Matan Azrad, Raslan Darawsheh, bernard.iremonger, thomas
Hi, Olivier
Thanks a lot for the review.
> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Tuesday, July 7, 2020 14:51
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan
> Darawsheh <rasland@mellanox.com>; bernard.iremonger@intel.com;
> thomas@mellanox.net
> Subject: Re: [PATCH 1/2] mbuf: introduce accurate packet Tx scheduling
>
> Hi Slava,
>
> Few question/comments below.
>
> On Wed, Jul 01, 2020 at 03:36:26PM +0000, Viacheslav Ovsiienko wrote:
> > There is the requirement on some networks for precise traffic timing
> > management. The ability to send (and, generally speaking, receive) the
> > packets at the very precisely specified moment of time provides the
> > opportunity to support the connections with Time Division Multiplexing
> > using the contemporary general purpose NIC without involving an
> > auxiliary hardware. For example, the supporting of O-RAN Fronthaul
> > interface is one of the promising features for potentially usage of
> > the precise time management for the egress packets.
> >
> > The main objective of this RFC is to specify the way how applications
> > can provide the moment of time at what the packet transmission must be
> > started and to describe in preliminary the supporting this feature
> > from
> > mlx5 PMD side.
> >
> > The new dynamic timestamp field is proposed, it provides some timing
> > information, the units and time references (initial phase) are not
> > explicitly defined but are maintained always the same for a given port.
> > Some devices allow to query rte_eth_read_clock() that will return the
> > current device timestamp. The dynamic timestamp flag tells whether the
> > field contains actual timestamp value. For the packets being sent this
> > value can be used by PMD to schedule packet sending.
> >
> > After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation and
> > obsoleting, these dynamic flag and field will be used to manage the
> > timestamps on receiving datapath as well.
>
> Do you mean the same flag will be used for both Rx and Tx? I wonder if it's a
> good idea: if you enable the timestamp on Rx, the packets will be flagged
> and it will impact Tx, except if the application explicitly resets the flag in all
> mbufs. Wouldn't it be safer to have an Rx flag and a Tx flag?
A little bit difficult to say ambiguously, I thought about and did not make strong decision.
We have the flag sharing for the Rx/Tx metadata and just follow the same approach.
OK, I will promote ones to:
RTE_MBUF_DYNFLAG_TX_TIMESTAMP_NAME
RTE_MBUF_DYNFIELD_TX_TIMESTAMP_NAME
And, possible, we should reconsider metadata dynamic flags.
>
> > When PMD sees the "rte_dynfield_timestamp" set on the packet being
> > sent it tries to synchronize the time of packet appearing on the wire
> > with the specified packet timestamp. It the specified one is in the
> > past it should be ignored, if one is in the distant future it should
> > be capped with some reasonable value (in range of seconds). These
> > specific cases ("too late" and "distant future") can be optionally
> > reported via device xstats to assist applications to detect the
> > time-related problems.
>
> I think what to do with packets to be send in the "past" could be configurable
> through an ethdev API in the future (drop or send).
Yes, currently there is no complete understanding how to handle packets out of the time slot.
In 20.11 we are going to introduce the time-based rte flow API to manage out-of-window packets.
>
> > There is no any packet reordering according timestamps is supposed,
> > neither within packet burst, nor between packets, it is an entirely
> > application responsibility to generate packets and its timestamps in
> > desired order. The timestamps can be put only in the first packet in
> > the burst providing the entire burst scheduling.
"can" should be replaced with "might". Current mlx5 implementation
checks each packet in the burst for the timestamp presence.
>
> This constraint makes sense. At first glance, it looks it is imposed by a PMD or
> hw limitation, but thinking more about it, I think it is the correct behavior to
> have. Packets are ordered within a PMD queue, and the ability to set the
> timestamp for one packet to delay the subsequent ones looks useful.
>
> Should this behavior be documented somewhere? Maybe in the API
> comment documenting the dynamic flag?
It is documented in mlx5.rst (coming soon), do you think it should be
more common? OK, will update.
>
> > PMD reports the ability to synchronize packet sending on timestamp
> > with new offload flag:
> >
> > This is palliative and is going to be replaced with new eth_dev API
> > about reporting/managing the supported dynamic flags and its related
> > features. This API would break ABI compatibility and can't be
> > introduced at the moment, so is postponed to 20.11.
> >
> > For testing purposes it is proposed to update testpmd "txonly"
> > forwarding mode routine. With this update testpmd application
> > generates the packets and sets the dynamic timestamps according to
> > specified time pattern if it sees the "rte_dynfield_timestamp" is registered.
> >
> > The new testpmd command is proposed to configure sending pattern:
> >
> > set tx_times <burst_gap>,<intra_gap>
> >
> > <intra_gap> - the delay between the packets within the burst
> > specified in the device clock units. The number
> > of packets in the burst is defined by txburst parameter
> >
> > <burst_gap> - the delay between the bursts in the device clock units
> >
> > As the result the bursts of packet will be transmitted with specific
> > delays between the packets within the burst and specific delay between
> > the bursts. The rte_eth_get_clock is supposed to be engaged to get the
> > current device clock value and provide the reference for the timestamps.
> >
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > ---
> > lib/librte_ethdev/rte_ethdev.c | 1 + lib/librte_ethdev/rte_ethdev.h
> > | 4 ++++ lib/librte_mbuf/rte_mbuf_dyn.h | 16 ++++++++++++++++
> > 3 files changed, 21 insertions(+)
> >
> > diff --git a/lib/librte_ethdev/rte_ethdev.c
> > b/lib/librte_ethdev/rte_ethdev.c index 8e10a6f..02157d5 100644
> > --- a/lib/librte_ethdev/rte_ethdev.c
> > +++ b/lib/librte_ethdev/rte_ethdev.c
> > @@ -162,6 +162,7 @@ struct rte_eth_xstats_name_off {
> > RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
> > RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
> > RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
> > + RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
> > };
> >
> > #undef RTE_TX_OFFLOAD_BIT2STR
> > diff --git a/lib/librte_ethdev/rte_ethdev.h
> > b/lib/librte_ethdev/rte_ethdev.h index a49242b..6f6454c 100644
> > --- a/lib/librte_ethdev/rte_ethdev.h
> > +++ b/lib/librte_ethdev/rte_ethdev.h
> > @@ -1178,6 +1178,10 @@ struct rte_eth_conf {
> > /** Device supports outer UDP checksum */ #define
> > DEV_TX_OFFLOAD_OUTER_UDP_CKSUM 0x00100000
> >
> > +/** Device supports send on timestamp */ #define
> > +DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
> > +
> > +
> > #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
> /**<
> > Device supports Rx queue setup after device started*/ #define
> > RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002 diff --git
> > a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
> > index 96c3631..fb5477c 100644
> > --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> > +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> > @@ -250,4 +250,20 @@ int rte_mbuf_dynflag_lookup(const char *name,
> > #define RTE_MBUF_DYNFIELD_METADATA_NAME
> "rte_flow_dynfield_metadata"
> > #define RTE_MBUF_DYNFLAG_METADATA_NAME
> "rte_flow_dynflag_metadata"
> >
> > +/*
> > + * The timestamp dynamic field provides some timing information, the
> > + * units and time references (initial phase) are not explicitly
> > +defined
> > + * but are maintained always the same for a given port. Some devices
> > +allow
> > + * to query rte_eth_read_clock() that will return the current device
> > + * timestamp. The dynamic timestamp flag tells whether the field
> > +contains
> > + * actual timestamp value. For the packets being sent this value can
> > +be
> > + * used by PMD to schedule packet sending.
> > + *
> > + * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
> > + * and obsoleting, these dynamic flag and field will be used to
> > +manage
> > + * the timestamps on receiving datapath as well.
> > + */
> > +#define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME
> "rte_dynfield_timestamp"
> > +#define RTE_MBUF_DYNFLAG_TIMESTAMP_NAME
> "rte_dynflag_timestamp"
> > +
>
> I realize that's not the case for rte_flow_dynfield_metadata, but I think it
> would be good to have a doxygen-like comment.
OK, will extend comment with expected PMD behavior and replace "/*" with "/**"
>
>
>
> Regards,
> Olivier
With best regards, Slava
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH 1/2] mbuf: introduce accurate packet Tx scheduling
@ 2020-07-07 11:50 0% ` Olivier Matz
2020-07-07 12:46 0% ` Slava Ovsiienko
0 siblings, 1 reply; 200+ results
From: Olivier Matz @ 2020-07-07 11:50 UTC (permalink / raw)
To: Viacheslav Ovsiienko; +Cc: dev, matan, rasland, bernard.iremonger, thomas
Hi Slava,
Few question/comments below.
On Wed, Jul 01, 2020 at 03:36:26PM +0000, Viacheslav Ovsiienko wrote:
> There is the requirement on some networks for precise traffic timing
> management. The ability to send (and, generally speaking, receive)
> the packets at the very precisely specified moment of time provides
> the opportunity to support the connections with Time Division
> Multiplexing using the contemporary general purpose NIC without involving
> an auxiliary hardware. For example, the supporting of O-RAN Fronthaul
> interface is one of the promising features for potentially usage of the
> precise time management for the egress packets.
>
> The main objective of this RFC is to specify the way how applications
> can provide the moment of time at what the packet transmission must be
> started and to describe in preliminary the supporting this feature from
> mlx5 PMD side.
>
> The new dynamic timestamp field is proposed, it provides some timing
> information, the units and time references (initial phase) are not
> explicitly defined but are maintained always the same for a given port.
> Some devices allow to query rte_eth_read_clock() that will return
> the current device timestamp. The dynamic timestamp flag tells whether
> the field contains actual timestamp value. For the packets being sent
> this value can be used by PMD to schedule packet sending.
>
> After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
> and obsoleting, these dynamic flag and field will be used to manage
> the timestamps on receiving datapath as well.
Do you mean the same flag will be used for both Rx and Tx? I wonder if
it's a good idea: if you enable the timestamp on Rx, the packets will be
flagged and it will impact Tx, except if the application explicitly
resets the flag in all mbufs. Wouldn't it be safer to have an Rx flag
and a Tx flag?
> When PMD sees the "rte_dynfield_timestamp" set on the packet being sent
> it tries to synchronize the time of packet appearing on the wire with
> the specified packet timestamp. It the specified one is in the past it
> should be ignored, if one is in the distant future it should be capped
> with some reasonable value (in range of seconds). These specific cases
> ("too late" and "distant future") can be optionally reported via
> device xstats to assist applications to detect the time-related
> problems.
I think what to do with packets to be send in the "past" could be
configurable through an ethdev API in the future (drop or send).
> There is no any packet reordering according timestamps is supposed,
> neither within packet burst, nor between packets, it is an entirely
> application responsibility to generate packets and its timestamps
> in desired order. The timestamps can be put only in the first packet
> in the burst providing the entire burst scheduling.
This constraint makes sense. At first glance, it looks it is imposed by
a PMD or hw limitation, but thinking more about it, I think it is the
correct behavior to have. Packets are ordered within a PMD queue, and
the ability to set the timestamp for one packet to delay the subsequent
ones looks useful.
Should this behavior be documented somewhere? Maybe in the API comment
documenting the dynamic flag?
> PMD reports the ability to synchronize packet sending on timestamp
> with new offload flag:
>
> This is palliative and is going to be replaced with new eth_dev API
> about reporting/managing the supported dynamic flags and its related
> features. This API would break ABI compatibility and can't be introduced
> at the moment, so is postponed to 20.11.
>
> For testing purposes it is proposed to update testpmd "txonly"
> forwarding mode routine. With this update testpmd application generates
> the packets and sets the dynamic timestamps according to specified time
> pattern if it sees the "rte_dynfield_timestamp" is registered.
>
> The new testpmd command is proposed to configure sending pattern:
>
> set tx_times <burst_gap>,<intra_gap>
>
> <intra_gap> - the delay between the packets within the burst
> specified in the device clock units. The number
> of packets in the burst is defined by txburst parameter
>
> <burst_gap> - the delay between the bursts in the device clock units
>
> As the result the bursts of packet will be transmitted with specific
> delays between the packets within the burst and specific delay between
> the bursts. The rte_eth_get_clock is supposed to be engaged to get the
> current device clock value and provide the reference for the timestamps.
>
> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> ---
> lib/librte_ethdev/rte_ethdev.c | 1 +
> lib/librte_ethdev/rte_ethdev.h | 4 ++++
> lib/librte_mbuf/rte_mbuf_dyn.h | 16 ++++++++++++++++
> 3 files changed, 21 insertions(+)
>
> diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
> index 8e10a6f..02157d5 100644
> --- a/lib/librte_ethdev/rte_ethdev.c
> +++ b/lib/librte_ethdev/rte_ethdev.c
> @@ -162,6 +162,7 @@ struct rte_eth_xstats_name_off {
> RTE_TX_OFFLOAD_BIT2STR(UDP_TNL_TSO),
> RTE_TX_OFFLOAD_BIT2STR(IP_TNL_TSO),
> RTE_TX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM),
> + RTE_TX_OFFLOAD_BIT2STR(SEND_ON_TIMESTAMP),
> };
>
> #undef RTE_TX_OFFLOAD_BIT2STR
> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
> index a49242b..6f6454c 100644
> --- a/lib/librte_ethdev/rte_ethdev.h
> +++ b/lib/librte_ethdev/rte_ethdev.h
> @@ -1178,6 +1178,10 @@ struct rte_eth_conf {
> /** Device supports outer UDP checksum */
> #define DEV_TX_OFFLOAD_OUTER_UDP_CKSUM 0x00100000
>
> +/** Device supports send on timestamp */
> +#define DEV_TX_OFFLOAD_SEND_ON_TIMESTAMP 0x00200000
> +
> +
> #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001
> /**< Device supports Rx queue setup after device started*/
> #define RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x00000002
> diff --git a/lib/librte_mbuf/rte_mbuf_dyn.h b/lib/librte_mbuf/rte_mbuf_dyn.h
> index 96c3631..fb5477c 100644
> --- a/lib/librte_mbuf/rte_mbuf_dyn.h
> +++ b/lib/librte_mbuf/rte_mbuf_dyn.h
> @@ -250,4 +250,20 @@ int rte_mbuf_dynflag_lookup(const char *name,
> #define RTE_MBUF_DYNFIELD_METADATA_NAME "rte_flow_dynfield_metadata"
> #define RTE_MBUF_DYNFLAG_METADATA_NAME "rte_flow_dynflag_metadata"
>
> +/*
> + * The timestamp dynamic field provides some timing information, the
> + * units and time references (initial phase) are not explicitly defined
> + * but are maintained always the same for a given port. Some devices allow
> + * to query rte_eth_read_clock() that will return the current device
> + * timestamp. The dynamic timestamp flag tells whether the field contains
> + * actual timestamp value. For the packets being sent this value can be
> + * used by PMD to schedule packet sending.
> + *
> + * After PKT_RX_TIMESTAMP flag and fixed timestamp field deprecation
> + * and obsoleting, these dynamic flag and field will be used to manage
> + * the timestamps on receiving datapath as well.
> + */
> +#define RTE_MBUF_DYNFIELD_TIMESTAMP_NAME "rte_dynfield_timestamp"
> +#define RTE_MBUF_DYNFLAG_TIMESTAMP_NAME "rte_dynflag_timestamp"
> +
I realize that's not the case for rte_flow_dynfield_metadata, but
I think it would be good to have a doxygen-like comment.
Regards,
Olivier
^ permalink raw reply [relevance 0%]
* [dpdk-dev] [PATCH v3 4/4] eventdev: relax smp barriers with C11 atomics
@ 2020-07-07 11:13 4% ` Phil Yang
2020-07-07 14:29 0% ` Jerin Jacob
1 sibling, 1 reply; 200+ results
From: Phil Yang @ 2020-07-07 11:13 UTC (permalink / raw)
To: thomas, erik.g.carrillo, dev
Cc: jerinj, Honnappa.Nagarahalli, drc, Ruifeng.Wang, Dharmik.Thakkar,
nd, david.marchand, mdr, nhorman, dodji
The impl_opaque field is shared between the timer arm and cancel
operations. Meanwhile, the state flag acts as a guard variable to
make sure the update of impl_opaque is synchronized. The original
code uses rte_smp barriers to achieve that. This patch uses C11
atomics with an explicit one-way memory barrier instead of full
barriers rte_smp_w/rmb() to avoid the unnecessary barrier on aarch64.
Since compilers can generate the same instructions for volatile and
non-volatile variable in C11 __atomics built-ins, so remain the volatile
keyword in front of state enum to avoid the ABI break issue.
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
---
v3:
Fix ABI issue: revert to 'volatile enum rte_event_timer_state type state'.
v2:
1. Removed implementation-specific opaque data cleanup code.
2. Replaced thread fence with atomic ACQURE/RELEASE ordering on state access.
lib/librte_eventdev/rte_event_timer_adapter.c | 55 ++++++++++++++++++---------
1 file changed, 37 insertions(+), 18 deletions(-)
diff --git a/lib/librte_eventdev/rte_event_timer_adapter.c b/lib/librte_eventdev/rte_event_timer_adapter.c
index d75415c..eb2c93a 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter.c
+++ b/lib/librte_eventdev/rte_event_timer_adapter.c
@@ -629,7 +629,8 @@ swtim_callback(struct rte_timer *tim)
sw->expired_timers[sw->n_expired_timers++] = tim;
sw->stats.evtim_exp_count++;
- evtim->state = RTE_EVENT_TIMER_NOT_ARMED;
+ __atomic_store_n(&evtim->state, RTE_EVENT_TIMER_NOT_ARMED,
+ __ATOMIC_RELEASE);
}
if (event_buffer_batch_ready(&sw->buffer)) {
@@ -1020,6 +1021,7 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
int n_lcores;
/* Timer list for this lcore is not in use. */
uint16_t exp_state = 0;
+ enum rte_event_timer_state n_state;
#ifdef RTE_LIBRTE_EVENTDEV_DEBUG
/* Check that the service is running. */
@@ -1060,30 +1062,36 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
}
for (i = 0; i < nb_evtims; i++) {
- /* Don't modify the event timer state in these cases */
- if (evtims[i]->state == RTE_EVENT_TIMER_ARMED) {
+ n_state = __atomic_load_n(&evtims[i]->state, __ATOMIC_ACQUIRE);
+ if (n_state == RTE_EVENT_TIMER_ARMED) {
rte_errno = EALREADY;
break;
- } else if (!(evtims[i]->state == RTE_EVENT_TIMER_NOT_ARMED ||
- evtims[i]->state == RTE_EVENT_TIMER_CANCELED)) {
+ } else if (!(n_state == RTE_EVENT_TIMER_NOT_ARMED ||
+ n_state == RTE_EVENT_TIMER_CANCELED)) {
rte_errno = EINVAL;
break;
}
ret = check_timeout(evtims[i], adapter);
if (unlikely(ret == -1)) {
- evtims[i]->state = RTE_EVENT_TIMER_ERROR_TOOLATE;
+ __atomic_store_n(&evtims[i]->state,
+ RTE_EVENT_TIMER_ERROR_TOOLATE,
+ __ATOMIC_RELAXED);
rte_errno = EINVAL;
break;
} else if (unlikely(ret == -2)) {
- evtims[i]->state = RTE_EVENT_TIMER_ERROR_TOOEARLY;
+ __atomic_store_n(&evtims[i]->state,
+ RTE_EVENT_TIMER_ERROR_TOOEARLY,
+ __ATOMIC_RELAXED);
rte_errno = EINVAL;
break;
}
if (unlikely(check_destination_event_queue(evtims[i],
adapter) < 0)) {
- evtims[i]->state = RTE_EVENT_TIMER_ERROR;
+ __atomic_store_n(&evtims[i]->state,
+ RTE_EVENT_TIMER_ERROR,
+ __ATOMIC_RELAXED);
rte_errno = EINVAL;
break;
}
@@ -1099,13 +1107,18 @@ __swtim_arm_burst(const struct rte_event_timer_adapter *adapter,
SINGLE, lcore_id, NULL, evtims[i]);
if (ret < 0) {
/* tim was in RUNNING or CONFIG state */
- evtims[i]->state = RTE_EVENT_TIMER_ERROR;
+ __atomic_store_n(&evtims[i]->state,
+ RTE_EVENT_TIMER_ERROR,
+ __ATOMIC_RELEASE);
break;
}
- rte_smp_wmb();
EVTIM_LOG_DBG("armed an event timer");
- evtims[i]->state = RTE_EVENT_TIMER_ARMED;
+ /* RELEASE ordering guarantees the adapter specific value
+ * changes observed before the update of state.
+ */
+ __atomic_store_n(&evtims[i]->state, RTE_EVENT_TIMER_ARMED,
+ __ATOMIC_RELEASE);
}
if (i < nb_evtims)
@@ -1132,6 +1145,7 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
struct rte_timer *timp;
uint64_t opaque;
struct swtim *sw = swtim_pmd_priv(adapter);
+ enum rte_event_timer_state n_state;
#ifdef RTE_LIBRTE_EVENTDEV_DEBUG
/* Check that the service is running. */
@@ -1143,16 +1157,18 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
for (i = 0; i < nb_evtims; i++) {
/* Don't modify the event timer state in these cases */
- if (evtims[i]->state == RTE_EVENT_TIMER_CANCELED) {
+ /* ACQUIRE ordering guarantees the access of implementation
+ * specific opague data under the correct state.
+ */
+ n_state = __atomic_load_n(&evtims[i]->state, __ATOMIC_ACQUIRE);
+ if (n_state == RTE_EVENT_TIMER_CANCELED) {
rte_errno = EALREADY;
break;
- } else if (evtims[i]->state != RTE_EVENT_TIMER_ARMED) {
+ } else if (n_state != RTE_EVENT_TIMER_ARMED) {
rte_errno = EINVAL;
break;
}
- rte_smp_rmb();
-
opaque = evtims[i]->impl_opaque[0];
timp = (struct rte_timer *)(uintptr_t)opaque;
RTE_ASSERT(timp != NULL);
@@ -1166,9 +1182,12 @@ swtim_cancel_burst(const struct rte_event_timer_adapter *adapter,
rte_mempool_put(sw->tim_pool, (void **)timp);
- evtims[i]->state = RTE_EVENT_TIMER_CANCELED;
-
- rte_smp_wmb();
+ /* The RELEASE ordering here pairs with atomic ordering
+ * to make sure the state update data observed between
+ * threads.
+ */
+ __atomic_store_n(&evtims[i]->state, RTE_EVENT_TIMER_CANCELED,
+ __ATOMIC_RELEASE);
}
return i;
--
2.7.4
^ permalink raw reply [relevance 4%]
* [dpdk-dev] [PATCH v2] mbuf: use C11 atomics for refcnt operations
2020-07-03 15:38 3% ` David Marchand
@ 2020-07-07 10:10 3% ` Phil Yang
2020-07-08 5:11 3% ` Phil Yang
` (2 more replies)
1 sibling, 3 replies; 200+ results
From: Phil Yang @ 2020-07-07 10:10 UTC (permalink / raw)
To: david.marchand, dev
Cc: drc, Honnappa.Nagarahalli, olivier.matz, ruifeng.wang, nd
Use C11 atomics with explicit ordering instead of rte_atomic ops which
enforce unnecessary barriers on aarch64.
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
v2:
Fix ABI issue: revert the rte_mbuf_ext_shared_info struct refcnt field
to refcnt_atomic.
lib/librte_mbuf/rte_mbuf.c | 1 -
lib/librte_mbuf/rte_mbuf.h | 19 ++++++++++---------
lib/librte_mbuf/rte_mbuf_core.h | 11 +++--------
3 files changed, 13 insertions(+), 18 deletions(-)
diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
index ae91ae2..8a456e5 100644
--- a/lib/librte_mbuf/rte_mbuf.c
+++ b/lib/librte_mbuf/rte_mbuf.c
@@ -22,7 +22,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_mbuf.h>
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index f8e492e..4a7a98c 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -37,7 +37,6 @@
#include <rte_config.h>
#include <rte_mempool.h>
#include <rte_memory.h>
-#include <rte_atomic.h>
#include <rte_prefetch.h>
#include <rte_branch_prediction.h>
#include <rte_byteorder.h>
@@ -365,7 +364,7 @@ rte_pktmbuf_priv_flags(struct rte_mempool *mp)
static inline uint16_t
rte_mbuf_refcnt_read(const struct rte_mbuf *m)
{
- return (uint16_t)(rte_atomic16_read(&m->refcnt_atomic));
+ return __atomic_load_n(&m->refcnt, __ATOMIC_RELAXED);
}
/**
@@ -378,14 +377,15 @@ rte_mbuf_refcnt_read(const struct rte_mbuf *m)
static inline void
rte_mbuf_refcnt_set(struct rte_mbuf *m, uint16_t new_value)
{
- rte_atomic16_set(&m->refcnt_atomic, (int16_t)new_value);
+ __atomic_store_n(&m->refcnt, new_value, __ATOMIC_RELAXED);
}
/* internal */
static inline uint16_t
__rte_mbuf_refcnt_update(struct rte_mbuf *m, int16_t value)
{
- return (uint16_t)(rte_atomic16_add_return(&m->refcnt_atomic, value));
+ return (uint16_t)(__atomic_add_fetch((int16_t *)&m->refcnt, value,
+ __ATOMIC_ACQ_REL));
}
/**
@@ -466,7 +466,7 @@ rte_mbuf_refcnt_set(struct rte_mbuf *m, uint16_t new_value)
static inline uint16_t
rte_mbuf_ext_refcnt_read(const struct rte_mbuf_ext_shared_info *shinfo)
{
- return (uint16_t)(rte_atomic16_read(&shinfo->refcnt_atomic));
+ return __atomic_load_n(&shinfo->refcnt_atomic, __ATOMIC_RELAXED);
}
/**
@@ -481,7 +481,7 @@ static inline void
rte_mbuf_ext_refcnt_set(struct rte_mbuf_ext_shared_info *shinfo,
uint16_t new_value)
{
- rte_atomic16_set(&shinfo->refcnt_atomic, (int16_t)new_value);
+ __atomic_store_n(&shinfo->refcnt_atomic, new_value, __ATOMIC_RELAXED);
}
/**
@@ -505,7 +505,8 @@ rte_mbuf_ext_refcnt_update(struct rte_mbuf_ext_shared_info *shinfo,
return (uint16_t)value;
}
- return (uint16_t)rte_atomic16_add_return(&shinfo->refcnt_atomic, value);
+ return (uint16_t)(__atomic_add_fetch((int16_t *)&shinfo->refcnt_atomic,
+ value, __ATOMIC_ACQ_REL));
}
/** Mbuf prefetch */
@@ -1304,8 +1305,8 @@ static inline int __rte_pktmbuf_pinned_extbuf_decref(struct rte_mbuf *m)
* Direct usage of add primitive to avoid
* duplication of comparing with one.
*/
- if (likely(rte_atomic16_add_return
- (&shinfo->refcnt_atomic, -1)))
+ if (likely(__atomic_add_fetch((int *)&shinfo->refcnt_atomic, -1,
+ __ATOMIC_ACQ_REL)))
return 1;
/* Reinitialize counter before mbuf freeing. */
diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
index 16600f1..806313a 100644
--- a/lib/librte_mbuf/rte_mbuf_core.h
+++ b/lib/librte_mbuf/rte_mbuf_core.h
@@ -18,7 +18,6 @@
#include <stdint.h>
#include <rte_compat.h>
-#include <generic/rte_atomic.h>
#ifdef __cplusplus
extern "C" {
@@ -495,12 +494,8 @@ struct rte_mbuf {
* or non-atomic) is controlled by the CONFIG_RTE_MBUF_REFCNT_ATOMIC
* config option.
*/
- RTE_STD_C11
- union {
- rte_atomic16_t refcnt_atomic; /**< Atomically accessed refcnt */
- /** Non-atomically accessed refcnt */
- uint16_t refcnt;
- };
+ uint16_t refcnt;
+
uint16_t nb_segs; /**< Number of segments. */
/** Input port (16 bits to support more than 256 virtual ports).
@@ -679,7 +674,7 @@ typedef void (*rte_mbuf_extbuf_free_callback_t)(void *addr, void *opaque);
struct rte_mbuf_ext_shared_info {
rte_mbuf_extbuf_free_callback_t free_cb; /**< Free callback function */
void *fcb_opaque; /**< Free callback argument */
- rte_atomic16_t refcnt_atomic; /**< Atomically accessed refcnt */
+ uint16_t refcnt_atomic; /**< Atomically accessed refcnt */
};
/**< Maximum number of nb_segs allowed. */
--
2.7.4
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH 1/3] ring: remove experimental tag for ring reset API
2020-07-07 3:19 3% ` Feifei Wang
@ 2020-07-07 7:40 0% ` Kinsella, Ray
0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2020-07-07 7:40 UTC (permalink / raw)
To: Feifei Wang, Honnappa Nagarahalli, Konstantin Ananyev, Neil Horman
Cc: dev, nd
On 07/07/2020 04:19, Feifei Wang wrote:
>
>
>> -----Original Message-----
>> From: Kinsella, Ray <mdr@ashroe.eu>
>> Sent: 2020年7月6日 14:23
>> To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Feifei Wang
>> <Feifei.Wang2@arm.com>; Konstantin Ananyev
>> <konstantin.ananyev@intel.com>; Neil Horman <nhorman@tuxdriver.com>
>> Cc: dev@dpdk.org; nd <nd@arm.com>
>> Subject: Re: [PATCH 1/3] ring: remove experimental tag for ring reset API
>>
>>
>>
>> On 03/07/2020 19:46, Honnappa Nagarahalli wrote:
>>> <snip>
>>>
>>>>
>>>> On 03/07/2020 11:26, Feifei Wang wrote:
>>>>> Remove the experimental tag for rte_ring_reset API that have been
>>>>> around for 4 releases.
>>>>>
>>>>> Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
>>>>> Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
>>>>> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
>>>>> ---
>>>>> lib/librte_ring/rte_ring.h | 3 ---
>>>>> lib/librte_ring/rte_ring_version.map | 4 +---
>>>>> 2 files changed, 1 insertion(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
>>>>> index f67141482..7181c33b4 100644
>>>>> --- a/lib/librte_ring/rte_ring.h
>>>>> +++ b/lib/librte_ring/rte_ring.h
>>>>> @@ -663,15 +663,12 @@ rte_ring_dequeue(struct rte_ring *r, void
>> **obj_p)
>>>>> *
>>>>> * This function flush all the elements in a ring
>>>>> *
>>>>> - * @b EXPERIMENTAL: this API may change without prior notice
>>>>> - *
>>>>> * @warning
>>>>> * Make sure the ring is not in use while calling this function.
>>>>> *
>>>>> * @param r
>>>>> * A pointer to the ring structure.
>>>>> */
>>>>> -__rte_experimental
>>>>> void
>>>>> rte_ring_reset(struct rte_ring *r);
>>>>>
>>>>> diff --git a/lib/librte_ring/rte_ring_version.map
>>>>> b/lib/librte_ring/rte_ring_version.map
>>>>> index e88c143cf..aec6f3820 100644
>>>>> --- a/lib/librte_ring/rte_ring_version.map
>>>>> +++ b/lib/librte_ring/rte_ring_version.map
>>>>> @@ -8,6 +8,7 @@ DPDK_20.0 {
>>>>> rte_ring_init;
>>>>> rte_ring_list_dump;
>>>>> rte_ring_lookup;
>>>>> + rte_ring_reset;
>>>>>
>>>>> local: *;
>>>>> };
>>>>> @@ -15,9 +16,6 @@ DPDK_20.0 {
>>>>> EXPERIMENTAL {
>>>>> global:
>>>>>
>>>>> - # added in 19.08
>>>>> - rte_ring_reset;
>>>>> -
>>>>> # added in 20.02
>>>>> rte_ring_create_elem;
>>>>> rte_ring_get_memsize_elem;
>>>>
>>>> So strictly speaking, rte_ring_reset is part of the DPDK_21 ABI, not
>>>> the v20.0 ABI.
>>> Thanks Ray for clarifying this.
>>>
> Thanks very much for pointing this.
>>>>
>>>> The way to solve is to add it the DPDK_21 ABI in the map file.
>>>> And then use the VERSION_SYMBOL_EXPERIMENTAL to alias to
>> experimental
>>>> if necessary.
>>> Is using VERSION_SYMBOL_EXPERIMENTAL a must?
>>
>> Purely at the discretion of the contributor and maintainer.
>> If it has been around for a while, applications are using it and changing the
>> symbol will break them.
>>
>> You may choose to provide the alias or not.
> Ok, in the new patch version, I will add it into the DPDK_21 ABI but the
> VERSION_SYMBOL_EXPERIMENTAL will not be added, because if it is added in this
> version, it is still needed to be removed in the near future.
>
> Thanks very much for your review.
That is 100%
>>
>>> The documentation also seems to be vague. It says " The macro is used
>> when a symbol matures to become part of the stable ABI, to provide an alias
>> to experimental for some time". What does 'some time' mean?
>>
>> "Some time" is a bit vague alright, should be "until the next major ABI
>> version" - I will fix.
>>
>>>
>>>>
>>>> https://doc.dpdk.org/guides/contributing/abi_versioning.html#versioni
>>>> ng-
>>>> macros
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH 1/3] ring: remove experimental tag for ring reset API
2020-07-06 6:23 3% ` Kinsella, Ray
@ 2020-07-07 3:19 3% ` Feifei Wang
2020-07-07 7:40 0% ` Kinsella, Ray
0 siblings, 1 reply; 200+ results
From: Feifei Wang @ 2020-07-07 3:19 UTC (permalink / raw)
To: Kinsella, Ray, Honnappa Nagarahalli, Konstantin Ananyev, Neil Horman
Cc: dev, nd, nd
> -----Original Message-----
> From: Kinsella, Ray <mdr@ashroe.eu>
> Sent: 2020年7月6日 14:23
> To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Feifei Wang
> <Feifei.Wang2@arm.com>; Konstantin Ananyev
> <konstantin.ananyev@intel.com>; Neil Horman <nhorman@tuxdriver.com>
> Cc: dev@dpdk.org; nd <nd@arm.com>
> Subject: Re: [PATCH 1/3] ring: remove experimental tag for ring reset API
>
>
>
> On 03/07/2020 19:46, Honnappa Nagarahalli wrote:
> > <snip>
> >
> >>
> >> On 03/07/2020 11:26, Feifei Wang wrote:
> >>> Remove the experimental tag for rte_ring_reset API that have been
> >>> around for 4 releases.
> >>>
> >>> Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
> >>> Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> >>> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> >>> ---
> >>> lib/librte_ring/rte_ring.h | 3 ---
> >>> lib/librte_ring/rte_ring_version.map | 4 +---
> >>> 2 files changed, 1 insertion(+), 6 deletions(-)
> >>>
> >>> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> >>> index f67141482..7181c33b4 100644
> >>> --- a/lib/librte_ring/rte_ring.h
> >>> +++ b/lib/librte_ring/rte_ring.h
> >>> @@ -663,15 +663,12 @@ rte_ring_dequeue(struct rte_ring *r, void
> **obj_p)
> >>> *
> >>> * This function flush all the elements in a ring
> >>> *
> >>> - * @b EXPERIMENTAL: this API may change without prior notice
> >>> - *
> >>> * @warning
> >>> * Make sure the ring is not in use while calling this function.
> >>> *
> >>> * @param r
> >>> * A pointer to the ring structure.
> >>> */
> >>> -__rte_experimental
> >>> void
> >>> rte_ring_reset(struct rte_ring *r);
> >>>
> >>> diff --git a/lib/librte_ring/rte_ring_version.map
> >>> b/lib/librte_ring/rte_ring_version.map
> >>> index e88c143cf..aec6f3820 100644
> >>> --- a/lib/librte_ring/rte_ring_version.map
> >>> +++ b/lib/librte_ring/rte_ring_version.map
> >>> @@ -8,6 +8,7 @@ DPDK_20.0 {
> >>> rte_ring_init;
> >>> rte_ring_list_dump;
> >>> rte_ring_lookup;
> >>> + rte_ring_reset;
> >>>
> >>> local: *;
> >>> };
> >>> @@ -15,9 +16,6 @@ DPDK_20.0 {
> >>> EXPERIMENTAL {
> >>> global:
> >>>
> >>> - # added in 19.08
> >>> - rte_ring_reset;
> >>> -
> >>> # added in 20.02
> >>> rte_ring_create_elem;
> >>> rte_ring_get_memsize_elem;
> >>
> >> So strictly speaking, rte_ring_reset is part of the DPDK_21 ABI, not
> >> the v20.0 ABI.
> > Thanks Ray for clarifying this.
> >
Thanks very much for pointing this.
> >>
> >> The way to solve is to add it the DPDK_21 ABI in the map file.
> >> And then use the VERSION_SYMBOL_EXPERIMENTAL to alias to
> experimental
> >> if necessary.
> > Is using VERSION_SYMBOL_EXPERIMENTAL a must?
>
> Purely at the discretion of the contributor and maintainer.
> If it has been around for a while, applications are using it and changing the
> symbol will break them.
>
> You may choose to provide the alias or not.
Ok, in the new patch version, I will add it into the DPDK_21 ABI but the
VERSION_SYMBOL_EXPERIMENTAL will not be added, because if it is added in this
version, it is still needed to be removed in the near future.
Thanks very much for your review.
>
> > The documentation also seems to be vague. It says " The macro is used
> when a symbol matures to become part of the stable ABI, to provide an alias
> to experimental for some time". What does 'some time' mean?
>
> "Some time" is a bit vague alright, should be "until the next major ABI
> version" - I will fix.
>
> >
> >>
> >> https://doc.dpdk.org/guides/contributing/abi_versioning.html#versioni
> >> ng-
> >> macros
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v7 1/3] eal: disable function versioning on Windows
2020-07-06 12:22 0% ` Bruce Richardson
@ 2020-07-06 23:16 0% ` Thomas Monjalon
0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2020-07-06 23:16 UTC (permalink / raw)
To: Bruce Richardson
Cc: Fady Bader, dev, tbashar, talshn, yohadt, dmitry.kozliuk,
harini.ramakrishnan, ocardona, pallavi.kadam, ranjit.menon,
olivier.matz, arybchenko, mdr, nhorman
06/07/2020 14:22, Bruce Richardson:
> On Mon, Jul 06, 2020 at 02:32:39PM +0300, Fady Bader wrote:
> > Function versioning implementation is not supported by Windows.
> > Function versioning is disabled on Windows.
> >
> > Signed-off-by: Fady Bader <fady@mellanox.com>
> > ---
> > doc/guides/windows_gsg/intro.rst | 4 ++++
> > lib/meson.build | 6 +++++-
> > 2 files changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/doc/guides/windows_gsg/intro.rst b/doc/guides/windows_gsg/intro.rst
> > index a0285732df..58c6246404 100644
> > --- a/doc/guides/windows_gsg/intro.rst
> > +++ b/doc/guides/windows_gsg/intro.rst
> > @@ -18,3 +18,7 @@ DPDK for Windows is currently a work in progress. Not all DPDK source files
> > compile. Support is being added in pieces so as to limit the overall scope
> > of any individual patch series. The goal is to be able to run any DPDK
> > application natively on Windows.
> > +
> > +The :doc:`../contributing/abi_policy` cannot be respected for Windows.
> > +Minor ABI versions may be incompatible
> > +because function versioning is not supported on Windows.
> > diff --git a/lib/meson.build b/lib/meson.build
> > index c1b9e1633f..dadf151f78 100644
> > --- a/lib/meson.build
> > +++ b/lib/meson.build
> > @@ -107,6 +107,10 @@ foreach l:libraries
> > shared_dep = declare_dependency(include_directories: includes)
> > static_dep = shared_dep
> > else
> > + if is_windows and use_function_versioning
> > + message('@0@: Function versioning is not supported by Windows.'
> > + .format(name))
> > + endif
> >
>
> This is ok here, but I think it might be better just moved to somewhere
> like config/meson.build, so that it is always just printed once for each
> build. I don't see an issue with having it printed even if there is no
> function versioning in the build itself.
Moving such message in config/meson.build is the same
as moving it to the doc.
I prefer having a message each time a library compatibility
is required but not possible.
> With or without the code move above, which is just a suggestion,
>
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
OK thanks, I'll merge as is.
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v7] sched: make RED scaling configurable
@ 2020-07-06 23:09 3% ` Thomas Monjalon
0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2020-07-06 23:09 UTC (permalink / raw)
To: Alan Dewar, Alan Dewar
Cc: dev, Yigit, Ferruh, Kantecki, Tomasz, Stephen Hemminger, dev,
Dumitrescu, Cristian, jasvinder.singh, david.marchand,
bruce.richardson
08/04/2019 15:29, Dumitrescu, Cristian:
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > 08/04/2019 10:24, Alan Dewar:
> > > On Fri, Apr 5, 2019 at 4:36 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> > > > On 1/16/2018 4:07 PM, alangordondewar@gmail.com wrote:
> > > > > From: Alan Dewar <alan.dewar@att.com>
> > > > >
> > > > > The RED code stores the weighted moving average in a 32-bit integer as
> > > > > a pseudo fixed-point floating number with 10 fractional bits. Twelve
> > > > > other bits are used to encode the filter weight, leaving just 10 bits
> > > > > for the queue length. This limits the maximum queue length supported
> > > > > by RED queues to 1024 packets.
> > > > >
> > > > > Introduce a new API to allow the RED scaling factor to be configured
> > > > > based upon maximum queue length. If this API is not called, the RED
> > > > > scaling factor remains at its default value.
> > > > >
> > > > > Added some new RED scaling unit-tests to test with RED queue-lengths
> > > > > up to 8192 packets long.
> > > > >
> > > > > Signed-off-by: Alan Dewar <alan.dewar@att.com>
> > > >
> > > > Hi Cristian, Alan,
> > > >
> > > > The v7 of this patch is sting without any comment for more than a year.
> > > > What is the status of this patch? Is it still valid? What is blocking it?
> > > >
> > > > For reference patch:
> > > > https://patches.dpdk.org/patch/33837/
> > >
> > > We are still using this patch against DPDK 17.11 and 18.11 as part of
> > > the AT&T Vyatta NOS. It is needed to make WRED queues longer than
> > > 1024 packets work correctly. I'm afraid that I have no idea what is
> > > holding it up from being merged.
> >
> > It will be in a release when it will be merged in the git tree
> > dpdk-next-qos, managed by Cristian.
>
> I was hoping to get a review & ACK from Tomasz Kantecki, the author of the WRED code in DPDK, hence the lack of progress on this patch.
It seems nobody was able to provide an feedback after two years,
and it was never merged in the QoS git tree.
The handling of this patch is really a shame.
Alan, please rebase this patch.
If nothing is wrong in CI (including ABI check),
I will merge the next version.
^ permalink raw reply [relevance 3%]
* [dpdk-dev] [PATCH v6 04/10] eal: introduce thread uninit helper
2020-07-06 20:52 3% ` [dpdk-dev] [PATCH v6 02/10] eal: fix multiple definition of per lcore thread id David Marchand
@ 2020-07-06 20:52 3% ` David Marchand
1 sibling, 0 replies; 200+ results
From: David Marchand @ 2020-07-06 20:52 UTC (permalink / raw)
To: dev
Cc: jerinjacobk, bruce.richardson, mdr, thomas, arybchenko, ktraynor,
ian.stokes, i.maximets, olivier.matz, konstantin.ananyev,
Jerin Jacob, Sunil Kumar Kori, Harini Ramakrishnan, Omar Cardona,
Pallavi Kadam, Ranjit Menon
This is a preparation step for dynamically unregistering threads.
Since we explicitly allocate a per thread trace buffer in
__rte_thread_init, add an internal helper to free this buffer.
Signed-off-by: David Marchand <david.marchand@redhat.com>
---
Changes since v5:
- fixed windows build,
Changes since v4:
- renamed rte_thread_uninit and moved to eal_private.h,
- hid freeing helper,
Changes since v2:
- added missing stub for windows tracing support,
- moved free symbol to exported (experimental) ABI as a counterpart of
the alloc symbol we already had,
Changes since v1:
- rebased on master, removed Windows workaround wrt traces support,
---
lib/librte_eal/common/eal_common_thread.c | 9 +++++
lib/librte_eal/common/eal_common_trace.c | 49 +++++++++++++++++++----
lib/librte_eal/common/eal_private.h | 5 +++
lib/librte_eal/common/eal_trace.h | 1 +
lib/librte_eal/windows/eal.c | 7 +++-
5 files changed, 63 insertions(+), 8 deletions(-)
diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
index fb06f8f802..6d1c87b1c2 100644
--- a/lib/librte_eal/common/eal_common_thread.c
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -20,6 +20,7 @@
#include "eal_internal_cfg.h"
#include "eal_private.h"
#include "eal_thread.h"
+#include "eal_trace.h"
RTE_DEFINE_PER_LCORE(unsigned int, _lcore_id) = LCORE_ID_ANY;
RTE_DEFINE_PER_LCORE(int, _thread_id) = -1;
@@ -161,6 +162,14 @@ __rte_thread_init(unsigned int lcore_id, rte_cpuset_t *cpuset)
__rte_trace_mem_per_thread_alloc();
}
+void
+__rte_thread_uninit(void)
+{
+ trace_mem_per_thread_free();
+
+ RTE_PER_LCORE(_lcore_id) = LCORE_ID_ANY;
+}
+
struct rte_thread_ctrl_params {
void *(*start_routine)(void *);
void *arg;
diff --git a/lib/librte_eal/common/eal_common_trace.c b/lib/librte_eal/common/eal_common_trace.c
index 875553d7e5..b6da5537fe 100644
--- a/lib/librte_eal/common/eal_common_trace.c
+++ b/lib/librte_eal/common/eal_common_trace.c
@@ -101,7 +101,7 @@ eal_trace_fini(void)
{
if (!rte_trace_is_enabled())
return;
- trace_mem_per_thread_free();
+ trace_mem_free();
trace_metadata_destroy();
eal_trace_args_free();
}
@@ -370,24 +370,59 @@ __rte_trace_mem_per_thread_alloc(void)
rte_spinlock_unlock(&trace->lock);
}
+static void
+trace_mem_per_thread_free_unlocked(struct thread_mem_meta *meta)
+{
+ if (meta->area == TRACE_AREA_HUGEPAGE)
+ eal_free_no_trace(meta->mem);
+ else if (meta->area == TRACE_AREA_HEAP)
+ free(meta->mem);
+}
+
void
trace_mem_per_thread_free(void)
+{
+ struct trace *trace = trace_obj_get();
+ struct __rte_trace_header *header;
+ uint32_t count;
+
+ header = RTE_PER_LCORE(trace_mem);
+ if (header == NULL)
+ return;
+
+ rte_spinlock_lock(&trace->lock);
+ for (count = 0; count < trace->nb_trace_mem_list; count++) {
+ if (trace->lcore_meta[count].mem == header)
+ break;
+ }
+ if (count != trace->nb_trace_mem_list) {
+ struct thread_mem_meta *meta = &trace->lcore_meta[count];
+
+ trace_mem_per_thread_free_unlocked(meta);
+ if (count != trace->nb_trace_mem_list - 1) {
+ memmove(meta, meta + 1,
+ sizeof(*meta) *
+ (trace->nb_trace_mem_list - count - 1));
+ }
+ trace->nb_trace_mem_list--;
+ }
+ rte_spinlock_unlock(&trace->lock);
+}
+
+void
+trace_mem_free(void)
{
struct trace *trace = trace_obj_get();
uint32_t count;
- void *mem;
if (!rte_trace_is_enabled())
return;
rte_spinlock_lock(&trace->lock);
for (count = 0; count < trace->nb_trace_mem_list; count++) {
- mem = trace->lcore_meta[count].mem;
- if (trace->lcore_meta[count].area == TRACE_AREA_HUGEPAGE)
- eal_free_no_trace(mem);
- else if (trace->lcore_meta[count].area == TRACE_AREA_HEAP)
- free(mem);
+ trace_mem_per_thread_free_unlocked(&trace->lcore_meta[count]);
}
+ trace->nb_trace_mem_list = 0;
rte_spinlock_unlock(&trace->lock);
}
diff --git a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h
index 5d8b53882d..a77ac7a963 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -709,4 +709,9 @@ eal_get_application_usage_hook(void);
*/
void __rte_thread_init(unsigned int lcore_id, rte_cpuset_t *cpuset);
+/**
+ * Uninitialize per-lcore info for current thread.
+ */
+void __rte_thread_uninit(void);
+
#endif /* _EAL_PRIVATE_H_ */
diff --git a/lib/librte_eal/common/eal_trace.h b/lib/librte_eal/common/eal_trace.h
index 8f60616156..92c5951c3a 100644
--- a/lib/librte_eal/common/eal_trace.h
+++ b/lib/librte_eal/common/eal_trace.h
@@ -106,6 +106,7 @@ int trace_metadata_create(void);
void trace_metadata_destroy(void);
int trace_mkdir(void);
int trace_epoch_time_save(void);
+void trace_mem_free(void);
void trace_mem_per_thread_free(void);
/* EAL interface */
diff --git a/lib/librte_eal/windows/eal.c b/lib/librte_eal/windows/eal.c
index 9f5d019e64..addac62ae5 100644
--- a/lib/librte_eal/windows/eal.c
+++ b/lib/librte_eal/windows/eal.c
@@ -17,10 +17,10 @@
#include <eal_filesystem.h>
#include <eal_options.h>
#include <eal_private.h>
-#include <rte_trace_point.h>
#include <rte_vfio.h>
#include "eal_hugepages.h"
+#include "eal_trace.h"
#include "eal_windows.h"
#define MEMSIZE_IF_NO_HUGE_PAGE (64ULL * 1024ULL * 1024ULL)
@@ -215,6 +215,11 @@ __rte_trace_mem_per_thread_alloc(void)
{
}
+void
+trace_mem_per_thread_free(void)
+{
+}
+
void
__rte_trace_point_emit_field(size_t sz, const char *field,
const char *type)
--
2.23.0
^ permalink raw reply [relevance 3%]
* [dpdk-dev] [PATCH v6 02/10] eal: fix multiple definition of per lcore thread id
@ 2020-07-06 20:52 3% ` David Marchand
2020-07-06 20:52 3% ` [dpdk-dev] [PATCH v6 04/10] eal: introduce thread uninit helper David Marchand
1 sibling, 0 replies; 200+ results
From: David Marchand @ 2020-07-06 20:52 UTC (permalink / raw)
To: dev
Cc: jerinjacobk, bruce.richardson, mdr, thomas, arybchenko, ktraynor,
ian.stokes, i.maximets, olivier.matz, konstantin.ananyev,
Neil Horman, Cunming Liang
Because of the inline accessor + static declaration in rte_gettid(),
we end up with multiple symbols for RTE_PER_LCORE(_thread_id).
Each compilation unit will pay a cost when accessing this information
for the first time.
$ nm build/app/dpdk-testpmd | grep per_lcore__thread_id
0000000000000054 d per_lcore__thread_id.5037
0000000000000040 d per_lcore__thread_id.5103
0000000000000048 d per_lcore__thread_id.5259
000000000000004c d per_lcore__thread_id.5259
0000000000000044 d per_lcore__thread_id.5933
0000000000000058 d per_lcore__thread_id.6261
0000000000000050 d per_lcore__thread_id.7378
000000000000005c d per_lcore__thread_id.7496
000000000000000c d per_lcore__thread_id.8016
0000000000000010 d per_lcore__thread_id.8431
Make it global as part of the DPDK_21 stable ABI.
Fixes: ef76436c6834 ("eal: get unique thread id")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
---
lib/librte_eal/common/eal_common_thread.c | 1 +
lib/librte_eal/include/rte_eal.h | 3 ++-
lib/librte_eal/rte_eal_version.map | 7 +++++++
3 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
index 7be80c292e..fd13453fee 100644
--- a/lib/librte_eal/common/eal_common_thread.c
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -22,6 +22,7 @@
#include "eal_thread.h"
RTE_DEFINE_PER_LCORE(unsigned int, _lcore_id) = LCORE_ID_ANY;
+RTE_DEFINE_PER_LCORE(int, _thread_id) = -1;
static RTE_DEFINE_PER_LCORE(unsigned int, _socket_id) =
(unsigned int)SOCKET_ID_ANY;
static RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
diff --git a/lib/librte_eal/include/rte_eal.h b/lib/librte_eal/include/rte_eal.h
index 2f9ed298de..2edf8c6556 100644
--- a/lib/librte_eal/include/rte_eal.h
+++ b/lib/librte_eal/include/rte_eal.h
@@ -447,6 +447,8 @@ enum rte_intr_mode rte_eal_vfio_intr_mode(void);
*/
int rte_sys_gettid(void);
+RTE_DECLARE_PER_LCORE(int, _thread_id);
+
/**
* Get system unique thread id.
*
@@ -456,7 +458,6 @@ int rte_sys_gettid(void);
*/
static inline int rte_gettid(void)
{
- static RTE_DEFINE_PER_LCORE(int, _thread_id) = -1;
if (RTE_PER_LCORE(_thread_id) == -1)
RTE_PER_LCORE(_thread_id) = rte_sys_gettid();
return RTE_PER_LCORE(_thread_id);
diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map
index 196eef5afa..0d42d44ce9 100644
--- a/lib/librte_eal/rte_eal_version.map
+++ b/lib/librte_eal/rte_eal_version.map
@@ -221,6 +221,13 @@ DPDK_20.0 {
local: *;
};
+DPDK_21 {
+ global:
+
+ per_lcore__thread_id;
+
+} DPDK_20.0;
+
EXPERIMENTAL {
global:
--
2.23.0
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v3 0/3] Experimental/internal libraries cleanup
2020-07-05 19:55 3% ` Thomas Monjalon
2020-07-06 8:02 3% ` [dpdk-dev] [dpdk-techboard] " Bruce Richardson
@ 2020-07-06 16:57 0% ` Medvedkin, Vladimir
1 sibling, 0 replies; 200+ results
From: Medvedkin, Vladimir @ 2020-07-06 16:57 UTC (permalink / raw)
To: Thomas Monjalon, David Marchand
Cc: dev, honnappa.nagarahalli, techboard, Jiayu Hu, Yipeng Wang,
Sameh Gobriel, Nipun Gupta, Hemant Agrawal
On 05/07/2020 20:55, Thomas Monjalon wrote:
> +Cc maintainers of the problematic libraries:
> - librte_fib
> - librte_rib
> - librte_gro
> - librte_member
> - librte_rawdev
>
> 26/06/2020 10:16, David Marchand:
>> Following discussions on the mailing list and the 05/20 TB meeting, here
>> is a series that drops the special versioning for non stable libraries.
>>
>> Two notes:
>>
>> - RIB/FIB library is not referenced in the API doxygen index, is this
>> intentional?
> Vladimir please, could you fix the miss in the doxygen index?
Sure, I'll send a patch.
>
>> - I inspected MAINTAINERS: librte_gro, librte_member and librte_rawdev are
>> announced as experimental while their functions are part of the 20
>> stable ABI (in .map files + no __rte_experimental marking).
>> Their fate must be discussed.
> I would suggest removing EXPERIMENTAL flag for gro, member and rawdev.
> They are probably already considered stable for a lot of users.
> Maintainers, are you OK to follow the ABI compatibility rules
> for these libraries? Do you feel these libraries are mature enough?
>
>
>
--
Regards,
Vladimir
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v2 4/4] eventdev: relax smp barriers with c11 atomics
2020-07-06 15:32 0% ` Phil Yang
@ 2020-07-06 15:40 0% ` Thomas Monjalon
0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2020-07-06 15:40 UTC (permalink / raw)
To: Phil Yang
Cc: erik.g.carrillo, dev, jerinj, Honnappa Nagarahalli, drc,
Ruifeng Wang, Dharmik Thakkar, nd, david.marchand, mdr,
Neil Horman, Dodji Seketeli
06/07/2020 17:32, Phil Yang:
> From: Thomas Monjalon <thomas@monjalon.net>
> > 02/07/2020 07:26, Phil Yang:
> > > --- a/lib/librte_eventdev/rte_event_timer_adapter.h
> > > +++ b/lib/librte_eventdev/rte_event_timer_adapter.h
> > > @@ -467,7 +467,7 @@ struct rte_event_timer {
> > > * - op: RTE_EVENT_OP_NEW
> > > * - event_type: RTE_EVENT_TYPE_TIMER
> > > */
> > > - volatile enum rte_event_timer_state state;
> > > + enum rte_event_timer_state state;
> > > /**< State of the event timer. */
> >
> > Why do you remove the volatile keyword?
> > It is not explained in the commit log.
> By using the C11 atomic operations, it will generate the same instructions for non-volatile and volatile version.
> Please check the sample code here: https://gcc.godbolt.org/z/8x5rWs
>
> > This change is triggering a warning in the ABI check:
> > http://mails.dpdk.org/archives/test-report/2020-July/140440.html
> > Moving from volatile to non-volatile is probably not an issue.
> > I expect the code generated for the volatile case to work the same
> > in non-volatile case. Do you confirm?
> They generate the same instructions, so either way will work.
> Do I need to revert it to the volatile version?
Either you revert, or you add explanation in the commit log
+ exception in libabigail.abignore
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v2 4/4] eventdev: relax smp barriers with c11 atomics
2020-07-06 10:04 4% ` Thomas Monjalon
@ 2020-07-06 15:32 0% ` Phil Yang
2020-07-06 15:40 0% ` Thomas Monjalon
0 siblings, 1 reply; 200+ results
From: Phil Yang @ 2020-07-06 15:32 UTC (permalink / raw)
To: thomas
Cc: erik.g.carrillo, dev, jerinj, Honnappa Nagarahalli, drc,
Ruifeng Wang, Dharmik Thakkar, nd, david.marchand, mdr,
Neil Horman, Dodji Seketeli
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Monday, July 6, 2020 6:04 PM
> To: Phil Yang <Phil.Yang@arm.com>
> Cc: erik.g.carrillo@intel.com; dev@dpdk.org; jerinj@marvell.com; Honnappa
> Nagarahalli <Honnappa.Nagarahalli@arm.com>; drc@linux.vnet.ibm.com;
> Ruifeng Wang <Ruifeng.Wang@arm.com>; Dharmik Thakkar
> <Dharmik.Thakkar@arm.com>; nd <nd@arm.com>;
> david.marchand@redhat.com; mdr@ashroe.eu; Neil Horman
> <nhorman@tuxdriver.com>; Dodji Seketeli <dodji@redhat.com>
> Subject: Re: [dpdk-dev] [PATCH v2 4/4] eventdev: relax smp barriers with c11
> atomics
>
> 02/07/2020 07:26, Phil Yang:
> > The implementation-specific opaque data is shared between arm and
> cancel
> > operations. The state flag acts as a guard variable to make sure the
> > update of opaque data is synchronized. This patch uses c11 atomics with
> > explicit one way memory barrier instead of full barriers rte_smp_w/rmb()
> > to synchronize the opaque data between timer arm and cancel threads.
>
> I think we should write C11 (uppercase).
Agreed.
I will change it in the next version.
>
> Please, in your explanations, try to be more specific.
> Naming fields may help to make things clear.
OK. Thanks.
>
> [...]
> > --- a/lib/librte_eventdev/rte_event_timer_adapter.h
> > +++ b/lib/librte_eventdev/rte_event_timer_adapter.h
> > @@ -467,7 +467,7 @@ struct rte_event_timer {
> > * - op: RTE_EVENT_OP_NEW
> > * - event_type: RTE_EVENT_TYPE_TIMER
> > */
> > - volatile enum rte_event_timer_state state;
> > + enum rte_event_timer_state state;
> > /**< State of the event timer. */
>
> Why do you remove the volatile keyword?
> It is not explained in the commit log.
By using the C11 atomic operations, it will generate the same instructions for non-volatile and volatile version.
Please check the sample code here: https://gcc.godbolt.org/z/8x5rWs
>
> This change is triggering a warning in the ABI check:
> http://mails.dpdk.org/archives/test-report/2020-July/140440.html
> Moving from volatile to non-volatile is probably not an issue.
> I expect the code generated for the volatile case to work the same
> in non-volatile case. Do you confirm?
They generate the same instructions, so either way will work.
Do I need to revert it to the volatile version?
Thanks,
Phil
>
> In any case, we need an explanation and an ABI check exception.
>
^ permalink raw reply [relevance 0%]
* [dpdk-dev] [PATCH v5 04/10] eal: introduce thread uninit helper
2020-07-06 14:15 3% ` [dpdk-dev] [PATCH v5 02/10] eal: fix multiple definition of per lcore thread id David Marchand
@ 2020-07-06 14:16 3% ` David Marchand
1 sibling, 0 replies; 200+ results
From: David Marchand @ 2020-07-06 14:16 UTC (permalink / raw)
To: dev
Cc: jerinjacobk, bruce.richardson, mdr, thomas, arybchenko, ktraynor,
ian.stokes, i.maximets, olivier.matz, konstantin.ananyev,
Jerin Jacob, Sunil Kumar Kori, Harini Ramakrishnan, Omar Cardona,
Pallavi Kadam, Ranjit Menon
This is a preparation step for dynamically unregistering threads.
Since we explicitly allocate a per thread trace buffer in
__rte_thread_init, add an internal helper to free this buffer.
Signed-off-by: David Marchand <david.marchand@redhat.com>
---
Changes since v4:
- renamed rte_thread_uninit and moved to eal_private.h,
- hid freeing helper,
Changes since v2:
- added missing stub for windows tracing support,
- moved free symbol to exported (experimental) ABI as a counterpart of
the alloc symbol we already had,
Changes since v1:
- rebased on master, removed Windows workaround wrt traces support,
---
lib/librte_eal/common/eal_common_thread.c | 9 +++++
lib/librte_eal/common/eal_common_trace.c | 49 +++++++++++++++++++----
lib/librte_eal/common/eal_private.h | 5 +++
lib/librte_eal/common/eal_trace.h | 1 +
lib/librte_eal/windows/eal.c | 5 +++
5 files changed, 62 insertions(+), 7 deletions(-)
diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
index fb06f8f802..6d1c87b1c2 100644
--- a/lib/librte_eal/common/eal_common_thread.c
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -20,6 +20,7 @@
#include "eal_internal_cfg.h"
#include "eal_private.h"
#include "eal_thread.h"
+#include "eal_trace.h"
RTE_DEFINE_PER_LCORE(unsigned int, _lcore_id) = LCORE_ID_ANY;
RTE_DEFINE_PER_LCORE(int, _thread_id) = -1;
@@ -161,6 +162,14 @@ __rte_thread_init(unsigned int lcore_id, rte_cpuset_t *cpuset)
__rte_trace_mem_per_thread_alloc();
}
+void
+__rte_thread_uninit(void)
+{
+ trace_mem_per_thread_free();
+
+ RTE_PER_LCORE(_lcore_id) = LCORE_ID_ANY;
+}
+
struct rte_thread_ctrl_params {
void *(*start_routine)(void *);
void *arg;
diff --git a/lib/librte_eal/common/eal_common_trace.c b/lib/librte_eal/common/eal_common_trace.c
index 875553d7e5..b6da5537fe 100644
--- a/lib/librte_eal/common/eal_common_trace.c
+++ b/lib/librte_eal/common/eal_common_trace.c
@@ -101,7 +101,7 @@ eal_trace_fini(void)
{
if (!rte_trace_is_enabled())
return;
- trace_mem_per_thread_free();
+ trace_mem_free();
trace_metadata_destroy();
eal_trace_args_free();
}
@@ -370,24 +370,59 @@ __rte_trace_mem_per_thread_alloc(void)
rte_spinlock_unlock(&trace->lock);
}
+static void
+trace_mem_per_thread_free_unlocked(struct thread_mem_meta *meta)
+{
+ if (meta->area == TRACE_AREA_HUGEPAGE)
+ eal_free_no_trace(meta->mem);
+ else if (meta->area == TRACE_AREA_HEAP)
+ free(meta->mem);
+}
+
void
trace_mem_per_thread_free(void)
+{
+ struct trace *trace = trace_obj_get();
+ struct __rte_trace_header *header;
+ uint32_t count;
+
+ header = RTE_PER_LCORE(trace_mem);
+ if (header == NULL)
+ return;
+
+ rte_spinlock_lock(&trace->lock);
+ for (count = 0; count < trace->nb_trace_mem_list; count++) {
+ if (trace->lcore_meta[count].mem == header)
+ break;
+ }
+ if (count != trace->nb_trace_mem_list) {
+ struct thread_mem_meta *meta = &trace->lcore_meta[count];
+
+ trace_mem_per_thread_free_unlocked(meta);
+ if (count != trace->nb_trace_mem_list - 1) {
+ memmove(meta, meta + 1,
+ sizeof(*meta) *
+ (trace->nb_trace_mem_list - count - 1));
+ }
+ trace->nb_trace_mem_list--;
+ }
+ rte_spinlock_unlock(&trace->lock);
+}
+
+void
+trace_mem_free(void)
{
struct trace *trace = trace_obj_get();
uint32_t count;
- void *mem;
if (!rte_trace_is_enabled())
return;
rte_spinlock_lock(&trace->lock);
for (count = 0; count < trace->nb_trace_mem_list; count++) {
- mem = trace->lcore_meta[count].mem;
- if (trace->lcore_meta[count].area == TRACE_AREA_HUGEPAGE)
- eal_free_no_trace(mem);
- else if (trace->lcore_meta[count].area == TRACE_AREA_HEAP)
- free(mem);
+ trace_mem_per_thread_free_unlocked(&trace->lcore_meta[count]);
}
+ trace->nb_trace_mem_list = 0;
rte_spinlock_unlock(&trace->lock);
}
diff --git a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h
index 5d8b53882d..a77ac7a963 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -709,4 +709,9 @@ eal_get_application_usage_hook(void);
*/
void __rte_thread_init(unsigned int lcore_id, rte_cpuset_t *cpuset);
+/**
+ * Uninitialize per-lcore info for current thread.
+ */
+void __rte_thread_uninit(void);
+
#endif /* _EAL_PRIVATE_H_ */
diff --git a/lib/librte_eal/common/eal_trace.h b/lib/librte_eal/common/eal_trace.h
index 8f60616156..92c5951c3a 100644
--- a/lib/librte_eal/common/eal_trace.h
+++ b/lib/librte_eal/common/eal_trace.h
@@ -106,6 +106,7 @@ int trace_metadata_create(void);
void trace_metadata_destroy(void);
int trace_mkdir(void);
int trace_epoch_time_save(void);
+void trace_mem_free(void);
void trace_mem_per_thread_free(void);
/* EAL interface */
diff --git a/lib/librte_eal/windows/eal.c b/lib/librte_eal/windows/eal.c
index 9f5d019e64..a11daee68b 100644
--- a/lib/librte_eal/windows/eal.c
+++ b/lib/librte_eal/windows/eal.c
@@ -215,6 +215,11 @@ __rte_trace_mem_per_thread_alloc(void)
{
}
+void
+trace_mem_per_thread_free(void)
+{
+}
+
void
__rte_trace_point_emit_field(size_t sz, const char *field,
const char *type)
--
2.23.0
^ permalink raw reply [relevance 3%]
* [dpdk-dev] [PATCH v5 02/10] eal: fix multiple definition of per lcore thread id
@ 2020-07-06 14:15 3% ` David Marchand
2020-07-06 14:16 3% ` [dpdk-dev] [PATCH v5 04/10] eal: introduce thread uninit helper David Marchand
1 sibling, 0 replies; 200+ results
From: David Marchand @ 2020-07-06 14:15 UTC (permalink / raw)
To: dev
Cc: jerinjacobk, bruce.richardson, mdr, thomas, arybchenko, ktraynor,
ian.stokes, i.maximets, olivier.matz, konstantin.ananyev,
Neil Horman, Cunming Liang
Because of the inline accessor + static declaration in rte_gettid(),
we end up with multiple symbols for RTE_PER_LCORE(_thread_id).
Each compilation unit will pay a cost when accessing this information
for the first time.
$ nm build/app/dpdk-testpmd | grep per_lcore__thread_id
0000000000000054 d per_lcore__thread_id.5037
0000000000000040 d per_lcore__thread_id.5103
0000000000000048 d per_lcore__thread_id.5259
000000000000004c d per_lcore__thread_id.5259
0000000000000044 d per_lcore__thread_id.5933
0000000000000058 d per_lcore__thread_id.6261
0000000000000050 d per_lcore__thread_id.7378
000000000000005c d per_lcore__thread_id.7496
000000000000000c d per_lcore__thread_id.8016
0000000000000010 d per_lcore__thread_id.8431
Make it global as part of the DPDK_21 stable ABI.
Fixes: ef76436c6834 ("eal: get unique thread id")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
---
lib/librte_eal/common/eal_common_thread.c | 1 +
lib/librte_eal/include/rte_eal.h | 3 ++-
lib/librte_eal/rte_eal_version.map | 7 +++++++
3 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/lib/librte_eal/common/eal_common_thread.c b/lib/librte_eal/common/eal_common_thread.c
index 7be80c292e..fd13453fee 100644
--- a/lib/librte_eal/common/eal_common_thread.c
+++ b/lib/librte_eal/common/eal_common_thread.c
@@ -22,6 +22,7 @@
#include "eal_thread.h"
RTE_DEFINE_PER_LCORE(unsigned int, _lcore_id) = LCORE_ID_ANY;
+RTE_DEFINE_PER_LCORE(int, _thread_id) = -1;
static RTE_DEFINE_PER_LCORE(unsigned int, _socket_id) =
(unsigned int)SOCKET_ID_ANY;
static RTE_DEFINE_PER_LCORE(rte_cpuset_t, _cpuset);
diff --git a/lib/librte_eal/include/rte_eal.h b/lib/librte_eal/include/rte_eal.h
index 2f9ed298de..2edf8c6556 100644
--- a/lib/librte_eal/include/rte_eal.h
+++ b/lib/librte_eal/include/rte_eal.h
@@ -447,6 +447,8 @@ enum rte_intr_mode rte_eal_vfio_intr_mode(void);
*/
int rte_sys_gettid(void);
+RTE_DECLARE_PER_LCORE(int, _thread_id);
+
/**
* Get system unique thread id.
*
@@ -456,7 +458,6 @@ int rte_sys_gettid(void);
*/
static inline int rte_gettid(void)
{
- static RTE_DEFINE_PER_LCORE(int, _thread_id) = -1;
if (RTE_PER_LCORE(_thread_id) == -1)
RTE_PER_LCORE(_thread_id) = rte_sys_gettid();
return RTE_PER_LCORE(_thread_id);
diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map
index 196eef5afa..0d42d44ce9 100644
--- a/lib/librte_eal/rte_eal_version.map
+++ b/lib/librte_eal/rte_eal_version.map
@@ -221,6 +221,13 @@ DPDK_20.0 {
local: *;
};
+DPDK_21 {
+ global:
+
+ per_lcore__thread_id;
+
+} DPDK_20.0;
+
EXPERIMENTAL {
global:
--
2.23.0
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v7 1/3] eal: disable function versioning on Windows
2020-07-06 11:32 5% ` [dpdk-dev] [PATCH v7 1/3] eal: disable function versioning " Fady Bader
@ 2020-07-06 12:22 0% ` Bruce Richardson
2020-07-06 23:16 0% ` Thomas Monjalon
0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2020-07-06 12:22 UTC (permalink / raw)
To: Fady Bader
Cc: dev, thomas, tbashar, talshn, yohadt, dmitry.kozliuk,
harini.ramakrishnan, ocardona, pallavi.kadam, ranjit.menon,
olivier.matz, arybchenko, mdr, nhorman
On Mon, Jul 06, 2020 at 02:32:39PM +0300, Fady Bader wrote:
> Function versioning implementation is not supported by Windows.
> Function versioning is disabled on Windows.
>
> Signed-off-by: Fady Bader <fady@mellanox.com>
> ---
> doc/guides/windows_gsg/intro.rst | 4 ++++
> lib/meson.build | 6 +++++-
> 2 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/doc/guides/windows_gsg/intro.rst b/doc/guides/windows_gsg/intro.rst
> index a0285732df..58c6246404 100644
> --- a/doc/guides/windows_gsg/intro.rst
> +++ b/doc/guides/windows_gsg/intro.rst
> @@ -18,3 +18,7 @@ DPDK for Windows is currently a work in progress. Not all DPDK source files
> compile. Support is being added in pieces so as to limit the overall scope
> of any individual patch series. The goal is to be able to run any DPDK
> application natively on Windows.
> +
> +The :doc:`../contributing/abi_policy` cannot be respected for Windows.
> +Minor ABI versions may be incompatible
> +because function versioning is not supported on Windows.
> diff --git a/lib/meson.build b/lib/meson.build
> index c1b9e1633f..dadf151f78 100644
> --- a/lib/meson.build
> +++ b/lib/meson.build
> @@ -107,6 +107,10 @@ foreach l:libraries
> shared_dep = declare_dependency(include_directories: includes)
> static_dep = shared_dep
> else
> + if is_windows and use_function_versioning
> + message('@0@: Function versioning is not supported by Windows.'
> + .format(name))
> + endif
>
This is ok here, but I think it might be better just moved to somewhere
like config/meson.build, so that it is always just printed once for each
build. I don't see an issue with having it printed even if there is no
function versioning in the build itself.
> if use_function_versioning
> cflags += '-DRTE_USE_FUNCTION_VERSIONING'
> @@ -138,7 +142,7 @@ foreach l:libraries
> include_directories: includes,
> dependencies: static_deps)
>
> - if not use_function_versioning
> + if not use_function_versioning or is_windows
> # use pre-build objects to build shared lib
> sources = []
> objs += static_lib.extract_all_objects(recursive: false)
> --
> 2.16.1.windows.4
>
With or without the code move above, which is just a suggestion,
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
^ permalink raw reply [relevance 0%]
* [dpdk-dev] [PATCH v7 1/3] eal: disable function versioning on Windows
@ 2020-07-06 11:32 5% ` Fady Bader
2020-07-06 12:22 0% ` Bruce Richardson
0 siblings, 1 reply; 200+ results
From: Fady Bader @ 2020-07-06 11:32 UTC (permalink / raw)
To: dev
Cc: thomas, tbashar, talshn, yohadt, dmitry.kozliuk,
harini.ramakrishnan, ocardona, pallavi.kadam, ranjit.menon,
olivier.matz, arybchenko, mdr, nhorman
Function versioning implementation is not supported by Windows.
Function versioning is disabled on Windows.
Signed-off-by: Fady Bader <fady@mellanox.com>
---
doc/guides/windows_gsg/intro.rst | 4 ++++
lib/meson.build | 6 +++++-
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/doc/guides/windows_gsg/intro.rst b/doc/guides/windows_gsg/intro.rst
index a0285732df..58c6246404 100644
--- a/doc/guides/windows_gsg/intro.rst
+++ b/doc/guides/windows_gsg/intro.rst
@@ -18,3 +18,7 @@ DPDK for Windows is currently a work in progress. Not all DPDK source files
compile. Support is being added in pieces so as to limit the overall scope
of any individual patch series. The goal is to be able to run any DPDK
application natively on Windows.
+
+The :doc:`../contributing/abi_policy` cannot be respected for Windows.
+Minor ABI versions may be incompatible
+because function versioning is not supported on Windows.
diff --git a/lib/meson.build b/lib/meson.build
index c1b9e1633f..dadf151f78 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -107,6 +107,10 @@ foreach l:libraries
shared_dep = declare_dependency(include_directories: includes)
static_dep = shared_dep
else
+ if is_windows and use_function_versioning
+ message('@0@: Function versioning is not supported by Windows.'
+ .format(name))
+ endif
if use_function_versioning
cflags += '-DRTE_USE_FUNCTION_VERSIONING'
@@ -138,7 +142,7 @@ foreach l:libraries
include_directories: includes,
dependencies: static_deps)
- if not use_function_versioning
+ if not use_function_versioning or is_windows
# use pre-build objects to build shared lib
sources = []
objs += static_lib.extract_all_objects(recursive: false)
--
2.16.1.windows.4
^ permalink raw reply [relevance 5%]
* Re: [dpdk-dev] [PATCH v2 4/4] eventdev: relax smp barriers with c11 atomics
@ 2020-07-06 10:04 4% ` Thomas Monjalon
2020-07-06 15:32 0% ` Phil Yang
0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2020-07-06 10:04 UTC (permalink / raw)
To: Phil Yang
Cc: erik.g.carrillo, dev, jerinj, Honnappa.Nagarahalli, drc,
Ruifeng.Wang, Dharmik.Thakkar, nd, david.marchand, mdr,
Neil Horman, Dodji Seketeli
02/07/2020 07:26, Phil Yang:
> The implementation-specific opaque data is shared between arm and cancel
> operations. The state flag acts as a guard variable to make sure the
> update of opaque data is synchronized. This patch uses c11 atomics with
> explicit one way memory barrier instead of full barriers rte_smp_w/rmb()
> to synchronize the opaque data between timer arm and cancel threads.
I think we should write C11 (uppercase).
Please, in your explanations, try to be more specific.
Naming fields may help to make things clear.
[...]
> --- a/lib/librte_eventdev/rte_event_timer_adapter.h
> +++ b/lib/librte_eventdev/rte_event_timer_adapter.h
> @@ -467,7 +467,7 @@ struct rte_event_timer {
> * - op: RTE_EVENT_OP_NEW
> * - event_type: RTE_EVENT_TYPE_TIMER
> */
> - volatile enum rte_event_timer_state state;
> + enum rte_event_timer_state state;
> /**< State of the event timer. */
Why do you remove the volatile keyword?
It is not explained in the commit log.
This change is triggering a warning in the ABI check:
http://mails.dpdk.org/archives/test-report/2020-July/140440.html
Moving from volatile to non-volatile is probably not an issue.
I expect the code generated for the volatile case to work the same
in non-volatile case. Do you confirm?
In any case, we need an explanation and an ABI check exception.
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [pull-request] next-eventdev 20.08 RC1
@ 2020-07-06 9:57 3% ` Thomas Monjalon
0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2020-07-06 9:57 UTC (permalink / raw)
To: Jerin Jacob Kollanukkaran; +Cc: dev, phil.yang
05/07/2020 05:41, Jerin Jacob Kollanukkaran:
> http://dpdk.org/git/next/dpdk-next-eventdev
>
> ----------------------------------------------------------------
> Harman Kalra (1):
> event/octeontx: fix memory corruption
>
> Harry van Haaren (1):
> examples/eventdev_pipeline: fix 32-bit coremask logic
>
> Pavan Nikhilesh (3):
> event/octeontx2: fix device reconfigure
> event/octeontx2: fix sub event type violation
> event/octeontx2: improve datapath memory locality
Pulled patches above.
> Phil Yang (4):
> eventdev: fix race condition on timer list counter
> eventdev: use c11 atomics for lcore timer armed flag
> eventdev: remove redundant code
> eventdev: relax smp barriers with c11 atomics
I cannot merge this C11 series because of an ABI breakage:
http://mails.dpdk.org/archives/test-report/2020-July/140440.html
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [dpdk-techboard] [PATCH v3 0/3] Experimental/internal libraries cleanup
2020-07-06 8:02 3% ` [dpdk-dev] [dpdk-techboard] " Bruce Richardson
@ 2020-07-06 8:12 0% ` Thomas Monjalon
0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2020-07-06 8:12 UTC (permalink / raw)
To: Bruce Richardson
Cc: David Marchand, dev, honnappa.nagarahalli, techboard, Jiayu Hu,
Yipeng Wang, Sameh Gobriel, Vladimir Medvedkin, Nipun Gupta,
Hemant Agrawal
06/07/2020 10:02, Bruce Richardson:
> On Sun, Jul 05, 2020 at 09:55:41PM +0200, Thomas Monjalon wrote:
> > +Cc maintainers of the problematic libraries:
> > - librte_fib
> > - librte_rib
> > - librte_gro
> > - librte_member
> > - librte_rawdev
> >
> > 26/06/2020 10:16, David Marchand:
> > > Following discussions on the mailing list and the 05/20 TB meeting, here
> > > is a series that drops the special versioning for non stable libraries.
> > >
> > > Two notes:
> > >
> > > - RIB/FIB library is not referenced in the API doxygen index, is this
> > > intentional?
> >
> > Vladimir please, could you fix the miss in the doxygen index?
> >
> > > - I inspected MAINTAINERS: librte_gro, librte_member and librte_rawdev are
> > > announced as experimental while their functions are part of the 20
> > > stable ABI (in .map files + no __rte_experimental marking).
> > > Their fate must be discussed.
> >
> > I would suggest removing EXPERIMENTAL flag for gro, member and rawdev.
> > They are probably already considered stable for a lot of users.
> > Maintainers, are you OK to follow the ABI compatibility rules
> > for these libraries? Do you feel these libraries are mature enough?
> >
>
> I think things being added to the official ABI is good. For these, I wonder
> if waiting till the 20.11 release is the best time to officially mark them
> as stable, rather than doing so now?
They are already not marked as experimental symbols...
I think we should remove confusion in the MAINTAINERS file.
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH] mbuf: use c11 atomics for refcnt operations
2020-07-03 15:38 3% ` David Marchand
@ 2020-07-06 8:03 3% ` Phil Yang
0 siblings, 0 replies; 200+ results
From: Phil Yang @ 2020-07-06 8:03 UTC (permalink / raw)
To: David Marchand
Cc: dev, Olivier Matz, David Christensen, Honnappa Nagarahalli,
Ruifeng Wang, nd
> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Friday, July 3, 2020 11:39 PM
> To: Phil Yang <Phil.Yang@arm.com>
> Cc: dev <dev@dpdk.org>; Olivier Matz <olivier.matz@6wind.com>; David
> Christensen <drc@linux.vnet.ibm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; nd <nd@arm.com>
> Subject: Re: [dpdk-dev] [PATCH] mbuf: use c11 atomics for refcnt operations
>
> On Thu, Jun 11, 2020 at 12:26 PM Phil Yang <phil.yang@arm.com> wrote:
> >
> > Use c11 atomics with explicit ordering instead of rte_atomic ops which
> > enforce unnecessary barriers on aarch64.
> >
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
>
> I did not look at the details, but this patch is refused by the ABI
> check in Travis.
Thanks, David.
The ABI issue is the name of 'rte_mbuf_ext_shared_info::refcnt_atomic' changed to 'rte_mbuf_ext_shared_info::refcnt' at rte_mbuf_core.h.
I made this change just to simplify the name of the variable.
Revert the 'rte_mbuf_ext_shared_info::refcnt' to refcnt_atomic can fix this issue.
I will update it in v2.
Thanks,
Phil
>
>
> --
> David Marchand
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [dpdk-techboard] [PATCH v3 0/3] Experimental/internal libraries cleanup
2020-07-05 19:55 3% ` Thomas Monjalon
@ 2020-07-06 8:02 3% ` Bruce Richardson
2020-07-06 8:12 0% ` Thomas Monjalon
2020-07-06 16:57 0% ` [dpdk-dev] " Medvedkin, Vladimir
1 sibling, 1 reply; 200+ results
From: Bruce Richardson @ 2020-07-06 8:02 UTC (permalink / raw)
To: Thomas Monjalon
Cc: David Marchand, dev, honnappa.nagarahalli, techboard, Jiayu Hu,
Yipeng Wang, Sameh Gobriel, Vladimir Medvedkin, Nipun Gupta,
Hemant Agrawal
On Sun, Jul 05, 2020 at 09:55:41PM +0200, Thomas Monjalon wrote:
> +Cc maintainers of the problematic libraries:
> - librte_fib
> - librte_rib
> - librte_gro
> - librte_member
> - librte_rawdev
>
> 26/06/2020 10:16, David Marchand:
> > Following discussions on the mailing list and the 05/20 TB meeting, here
> > is a series that drops the special versioning for non stable libraries.
> >
> > Two notes:
> >
> > - RIB/FIB library is not referenced in the API doxygen index, is this
> > intentional?
>
> Vladimir please, could you fix the miss in the doxygen index?
>
> > - I inspected MAINTAINERS: librte_gro, librte_member and librte_rawdev are
> > announced as experimental while their functions are part of the 20
> > stable ABI (in .map files + no __rte_experimental marking).
> > Their fate must be discussed.
>
> I would suggest removing EXPERIMENTAL flag for gro, member and rawdev.
> They are probably already considered stable for a lot of users.
> Maintainers, are you OK to follow the ABI compatibility rules
> for these libraries? Do you feel these libraries are mature enough?
>
I think things being added to the official ABI is good. For these, I wonder
if waiting till the 20.11 release is the best time to officially mark them
as stable, rather than doing so now?
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v6 1/3] eal: disable function versioning on Windows
2020-07-05 20:23 4% ` Thomas Monjalon
@ 2020-07-06 7:02 0% ` Fady Bader
0 siblings, 0 replies; 200+ results
From: Fady Bader @ 2020-07-06 7:02 UTC (permalink / raw)
To: Thomas Monjalon
Cc: dev, Tasnim Bashar, Tal Shnaiderman, Yohad Tor, dmitry.kozliuk,
harini.ramakrishnan, ocardona, pallavi.kadam, ranjit.menon,
olivier.matz, arybchenko, mdr, nhorman
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Sunday, July 5, 2020 11:24 PM
> To: Fady Bader <fady@mellanox.com>
> Cc: dev@dpdk.org; Tasnim Bashar <tbashar@mellanox.com>; Tal Shnaiderman
> <talshn@mellanox.com>; Yohad Tor <yohadt@mellanox.com>;
> dmitry.kozliuk@gmail.com; harini.ramakrishnan@microsoft.com;
> ocardona@microsoft.com; pallavi.kadam@intel.com; ranjit.menon@intel.com;
> olivier.matz@6wind.com; arybchenko@solarflare.com; mdr@ashroe.eu;
> nhorman@tuxdriver.com
> Subject: Re: [dpdk-dev] [PATCH v6 1/3] eal: disable function versioning on
> Windows
>
> 05/07/2020 15:47, Fady Bader:
> > Function versioning implementation is not supported by Windows.
> > Function versioning was disabled on Windows.
>
> was -> is
>
> > Signed-off-by: Fady Bader <fady@mellanox.com>
> > ---
> > lib/librte_eal/include/rte_function_versioning.h | 2 +-
> > lib/meson.build | 5 +++++
> > 2 files changed, 6 insertions(+), 1 deletion(-)
>
> As suggested by Ray, we should add a note in the documentation about the ABI
> compatibility. Because we have no function versioning, we cannot ensure ABI
> compatibility on Windows.
>
> I recommend adding this text in doc/guides/windows_gsg/intro.rst under
> "Limitations":
> "
> The :doc:`../contributing/abi_policy` cannot be respected for Windows.
> Minor ABI versions may be incompatible
> because function versioning is not supported on Windows.
> "
Ok, I'll send a new patch with the changes soon.
>
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH 1/3] ring: remove experimental tag for ring reset API
2020-07-03 18:46 3% ` Honnappa Nagarahalli
@ 2020-07-06 6:23 3% ` Kinsella, Ray
2020-07-07 3:19 3% ` Feifei Wang
0 siblings, 1 reply; 200+ results
From: Kinsella, Ray @ 2020-07-06 6:23 UTC (permalink / raw)
To: Honnappa Nagarahalli, Feifei Wang, Konstantin Ananyev, Neil Horman
Cc: dev, nd
On 03/07/2020 19:46, Honnappa Nagarahalli wrote:
> <snip>
>
>>
>> On 03/07/2020 11:26, Feifei Wang wrote:
>>> Remove the experimental tag for rte_ring_reset API that have been
>>> around for 4 releases.
>>>
>>> Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
>>> Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
>>> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
>>> ---
>>> lib/librte_ring/rte_ring.h | 3 ---
>>> lib/librte_ring/rte_ring_version.map | 4 +---
>>> 2 files changed, 1 insertion(+), 6 deletions(-)
>>>
>>> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
>>> index f67141482..7181c33b4 100644
>>> --- a/lib/librte_ring/rte_ring.h
>>> +++ b/lib/librte_ring/rte_ring.h
>>> @@ -663,15 +663,12 @@ rte_ring_dequeue(struct rte_ring *r, void **obj_p)
>>> *
>>> * This function flush all the elements in a ring
>>> *
>>> - * @b EXPERIMENTAL: this API may change without prior notice
>>> - *
>>> * @warning
>>> * Make sure the ring is not in use while calling this function.
>>> *
>>> * @param r
>>> * A pointer to the ring structure.
>>> */
>>> -__rte_experimental
>>> void
>>> rte_ring_reset(struct rte_ring *r);
>>>
>>> diff --git a/lib/librte_ring/rte_ring_version.map
>>> b/lib/librte_ring/rte_ring_version.map
>>> index e88c143cf..aec6f3820 100644
>>> --- a/lib/librte_ring/rte_ring_version.map
>>> +++ b/lib/librte_ring/rte_ring_version.map
>>> @@ -8,6 +8,7 @@ DPDK_20.0 {
>>> rte_ring_init;
>>> rte_ring_list_dump;
>>> rte_ring_lookup;
>>> + rte_ring_reset;
>>>
>>> local: *;
>>> };
>>> @@ -15,9 +16,6 @@ DPDK_20.0 {
>>> EXPERIMENTAL {
>>> global:
>>>
>>> - # added in 19.08
>>> - rte_ring_reset;
>>> -
>>> # added in 20.02
>>> rte_ring_create_elem;
>>> rte_ring_get_memsize_elem;
>>
>> So strictly speaking, rte_ring_reset is part of the DPDK_21 ABI, not the v20.0
>> ABI.
> Thanks Ray for clarifying this.
>
>>
>> The way to solve is to add it the DPDK_21 ABI in the map file.
>> And then use the VERSION_SYMBOL_EXPERIMENTAL to alias to experimental
>> if necessary.
> Is using VERSION_SYMBOL_EXPERIMENTAL a must?
Purely at the discretion of the contributor and maintainer.
If it has been around for a while, applications are using it and changing the symbol will break them.
You may choose to provide the alias or not.
> The documentation also seems to be vague. It says " The macro is used when a symbol matures to become part of the stable ABI, to provide an alias to experimental for some time". What does 'some time' mean?
"Some time" is a bit vague alright, should be "until the next major ABI version" - I will fix.
>
>>
>> https://doc.dpdk.org/guides/contributing/abi_versioning.html#versioning-
>> macros
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH v6 1/3] eal: disable function versioning on Windows
@ 2020-07-05 20:23 4% ` Thomas Monjalon
2020-07-06 7:02 0% ` Fady Bader
0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2020-07-05 20:23 UTC (permalink / raw)
To: Fady Bader
Cc: dev, tbashar, talshn, yohadt, dmitry.kozliuk,
harini.ramakrishnan, ocardona, pallavi.kadam, ranjit.menon,
olivier.matz, arybchenko, mdr, nhorman
05/07/2020 15:47, Fady Bader:
> Function versioning implementation is not supported by Windows.
> Function versioning was disabled on Windows.
was -> is
> Signed-off-by: Fady Bader <fady@mellanox.com>
> ---
> lib/librte_eal/include/rte_function_versioning.h | 2 +-
> lib/meson.build | 5 +++++
> 2 files changed, 6 insertions(+), 1 deletion(-)
As suggested by Ray, we should add a note in the documentation
about the ABI compatibility. Because we have no function versioning,
we cannot ensure ABI compatibility on Windows.
I recommend adding this text in doc/guides/windows_gsg/intro.rst
under "Limitations":
"
The :doc:`../contributing/abi_policy` cannot be respected for Windows.
Minor ABI versions may be incompatible
because function versioning is not supported on Windows.
"
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH v3 0/3] Experimental/internal libraries cleanup
@ 2020-07-05 19:55 3% ` Thomas Monjalon
2020-07-06 8:02 3% ` [dpdk-dev] [dpdk-techboard] " Bruce Richardson
2020-07-06 16:57 0% ` [dpdk-dev] " Medvedkin, Vladimir
0 siblings, 2 replies; 200+ results
From: Thomas Monjalon @ 2020-07-05 19:55 UTC (permalink / raw)
To: David Marchand
Cc: dev, honnappa.nagarahalli, techboard, Jiayu Hu, Yipeng Wang,
Sameh Gobriel, Vladimir Medvedkin, Nipun Gupta, Hemant Agrawal
+Cc maintainers of the problematic libraries:
- librte_fib
- librte_rib
- librte_gro
- librte_member
- librte_rawdev
26/06/2020 10:16, David Marchand:
> Following discussions on the mailing list and the 05/20 TB meeting, here
> is a series that drops the special versioning for non stable libraries.
>
> Two notes:
>
> - RIB/FIB library is not referenced in the API doxygen index, is this
> intentional?
Vladimir please, could you fix the miss in the doxygen index?
> - I inspected MAINTAINERS: librte_gro, librte_member and librte_rawdev are
> announced as experimental while their functions are part of the 20
> stable ABI (in .map files + no __rte_experimental marking).
> Their fate must be discussed.
I would suggest removing EXPERIMENTAL flag for gro, member and rawdev.
They are probably already considered stable for a lot of users.
Maintainers, are you OK to follow the ABI compatibility rules
for these libraries? Do you feel these libraries are mature enough?
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [RFC] ethdev: add fragment attribute to IPv6 item
@ 2020-07-05 13:13 0% ` Andrew Rybchenko
0 siblings, 0 replies; 200+ results
From: Andrew Rybchenko @ 2020-07-05 13:13 UTC (permalink / raw)
To: Adrien Mazarguil, Ori Kam
Cc: Dekel Peled, ferruh.yigit, john.mcnamara, marko.kovacevic,
Asaf Penso, Matan Azrad, Eli Britstein, dev, Ivan Malov
On 6/2/20 10:04 PM, Adrien Mazarguil wrote:
> Hi Ori, Andrew, Delek,
>
> (been a while eh?)
>
> On Tue, Jun 02, 2020 at 06:28:41PM +0000, Ori Kam wrote:
>> Hi Andrew,
>>
>> PSB,
> [...]
>>>> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
>>>> index b0e4199..3bc8ce1 100644
>>>> --- a/lib/librte_ethdev/rte_flow.h
>>>> +++ b/lib/librte_ethdev/rte_flow.h
>>>> @@ -787,6 +787,8 @@ struct rte_flow_item_ipv4 {
>>>> */
>>>> struct rte_flow_item_ipv6 {
>>>> struct rte_ipv6_hdr hdr; /**< IPv6 header definition. */
>>>> + uint32_t is_frag:1; /**< Is IPv6 packet fragmented/non-fragmented. */
>>>> + uint32_t reserved:31; /**< Reserved, must be zero. */
>>>
>>> The solution is simple, but hardly generic and adds an
>>> example for the future extensions. I doubt that it is a
>>> right way to go.
>>>
>> I agree with you that this is not the most generic way possible,
>> but the IPV6 extensions are very unique. So the solution is also unique.
>> In general, I'm always in favor of finding the most generic way, but sometimes
>> it is better to keep things simple, and see how it goes.
>
> Same feeling here, it doesn't look right.
>
>>> May be we should add 256-bit string with one bit for each
>>> IP protocol number and apply it to extension headers only?
>>> If bit A is set in the mask:
>>> - if bit A is set in spec as well, extension header with
>>> IP protocol (1 << A) number must present
>>> - if bit A is clear in spec, extension header with
>>> IP protocol (1 << A) number must absent
>>> If bit is clear in the mask, corresponding extension header
>>> may present and may absent (i.e. don't care).
>>>
>> There are only 12 possible extension headers and currently none of them
>> are supported in rte_flow. So adding a logic to parse the 256 just to get a max of 12
>> possible values is an overkill. Also, if we disregard the case of the extension,
>> the application must select only one next proto. For example, the application
>> can't select udp + tcp. There is the option to add a flag for each of the
>> possible extensions, does it makes more sense to you?
>
> Each of these extension headers has its own structure, we first need the
> ability to match them properly by adding the necessary pattern items.
>
>>> The RFC indirectly touches IPv6 proto (next header) matching
>>> logic.
>>>
>>> If logic used in ETH+VLAN is applied on IPv6 as well, it would
>>> make pattern specification and handling complicated. E.g.:
>>> eth / ipv6 / udp / end
>>> should match UDP over IPv6 without any extension headers only.
>>>
>> The issue with VLAN I agree is different since by definition VLAN is
>> layer 2.5. We can add the same logic also to the VLAN case, maybe it will
>> be easier.
>> In any case, in your example above and according to the RFC we will
>> get all ipv6 udp traffic with and without extensions.
>>
>>> And how to specify UPD over IPv6 regardless extension headers?
>>
>> Please see above the rule will be eth / ipv6 /udp.
>>
>>> eth / ipv6 / ipv6_ext / udp / end
>>> with a convention that ipv6_ext is optional if spec and mask
>>> are NULL (or mask is empty).
>>>
>> I would guess that this flow should match all ipv6 that has one ext and the next
>> proto is udp.
>
> In my opinion RTE_FLOW_ITEM_TYPE_IPV6_EXT is a bit useless on its own. It's
> only for matching packets that contain some kind of extension header, not a
> specific one, more about that below.
>
>>> I'm wondering if any driver treats it this way?
>>>
>> I'm not sure, we can support only the frag ext by default, but if required we can support other
>> ext.
>>
>>> I agree that the problem really comes when we'd like match
>>> IPv6 frags or even worse not fragments.
>>>
>>> Two patterns for fragments:
>>> eth / ipv6 (proto=FRAGMENT) / end
>>> eth / ipv6 / ipv6_ext (next_hdr=FRAGMENT) / end
>>>
>>> Any sensible solution for not-fragments with any other
>>> extension headers?
>>>
>> The one propose in this mail 😊
>>
>>> INVERT exists, but hardly useful, since it simply says
>>> that patches which do not match pattern without INVERT
>>> matches the pattern with INVERT and
>>> invert / eth / ipv6 (proto=FRAGMENT) / end
>>> will match ARP, IPv4, IPv6 with an extension header before
>>> fragment header and so on.
>>>
>> I agree with you, INVERT in this doesn’t help.
>> We were considering adding some kind of not mask / item per item.
>> some think around this line:
>> user request ipv6 unfragmented udp packets. The flow would look something
>> like this:
>> Eth / ipv6 / Not (Ipv6.proto = frag_proto) / udp
>> But it makes the rules much harder to use, and I don't think that there
>> is any HW that support not, and adding such feature to all items is overkill.
>>
>>
>>> Bit string suggested above will allow to match:
>>> - UDP over IPv6 with any extension headers:
>>> eth / ipv6 (ext_hdrs mask empty) / udp / end
>>> - UDP over IPv6 without any extension headers:
>>> eth / ipv6 (ext_hdrs mask full, spec empty) / udp / end
>>> - UDP over IPv6 without fragment header:
>>> eth / ipv6 (ext.spec & ~FRAGMENT, ext.mask | FRAGMENT) / udp / end
>>> - UDP over IPv6 with fragment header
>>> eth / ipv6 (ext.spec | FRAGMENT, ext.mask | FRAGMENT) / udp / end
>>>
>>> where FRAGMENT is 1 << IPPROTO_FRAGMENT.
>>>
>> Please see my response regarding this above.
>>
>>> Above I intentionally keep 'proto' unspecified in ipv6
>>> since otherwise it would specify the next header after IPv6
>>> header.
>>>
>>> Extension headers mask should be empty by default.
>
> This is a deliberate design choice/issue with rte_flow: an empty pattern
> matches everything; adding items only narrows the selection. As Andrew said
> there is currently no way to provide a specific item to reject, it can only
> be done globally on a pattern through INVERT that no PMD implements so far.
>
> So we have two requirements here: the ability to specifically match IPv6
> fragment headers and the ability to reject them.
>
I think that one of key requirements here is an ability to say
that an extension header may be anywhere (or it must be no
extension header anywhere), since specification using a pattern
item suggests specified order of items, but it could be other
extension headers before frag header and after it before UDP protocol
header.
> To match IPv6 fragment headers, we need a dedicated pattern item. The
> generic RTE_FLOW_ITEM_TYPE_IPV6_EXT is useless for that on its own, it must
> be completed with RTE_FLOW_ITEM_TYPE_IPV6_EXT_FRAG and associated object
> to match individual fields if needed (like all the others
> protocols/headers).
>
Yes, I agree, but it is strictly required if we want to match
on fragment header content or see it in exact place in next
protocols chain.
> Then to reject a pattern item... My preference goes to a new "NOT" meta item
> affecting the meaning of the item coming immediately after in the pattern
> list. That would be ultra generic, wouldn't break any ABI/API and like
> INVERT, wouldn't even require a new object associated with it.
>
Yes, that's true, but I'm not sure if it is easy to do in HW.
Also, *NOT* scope could be per item field in fact, not whole
item. It sounds like it is getting more and more complicated.
> To match UDPv6 traffic when there is no fragment header, one could then do
> something like:
>
> eth / ipv6 / not / ipv6_ext_frag / udp
>
> PMD support would be trivial to implement (I'm sure!)
>
The problem is an interpretation of the above pattern.
Strictly speaking only UDP packets with exactly one
not frag extension header match the pattern.
What about packets without any extension headers?
Or packet with two (more) extension headers when the first
one is not frag header?
> We may later implement other kinds of "operator" items as Andrew suggested,
> for bit-wise stuff and so on. Let's keep adding features on a needed basis
> though.
>
Thanks,
Andrew.
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH v5 1/3] lib/lpm: integrate RCU QSBR
@ 2020-07-04 17:00 3% ` Ruifeng Wang
0 siblings, 0 replies; 200+ results
From: Ruifeng Wang @ 2020-07-04 17:00 UTC (permalink / raw)
To: David Marchand, Vladimir Medvedkin, Bruce Richardson
Cc: John McNamara, Marko Kovacevic, Ray Kinsella, Neil Horman, dev,
Ananyev, Konstantin, Honnappa Nagarahalli, nd, nd
Hi David,
Sorry, I missed tracking of this thread.
> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Monday, June 29, 2020 7:56 PM
> To: Ruifeng Wang <Ruifeng.Wang@arm.com>; Vladimir Medvedkin
> <vladimir.medvedkin@intel.com>; Bruce Richardson
> <bruce.richardson@intel.com>
> Cc: John McNamara <john.mcnamara@intel.com>; Marko Kovacevic
> <marko.kovacevic@intel.com>; Ray Kinsella <mdr@ashroe.eu>; Neil Horman
> <nhorman@tuxdriver.com>; dev <dev@dpdk.org>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v5 1/3] lib/lpm: integrate RCU QSBR
>
> On Mon, Jun 29, 2020 at 10:03 AM Ruifeng Wang <ruifeng.wang@arm.com>
> wrote:
> >
> > Currently, the tbl8 group is freed even though the readers might be
> > using the tbl8 group entries. The freed tbl8 group can be reallocated
> > quickly. This results in incorrect lookup results.
> >
> > RCU QSBR process is integrated for safe tbl8 group reclaim.
> > Refer to RCU documentation to understand various aspects of
> > integrating RCU library into other libraries.
> >
> > Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > ---
> > doc/guides/prog_guide/lpm_lib.rst | 32 +++++++
> > lib/librte_lpm/Makefile | 2 +-
> > lib/librte_lpm/meson.build | 1 +
> > lib/librte_lpm/rte_lpm.c | 129 ++++++++++++++++++++++++++---
> > lib/librte_lpm/rte_lpm.h | 59 +++++++++++++
> > lib/librte_lpm/rte_lpm_version.map | 6 ++
> > 6 files changed, 216 insertions(+), 13 deletions(-)
> >
> > diff --git a/doc/guides/prog_guide/lpm_lib.rst
> > b/doc/guides/prog_guide/lpm_lib.rst
> > index 1609a57d0..7cc99044a 100644
> > --- a/doc/guides/prog_guide/lpm_lib.rst
> > +++ b/doc/guides/prog_guide/lpm_lib.rst
> > @@ -145,6 +145,38 @@ depending on whether we need to move to the
> next table or not.
> > Prefix expansion is one of the keys of this algorithm, since it
> > improves the speed dramatically by adding redundancy.
> >
> > +Deletion
> > +~~~~~~~~
> > +
> > +When deleting a rule, a replacement rule is searched for. Replacement
> > +rule is an existing rule that has the longest prefix match with the rule to be
> deleted, but has smaller depth.
> > +
> > +If a replacement rule is found, target tbl24 and tbl8 entries are
> > +updated to have the same depth and next hop value with the
> replacement rule.
> > +
> > +If no replacement rule can be found, target tbl24 and tbl8 entries will be
> cleared.
> > +
> > +Prefix expansion is performed if the rule's depth is not exactly 24 bits or
> 32 bits.
> > +
> > +After deleting a rule, a group of tbl8s that belongs to the same tbl24 entry
> are freed in following cases:
> > +
> > +* All tbl8s in the group are empty .
> > +
> > +* All tbl8s in the group have the same values and with depth no greater
> than 24.
> > +
> > +Free of tbl8s have different behaviors:
> > +
> > +* If RCU is not used, tbl8s are cleared and reclaimed immediately.
> > +
> > +* If RCU is used, tbl8s are reclaimed when readers are in quiescent state.
> > +
> > +When the LPM is not using RCU, tbl8 group can be freed immediately
> > +even though the readers might be using the tbl8 group entries. This might
> result in incorrect lookup results.
> > +
> > +RCU QSBR process is integrated for safe tbl8 group reclaimation.
> > +Application has certain responsibilities while using this feature.
> > +Please refer to resource reclaimation framework of :ref:`RCU library
> <RCU_Library>` for more details.
> > +
>
> Would the lpm6 library benefit from the same?
> Asking as I do not see much code shared between lpm and lpm6.
>
Didn't look into lpm6. It may need separate integration with RCU since no shared code between lpm and lpm6 as you mentioned.
> [...]
>
> > diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c index
> > 38ab512a4..41e9c49b8 100644
> > --- a/lib/librte_lpm/rte_lpm.c
> > +++ b/lib/librte_lpm/rte_lpm.c
> > @@ -1,5 +1,6 @@
> > /* SPDX-License-Identifier: BSD-3-Clause
> > * Copyright(c) 2010-2014 Intel Corporation
> > + * Copyright(c) 2020 Arm Limited
> > */
> >
> > #include <string.h>
> > @@ -245,13 +246,84 @@ rte_lpm_free(struct rte_lpm *lpm)
> > TAILQ_REMOVE(lpm_list, te, next);
> >
> > rte_mcfg_tailq_write_unlock();
> > -
> > +#ifdef ALLOW_EXPERIMENTAL_API
> > + if (lpm->dq)
> > + rte_rcu_qsbr_dq_delete(lpm->dq); #endif
>
> All DPDK code under lib/ is compiled with the ALLOW_EXPERIMENTAL_API
> flag set.
> There is no need to protect against this flag in rte_lpm.c.
>
OK, I see. So DPDK libraries will always be compiled with the ALLOW_EXPERIMENTAL_API. It is application's
choice to use experimental APIs.
Will update in next version to remove the ALLOW_EXPERIMENTAL_API flag from rte_lpm.c and only keep the one in rte_lpm.h.
> [...]
>
> > diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h index
> > b9d49ac87..7889f21b3 100644
> > --- a/lib/librte_lpm/rte_lpm.h
> > +++ b/lib/librte_lpm/rte_lpm.h
>
> > @@ -130,6 +143,28 @@ struct rte_lpm {
> > __rte_cache_aligned; /**< LPM tbl24 table. */
> > struct rte_lpm_tbl_entry *tbl8; /**< LPM tbl8 table. */
> > struct rte_lpm_rule *rules_tbl; /**< LPM rules. */
> > +#ifdef ALLOW_EXPERIMENTAL_API
> > + /* RCU config. */
> > + struct rte_rcu_qsbr *v; /* RCU QSBR variable. */
> > + enum rte_lpm_qsbr_mode rcu_mode;/* Blocking, defer queue. */
> > + struct rte_rcu_qsbr_dq *dq; /* RCU QSBR defer queue. */
> > +#endif
> > +};
>
> This is more a comment/question for the lpm maintainers.
>
> Afaics, the rte_lpm structure is exported/public because of lookup which is
> inlined.
> But most of the structure can be hidden and stored in a private structure that
> would embed the exposed rte_lpm.
> The slowpath functions would only have to translate from publicly exposed
> to internal representation (via container_of).
>
> This patch could do this and be the first step to hide the unneeded exposure
> of other fields (later/in 20.11 ?).
>
To hide most of the structure is reasonable.
Since it will break ABI, I can do that in 20.11.
> Thoughts?
>
>
> --
> David Marchand
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH 1/3] ring: remove experimental tag for ring reset API
2020-07-03 16:16 4% ` Kinsella, Ray
@ 2020-07-03 18:46 3% ` Honnappa Nagarahalli
2020-07-06 6:23 3% ` Kinsella, Ray
0 siblings, 1 reply; 200+ results
From: Honnappa Nagarahalli @ 2020-07-03 18:46 UTC (permalink / raw)
To: Kinsella, Ray, Feifei Wang, Konstantin Ananyev, Neil Horman
Cc: dev, nd, Honnappa Nagarahalli, nd
<snip>
>
> On 03/07/2020 11:26, Feifei Wang wrote:
> > Remove the experimental tag for rte_ring_reset API that have been
> > around for 4 releases.
> >
> > Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
> > Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> > ---
> > lib/librte_ring/rte_ring.h | 3 ---
> > lib/librte_ring/rte_ring_version.map | 4 +---
> > 2 files changed, 1 insertion(+), 6 deletions(-)
> >
> > diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> > index f67141482..7181c33b4 100644
> > --- a/lib/librte_ring/rte_ring.h
> > +++ b/lib/librte_ring/rte_ring.h
> > @@ -663,15 +663,12 @@ rte_ring_dequeue(struct rte_ring *r, void **obj_p)
> > *
> > * This function flush all the elements in a ring
> > *
> > - * @b EXPERIMENTAL: this API may change without prior notice
> > - *
> > * @warning
> > * Make sure the ring is not in use while calling this function.
> > *
> > * @param r
> > * A pointer to the ring structure.
> > */
> > -__rte_experimental
> > void
> > rte_ring_reset(struct rte_ring *r);
> >
> > diff --git a/lib/librte_ring/rte_ring_version.map
> > b/lib/librte_ring/rte_ring_version.map
> > index e88c143cf..aec6f3820 100644
> > --- a/lib/librte_ring/rte_ring_version.map
> > +++ b/lib/librte_ring/rte_ring_version.map
> > @@ -8,6 +8,7 @@ DPDK_20.0 {
> > rte_ring_init;
> > rte_ring_list_dump;
> > rte_ring_lookup;
> > + rte_ring_reset;
> >
> > local: *;
> > };
> > @@ -15,9 +16,6 @@ DPDK_20.0 {
> > EXPERIMENTAL {
> > global:
> >
> > - # added in 19.08
> > - rte_ring_reset;
> > -
> > # added in 20.02
> > rte_ring_create_elem;
> > rte_ring_get_memsize_elem;
>
> So strictly speaking, rte_ring_reset is part of the DPDK_21 ABI, not the v20.0
> ABI.
Thanks Ray for clarifying this.
>
> The way to solve is to add it the DPDK_21 ABI in the map file.
> And then use the VERSION_SYMBOL_EXPERIMENTAL to alias to experimental
> if necessary.
Is using VERSION_SYMBOL_EXPERIMENTAL a must? The documentation also seems to be vague. It says " The macro is used when a symbol matures to become part of the stable ABI, to provide an alias to experimental for some time". What does 'some time' mean?
>
> https://doc.dpdk.org/guides/contributing/abi_versioning.html#versioning-
> macros
^ permalink raw reply [relevance 3%]
* [dpdk-dev] [PATCH] doc: add sample for ABI checks in contribution guide
@ 2020-07-03 17:15 4% Ferruh Yigit
0 siblings, 0 replies; 200+ results
From: Ferruh Yigit @ 2020-07-03 17:15 UTC (permalink / raw)
To: John McNamara, Marko Kovacevic; +Cc: dev, Ferruh Yigit
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
---
doc/guides/contributing/patches.rst | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/doc/guides/contributing/patches.rst b/doc/guides/contributing/patches.rst
index 25d97b85b..39ec64ec8 100644
--- a/doc/guides/contributing/patches.rst
+++ b/doc/guides/contributing/patches.rst
@@ -550,6 +550,10 @@ results in a subfolder of the current working directory.
The environment variable ``DPDK_ABI_REF_DIR`` can be set so that the results go
to a different location.
+Sample::
+
+ DPDK_ABI_REF_VERSION=v19.11 DPDK_ABI_REF_DIR=/tmp ./devtools/test-meson-builds.sh
+
Sending Patches
---------------
--
2.25.4
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH 2/3] ring: remove experimental tag for ring element APIs
@ 2020-07-03 16:17 3% ` Kinsella, Ray
0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2020-07-03 16:17 UTC (permalink / raw)
To: Feifei Wang, Honnappa Nagarahalli, Konstantin Ananyev, Neil Horman
Cc: dev, nd
On 03/07/2020 11:26, Feifei Wang wrote:
> Remove the experimental tag for rte_ring_xxx_elem APIs that have been
> around for 2 releases.
>
> Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
> lib/librte_ring/rte_ring.h | 5 +----
> lib/librte_ring/rte_ring_elem.h | 8 --------
> lib/librte_ring/rte_ring_version.map | 9 ++-------
> 3 files changed, 3 insertions(+), 19 deletions(-)
>
> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> index 7181c33b4..35f3f8c42 100644
> --- a/lib/librte_ring/rte_ring.h
> +++ b/lib/librte_ring/rte_ring.h
> @@ -40,6 +40,7 @@ extern "C" {
> #endif
>
> #include <rte_ring_core.h>
> +#include <rte_ring_elem.h>
>
> /**
> * Calculate the memory size needed for a ring
> @@ -401,10 +402,6 @@ rte_ring_sp_enqueue_bulk(struct rte_ring *r, void * const *obj_table,
> RTE_RING_SYNC_ST, free_space);
> }
>
> -#ifdef ALLOW_EXPERIMENTAL_API
> -#include <rte_ring_elem.h>
> -#endif
> -
> /**
> * Enqueue several objects on a ring.
> *
> diff --git a/lib/librte_ring/rte_ring_elem.h b/lib/librte_ring/rte_ring_elem.h
> index 9e5192ae6..69dc51746 100644
> --- a/lib/librte_ring/rte_ring_elem.h
> +++ b/lib/librte_ring/rte_ring_elem.h
> @@ -23,9 +23,6 @@ extern "C" {
> #include <rte_ring_core.h>
>
> /**
> - * @warning
> - * @b EXPERIMENTAL: this API may change without prior notice
> - *
> * Calculate the memory size needed for a ring with given element size
> *
> * This function returns the number of bytes needed for a ring, given
> @@ -43,13 +40,9 @@ extern "C" {
> * - -EINVAL - esize is not a multiple of 4 or count provided is not a
> * power of 2.
> */
> -__rte_experimental
> ssize_t rte_ring_get_memsize_elem(unsigned int esize, unsigned int count);
>
> /**
> - * @warning
> - * @b EXPERIMENTAL: this API may change without prior notice
> - *
> * Create a new ring named *name* that stores elements with given size.
> *
> * This function uses ``memzone_reserve()`` to allocate memory. Then it
> @@ -109,7 +102,6 @@ ssize_t rte_ring_get_memsize_elem(unsigned int esize, unsigned int count);
> * - EEXIST - a memzone with the same name already exists
> * - ENOMEM - no appropriate memory area found in which to create memzone
> */
> -__rte_experimental
> struct rte_ring *rte_ring_create_elem(const char *name, unsigned int esize,
> unsigned int count, int socket_id, unsigned int flags);
>
> diff --git a/lib/librte_ring/rte_ring_version.map b/lib/librte_ring/rte_ring_version.map
> index aec6f3820..3030e8edb 100644
> --- a/lib/librte_ring/rte_ring_version.map
> +++ b/lib/librte_ring/rte_ring_version.map
> @@ -2,9 +2,11 @@ DPDK_20.0 {
> global:
>
> rte_ring_create;
> + rte_ring_create_elem;
> rte_ring_dump;
> rte_ring_free;
> rte_ring_get_memsize;
> + rte_ring_get_memsize_elem;
> rte_ring_init;
> rte_ring_list_dump;
> rte_ring_lookup;
> @@ -13,10 +15,3 @@ DPDK_20.0 {
> local: *;
> };
>
> -EXPERIMENTAL {
> - global:
> -
> - # added in 20.02
> - rte_ring_create_elem;
> - rte_ring_get_memsize_elem;
> -};
>
Same as the last comment.
Rte_ring_get_memsize_elem and rte_ring_create_elem are part of the DPDK_21 ABI.
Ray K
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH 1/3] ring: remove experimental tag for ring reset API
@ 2020-07-03 16:16 4% ` Kinsella, Ray
2020-07-03 18:46 3% ` Honnappa Nagarahalli
0 siblings, 1 reply; 200+ results
From: Kinsella, Ray @ 2020-07-03 16:16 UTC (permalink / raw)
To: Feifei Wang, Honnappa Nagarahalli, Konstantin Ananyev, Neil Horman
Cc: dev, nd
On 03/07/2020 11:26, Feifei Wang wrote:
> Remove the experimental tag for rte_ring_reset API that have been around
> for 4 releases.
>
> Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
> Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
> lib/librte_ring/rte_ring.h | 3 ---
> lib/librte_ring/rte_ring_version.map | 4 +---
> 2 files changed, 1 insertion(+), 6 deletions(-)
>
> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> index f67141482..7181c33b4 100644
> --- a/lib/librte_ring/rte_ring.h
> +++ b/lib/librte_ring/rte_ring.h
> @@ -663,15 +663,12 @@ rte_ring_dequeue(struct rte_ring *r, void **obj_p)
> *
> * This function flush all the elements in a ring
> *
> - * @b EXPERIMENTAL: this API may change without prior notice
> - *
> * @warning
> * Make sure the ring is not in use while calling this function.
> *
> * @param r
> * A pointer to the ring structure.
> */
> -__rte_experimental
> void
> rte_ring_reset(struct rte_ring *r);
>
> diff --git a/lib/librte_ring/rte_ring_version.map b/lib/librte_ring/rte_ring_version.map
> index e88c143cf..aec6f3820 100644
> --- a/lib/librte_ring/rte_ring_version.map
> +++ b/lib/librte_ring/rte_ring_version.map
> @@ -8,6 +8,7 @@ DPDK_20.0 {
> rte_ring_init;
> rte_ring_list_dump;
> rte_ring_lookup;
> + rte_ring_reset;
>
> local: *;
> };
> @@ -15,9 +16,6 @@ DPDK_20.0 {
> EXPERIMENTAL {
> global:
>
> - # added in 19.08
> - rte_ring_reset;
> -
> # added in 20.02
> rte_ring_create_elem;
> rte_ring_get_memsize_elem;
So strictly speaking, rte_ring_reset is part of the DPDK_21 ABI, not the v20.0 ABI.
The way to solve is to add it the DPDK_21 ABI in the map file.
And then use the VERSION_SYMBOL_EXPERIMENTAL to alias to experimental if necessary.
https://doc.dpdk.org/guides/contributing/abi_versioning.html#versioning-macros
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH] mbuf: use c11 atomics for refcnt operations
@ 2020-07-03 15:38 3% ` David Marchand
2020-07-06 8:03 3% ` Phil Yang
2020-07-07 10:10 3% ` [dpdk-dev] [PATCH v2] mbuf: use C11 " Phil Yang
1 sibling, 1 reply; 200+ results
From: David Marchand @ 2020-07-03 15:38 UTC (permalink / raw)
To: Phil Yang
Cc: dev, Olivier Matz, David Christensen, Honnappa Nagarahalli,
Ruifeng Wang (Arm Technology China),
nd
On Thu, Jun 11, 2020 at 12:26 PM Phil Yang <phil.yang@arm.com> wrote:
>
> Use c11 atomics with explicit ordering instead of rte_atomic ops which
> enforce unnecessary barriers on aarch64.
>
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
I did not look at the details, but this patch is refused by the ABI
check in Travis.
--
David Marchand
^ permalink raw reply [relevance 3%]
* [dpdk-dev] [PATCH v17 0/2] support for VFIO-PCI VF token interface
@ 2020-07-03 14:57 4% ` Haiyue Wang
0 siblings, 0 replies; 200+ results
From: Haiyue Wang @ 2020-07-03 14:57 UTC (permalink / raw)
To: dev, anatoly.burakov, thomas, jerinj, david.marchand, arybchenko
Cc: Haiyue Wang
v17: Rebase for new EAL config API, update the commit message and doc.
v16: Rebase the patch for 20.08 release note.
v15: Add the missed EXPERIMENTAL warning for API doxgen.
v14: Rebase the patch for 20.08 release note.
v13: Rename the EAL get VF token function, and leave the freebsd type as empty.
v12: support to vfio devices with VF token and no token.
v11: Use the eal parameter to pass the VF token, then not every PCI
device needs to be specified with this token. Also no ABI issue
now.
v10: Use the __rte_internal to mark the internal API changing.
v9: Rewrite the document.
v8: Update the document.
v7: Add the Fixes tag in uuid, the release note and help
document.
v6: Drop the Fixes tag in uuid, since the file has been
moved to another place, not suitable to apply on stable.
And this is not a bug, just some kind of enhancement.
v5: 1. Add the VF token parse error handling.
2. Split into two patches for different logic module.
3. Add more comments into the code for explaining the design.
4. Drop the ABI change workaround, this patch set focuses on code review.
v4: 1. Ignore rte_vfio_setup_device ABI check since it is
for Linux driver use.
v3: Fix the Travis build failed:
(1). rte_uuid.h:97:55: error: unknown type name ‘size_t’
(2). rte_uuid.h:58:2: error: implicit declaration of function ‘memcpy’
v2: Fix the FreeBSD build error.
v1: Update the commit message.
RFC v2:
Based on Vamsi's RFC v1, and Alex's patch for Qemu
[https://lore.kernel.org/lkml/20200204161737.34696b91@w520.home/]:
Use the devarg to pass-down the VF token.
RFC v1: https://patchwork.dpdk.org/patch/66281/ by Vamsi.
Haiyue Wang (2):
eal: add uuid dependent header files explicitly
eal: support for VFIO-PCI VF token
doc/guides/linux_gsg/linux_drivers.rst | 35 ++++++++++++++++++-
doc/guides/linux_gsg/linux_eal_parameters.rst | 4 +++
doc/guides/rel_notes/release_20_08.rst | 6 ++++
lib/librte_eal/common/eal_common_options.c | 3 ++
lib/librte_eal/common/eal_internal_cfg.h | 2 ++
lib/librte_eal/common/eal_options.h | 2 ++
lib/librte_eal/freebsd/eal.c | 5 +++
lib/librte_eal/include/rte_eal.h | 14 ++++++++
lib/librte_eal/include/rte_uuid.h | 2 ++
lib/librte_eal/linux/eal.c | 33 +++++++++++++++++
lib/librte_eal/linux/eal_vfio.c | 19 ++++++++++
lib/librte_eal/rte_eal_version.map | 3 ++
12 files changed, 127 insertions(+), 1 deletion(-)
--
2.27.0
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [PATCH v2] devtools: remove useless files from ABI reference
@ 2020-07-03 9:08 4% ` David Marchand
0 siblings, 0 replies; 200+ results
From: David Marchand @ 2020-07-03 9:08 UTC (permalink / raw)
To: Thomas Monjalon; +Cc: dev, Bruce Richardson
On Thu, May 28, 2020 at 3:16 PM David Marchand
<david.marchand@redhat.com> wrote:
> On Sun, May 24, 2020 at 7:43 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > When building an ABI reference with meson, some static libraries
> > are built and linked in apps. They are useless and take a lot of space.
> > Those binaries, and other useless files (examples and doc files)
> > in the share/ directory, are removed after being installed.
> >
> > In order to save time when building the ABI reference,
> > the examples (which are not installed anyway) are not compiled.
> >
> > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> Acked-by: David Marchand <david.marchand@redhat.com>
Applied, thanks.
--
David Marchand
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [20.11, PATCH] bbdev: remove experimental tag from API
2020-07-02 18:02 3% ` Chautru, Nicolas
@ 2020-07-02 18:09 4% ` Akhil Goyal
0 siblings, 0 replies; 200+ results
From: Akhil Goyal @ 2020-07-02 18:09 UTC (permalink / raw)
To: Chautru, Nicolas, David Marchand; +Cc: dev, Thomas Monjalon
>
> > From: Akhil Goyal <akhil.goyal@nxp.com>
> > > > Hello Nicolas,
> > > >
> > > > On Sat, Jun 27, 2020 at 1:14 AM Nicolas Chautru
> > > > <nicolas.chautru@intel.com> wrote:
> > > > >
> > > > > Planning to move bbdev API to stable from 20.11 (ABI version 21)
> > > > > and remove experimental tag.
> > > > > Sending now to advertise and get any feedback.
> > > > > Some manual rebase will be required later on notably as the actual
> > > > > release note which is not there yet.
> > > >
> > > > Cool that we want to stabilize this API.
> > > > My concern is that we have drivers from a single vendor.
> > > > I would hate to see a new vendor unable to submit a driver (or
> > > > having to wait until the next ABI breakage window) because of the
> > > > current API/ABI.
> > > >
> > > >
> > >
> > > +1 from my side. I am not sure how much it is acceptable for all the
> > > vendors/customers.
> > > It is not reviewed by most of the vendors who may support in future.
> > > It is not good to remove experimental tag as we have a long 1 year
> > > cycle to break the API/ABI.
> > >
> > Moving the patch as deferred in patchworks.
>
> That is fine and all good discussion.
> We know of another vendor who plan to release a bbdev driver but probably
> after 20.11.
> There is one extra capability they will need exposed, we will aim to have the API
> is updated prior to that.
> Assuming the API get updated between now and 20.11, is there still room to
> remove experimental tag in 20.11 or the expectation is to wait regardless for a
> full stable cycle and only intercept ABI v22 in 21.11?
>
I think ABI v22 in 21.11 would be good to move this to stable so that if there are changes
In the ABI when a new vendor PMD comes up, they can be incorporated.
And as the world is evolving towards 5G, there may be multiple vendors and ABI may change.
^ permalink raw reply [relevance 4%]
* Re: [dpdk-dev] [20.11, PATCH] bbdev: remove experimental tag from API
2020-07-02 17:54 0% ` Akhil Goyal
@ 2020-07-02 18:02 3% ` Chautru, Nicolas
2020-07-02 18:09 4% ` Akhil Goyal
0 siblings, 1 reply; 200+ results
From: Chautru, Nicolas @ 2020-07-02 18:02 UTC (permalink / raw)
To: Akhil Goyal, David Marchand; +Cc: dev, Thomas Monjalon
> From: Akhil Goyal <akhil.goyal@nxp.com>
> > > Hello Nicolas,
> > >
> > > On Sat, Jun 27, 2020 at 1:14 AM Nicolas Chautru
> > > <nicolas.chautru@intel.com> wrote:
> > > >
> > > > Planning to move bbdev API to stable from 20.11 (ABI version 21)
> > > > and remove experimental tag.
> > > > Sending now to advertise and get any feedback.
> > > > Some manual rebase will be required later on notably as the actual
> > > > release note which is not there yet.
> > >
> > > Cool that we want to stabilize this API.
> > > My concern is that we have drivers from a single vendor.
> > > I would hate to see a new vendor unable to submit a driver (or
> > > having to wait until the next ABI breakage window) because of the
> > > current API/ABI.
> > >
> > >
> >
> > +1 from my side. I am not sure how much it is acceptable for all the
> > vendors/customers.
> > It is not reviewed by most of the vendors who may support in future.
> > It is not good to remove experimental tag as we have a long 1 year
> > cycle to break the API/ABI.
> >
> Moving the patch as deferred in patchworks.
That is fine and all good discussion.
We know of another vendor who plan to release a bbdev driver but probably after 20.11.
There is one extra capability they will need exposed, we will aim to have the API is updated prior to that.
Assuming the API get updated between now and 20.11, is there still room to remove experimental tag in 20.11 or the expectation is to wait regardless for a full stable cycle and only intercept ABI v22 in 21.11?
Thanks
Nic
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [20.11, PATCH] bbdev: remove experimental tag from API
@ 2020-07-02 17:54 0% ` Akhil Goyal
2020-07-02 18:02 3% ` Chautru, Nicolas
0 siblings, 1 reply; 200+ results
From: Akhil Goyal @ 2020-07-02 17:54 UTC (permalink / raw)
To: David Marchand, Nicolas Chautru; +Cc: dev, Thomas Monjalon
>
> >
> > Hello Nicolas,
> >
> > On Sat, Jun 27, 2020 at 1:14 AM Nicolas Chautru
> > <nicolas.chautru@intel.com> wrote:
> > >
> > > Planning to move bbdev API to stable from 20.11 (ABI version 21)
> > > and remove experimental tag.
> > > Sending now to advertise and get any feedback.
> > > Some manual rebase will be required later on notably as the
> > > actual release note which is not there yet.
> >
> > Cool that we want to stabilize this API.
> > My concern is that we have drivers from a single vendor.
> > I would hate to see a new vendor unable to submit a driver (or having
> > to wait until the next ABI breakage window) because of the current
> > API/ABI.
> >
> >
>
> +1 from my side. I am not sure how much it is acceptable for all the
> vendors/customers.
> It is not reviewed by most of the vendors who may support in future.
> It is not good to remove experimental tag as we have a long 1 year cycle to
> break the API/ABI.
>
Moving the patch as deferred in patchworks.
^ permalink raw reply [relevance 0%]
* Re: [dpdk-dev] [PATCH 01/27] eventdev: dlb upstream prerequisites
2020-07-02 15:21 0% ` Kinsella, Ray
@ 2020-07-02 16:35 3% ` McDaniel, Timothy
0 siblings, 0 replies; 200+ results
From: McDaniel, Timothy @ 2020-07-02 16:35 UTC (permalink / raw)
To: Kinsella, Ray, Jerin Jacob
Cc: Neil Horman, Jerin Jacob, Mattias Rönnblom, dpdk-dev, Eads,
Gage, Van Haaren, Harry
>-----Original Message-----
>From: Kinsella, Ray <mdr@ashroe.eu>
>Sent: Thursday, July 2, 2020 10:21 AM
>To: Jerin Jacob <jerinjacobk@gmail.com>
>Cc: McDaniel, Timothy <timothy.mcdaniel@intel.com>; Neil Horman
><nhorman@tuxdriver.com>; Jerin Jacob <jerinj@marvell.com>; Mattias
>Rönnblom <mattias.ronnblom@ericsson.com>; dpdk-dev <dev@dpdk.org>; Eads,
>Gage <gage.eads@intel.com>; Van Haaren, Harry <harry.van.haaren@intel.com>
>Subject: Re: [dpdk-dev] [PATCH 01/27] eventdev: dlb upstream prerequisites
>
>
>
>On 30/06/2020 13:14, Jerin Jacob wrote:
>> On Tue, Jun 30, 2020 at 5:06 PM Kinsella, Ray <mdr@ashroe.eu> wrote:
>>>
>>>
>>>
>>> On 30/06/2020 12:30, Jerin Jacob wrote:
>>>> On Tue, Jun 30, 2020 at 4:52 PM Kinsella, Ray <mdr@ashroe.eu> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 27/06/2020 08:44, Jerin Jacob wrote:
>>>>>>> +
>>>>>>> +/** Event port configuration structure */
>>>>>>> +struct rte_event_port_conf_v20 {
>>>>>>> + int32_t new_event_threshold;
>>>>>>> + /**< A backpressure threshold for new event enqueues on this port.
>>>>>>> + * Use for *closed system* event dev where event capacity is
>limited,
>>>>>>> + * and cannot exceed the capacity of the event dev.
>>>>>>> + * Configuring ports with different thresholds can make higher
>priority
>>>>>>> + * traffic less likely to be backpressured.
>>>>>>> + * For example, a port used to inject NIC Rx packets into the event
>dev
>>>>>>> + * can have a lower threshold so as not to overwhelm the device,
>>>>>>> + * while ports used for worker pools can have a higher threshold.
>>>>>>> + * This value cannot exceed the *nb_events_limit*
>>>>>>> + * which was previously supplied to rte_event_dev_configure().
>>>>>>> + * This should be set to '-1' for *open system*.
>>>>>>> + */
>>>>>>> + uint16_t dequeue_depth;
>>>>>>> + /**< Configure number of bulk dequeues for this event port.
>>>>>>> + * This value cannot exceed the *nb_event_port_dequeue_depth*
>>>>>>> + * which previously supplied to rte_event_dev_configure().
>>>>>>> + * Ignored when device is not RTE_EVENT_DEV_CAP_BURST_MODE
>capable.
>>>>>>> + */
>>>>>>> + uint16_t enqueue_depth;
>>>>>>> + /**< Configure number of bulk enqueues for this event port.
>>>>>>> + * This value cannot exceed the *nb_event_port_enqueue_depth*
>>>>>>> + * which previously supplied to rte_event_dev_configure().
>>>>>>> + * Ignored when device is not RTE_EVENT_DEV_CAP_BURST_MODE
>capable.
>>>>>>> + */
>>>>>>> uint8_t disable_implicit_release;
>>>>>>> /**< Configure the port not to release outstanding events in
>>>>>>> * rte_event_dev_dequeue_burst(). If true, all events received
>through
>>>>>>> @@ -733,6 +911,14 @@ struct rte_event_port_conf {
>>>>>>> rte_event_port_default_conf_get(uint8_t dev_id, uint8_t port_id,
>>>>>>> struct rte_event_port_conf *port_conf);
>>>>>>>
>>>>>>> +int
>>>>>>> +rte_event_port_default_conf_get_v20(uint8_t dev_id, uint8_t port_id,
>>>>>>> + struct rte_event_port_conf_v20 *port_conf);
>>>>>>> +
>>>>>>> +int
>>>>>>> +rte_event_port_default_conf_get_v21(uint8_t dev_id, uint8_t port_id,
>>>>>>> + struct rte_event_port_conf *port_conf);
>>>>>>
>>>>>> Hi Timothy,
>>>>>>
>>>>>> + ABI Maintainers (Ray, Neil)
>>>>>>
>>>>>> # As per my understanding, the structures can not be versioned, only
>>>>>> function can be versioned.
>>>>>> i.e we can not make any change to " struct rte_event_port_conf"
>>>>>
>>>>> So the answer is (as always): depends
>>>>>
>>>>> If the structure is being use in inline functions is when you run into trouble
>>>>> - as knowledge of the structure is embedded in the linked application.
>>>>>
>>>>> However if the structure is _strictly_ being used as a non-inlined function
>parameter,
>>>>> It can be safe to version in this way.
>>>>
>>>> But based on the optimization applied when building the consumer code
>>>> matters. Right?
>>>> i.e compiler can "inline" it, based on the optimization even the
>>>> source code explicitly mentions it.
>>>
>>> Well a compiler will typically only inline within the confines of a given object
>file, or
>>> binary, if LTO is enabled.
>>
>>>
>>> If a function symbol is exported from library however, it won't be inlined in a
>linked application.
>>
>> Yes, With respect to that function.
>> But the application can use struct rte_event_port_conf in their code
>> and it can be part of other structures.
>> Right?
>
>Tim, it looks like you might be inadvertently breaking other symbols also.
>For example ...
>
>int
>rte_event_crypto_adapter_create(uint8_t id, uint8_t dev_id,
> struct rte_event_port_conf *port_config,
> enum rte_event_crypto_adapter_mode mode);
>
>int
>rte_event_port_setup(uint8_t dev_id, uint8_t port_id,
> const struct rte_event_port_conf *port_conf);
>
>These and others symbols are also using rte_event_port_conf and would need to
>be updated to use the v20 struct,
>if we aren't to break them .
>
Yes, we just discovered that after successfully running the ABI checker. I will address those in the v3
patch set. Thanks.
>>
>>
>>> The compiler doesn't have enough information to inline it.
>>> All the compiler will know about it is it's offset in memory, and it's signature.
>>>
>>>>
>>>>
>>>>>
>>>>> So just to be clear, it is still the function that is actually being versioned
>here.
>>>>>
>>>>>>
>>>>>> # We have a similar case with ethdev and it deferred to next release v20.11
>>>>>> http://patches.dpdk.org/patch/69113/
>>>>>
>>>>> Yes - I spent a why looking at this one, but I am struggling to recall,
>>>>> why when I looked it we didn't suggest function versioning as a potential
>solution in this case.
>>>>>
>>>>> Looking back at it now, looks like it would have been ok.
>>>>
>>>> Ok.
>>>>
>>>>>
>>>>>>
>>>>>> Regarding the API changes:
>>>>>> # The slow path changes general looks good to me. I will review the
>>>>>> next level in the coming days
>>>>>> # The following fast path changes bothers to me. Could you share more
>>>>>> details on below change?
>>>>>>
>>>>>> diff --git a/app/test-eventdev/test_order_atq.c
>>>>>> b/app/test-eventdev/test_order_atq.c
>>>>>> index 3366cfc..8246b96 100644
>>>>>> --- a/app/test-eventdev/test_order_atq.c
>>>>>> +++ b/app/test-eventdev/test_order_atq.c
>>>>>> @@ -34,6 +34,8 @@
>>>>>> continue;
>>>>>> }
>>>>>>
>>>>>> + ev.flow_id = ev.mbuf->udata64;
>>>>>> +
>>>>>> # Since RC1 is near, I am not sure how to accommodate the API changes
>>>>>> now and sort out ABI stuffs.
>>>>>> # Other concern is eventdev spec get bloated with versioning files
>>>>>> just for ONE release as 20.11 will be OK to change the ABI.
>>>>>> # While we discuss the API change, Please send deprecation notice for
>>>>>> ABI change for 20.11,
>>>>>> so that there is no ambiguity of this patch for the 20.11 release.
>>>>>>
^ permalink raw reply [relevance 3%]
* Re: [dpdk-dev] [PATCH 01/27] eventdev: dlb upstream prerequisites
@ 2020-07-02 15:21 0% ` Kinsella, Ray
2020-07-02 16:35 3% ` McDaniel, Timothy
0 siblings, 1 reply; 200+ results
From: Kinsella, Ray @ 2020-07-02 15:21 UTC (permalink / raw)
To: Jerin Jacob
Cc: Tim McDaniel, Neil Horman, Jerin Jacob, Mattias Rönnblom,
dpdk-dev, Gage Eads, Van Haaren, Harry
On 30/06/2020 13:14, Jerin Jacob wrote:
> On Tue, Jun 30, 2020 at 5:06 PM Kinsella, Ray <mdr@ashroe.eu> wrote:
>>
>>
>>
>> On 30/06/2020 12:30, Jerin Jacob wrote:
>>> On Tue, Jun 30, 2020 at 4:52 PM Kinsella, Ray <mdr@ashroe.eu> wrote:
>>>>
>>>>
>>>>
>>>> On 27/06/2020 08:44, Jerin Jacob wrote:
>>>>>> +
>>>>>> +/** Event port configuration structure */
>>>>>> +struct rte_event_port_conf_v20 {
>>>>>> + int32_t new_event_threshold;
>>>>>> + /**< A backpressure threshold for new event enqueues on this port.
>>>>>> + * Use for *closed system* event dev where event capacity is limited,
>>>>>> + * and cannot exceed the capacity of the event dev.
>>>>>> + * Configuring ports with different thresholds can make higher priority
>>>>>> + * traffic less likely to be backpressured.
>>>>>> + * For example, a port used to inject NIC Rx packets into the event dev
>>>>>> + * can have a lower threshold so as not to overwhelm the device,
>>>>>> + * while ports used for worker pools can have a higher threshold.
>>>>>> + * This value cannot exceed the *nb_events_limit*
>>>>>> + * which was previously supplied to rte_event_dev_configure().
>>>>>> + * This should be set to '-1' for *open system*.
>>>>>> + */
>>>>>> + uint16_t dequeue_depth;
>>>>>> + /**< Configure number of bulk dequeues for this event port.
>>>>>> + * This value cannot exceed the *nb_event_port_dequeue_depth*
>>>>>> + * which previously supplied to rte_event_dev_configure().
>>>>>> + * Ignored when device is not RTE_EVENT_DEV_CAP_BURST_MODE capable.
>>>>>> + */
>>>>>> + uint16_t enqueue_depth;
>>>>>> + /**< Configure number of bulk enqueues for this event port.
>>>>>> + * This value cannot exceed the *nb_event_port_enqueue_depth*
>>>>>> + * which previously supplied to rte_event_dev_configure().
>>>>>> + * Ignored when device is not RTE_EVENT_DEV_CAP_BURST_MODE capable.
>>>>>> + */
>>>>>> uint8_t disable_implicit_release;
>>>>>> /**< Configure the port not to release outstanding events in
>>>>>> * rte_event_dev_dequeue_burst(). If true, all events received through
>>>>>> @@ -733,6 +911,14 @@ struct rte_event_port_conf {
>>>>>> rte_event_port_default_conf_get(uint8_t dev_id, uint8_t port_id,
>>>>>> struct rte_event_port_conf *port_conf);
>>>>>>
>>>>>> +int
>>>>>> +rte_event_port_default_conf_get_v20(uint8_t dev_id, uint8_t port_id,
>>>>>> + struct rte_event_port_conf_v20 *port_conf);
>>>>>> +
>>>>>> +int
>>>>>> +rte_event_port_default_conf_get_v21(uint8_t dev_id, uint8_t port_id,
>>>>>> + struct rte_event_port_conf *port_conf);
>>>>>
>>>>> Hi Timothy,
>>>>>
>>>>> + ABI Maintainers (Ray, Neil)
>>>>>
>>>>> # As per my understanding, the structures can not be versioned, only
>>>>> function can be versioned.
>>>>> i.e we can not make any change to " struct rte_event_port_conf"
>>>>
>>>> So the answer is (as always): depends
>>>>
>>>> If the structure is being use in inline functions is when you run into trouble
>>>> - as knowledge of the structure is embedded in the linked application.
>>>>
>>>> However if the structure is _strictly_ being used as a non-inlined function parameter,
>>>> It can be safe to version in this way.
>>>
>>> But based on the optimization applied when building the consumer code
>>> matters. Right?
>>> i.e compiler can "inline" it, based on the optimization even the
>>> source code explicitly mentions it.
>>
>> Well a compiler will typically only inline within the confines of a given object file, or
>> binary, if LTO is enabled.
>
>>
>> If a function symbol is exported from library however, it won't be inlined in a linked application.
>
> Yes, With respect to that function.
> But the application can use struct rte_event_port_conf in their code
> and it can be part of other structures.
> Right?
Tim, it looks like you might be inadvertently breaking other symbols also.
For example ...
int
rte_event_crypto_adapter_create(uint8_t id, uint8_t dev_id,
struct rte_event_port_conf *port_config,
enum rte_event_crypto_adapter_mode mode);
int
rte_event_port_setup(uint8_t dev_id, uint8_t port_id,
const struct rte_event_port_conf *port_conf);
These and others symbols are also using rte_event_port_conf and would need to be updated to use the v20 struct,
if we aren't to break them .
>
>
>> The compiler doesn't have enough information to inline it.
>> All the compiler will know about it is it's offset in memory, and it's signature.
>>
>>>
>>>
>>>>
>>>> So just to be clear, it is still the function that is actually being versioned here.
>>>>
>>>>>
>>>>> # We have a similar case with ethdev and it deferred to next release v20.11
>>>>> http://patches.dpdk.org/patch/69113/
>>>>
>>>> Yes - I spent a why looking at this one, but I am struggling to recall,
>>>> why when I looked it we didn't suggest function versioning as a potential solution in this case.
>>>>
>>>> Looking back at it now, looks like it would have been ok.
>>>
>>> Ok.
>>>
>>>>
>>>>>
>>>>> Regarding the API changes:
>>>>> # The slow path changes general looks good to me. I will review the
>>>>> next level in the coming days
>>>>> # The following fast path changes bothers to me. Could you share more
>>>>> details on below change?
>>>>>
>>>>> diff --git a/app/test-eventdev/test_order_atq.c
>>>>> b/app/test-eventdev/test_order_atq.c
>>>>> index 3366cfc..8246b96 100644
>>>>> --- a/app/test-eventdev/test_order_atq.c
>>>>> +++ b/app/test-eventdev/test_order_atq.c
>>>>> @@ -34,6 +34,8 @@
>>>>> continue;
>>>>> }
>>>>>
>>>>> + ev.flow_id = ev.mbuf->udata64;
>>>>> +
>>>>> # Since RC1 is near, I am not sure how to accommodate the API changes
>>>>> now and sort out ABI stuffs.
>>>>> # Other concern is eventdev spec get bloated with versioning files
>>>>> just for ONE release as 20.11 will be OK to change the ABI.
>>>>> # While we discuss the API change, Please send deprecation notice for
>>>>> ABI change for 20.11,
>>>>> so that there is no ambiguity of this patch for the 20.11 release.
>>>>>
^ permalink raw reply [relevance 0%]
* [dpdk-dev] DPDK Release Status Meeting 2/07/2020
@ 2020-07-02 14:58 4% Ferruh Yigit
0 siblings, 0 replies; 200+ results
From: Ferruh Yigit @ 2020-07-02 14:58 UTC (permalink / raw)
To: dev; +Cc: Thomas Monjalon
Minutes 2 July 2020
-------------------
Agenda:
* Release Dates
* Highlights
* Subtrees
* LTS
Participants:
* Arm
* Debian/Microsoft
* Intel
* Marvell
* Nvidia
* NXP
* Red Hat
Release Dates
-------------
* v20.08 dates:
* -rc1: Wednesday, 8 July 2020
* -rc2: Monday, 20 July 2020
* Release: Tuesday, 4 August 2020
* v20.11 proposal dates, please comment:
* Proposal/V1: Wednesday, 2 September 2020
* -rc1: Wednesday, 30 September 2020
* -rc2: Friday, 16 October 2020
* Release: Friday, 6 November 2020
Highlights
----------
* We are close to -rc1 but still lots of patches in backlog waiting for review
*Please help on code reviews*, missing code reviews may lead some features
missing the release.
Please check "call for reviews" email for the list of patches to review:
https://mails.dpdk.org/archives/announce/2020-June/000329.html
* Please subscribe to patchwork to be able to update status of your patches,
not updating them adding overhead to the maintainers.
* We are observing an issue in Intel, that not receiving patchwork
registration and lost password emails.
* If there is anyone else outside Intel having the same problem please reach
out to help analyzing the problem.
* Within Intel please reach to Ferruh if there are patches their status
needs update in patchwork and you don't have the access.
Subtrees
--------
* main
* Started to merge ring and vfio patches
* Would like to close following
* non-EAL threads as lcore from David
* rte_log registration usage improvement from Jerin
* if-proxy
* Stephen reviewed the patch
* regex
* Waiting for PMD implementations. How many PMD required for merge?
* A HW and two SW PMDs were planned
* Worrying that ethdev doesn't have enough review
* Jerin did review on some rte flow ones
* next-net
* Pulled from vendor sub-trees
* Some big base update patches from Intel and bnxt merged
* Need to get ethdev patches before -rc1, requires more review
* next-crypto
* Reviewed half of the backlog
* Will be good for -rc1
* cryptodev patches has been reviewed
* next-eventdev
* Almost ready for -rc1
* Intel DLB PMD new version still has ABI breakage
* Postponed to next release because of the ABI
* No controversial issues otherwise
* next-virtio
* Maxime did a pull request for majority of the patches
* Maxime sent a status for remaining ones
* 2 patches for async datapath, looks good
* 2 patches for vhost-user protocol features
* Has a dependency to quemu
* Adrian from Red Hat will takeover the patches
* Performance optimization (loops vectorization)
* Waiting for new version
* Not critical for this release, may be postponed if needed
* Chenbo is managing the virtio patches during Maxime's absence
* next-net-intel
* Qi is actively merging patches
* Some base code updates already merged
* DCF datapath merged
* next-net-mlx
* Some patches already merged
* Expecting more but not many
* next-net-mrvl
* A few patches merged
* Two more patches for -rc1
* change requested for qede patches, can merge when they are ready
LTS
---
* v18.11.9-rc2 is out, please test
* https://mails.dpdk.org/archives/dev/2020-June/171690.html
* OvS testing reported an issue
* A workaround can exist for it
* Nvidia reported an error
* Which is not regression for 18.11.9 release
* The release is planned on end of this week or early next week
DPDK Release Status Meetings
============================
The DPDK Release Status Meeting is intended for DPDK Committers to discuss the
status of the master tree and sub-trees, and for project managers to track
progress or milestone dates.
The meeting occurs on every Thursdays at 8:30 UTC. on https://meet.jit.si/DPDK
If you wish to attend just send an email to
"John McNamara <john.mcnamara@intel.com>" for the invite.
^ permalink raw reply [relevance 4%]
* [dpdk-dev] [PATCH v3 8/9] devtools: support python3 only
@ 2020-07-02 10:37 4% ` Louise Kilheeney
0 siblings, 0 replies; 200+ results
From: Louise Kilheeney @ 2020-07-02 10:37 UTC (permalink / raw)
To: dev
Cc: robin.jarry, anatoly.burakov, bruce.richardson, Louise Kilheeney,
Neil Horman, Ray Kinsella
Changed script to explicitly use python3 only to avoid
maintaining python 2.
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Ray Kinsella <mdr@ashroe.eu>
Signed-off-by: Louise Kilheeney <louise.kilheeney@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
---
devtools/update_version_map_abi.py | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/devtools/update_version_map_abi.py b/devtools/update_version_map_abi.py
index e2104e61e..830e6c58c 100755
--- a/devtools/update_version_map_abi.py
+++ b/devtools/update_version_map_abi.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/env python3
# SPDX-License-Identifier: BSD-3-Clause
# Copyright(c) 2019 Intel Corporation
@@ -9,7 +9,6 @@
from the devtools/update-abi.sh utility.
"""
-from __future__ import print_function
import argparse
import sys
import re
--
2.17.1
^ permalink raw reply [relevance 4%]
* [dpdk-dev] [PATCH (v20.11) 2/2] eventdev: reserve space in timer structs for extension
@ 2020-07-02 6:19 4% ` pbhagavatula
0 siblings, 0 replies; 200+ results
From: pbhagavatula @ 2020-07-02 6:19 UTC (permalink / raw)
To: jerinj, Erik Gabriel Carrillo; +Cc: dev, Pavan Nikhilesh
From: Pavan Nikhilesh <pbhagavatula@marvell.com>
The struct rte_event_timer_adapter and rte_event_timer_adapter_data are
supposed to be used internally only, but there is a chance that
increasing their size would break ABI for some applications.
In order to allow smooth addition of features without breaking
ABI compatibility, reserve some space.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
---
lib/librte_eventdev/rte_event_timer_adapter.h | 5 +++++
lib/librte_eventdev/rte_event_timer_adapter_pmd.h | 5 +++++
2 files changed, 10 insertions(+)
diff --git a/lib/librte_eventdev/rte_event_timer_adapter.h b/lib/librte_eventdev/rte_event_timer_adapter.h
index f83d85f4d..ce57a990a 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter.h
+++ b/lib/librte_eventdev/rte_event_timer_adapter.h
@@ -529,6 +529,11 @@ struct rte_event_timer_adapter {
RTE_STD_C11
uint8_t allocated : 1;
/**< Flag to indicate that this adapter has been allocated */
+
+ uint64_t reserved_64s[4];
+ /**< Reserved for future fields */
+ void *reserved_ptrs[4];
+ /**< Reserved for future fields */
} __rte_cache_aligned;
#define ADAPTER_VALID_OR_ERR_RET(adapter, retval) do { \
diff --git a/lib/librte_eventdev/rte_event_timer_adapter_pmd.h b/lib/librte_eventdev/rte_event_timer_adapter_pmd.h
index cf3509dc6..0a6682833 100644
--- a/lib/librte_eventdev/rte_event_timer_adapter_pmd.h
+++ b/lib/librte_eventdev/rte_event_timer_adapter_pmd.h
@@ -105,6 +105,11 @@ struct rte_event_timer_adapter_data {
RTE_STD_C11
uint8_t started : 1;
/**< Flag to indicate adapter started. */
+
+ uint64_t reserved_64s[4];
+ /**< Reserved for future fields */
+ void *reserved_ptrs[4];
+ /**< Reserved for future fields */
} __rte_cache_aligned;
#ifdef __cplusplus
--
2.17.1
^ permalink raw reply [relevance 4%]
Results 5401-5600 of ~18000 next (older) | prev (newer) | reverse | sort options + mbox downloads above
-- links below jump to the message on this page --
2018-01-15 16:16 [dpdk-dev] [PATCH v6] sched: make RED scaling configurable alangordondewar
2019-04-08 8:53 ` [dpdk-dev] [PATCH v7] " Thomas Monjalon
2019-04-08 13:29 ` Dumitrescu, Cristian
2020-07-06 23:09 3% ` Thomas Monjalon
2019-09-06 9:45 [dpdk-dev] [PATCH v2 0/6] RCU integration with LPM library Ruifeng Wang
2020-06-29 8:02 ` [dpdk-dev] [PATCH v5 0/3] " Ruifeng Wang
2020-06-29 8:02 ` [dpdk-dev] [PATCH v5 1/3] lib/lpm: integrate RCU QSBR Ruifeng Wang
2020-06-29 11:56 ` David Marchand
2020-07-04 17:00 3% ` Ruifeng Wang
2020-07-07 14:40 3% ` [dpdk-dev] [PATCH v6 0/3] RCU integration with LPM library Ruifeng Wang
2020-07-07 15:15 3% ` [dpdk-dev] [PATCH v7 " Ruifeng Wang
2020-07-07 15:15 ` [dpdk-dev] [PATCH v7 1/3] lib/lpm: integrate RCU QSBR Ruifeng Wang
2020-07-08 14:30 2% ` David Marchand
2020-07-08 15:34 5% ` Ruifeng Wang
2020-07-09 8:02 4% ` [dpdk-dev] [PATCH v8 0/3] RCU integration with LPM library Ruifeng Wang
2020-07-09 8:02 2% ` [dpdk-dev] [PATCH v8 1/3] lib/lpm: integrate RCU QSBR Ruifeng Wang
2020-07-09 15:42 4% ` [dpdk-dev] [PATCH v9 0/3] RCU integration with LPM library Ruifeng Wang
2020-07-09 15:42 2% ` [dpdk-dev] [PATCH v9 1/3] lib/lpm: integrate RCU QSBR Ruifeng Wang
2020-07-10 2:22 4% ` [dpdk-dev] [PATCH v10 0/3] RCU integration with LPM library Ruifeng Wang
2020-07-10 2:22 2% ` [dpdk-dev] [PATCH v10 1/3] lib/lpm: integrate RCU QSBR Ruifeng Wang
2020-07-10 2:29 0% ` Ruifeng Wang
2020-02-20 13:18 [dpdk-dev] [PATCH] lib/cmdline_rdline: increase command line buf size Wisam Jaddo
2020-02-20 14:53 ` [dpdk-dev] [PATCH v3] cmdline: increase maximum line length Wisam Jaddo
2020-02-22 15:28 ` David Marchand
2020-07-31 12:55 0% ` Olivier Matz
2020-07-31 13:00 0% ` David Marchand
2020-07-31 15:46 0% ` Stephen Hemminger
2020-03-05 4:33 [dpdk-dev] [RFC v1 1/1] vfio: set vf token and gain vf device access vattunuru
2020-07-03 14:57 4% ` [dpdk-dev] [PATCH v17 0/2] support for VFIO-PCI VF token interface Haiyue Wang
2020-04-21 2:04 [dpdk-dev] [PATCH] devtools: remove useless files from ABI reference Thomas Monjalon
2020-05-24 17:43 ` [dpdk-dev] [PATCH v2] " Thomas Monjalon
2020-05-28 13:16 ` David Marchand
2020-07-03 9:08 4% ` David Marchand
2020-05-21 13:20 [dpdk-dev] [PATCH 20.08] mempool/ring: add support for new ring sync modes Konstantin Ananyev
2020-06-29 16:10 ` [dpdk-dev] [PATCH v2] " Konstantin Ananyev
2020-07-09 16:18 ` Olivier Matz
2020-07-09 17:55 ` Ananyev, Konstantin
2020-07-10 12:52 ` Olivier Matz
2020-07-10 15:15 ` Ananyev, Konstantin
2020-07-10 15:20 ` Ananyev, Konstantin
2020-07-13 13:30 ` Olivier Matz
2020-07-13 14:46 ` Ananyev, Konstantin
2020-07-13 15:00 3% ` Olivier Matz
2020-07-13 16:29 0% ` Ananyev, Konstantin
2020-05-22 6:58 [dpdk-dev] [PATCH 0/3] Experimental/internal libraries cleanup David Marchand
2020-06-26 8:16 ` [dpdk-dev] [PATCH v3 " David Marchand
2020-07-05 19:55 3% ` Thomas Monjalon
2020-07-06 8:02 3% ` [dpdk-dev] [dpdk-techboard] " Bruce Richardson
2020-07-06 8:12 0% ` Thomas Monjalon
2020-07-06 16:57 0% ` [dpdk-dev] " Medvedkin, Vladimir
2020-05-22 13:23 [dpdk-dev] [PATCH 20.08 0/9] adding support for python 3 only Louise Kilheeney
2020-07-02 10:37 ` [dpdk-dev] [PATCH v3 0/9] dding " Louise Kilheeney
2020-07-02 10:37 4% ` [dpdk-dev] [PATCH v3 8/9] devtools: support python3 only Louise Kilheeney
2020-05-31 14:43 [dpdk-dev] [RFC] ethdev: add fragment attribute to IPv6 item Dekel Peled
2020-06-02 14:32 ` Andrew Rybchenko
2020-06-02 18:28 ` Ori Kam
2020-06-02 19:04 ` Adrien Mazarguil
2020-07-05 13:13 0% ` Andrew Rybchenko
2020-08-03 17:01 3% ` [dpdk-dev] [RFC v2] ethdev: add extensions attributes " Dekel Peled
2020-08-03 17:11 3% ` [dpdk-dev] [RFC v3] " Dekel Peled
2020-06-04 21:02 [dpdk-dev] [RFC] doc: change to diverse and inclusive language Stephen Hemminger
2020-07-30 0:57 ` [dpdk-dev] [PATCH v2 20.08 0/6] inclusive language fixes and deprecation notices Stephen Hemminger
2020-07-30 0:57 ` [dpdk-dev] [PATCH v2 20.08 1/6] doc: announce deprecation of master lcore Stephen Hemminger
2020-07-30 8:42 3% ` Bruce Richardson
2020-07-30 0:58 ` [dpdk-dev] [PATCH v2 20.08 4/6] doc: announce deprecation blacklist/whitelist Stephen Hemminger
2020-07-30 8:45 3% ` Bruce Richardson
2020-07-30 15:10 0% ` Stephen Hemminger
2020-06-07 17:01 [dpdk-dev] [PATCH 0/9] Rename blacklist/whitelist to blocklist/allowlist Stephen Hemminger
2020-07-15 23:02 ` [dpdk-dev] [PATCH v5 0/9] rename blacklist/whitelist to exclude/include Stephen Hemminger
2020-07-15 23:02 1% ` [dpdk-dev] [PATCH v5 9/9] doc: replace references to blacklist/whitelist Stephen Hemminger
2020-06-10 6:38 [dpdk-dev] [RFC] mbuf: accurate packet Tx scheduling Viacheslav Ovsiienko
2020-07-01 15:36 ` [dpdk-dev] [PATCH 1/2] mbuf: introduce " Viacheslav Ovsiienko
2020-07-07 11:50 0% ` Olivier Matz
2020-07-07 12:46 0% ` Slava Ovsiienko
2020-07-07 12:59 2% ` [dpdk-dev] [PATCH v2 " Viacheslav Ovsiienko
2020-07-07 13:08 2% ` [dpdk-dev] [PATCH v3 " Viacheslav Ovsiienko
2020-07-07 14:32 0% ` Olivier Matz
2020-07-07 14:57 2% ` [dpdk-dev] [PATCH v4 " Viacheslav Ovsiienko
2020-07-07 15:23 0% ` Olivier Matz
2020-07-08 14:16 0% ` [dpdk-dev] [PATCH v4 1/2] mbuf: introduce accurate packet Txscheduling Morten Brørup
2020-07-08 14:54 0% ` Slava Ovsiienko
2020-07-08 15:27 0% ` Morten Brørup
2020-07-08 15:51 0% ` Slava Ovsiienko
2020-07-08 15:47 2% ` [dpdk-dev] [PATCH v5 1/2] mbuf: introduce accurate packet Tx scheduling Viacheslav Ovsiienko
2020-07-08 16:05 0% ` Slava Ovsiienko
2020-07-09 12:36 2% ` [dpdk-dev] [PATCH v6 " Viacheslav Ovsiienko
2020-07-09 23:47 0% ` Ferruh Yigit
2020-07-10 12:32 0% ` Slava Ovsiienko
2020-07-10 12:39 2% ` [dpdk-dev] [PATCH v7 " Viacheslav Ovsiienko
2020-07-10 15:46 0% ` Slava Ovsiienko
2020-07-10 22:07 0% ` Ferruh Yigit
2020-06-10 14:44 [dpdk-dev] [PATCH 0/7] Register external threads as lcore David Marchand
2020-07-06 14:15 ` [dpdk-dev] [PATCH v5 00/10] Register non-EAL " David Marchand
2020-07-06 14:15 3% ` [dpdk-dev] [PATCH v5 02/10] eal: fix multiple definition of per lcore thread id David Marchand
2020-07-06 14:16 3% ` [dpdk-dev] [PATCH v5 04/10] eal: introduce thread uninit helper David Marchand
2020-07-06 20:52 ` [dpdk-dev] [PATCH v6 00/10] Register non-EAL threads as lcore David Marchand
2020-07-06 20:52 3% ` [dpdk-dev] [PATCH v6 02/10] eal: fix multiple definition of per lcore thread id David Marchand
2020-07-06 20:52 3% ` [dpdk-dev] [PATCH v6 04/10] eal: introduce thread uninit helper David Marchand
2020-06-10 17:17 [dpdk-dev] [RFC PATCH 1/6] eal: introduce macros for getting value for bit Parav Pandit
2020-07-24 14:38 ` [dpdk-dev] [PATCH v10 00/10] Improve mlx5 PMD driver framework for multiple classes Parav Pandit
2020-07-24 14:38 ` [dpdk-dev] [PATCH v10 01/10] eal: introduce macro for bit definition Parav Pandit
2020-07-24 18:31 ` Honnappa Nagarahalli
2020-07-27 8:21 ` Morten Brørup
2020-07-28 2:18 ` Honnappa Nagarahalli
2020-07-28 8:24 3% ` Morten Brørup
2020-07-28 9:29 0% ` Gaëtan Rivet
2020-07-28 11:11 0% ` Morten Brørup
2020-07-28 15:39 0% ` Honnappa Nagarahalli
2020-06-11 10:24 [dpdk-dev] [PATCH 1/2] eal: remove redundant code Phil Yang
2020-06-11 10:24 ` [dpdk-dev] [PATCH 2/2] eal: use c11 atomics for interrupt status Phil Yang
2020-07-08 12:29 3% ` David Marchand
2020-07-08 13:43 0% ` Aaron Conole
2020-07-08 15:04 0% ` Kinsella, Ray
2020-07-09 5:21 0% ` Phil Yang
2020-07-09 6:46 3% ` [dpdk-dev] [PATCH v2] eal: use c11 atomic built-ins " Phil Yang
2020-07-09 8:02 0% ` Stefan Puiu
2020-07-09 8:34 3% ` [dpdk-dev] [PATCH v3] " Phil Yang
2020-07-09 10:30 0% ` David Marchand
2020-07-10 7:18 3% ` Dodji Seketeli
2020-06-11 10:26 [dpdk-dev] [PATCH] mbuf: use c11 atomics for refcnt operations Phil Yang
2020-07-03 15:38 3% ` David Marchand
2020-07-06 8:03 3% ` Phil Yang
2020-07-07 10:10 3% ` [dpdk-dev] [PATCH v2] mbuf: use C11 " Phil Yang
2020-07-08 5:11 3% ` Phil Yang
2020-07-08 11:44 0% ` Olivier Matz
2020-07-09 10:00 3% ` Phil Yang
2020-07-09 10:10 4% ` [dpdk-dev] [PATCH v3] mbuf: use C11 atomic built-ins " Phil Yang
2020-07-09 11:03 3% ` Olivier Matz
2020-07-09 13:00 3% ` Phil Yang
2020-07-09 13:31 0% ` Honnappa Nagarahalli
2020-07-09 14:10 0% ` Phil Yang
2020-07-09 15:58 4% ` [dpdk-dev] [PATCH v4 1/2] " Phil Yang
2020-07-15 12:29 0% ` David Marchand
2020-07-15 12:49 0% ` Aaron Conole
2020-07-15 16:29 0% ` Phil Yang
2020-07-16 4:16 0% ` Phil Yang
2020-07-16 11:30 4% ` David Marchand
2020-07-17 4:36 4% ` [dpdk-dev] [PATCH v5 1/2] mbuf: use C11 atomic builtins " Phil Yang
2020-06-12 11:19 [dpdk-dev] [PATCH 1/3] eventdev: fix race condition on timer list counter Phil Yang
2020-07-02 5:26 ` [dpdk-dev] [PATCH v2 1/4] " Phil Yang
2020-07-02 5:26 ` [dpdk-dev] [PATCH v2 4/4] eventdev: relax smp barriers with c11 atomics Phil Yang
2020-07-06 10:04 4% ` Thomas Monjalon
2020-07-06 15:32 0% ` Phil Yang
2020-07-06 15:40 0% ` Thomas Monjalon
2020-07-07 11:13 ` [dpdk-dev] [PATCH v3 1/4] eventdev: fix race condition on timer list counter Phil Yang
2020-07-07 11:13 4% ` [dpdk-dev] [PATCH v3 4/4] eventdev: relax smp barriers with C11 atomics Phil Yang
2020-07-07 14:29 0% ` Jerin Jacob
2020-07-07 15:56 0% ` Phil Yang
2020-07-07 15:54 ` [dpdk-dev] [PATCH v4 1/4] eventdev: fix race condition on timer list counter Phil Yang
2020-07-07 15:54 4% ` [dpdk-dev] [PATCH v4 4/4] eventdev: relax smp barriers with C11 atomics Phil Yang
2020-07-08 13:30 4% ` [dpdk-dev] [PATCH v4 1/4] eventdev: fix race condition on timer list counter Jerin Jacob
2020-07-08 15:01 0% ` Thomas Monjalon
2020-06-13 0:00 [dpdk-dev] [PATCH v3 00/10] rename blacklist/whitelist to block/allow Stephen Hemminger
2020-07-10 15:06 3% ` David Marchand
2020-07-14 4:43 0% ` Stephen Hemminger
2020-07-14 5:39 ` [dpdk-dev] [PATCH v4 00/11] rename blacklist/whitelist to exclude/include Stephen Hemminger
2020-07-14 5:39 4% ` [dpdk-dev] [PATCH v4 09/11] doc: add note about blacklist/whitelist changes Stephen Hemminger
2020-06-20 21:05 [dpdk-dev] [PATCH 0/7] cmdline: support Windows Dmitry Kozlyuk
2020-06-20 21:05 ` [dpdk-dev] [PATCH 6/7] " Dmitry Kozlyuk
2020-06-28 14:20 ` Fady Bader
2020-06-29 6:23 ` Ranjit Menon
2020-06-29 7:42 ` Dmitry Kozlyuk
2020-06-29 8:12 ` Tal Shnaiderman
2020-06-29 23:56 ` Dmitry Kozlyuk
2020-07-08 1:09 0% ` Dmitry Kozlyuk
2020-06-23 13:49 [dpdk-dev] [PATCH] doc: mark internal symbols in ethdev Ferruh Yigit
2020-06-26 8:49 ` Kinsella, Ray
2020-07-10 14:20 0% ` Thomas Monjalon
2020-07-10 16:17 0% ` Ferruh Yigit
2020-06-26 23:14 [dpdk-dev] [20.11, PATCH] bbdev: remove experimental tag from API Nicolas Chautru
2020-06-30 7:30 ` David Marchand
2020-06-30 7:35 ` Akhil Goyal
2020-07-02 17:54 0% ` Akhil Goyal
2020-07-02 18:02 3% ` Chautru, Nicolas
2020-07-02 18:09 4% ` Akhil Goyal
2020-06-27 4:37 [dpdk-dev] [PATCH 00/27] event/dlb Intel DLB PMD Tim McDaniel
2020-06-27 4:37 ` [dpdk-dev] [PATCH 01/27] eventdev: dlb upstream prerequisites Tim McDaniel
2020-06-27 7:44 ` Jerin Jacob
2020-06-30 11:22 ` Kinsella, Ray
2020-06-30 11:30 ` Jerin Jacob
2020-06-30 11:36 ` Kinsella, Ray
2020-06-30 12:14 ` Jerin Jacob
2020-07-02 15:21 0% ` Kinsella, Ray
2020-07-02 16:35 3% ` McDaniel, Timothy
2020-07-02 6:19 [dpdk-dev] [PATCH (v20.11) 1/2] eventdev: reserve space in config structs for extension pbhagavatula
2020-07-02 6:19 4% ` [dpdk-dev] [PATCH (v20.11) 2/2] eventdev: reserve space in timer " pbhagavatula
2020-07-02 14:58 4% [dpdk-dev] DPDK Release Status Meeting 2/07/2020 Ferruh Yigit
2020-07-02 22:13 [dpdk-dev] [PATCH] doc: announce changes to eventdev public data structures McDaniel, Timothy
2020-07-30 16:33 ` McDaniel, Timothy
2020-07-30 18:48 3% ` Jerin Jacob
2020-07-31 18:51 5% ` McDaniel, Timothy
2020-07-31 19:03 0% ` Jerin Jacob
2020-07-31 19:31 5% ` McDaniel, Timothy
2020-08-03 6:09 4% ` Jerin Jacob
2020-08-03 17:55 13% ` [dpdk-dev] [PATCH] doc: eventdev ABI change to support DLB PMD McDaniel, Timothy
2020-08-04 7:38 4% ` Jerin Jacob
2020-08-04 13:46 4% ` Van Haaren, Harry
2020-07-03 10:26 [dpdk-dev] [PATCH 0/3] ring clean up Feifei Wang
2020-07-03 10:26 ` [dpdk-dev] [PATCH 1/3] ring: remove experimental tag for ring reset API Feifei Wang
2020-07-03 16:16 4% ` Kinsella, Ray
2020-07-03 18:46 3% ` Honnappa Nagarahalli
2020-07-06 6:23 3% ` Kinsella, Ray
2020-07-07 3:19 3% ` Feifei Wang
2020-07-07 7:40 0% ` Kinsella, Ray
2020-07-03 10:26 ` [dpdk-dev] [PATCH 2/3] ring: remove experimental tag for ring element APIs Feifei Wang
2020-07-03 16:17 3% ` Kinsella, Ray
2020-07-03 17:15 4% [dpdk-dev] [PATCH] doc: add sample for ABI checks in contribution guide Ferruh Yigit
2020-07-05 3:41 [dpdk-dev] [pull-request] next-eventdev 20.08 RC1 Jerin Jacob Kollanukkaran
2020-07-06 9:57 3% ` Thomas Monjalon
2020-07-05 11:46 [dpdk-dev] [PATCH v5 0/3] build mempool on Windows Fady Bader
2020-07-05 13:47 ` [dpdk-dev] [PATCH v6 " Fady Bader
2020-07-05 13:47 ` [dpdk-dev] [PATCH v6 1/3] eal: disable function versioning " Fady Bader
2020-07-05 20:23 4% ` Thomas Monjalon
2020-07-06 7:02 0% ` Fady Bader
2020-07-06 11:32 ` [dpdk-dev] [PATCH v7 0/3] build mempool " Fady Bader
2020-07-06 11:32 5% ` [dpdk-dev] [PATCH v7 1/3] eal: disable function versioning " Fady Bader
2020-07-06 12:22 0% ` Bruce Richardson
2020-07-06 23:16 0% ` Thomas Monjalon
2020-07-07 9:03 [dpdk-dev] [PATCH] lib/librte_timer:fix corruption with reset Sarosh Arif
2020-07-10 6:59 ` [dpdk-dev] [PATCH v3] " Sarosh Arif
2020-07-10 15:19 3% ` Stephen Hemminger
2020-07-28 19:04 3% ` Carrillo, Erik G
2020-07-07 14:45 8% [dpdk-dev] [PATCH v1 0/2] doc: minor abi policy fixes Ray Kinsella
2020-07-07 14:45 24% ` [dpdk-dev] [PATCH v1 1/2] doc: reword abi policy for windows Ray Kinsella
2020-07-07 15:23 7% ` Thomas Monjalon
2020-07-07 16:33 4% ` Kinsella, Ray
2020-07-07 14:45 12% ` [dpdk-dev] [PATCH v1 2/2] doc: clarify alias to experimental period Ray Kinsella
2020-07-07 15:26 0% ` Thomas Monjalon
2020-07-07 16:35 3% ` Kinsella, Ray
2020-07-07 16:36 0% ` Thomas Monjalon
2020-07-07 16:37 0% ` Kinsella, Ray
2020-07-07 16:55 0% ` Honnappa Nagarahalli
2020-07-07 17:00 0% ` Thomas Monjalon
2020-07-07 17:01 0% ` Kinsella, Ray
2020-07-07 16:57 0% ` Thomas Monjalon
2020-07-07 17:01 4% ` Kinsella, Ray
2020-07-07 17:08 0% ` Thomas Monjalon
2020-07-07 17:50 8% ` [dpdk-dev] [PATCH v2 0/2] doc: minor abi policy fixes Ray Kinsella
2020-07-07 17:51 24% ` [dpdk-dev] [PATCH v2 1/2] doc: reword abi policy for windows Ray Kinsella
2020-07-07 17:51 12% ` [dpdk-dev] [PATCH v2 2/2] doc: clarify alias to experimental period Ray Kinsella
2020-07-07 18:44 0% ` Honnappa Nagarahalli
2020-07-08 10:32 7% ` [dpdk-dev] [PATCH v2 0/2] doc: minor abi policy fixes Thomas Monjalon
2020-07-08 12:02 4% ` Kinsella, Ray
2020-07-07 15:36 [dpdk-dev] [PATCH v4 0/2] rte_flow: introduce eCPRI item for rte_flow Bing Zhao
2020-07-10 8:45 ` [dpdk-dev] [PATCH v5 " Bing Zhao
2020-07-10 8:45 ` [dpdk-dev] [PATCH v5 1/2] rte_flow: add eCPRI key fields to flow API Bing Zhao
2020-07-10 14:31 ` Olivier Matz
2020-07-11 4:25 ` Bing Zhao
2020-07-12 13:17 3% ` Olivier Matz
2020-07-12 14:28 0% ` Bing Zhao
2020-07-12 14:43 0% ` Olivier Matz
2020-07-08 10:22 25% [dpdk-dev] [PATCH] devtools: give some hints for ABI errors David Marchand
2020-07-08 13:09 7% ` Kinsella, Ray
2020-07-08 13:15 4% ` David Marchand
2020-07-08 13:22 4% ` Kinsella, Ray
2020-07-08 13:45 7% ` Aaron Conole
2020-07-08 14:01 4% ` Kinsella, Ray
2020-07-09 15:52 4% ` Dodji Seketeli
2020-07-10 7:37 4% ` Kinsella, Ray
2020-07-10 10:58 4% ` Neil Horman
2020-07-15 12:15 25% ` [dpdk-dev] [PATCH v2] " David Marchand
2020-07-15 12:48 4% ` Aaron Conole
2020-07-16 7:29 4% ` David Marchand
[not found] <20200703102651.8918-1>
2020-07-09 6:12 ` [dpdk-dev] [PATCH v2 0/3] ring clean up Feifei Wang
2020-07-09 6:12 3% ` [dpdk-dev] [PATCH v2 1/3] ring: remove experimental tag for ring reset API Feifei Wang
2020-07-09 6:12 3% ` [dpdk-dev] [PATCH v2 2/3] ring: remove experimental tag for ring element APIs Feifei Wang
2020-07-09 6:53 4% [dpdk-dev] [PATCH] devtools: fix ninja break under default DESTDIR path Phil Yang
2020-07-09 15:20 4% [dpdk-dev] [PATCH 20.11 0/5] Enhance rawdev APIs Bruce Richardson
2020-07-09 15:20 3% ` [dpdk-dev] [PATCH 20.11 1/5] rawdev: add private data length parameter to info fn Bruce Richardson
2020-07-12 14:13 0% ` Xu, Rosen
2020-07-09 15:20 3% ` [dpdk-dev] [PATCH 20.11 3/5] rawdev: add private data length parameter to config fn Bruce Richardson
2020-07-12 14:13 0% ` Xu, Rosen
2020-07-09 15:20 3% ` [dpdk-dev] [PATCH 20.11 4/5] rawdev: add private data length parameter to queue fns Bruce Richardson
2020-07-13 9:57 3% [dpdk-dev] The mbuf API needs some cleaning up Morten Brørup
2020-07-31 15:24 0% ` Olivier Matz
2020-08-03 8:42 0% ` Morten Brørup
2020-07-13 12:31 [dpdk-dev] [PATCH 0/2] Deprecation notice updates Bruce Richardson
2020-07-13 12:31 5% ` [dpdk-dev] [PATCH 2/2] doc: add deprecation notice for change of rawdev APIs Bruce Richardson
2020-07-13 12:48 5% ` Hemant Agrawal
2020-07-20 11:35 0% ` Ananyev, Konstantin
2020-07-23 1:55 5% ` Xu, Rosen
2020-07-23 7:38 5% ` Hemant Agrawal
2020-07-14 11:32 [dpdk-dev] [PATCH] net/dpaa: announce extended definition of port_id in API 'rte_pmd_dpaa_set_tx_loopback' Sachin Saxena (OSS)
2020-07-23 9:23 ` Yang, Zhiyong
2020-07-23 14:34 4% ` Ferruh Yigit
2020-07-15 18:27 3% [dpdk-dev] [RFC PATCH 0/2] Enable dyynamic configuration of subport bandwidth profile Savinay Dharmappa
2020-07-16 8:14 0% ` Singh, Jasvinder
2020-07-16 5:19 [dpdk-dev] [PATCH] lpm: fix unchecked return value Ruifeng Wang
2020-07-16 15:49 ` [dpdk-dev] [PATCH v2] " Ruifeng Wang
2020-07-17 17:12 ` Medvedkin, Vladimir
2020-07-18 9:22 4% ` Ruifeng Wang
2020-07-21 16:23 0% ` Medvedkin, Vladimir
2020-07-21 17:10 3% ` Bruce Richardson
2020-07-21 17:33 0% ` David Marchand
2020-07-22 1:01 3% [dpdk-dev] [dpdk-announce] release candidate 20.08-rc2 Thomas Monjalon
2020-07-29 14:46 [dpdk-dev] [PATCH] [RFC] doc: announce removal of crypto list end enumerators Arek Kusztal
2020-07-29 15:18 3% ` Bruce Richardson
2020-07-30 10:37 3% [dpdk-dev] DPDK Release Status Meeting 30/07/2020 Ferruh Yigit
[not found] <1593232671-5690-0-git-send-email-timothy.mcdaniel@intel.com>
2020-07-30 19:49 ` [dpdk-dev] [PATCH 00/27] Add Intel DLM PMD to 20.11 McDaniel, Timothy
2020-07-30 19:49 1% ` [dpdk-dev] [PATCH 03/27] event/dlb: add shared code version 10.7.9 McDaniel, Timothy
2020-07-30 19:49 1% ` [dpdk-dev] [PATCH 08/27] event/dlb: add definitions shared with LKM or shared code McDaniel, Timothy
2020-08-02 10:51 [dpdk-dev] [PATCH] doc: announce changes to eventdev public structures pbhagavatula
2020-08-03 7:29 ` [dpdk-dev] [PATCH v2] doc: add reserve fields " pbhagavatula
2020-08-04 10:41 4% ` Bruce Richardson
2020-08-04 11:37 0% ` Jerin Jacob
2020-08-03 19:51 9% [dpdk-dev] [PATCH] doc: announce change in IPv6 item struct Dekel Peled
2020-08-04 13:17 0% ` Dekel Peled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).