DPDK patches and discussions
 help / color / mirror / Atom feed
* [RFC 0/4] ethdev: rework config restore
@ 2024-09-18  9:21 Dariusz Sosnowski
  2024-09-18  9:21 ` [RFC 1/4] " Dariusz Sosnowski
                   ` (4 more replies)
  0 siblings, 5 replies; 23+ messages in thread
From: Dariusz Sosnowski @ 2024-09-18  9:21 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko; +Cc: dev

Hi all,

We have been working on optimizing the latency of calls to rte_eth_dev_start(),
on ports spawned by mlx5 PMD. Most of the work requires changes in the implementation of
.dev_start() PMD callback, but I also wanted to start a discussion regarding
configuration restore.

rte_eth_dev_start() does a few things on top of calling .dev_start() callback:

- Before calling it:
    - eth_dev_mac_restore() - if device supports RTE_ETH_DEV_NOLIVE_MAC_ADDR;
- After calling it:
    - eth_dev_mac_restore() - if device does not support RTE_ETH_DEV_NOLIVE_MAC_ADDR;
    - restore promiscuous config
    - restore all multicast config

eth_dev_mac_restore() iterates over all known MAC addresses -
stored in rte_eth_dev_data.mac_addrs array - and calls
.mac_addr_set() and .mac_addr_add() callbacks to apply these MAC addresses.

Promiscuous config restore checks if promiscuous mode is enabled or not,
and calls .promiscuous_enable() or .promiscuous_disable() callback.

All multicast config restore checks if all multicast mode is enabled or not,
and calls .allmulticast_enable() or .allmulticast_disable() callback.

Callbacks are called directly in all of these cases, to bypass the checks
for applying the same configuration, which exist in relevant APIs.
Checks are bypassed to force drivers to reapply the configuration.

Let's consider what happens in the following sequence of API calls.

1. rte_eth_dev_configure()
2. rte_eth_tx_queue_setup()
3. rte_eth_rx_queue_setup()
4. rte_eth_promiscuous_enable()
    - Call dev->dev_ops->promiscuous_enable()
    - Stores promiscuous state in dev->data->promiscuous
5. rte_eth_allmulticast_enable()
    - Call dev->dev_ops->allmulticast_enable()
    - Stores allmulticast state in dev->data->allmulticast
6. rte_eth_dev_start()
    - Call dev->dev_ops->dev_start()
    - Call dev->dev_ops->mac_addr_set() - apply default MAC address
    - Call dev->dev_ops->promiscuous_enable()
    - Call dev->dev_ops->allmulticast_enable()

Even though all configuration is available in dev->data after step 5,
library forces reapplying this configuration in step 6.

In mlx5 PMD case all relevant callbacks require communication with the kernel driver,
to configure the device (mlx5 PMD must create/destroy new kernel flow rules and/or
change netdev config).

mlx5 PMD handles applying all configuration in .dev_start(), so the following forced callbacks
force additional communication with the kernel. The same configuration is applied multiple times.

As an optimization, mlx5 PMD could check if a given configuration was applied,
but this would duplicate the functionality of the library
(for example rte_eth_promiscuous_enable() does not call the driver
if dev->data->promiscuous is set).

Question: Since all of the configuration is available before .dev_start() callback is called,
why ethdev library does not expect .dev_start() to take this configuration into account?
In other words, why library has to reapply the configuration?

I could not find any particular reason why configuration restore exists
as part of the process (it was in the initial DPDK commit).

The patches included in this RFC, propose a mechanism which would help with managing which drivers
rely on forceful configuration restore.
Drivers could advertise if forceful configuration restore is needed through
`RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If this flag is set, then the driver in question
requires ethdev to forcefully restore configuration.

This way, if we would conclude that it makes sense for .dev_start() to handle all
starting configuration aspects, we could track which drivers still rely on configuration restore.

Dariusz Sosnowski (4):
  ethdev: rework config restore
  ethdev: omit promiscuous config restore if not required
  ethdev: omit all multicast config restore if not required
  ethdev: omit MAC address restore if not required

 lib/ethdev/rte_ethdev.c | 39 ++++++++++++++++++++++++++++++++++-----
 lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++
 2 files changed, 52 insertions(+), 5 deletions(-)

--
2.39.5


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC 1/4] ethdev: rework config restore
  2024-09-18  9:21 [RFC 0/4] ethdev: rework config restore Dariusz Sosnowski
@ 2024-09-18  9:21 ` Dariusz Sosnowski
  2024-09-18  9:21 ` [RFC 2/4] ethdev: omit promiscuous config restore if not required Dariusz Sosnowski
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 23+ messages in thread
From: Dariusz Sosnowski @ 2024-09-18  9:21 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko; +Cc: dev

Extract promiscuous and all multicast configuration restore
to separate functions.
This change will allow easier integration of disabling
these procedures for supporting PMDs in follow up commits.

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
---
 lib/ethdev/rte_ethdev.c | 34 +++++++++++++++++++++++++++++-----
 1 file changed, 29 insertions(+), 5 deletions(-)

diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index f1c658f49e..362a1883f0 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -1648,14 +1648,10 @@ eth_dev_mac_restore(struct rte_eth_dev *dev,
 }
 
 static int
-eth_dev_config_restore(struct rte_eth_dev *dev,
-		struct rte_eth_dev_info *dev_info, uint16_t port_id)
+eth_dev_promiscuous_restore(struct rte_eth_dev *dev, uint16_t port_id)
 {
 	int ret;
 
-	if (!(*dev_info->dev_flags & RTE_ETH_DEV_NOLIVE_MAC_ADDR))
-		eth_dev_mac_restore(dev, dev_info);
-
 	/* replay promiscuous configuration */
 	/*
 	 * use callbacks directly since we don't need port_id check and
@@ -1683,6 +1679,14 @@ eth_dev_config_restore(struct rte_eth_dev *dev,
 		}
 	}
 
+	return 0;
+}
+
+static int
+eth_dev_allmulticast_restore(struct rte_eth_dev *dev, uint16_t port_id)
+{
+	int ret;
+
 	/* replay all multicast configuration */
 	/*
 	 * use callbacks directly since we don't need port_id check and
@@ -1713,6 +1717,26 @@ eth_dev_config_restore(struct rte_eth_dev *dev,
 	return 0;
 }
 
+static int
+eth_dev_config_restore(struct rte_eth_dev *dev,
+		struct rte_eth_dev_info *dev_info, uint16_t port_id)
+{
+	int ret;
+
+	if (!(*dev_info->dev_flags & RTE_ETH_DEV_NOLIVE_MAC_ADDR))
+		eth_dev_mac_restore(dev, dev_info);
+
+	ret = eth_dev_promiscuous_restore(dev, port_id);
+	if (ret != 0)
+		return ret;
+
+	ret = eth_dev_allmulticast_restore(dev, port_id);
+	if (ret != 0)
+		return ret;
+
+	return 0;
+}
+
 int
 rte_eth_dev_start(uint16_t port_id)
 {
-- 
2.39.5


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC 2/4] ethdev: omit promiscuous config restore if not required
  2024-09-18  9:21 [RFC 0/4] ethdev: rework config restore Dariusz Sosnowski
  2024-09-18  9:21 ` [RFC 1/4] " Dariusz Sosnowski
@ 2024-09-18  9:21 ` Dariusz Sosnowski
  2024-09-18  9:22 ` [RFC 3/4] ethdev: omit all multicast " Dariusz Sosnowski
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 23+ messages in thread
From: Dariusz Sosnowski @ 2024-09-18  9:21 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko; +Cc: dev

This patch adds a new device flag - RTE_ETH_DEV_PROMISC_FORCE_RESTORE.
If device driver sets this flag, then it requires that ethdev library
forcefully reapplies promiscuous mode configuration,
after the port is started.
As a result, unnecessary work can be removed from rte_eth_dev_start()
for drivers which apply all available configuration in dev_start()
(such drivers do not set the flag).

If RFC is approved, then the next version of this patch
should set the new flag for all drivers to maintain the same behavior,
until drivers adjust and it can be safely cleared.

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
---
 lib/ethdev/rte_ethdev.c | 8 +++++---
 lib/ethdev/rte_ethdev.h | 6 ++++++
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index 362a1883f0..ff08abd566 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -1726,9 +1726,11 @@ eth_dev_config_restore(struct rte_eth_dev *dev,
 	if (!(*dev_info->dev_flags & RTE_ETH_DEV_NOLIVE_MAC_ADDR))
 		eth_dev_mac_restore(dev, dev_info);
 
-	ret = eth_dev_promiscuous_restore(dev, port_id);
-	if (ret != 0)
-		return ret;
+	if (*dev_info->dev_flags & RTE_ETH_DEV_PROMISC_FORCE_RESTORE) {
+		ret = eth_dev_promiscuous_restore(dev, port_id);
+		if (ret != 0)
+			return ret;
+	}
 
 	ret = eth_dev_allmulticast_restore(dev, port_id);
 	if (ret != 0)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 548fada1c7..0fc23fb924 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -2120,6 +2120,12 @@ struct rte_eth_dev_owner {
  * PMDs filling the queue xstats themselves should not set this flag
  */
 #define RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS RTE_BIT32(6)
+/**
+ * If this flag is set, then device driver requires that
+ * ethdev library forcefully reapplies promiscuous mode configuration,
+ * after driver's dev_start() callback is called.
+ */
+#define RTE_ETH_DEV_PROMISC_FORCE_RESTORE RTE_BIT32(7)
 /**@}*/
 
 /**
-- 
2.39.5


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC 3/4] ethdev: omit all multicast config restore if not required
  2024-09-18  9:21 [RFC 0/4] ethdev: rework config restore Dariusz Sosnowski
  2024-09-18  9:21 ` [RFC 1/4] " Dariusz Sosnowski
  2024-09-18  9:21 ` [RFC 2/4] ethdev: omit promiscuous config restore if not required Dariusz Sosnowski
@ 2024-09-18  9:22 ` Dariusz Sosnowski
  2024-09-18  9:22 ` [RFC 4/4] ethdev: omit MAC address " Dariusz Sosnowski
  2024-09-29 23:31 ` [RFC 0/4] ethdev: rework config restore Ferruh Yigit
  4 siblings, 0 replies; 23+ messages in thread
From: Dariusz Sosnowski @ 2024-09-18  9:22 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko; +Cc: dev

This patch adds a new device flag - RTE_ETH_DEV_ALLMULTI_FORCE_RESTORE.
If device driver sets this flag, then it requires that ethdev library
forcefully reapplies allmulticast configration,
after the port is started.
As a result, unnecessary work can be removed from rte_eth_dev_start()
for drivers which apply all available configuration in dev_start()
(such drivers do not set the flag).

If RFC is approved, then the next version of this patch
should set the new flag for all drivers to maintain the same behavior,
until drivers adjust and it can be safely cleared.

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
---
 lib/ethdev/rte_ethdev.c | 8 +++++---
 lib/ethdev/rte_ethdev.h | 6 ++++++
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index ff08abd566..a08922a78a 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -1732,9 +1732,11 @@ eth_dev_config_restore(struct rte_eth_dev *dev,
 			return ret;
 	}
 
-	ret = eth_dev_allmulticast_restore(dev, port_id);
-	if (ret != 0)
-		return ret;
+	if (*dev_info->dev_flags & RTE_ETH_DEV_ALLMULTI_FORCE_RESTORE) {
+		ret = eth_dev_allmulticast_restore(dev, port_id);
+		if (ret != 0)
+			return ret;
+	}
 
 	return 0;
 }
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 0fc23fb924..73405dd17d 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -2126,6 +2126,12 @@ struct rte_eth_dev_owner {
  * after driver's dev_start() callback is called.
  */
 #define RTE_ETH_DEV_PROMISC_FORCE_RESTORE RTE_BIT32(7)
+/**
+ * If this flag is set, then device driver requires that
+ * ethdev library forcefully reapplies allmulticast configuration,
+ * after driver's dev_start() callback is called.
+ */
+#define RTE_ETH_DEV_ALLMULTI_FORCE_RESTORE RTE_BIT32(8)
 /**@}*/
 
 /**
-- 
2.39.5


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC 4/4] ethdev: omit MAC address restore if not required
  2024-09-18  9:21 [RFC 0/4] ethdev: rework config restore Dariusz Sosnowski
                   ` (2 preceding siblings ...)
  2024-09-18  9:22 ` [RFC 3/4] ethdev: omit all multicast " Dariusz Sosnowski
@ 2024-09-18  9:22 ` Dariusz Sosnowski
  2024-09-29 23:31 ` [RFC 0/4] ethdev: rework config restore Ferruh Yigit
  4 siblings, 0 replies; 23+ messages in thread
From: Dariusz Sosnowski @ 2024-09-18  9:22 UTC (permalink / raw)
  To: Thomas Monjalon, Ferruh Yigit, Andrew Rybchenko; +Cc: dev

This patch adds a new device flag - RTE_ETH_DEV_MAC_ADDR_FORCE_RESTORE.
If device driver sets this flag, then it requires that ethdev library
forcefully reapplies configured MAC addresses,
after the port is started.
As a result, unnecessary work can be removed from rte_eth_dev_start()
for drivers which apply all available configuration in dev_start()
(such drivers do not the set the flag).

If RFC is approved, then the next version of this patch
should set the new flag for all drivers to maintain the same behavior,
until drivers adjust and it can be safely cleared.

Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
---
 lib/ethdev/rte_ethdev.c | 3 ++-
 lib/ethdev/rte_ethdev.h | 6 ++++++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index a08922a78a..e4bb40cad8 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -1723,7 +1723,8 @@ eth_dev_config_restore(struct rte_eth_dev *dev,
 {
 	int ret;
 
-	if (!(*dev_info->dev_flags & RTE_ETH_DEV_NOLIVE_MAC_ADDR))
+	if ((*dev_info->dev_flags & RTE_ETH_DEV_MAC_ADDR_FORCE_RESTORE) &&
+	    !(*dev_info->dev_flags & RTE_ETH_DEV_NOLIVE_MAC_ADDR))
 		eth_dev_mac_restore(dev, dev_info);
 
 	if (*dev_info->dev_flags & RTE_ETH_DEV_PROMISC_FORCE_RESTORE) {
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 73405dd17d..deab07fbc0 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -2132,6 +2132,12 @@ struct rte_eth_dev_owner {
  * after driver's dev_start() callback is called.
  */
 #define RTE_ETH_DEV_ALLMULTI_FORCE_RESTORE RTE_BIT32(8)
+/**
+ * If this flag is set, then device driver requires that
+ * ethdev library forcefully reapplies active MAC addresses,
+ * after driver's dev_start() callback is called.
+ */
+#define RTE_ETH_DEV_MAC_ADDR_FORCE_RESTORE RTE_BIT32(9)
 /**@}*/
 
 /**
-- 
2.39.5


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC 0/4] ethdev: rework config restore
  2024-09-18  9:21 [RFC 0/4] ethdev: rework config restore Dariusz Sosnowski
                   ` (3 preceding siblings ...)
  2024-09-18  9:22 ` [RFC 4/4] ethdev: omit MAC address " Dariusz Sosnowski
@ 2024-09-29 23:31 ` Ferruh Yigit
  2024-10-04 19:13   ` Dariusz Sosnowski
  4 siblings, 1 reply; 23+ messages in thread
From: Ferruh Yigit @ 2024-09-29 23:31 UTC (permalink / raw)
  To: Dariusz Sosnowski, Thomas Monjalon, Andrew Rybchenko
  Cc: dev, Bruce Richardson, Konstantin Ananyev

On 9/18/2024 10:21 AM, Dariusz Sosnowski wrote:
> Hi all,
> 
> We have been working on optimizing the latency of calls to rte_eth_dev_start(),
> on ports spawned by mlx5 PMD. Most of the work requires changes in the implementation of
> .dev_start() PMD callback, but I also wanted to start a discussion regarding
> configuration restore.
> 
> rte_eth_dev_start() does a few things on top of calling .dev_start() callback:
> 
> - Before calling it:
>     - eth_dev_mac_restore() - if device supports RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> - After calling it:
>     - eth_dev_mac_restore() - if device does not support RTE_ETH_DEV_NOLIVE_MAC_ADDR;
>     - restore promiscuous config
>     - restore all multicast config
> 
> eth_dev_mac_restore() iterates over all known MAC addresses -
> stored in rte_eth_dev_data.mac_addrs array - and calls
> .mac_addr_set() and .mac_addr_add() callbacks to apply these MAC addresses.
> 
> Promiscuous config restore checks if promiscuous mode is enabled or not,
> and calls .promiscuous_enable() or .promiscuous_disable() callback.
> 
> All multicast config restore checks if all multicast mode is enabled or not,
> and calls .allmulticast_enable() or .allmulticast_disable() callback.
> 
> Callbacks are called directly in all of these cases, to bypass the checks
> for applying the same configuration, which exist in relevant APIs.
> Checks are bypassed to force drivers to reapply the configuration.
> 
> Let's consider what happens in the following sequence of API calls.
> 
> 1. rte_eth_dev_configure()
> 2. rte_eth_tx_queue_setup()
> 3. rte_eth_rx_queue_setup()
> 4. rte_eth_promiscuous_enable()
>     - Call dev->dev_ops->promiscuous_enable()
>     - Stores promiscuous state in dev->data->promiscuous
> 5. rte_eth_allmulticast_enable()
>     - Call dev->dev_ops->allmulticast_enable()
>     - Stores allmulticast state in dev->data->allmulticast
> 6. rte_eth_dev_start()
>     - Call dev->dev_ops->dev_start()
>     - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>     - Call dev->dev_ops->promiscuous_enable()
>     - Call dev->dev_ops->allmulticast_enable()
> 
> Even though all configuration is available in dev->data after step 5,
> library forces reapplying this configuration in step 6.
> 
> In mlx5 PMD case all relevant callbacks require communication with the kernel driver,
> to configure the device (mlx5 PMD must create/destroy new kernel flow rules and/or
> change netdev config).
> 
> mlx5 PMD handles applying all configuration in .dev_start(), so the following forced callbacks
> force additional communication with the kernel. The same configuration is applied multiple times.
> 
> As an optimization, mlx5 PMD could check if a given configuration was applied,
> but this would duplicate the functionality of the library
> (for example rte_eth_promiscuous_enable() does not call the driver
> if dev->data->promiscuous is set).
> 
> Question: Since all of the configuration is available before .dev_start() callback is called,
> why ethdev library does not expect .dev_start() to take this configuration into account?
> In other words, why library has to reapply the configuration?
> 
> I could not find any particular reason why configuration restore exists
> as part of the process (it was in the initial DPDK commit).
> 

My assumption is .dev_stop() cause these values reset in some devices,
so .dev_start() restores them back.
@Bruce or @Konstantin may remember the history.

But I agree this is device specific behavior, and can be managed by what
device requires.


> The patches included in this RFC, propose a mechanism which would help with managing which drivers
> rely on forceful configuration restore.
> Drivers could advertise if forceful configuration restore is needed through
> `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If this flag is set, then the driver in question
> requires ethdev to forcefully restore configuration.
> 

OK to use flag for it, but not sure about using 'dev_info->dev_flags'
(RTE_ETH_DEV_*) for this, as this flag is shared with user and this is
all dpdk internal.

What about to have a dedicated flag for it? We can have a dedicated set
of flag values for restore.
Also perhaps we have go into details what needs to be restored after
'stop' and what needs to be restored after 'reset' and use similar
mechanism etc...

> This way, if we would conclude that it makes sense for .dev_start() to handle all
> starting configuration aspects, we could track which drivers still rely on configuration restore.
> 
> Dariusz Sosnowski (4):
>   ethdev: rework config restore
>   ethdev: omit promiscuous config restore if not required
>   ethdev: omit all multicast config restore if not required
>   ethdev: omit MAC address restore if not required
> 
>  lib/ethdev/rte_ethdev.c | 39 ++++++++++++++++++++++++++++++++++-----
>  lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++
>  2 files changed, 52 insertions(+), 5 deletions(-)
> 
> --
> 2.39.5
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [RFC 0/4] ethdev: rework config restore
  2024-09-29 23:31 ` [RFC 0/4] ethdev: rework config restore Ferruh Yigit
@ 2024-10-04 19:13   ` Dariusz Sosnowski
  2024-10-07  9:27     ` Konstantin Ananyev
  0 siblings, 1 reply; 23+ messages in thread
From: Dariusz Sosnowski @ 2024-10-04 19:13 UTC (permalink / raw)
  To: Ferruh Yigit, NBU-Contact-Thomas Monjalon (EXTERNAL), Andrew Rybchenko
  Cc: dev, Bruce Richardson, Konstantin Ananyev

> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Monday, September 30, 2024 01:31
> To: Dariusz Sosnowski <dsosnowski@nvidia.com>; NBU-Contact-Thomas
> Monjalon (EXTERNAL) <thomas@monjalon.net>; Andrew Rybchenko
> <andrew.rybchenko@oktetlabs.ru>
> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>; Konstantin
> Ananyev <konstantin.ananyev@huawei.com>
> Subject: Re: [RFC 0/4] ethdev: rework config restore
> 
> External email: Use caution opening links or attachments
> 
> 
> On 9/18/2024 10:21 AM, Dariusz Sosnowski wrote:
> > Hi all,
> >
> > We have been working on optimizing the latency of calls to
> > rte_eth_dev_start(), on ports spawned by mlx5 PMD. Most of the work
> > requires changes in the implementation of
> > .dev_start() PMD callback, but I also wanted to start a discussion
> > regarding configuration restore.
> >
> > rte_eth_dev_start() does a few things on top of calling .dev_start() callback:
> >
> > - Before calling it:
> >     - eth_dev_mac_restore() - if device supports
> > RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> > - After calling it:
> >     - eth_dev_mac_restore() - if device does not support
> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> >     - restore promiscuous config
> >     - restore all multicast config
> >
> > eth_dev_mac_restore() iterates over all known MAC addresses - stored
> > in rte_eth_dev_data.mac_addrs array - and calls
> > .mac_addr_set() and .mac_addr_add() callbacks to apply these MAC addresses.
> >
> > Promiscuous config restore checks if promiscuous mode is enabled or
> > not, and calls .promiscuous_enable() or .promiscuous_disable() callback.
> >
> > All multicast config restore checks if all multicast mode is enabled
> > or not, and calls .allmulticast_enable() or .allmulticast_disable() callback.
> >
> > Callbacks are called directly in all of these cases, to bypass the
> > checks for applying the same configuration, which exist in relevant APIs.
> > Checks are bypassed to force drivers to reapply the configuration.
> >
> > Let's consider what happens in the following sequence of API calls.
> >
> > 1. rte_eth_dev_configure()
> > 2. rte_eth_tx_queue_setup()
> > 3. rte_eth_rx_queue_setup()
> > 4. rte_eth_promiscuous_enable()
> >     - Call dev->dev_ops->promiscuous_enable()
> >     - Stores promiscuous state in dev->data->promiscuous 5.
> > rte_eth_allmulticast_enable()
> >     - Call dev->dev_ops->allmulticast_enable()
> >     - Stores allmulticast state in dev->data->allmulticast 6.
> > rte_eth_dev_start()
> >     - Call dev->dev_ops->dev_start()
> >     - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >     - Call dev->dev_ops->promiscuous_enable()
> >     - Call dev->dev_ops->allmulticast_enable()
> >
> > Even though all configuration is available in dev->data after step 5,
> > library forces reapplying this configuration in step 6.
> >
> > In mlx5 PMD case all relevant callbacks require communication with the
> > kernel driver, to configure the device (mlx5 PMD must create/destroy
> > new kernel flow rules and/or change netdev config).
> >
> > mlx5 PMD handles applying all configuration in .dev_start(), so the
> > following forced callbacks force additional communication with the kernel. The
> same configuration is applied multiple times.
> >
> > As an optimization, mlx5 PMD could check if a given configuration was
> > applied, but this would duplicate the functionality of the library
> > (for example rte_eth_promiscuous_enable() does not call the driver if
> > dev->data->promiscuous is set).
> >
> > Question: Since all of the configuration is available before
> > .dev_start() callback is called, why ethdev library does not expect .dev_start() to
> take this configuration into account?
> > In other words, why library has to reapply the configuration?
> >
> > I could not find any particular reason why configuration restore
> > exists as part of the process (it was in the initial DPDK commit).
> >
> 
> My assumption is .dev_stop() cause these values reset in some devices, so
> .dev_start() restores them back.
> @Bruce or @Konstantin may remember the history.
> 
> But I agree this is device specific behavior, and can be managed by what device
> requires.
> 
> 
> > The patches included in this RFC, propose a mechanism which would help
> > with managing which drivers rely on forceful configuration restore.
> > Drivers could advertise if forceful configuration restore is needed
> > through `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If this flag is
> > set, then the driver in question requires ethdev to forcefully restore
> configuration.
> >
> 
> OK to use flag for it, but not sure about using 'dev_info->dev_flags'
> (RTE_ETH_DEV_*) for this, as this flag is shared with user and this is all dpdk
> internal.
> 
> What about to have a dedicated flag for it? We can have a dedicated set of flag
> values for restore.

Agreed. What do you think about the following?

#define RTE_ETH_DEV_INTERNAL_PROMISC_FORCE_RESTORE RTE_BIT32(0)
#define RTE_ETH_DEV_INTERNAL_ALLMULTI_FORCE_RESTORE RTE_BIT32(1)
#define RTE_ETH_DEV_INTERNAL_MAC_ADDR_FORCE_RESTORE RTE_BIT32(2)

struct rte_eth_dev_data {
	/* snip */

	uint32_t dev_flags;

	/**
	 * Internal device capabilities, used only by ethdev library.
	 * Certain functionalities provided by the library might enabled/disabled,
	 * based on driver exposing certain capabilities.
	 */
	uint32_t internal_flags;

	/* snip */
};

> Also perhaps we have go into details what needs to be restored after 'stop' and
> what needs to be restored after 'reset' and use similar mechanism etc...

I think we should look into that.
Any 'codification' of semantics between drivers and ethdev library is good in my opinion.

At least right now, ethdev does not change any configuration in 'stop' and 'reset' from what I see.
But that's on library side only.

> > This way, if we would conclude that it makes sense for .dev_start() to
> > handle all starting configuration aspects, we could track which drivers still rely
> on configuration restore.
> >
> > Dariusz Sosnowski (4):
> >   ethdev: rework config restore
> >   ethdev: omit promiscuous config restore if not required
> >   ethdev: omit all multicast config restore if not required
> >   ethdev: omit MAC address restore if not required
> >
> >  lib/ethdev/rte_ethdev.c | 39 ++++++++++++++++++++++++++++++++++-----
> >  lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++
> >  2 files changed, 52 insertions(+), 5 deletions(-)
> >
> > --
> > 2.39.5
> >


^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [RFC 0/4] ethdev: rework config restore
  2024-10-04 19:13   ` Dariusz Sosnowski
@ 2024-10-07  9:27     ` Konstantin Ananyev
  2024-10-07 22:56       ` Ferruh Yigit
  0 siblings, 1 reply; 23+ messages in thread
From: Konstantin Ananyev @ 2024-10-07  9:27 UTC (permalink / raw)
  To: Dariusz Sosnowski, Ferruh Yigit,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Andrew Rybchenko
  Cc: dev, Bruce Richardson



> > External email: Use caution opening links or attachments
> >
> >
> > On 9/18/2024 10:21 AM, Dariusz Sosnowski wrote:
> > > Hi all,
> > >
> > > We have been working on optimizing the latency of calls to
> > > rte_eth_dev_start(), on ports spawned by mlx5 PMD. Most of the work
> > > requires changes in the implementation of
> > > .dev_start() PMD callback, but I also wanted to start a discussion
> > > regarding configuration restore.
> > >
> > > rte_eth_dev_start() does a few things on top of calling .dev_start() callback:
> > >
> > > - Before calling it:
> > >     - eth_dev_mac_restore() - if device supports
> > > RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> > > - After calling it:
> > >     - eth_dev_mac_restore() - if device does not support
> > RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> > >     - restore promiscuous config
> > >     - restore all multicast config
> > >
> > > eth_dev_mac_restore() iterates over all known MAC addresses - stored
> > > in rte_eth_dev_data.mac_addrs array - and calls
> > > .mac_addr_set() and .mac_addr_add() callbacks to apply these MAC addresses.
> > >
> > > Promiscuous config restore checks if promiscuous mode is enabled or
> > > not, and calls .promiscuous_enable() or .promiscuous_disable() callback.
> > >
> > > All multicast config restore checks if all multicast mode is enabled
> > > or not, and calls .allmulticast_enable() or .allmulticast_disable() callback.
> > >
> > > Callbacks are called directly in all of these cases, to bypass the
> > > checks for applying the same configuration, which exist in relevant APIs.
> > > Checks are bypassed to force drivers to reapply the configuration.
> > >
> > > Let's consider what happens in the following sequence of API calls.
> > >
> > > 1. rte_eth_dev_configure()
> > > 2. rte_eth_tx_queue_setup()
> > > 3. rte_eth_rx_queue_setup()
> > > 4. rte_eth_promiscuous_enable()
> > >     - Call dev->dev_ops->promiscuous_enable()
> > >     - Stores promiscuous state in dev->data->promiscuous 5.
> > > rte_eth_allmulticast_enable()
> > >     - Call dev->dev_ops->allmulticast_enable()
> > >     - Stores allmulticast state in dev->data->allmulticast 6.
> > > rte_eth_dev_start()
> > >     - Call dev->dev_ops->dev_start()
> > >     - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> > >     - Call dev->dev_ops->promiscuous_enable()
> > >     - Call dev->dev_ops->allmulticast_enable()
> > >
> > > Even though all configuration is available in dev->data after step 5,
> > > library forces reapplying this configuration in step 6.
> > >
> > > In mlx5 PMD case all relevant callbacks require communication with the
> > > kernel driver, to configure the device (mlx5 PMD must create/destroy
> > > new kernel flow rules and/or change netdev config).
> > >
> > > mlx5 PMD handles applying all configuration in .dev_start(), so the
> > > following forced callbacks force additional communication with the kernel. The
> > same configuration is applied multiple times.
> > >
> > > As an optimization, mlx5 PMD could check if a given configuration was
> > > applied, but this would duplicate the functionality of the library
> > > (for example rte_eth_promiscuous_enable() does not call the driver if
> > > dev->data->promiscuous is set).
> > >
> > > Question: Since all of the configuration is available before
> > > .dev_start() callback is called, why ethdev library does not expect .dev_start() to
> > take this configuration into account?
> > > In other words, why library has to reapply the configuration?
> > >
> > > I could not find any particular reason why configuration restore
> > > exists as part of the process (it was in the initial DPDK commit).
> > >
> >
> > My assumption is .dev_stop() cause these values reset in some devices, so
> > .dev_start() restores them back.
> > @Bruce or @Konstantin may remember the history.

Yep, as I remember, at least some Intel PMDs calling hw_reset() ad dec_stop() and 
even dev_start() to make sure that HW is in a clean (known) state.

> >
> > But I agree this is device specific behavior, and can be managed by what device
> > requires.

Probably yes.

> >
> > > The patches included in this RFC, propose a mechanism which would help
> > > with managing which drivers rely on forceful configuration restore.
> > > Drivers could advertise if forceful configuration restore is needed
> > > through `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If this flag is
> > > set, then the driver in question requires ethdev to forcefully restore
> > configuration.
> > >
> >
> > OK to use flag for it, but not sure about using 'dev_info->dev_flags'
> > (RTE_ETH_DEV_*) for this, as this flag is shared with user and this is all dpdk
> > internal.
> >
> > What about to have a dedicated flag for it? We can have a dedicated set of flag
> > values for restore.
> 
> Agreed. What do you think about the following?

Instead of exposing that, can we probably make it transparent to the user
and probably ethdev layer too?
Might be we can move this restoration code into the new ethdev helper function,(ethdevd_user_config_restore()  or so)
that PMD can invoke during its dev_start() if needed?
 
> 
> #define RTE_ETH_DEV_INTERNAL_PROMISC_FORCE_RESTORE RTE_BIT32(0)
> #define RTE_ETH_DEV_INTERNAL_ALLMULTI_FORCE_RESTORE RTE_BIT32(1)
> #define RTE_ETH_DEV_INTERNAL_MAC_ADDR_FORCE_RESTORE RTE_BIT32(2)
> 
> struct rte_eth_dev_data {
> 	/* snip */
> 
> 	uint32_t dev_flags;
> 
> 	/**
> 	 * Internal device capabilities, used only by ethdev library.
> 	 * Certain functionalities provided by the library might enabled/disabled,
> 	 * based on driver exposing certain capabilities.
> 	 */
> 	uint32_t internal_flags;
> 
> 	/* snip */
> };
> 
> > Also perhaps we have go into details what needs to be restored after 'stop' and
> > what needs to be restored after 'reset' and use similar mechanism etc...
> 
> I think we should look into that.
> Any 'codification' of semantics between drivers and ethdev library is good in my opinion.
> 
> At least right now, ethdev does not change any configuration in 'stop' and 'reset' from what I see.
> But that's on library side only.
> 
> > > This way, if we would conclude that it makes sense for .dev_start() to
> > > handle all starting configuration aspects, we could track which drivers still rely
> > on configuration restore.
> > >
> > > Dariusz Sosnowski (4):
> > >   ethdev: rework config restore
> > >   ethdev: omit promiscuous config restore if not required
> > >   ethdev: omit all multicast config restore if not required
> > >   ethdev: omit MAC address restore if not required
> > >
> > >  lib/ethdev/rte_ethdev.c | 39 ++++++++++++++++++++++++++++++++++-----
> > >  lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++
> > >  2 files changed, 52 insertions(+), 5 deletions(-)
> > >
> > > --
> > > 2.39.5
> > >


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC 0/4] ethdev: rework config restore
  2024-10-07  9:27     ` Konstantin Ananyev
@ 2024-10-07 22:56       ` Ferruh Yigit
  2024-10-08 17:21         ` Konstantin Ananyev
  0 siblings, 1 reply; 23+ messages in thread
From: Ferruh Yigit @ 2024-10-07 22:56 UTC (permalink / raw)
  To: Konstantin Ananyev, Dariusz Sosnowski,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Andrew Rybchenko
  Cc: dev, Bruce Richardson

On 10/7/2024 10:27 AM, Konstantin Ananyev wrote:
> 
> 
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> On 9/18/2024 10:21 AM, Dariusz Sosnowski wrote:
>>>> Hi all,
>>>>
>>>> We have been working on optimizing the latency of calls to
>>>> rte_eth_dev_start(), on ports spawned by mlx5 PMD. Most of the work
>>>> requires changes in the implementation of
>>>> .dev_start() PMD callback, but I also wanted to start a discussion
>>>> regarding configuration restore.
>>>>
>>>> rte_eth_dev_start() does a few things on top of calling .dev_start() callback:
>>>>
>>>> - Before calling it:
>>>>     - eth_dev_mac_restore() - if device supports
>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
>>>> - After calling it:
>>>>     - eth_dev_mac_restore() - if device does not support
>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
>>>>     - restore promiscuous config
>>>>     - restore all multicast config
>>>>
>>>> eth_dev_mac_restore() iterates over all known MAC addresses - stored
>>>> in rte_eth_dev_data.mac_addrs array - and calls
>>>> .mac_addr_set() and .mac_addr_add() callbacks to apply these MAC addresses.
>>>>
>>>> Promiscuous config restore checks if promiscuous mode is enabled or
>>>> not, and calls .promiscuous_enable() or .promiscuous_disable() callback.
>>>>
>>>> All multicast config restore checks if all multicast mode is enabled
>>>> or not, and calls .allmulticast_enable() or .allmulticast_disable() callback.
>>>>
>>>> Callbacks are called directly in all of these cases, to bypass the
>>>> checks for applying the same configuration, which exist in relevant APIs.
>>>> Checks are bypassed to force drivers to reapply the configuration.
>>>>
>>>> Let's consider what happens in the following sequence of API calls.
>>>>
>>>> 1. rte_eth_dev_configure()
>>>> 2. rte_eth_tx_queue_setup()
>>>> 3. rte_eth_rx_queue_setup()
>>>> 4. rte_eth_promiscuous_enable()
>>>>     - Call dev->dev_ops->promiscuous_enable()
>>>>     - Stores promiscuous state in dev->data->promiscuous 5.
>>>> rte_eth_allmulticast_enable()
>>>>     - Call dev->dev_ops->allmulticast_enable()
>>>>     - Stores allmulticast state in dev->data->allmulticast 6.
>>>> rte_eth_dev_start()
>>>>     - Call dev->dev_ops->dev_start()
>>>>     - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>>>>     - Call dev->dev_ops->promiscuous_enable()
>>>>     - Call dev->dev_ops->allmulticast_enable()
>>>>
>>>> Even though all configuration is available in dev->data after step 5,
>>>> library forces reapplying this configuration in step 6.
>>>>
>>>> In mlx5 PMD case all relevant callbacks require communication with the
>>>> kernel driver, to configure the device (mlx5 PMD must create/destroy
>>>> new kernel flow rules and/or change netdev config).
>>>>
>>>> mlx5 PMD handles applying all configuration in .dev_start(), so the
>>>> following forced callbacks force additional communication with the kernel. The
>>> same configuration is applied multiple times.
>>>>
>>>> As an optimization, mlx5 PMD could check if a given configuration was
>>>> applied, but this would duplicate the functionality of the library
>>>> (for example rte_eth_promiscuous_enable() does not call the driver if
>>>> dev->data->promiscuous is set).
>>>>
>>>> Question: Since all of the configuration is available before
>>>> .dev_start() callback is called, why ethdev library does not expect .dev_start() to
>>> take this configuration into account?
>>>> In other words, why library has to reapply the configuration?
>>>>
>>>> I could not find any particular reason why configuration restore
>>>> exists as part of the process (it was in the initial DPDK commit).
>>>>
>>>
>>> My assumption is .dev_stop() cause these values reset in some devices, so
>>> .dev_start() restores them back.
>>> @Bruce or @Konstantin may remember the history.
> 
> Yep, as I remember, at least some Intel PMDs calling hw_reset() ad dec_stop() and 
> even dev_start() to make sure that HW is in a clean (known) state.
> 
>>>
>>> But I agree this is device specific behavior, and can be managed by what device
>>> requires.
> 
> Probably yes.
> 
>>>
>>>> The patches included in this RFC, propose a mechanism which would help
>>>> with managing which drivers rely on forceful configuration restore.
>>>> Drivers could advertise if forceful configuration restore is needed
>>>> through `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If this flag is
>>>> set, then the driver in question requires ethdev to forcefully restore
>>> configuration.
>>>>
>>>
>>> OK to use flag for it, but not sure about using 'dev_info->dev_flags'
>>> (RTE_ETH_DEV_*) for this, as this flag is shared with user and this is all dpdk
>>> internal.
>>>
>>> What about to have a dedicated flag for it? We can have a dedicated set of flag
>>> values for restore.
>>
>> Agreed. What do you think about the following?
> 
> Instead of exposing that, can we probably make it transparent to the user
> and probably ethdev layer too?
>

+1 to make it transparent to user, but not sure if we can make it
transparent to ethdev layer.

Suggested 'internal_flag' in "struct rte_eth_dev_data" can be confusing
and open to interpretation what to use it for and by time become source
of defect.
Instead what do you think to have a separate, dedicated data struct for it?


> Might be we can move this restoration code into the new ethdev helper function,(ethdevd_user_config_restore()  or so)
> that PMD can invoke during its dev_start() if needed?
>  
>>
>> #define RTE_ETH_DEV_INTERNAL_PROMISC_FORCE_RESTORE RTE_BIT32(0)
>> #define RTE_ETH_DEV_INTERNAL_ALLMULTI_FORCE_RESTORE RTE_BIT32(1)
>> #define RTE_ETH_DEV_INTERNAL_MAC_ADDR_FORCE_RESTORE RTE_BIT32(2)
>>
>> struct rte_eth_dev_data {
>> 	/* snip */
>>
>> 	uint32_t dev_flags;
>>
>> 	/**
>> 	 * Internal device capabilities, used only by ethdev library.
>> 	 * Certain functionalities provided by the library might enabled/disabled,
>> 	 * based on driver exposing certain capabilities.
>> 	 */
>> 	uint32_t internal_flags;
>>
>> 	/* snip */
>> };
>>
>>> Also perhaps we have go into details what needs to be restored after 'stop' and
>>> what needs to be restored after 'reset' and use similar mechanism etc...
>>
>> I think we should look into that.
>> Any 'codification' of semantics between drivers and ethdev library is good in my opinion.
>>
>> At least right now, ethdev does not change any configuration in 'stop' and 'reset' from what I see.
>> But that's on library side only.
>>
>>>> This way, if we would conclude that it makes sense for .dev_start() to
>>>> handle all starting configuration aspects, we could track which drivers still rely
>>> on configuration restore.
>>>>
>>>> Dariusz Sosnowski (4):
>>>>   ethdev: rework config restore
>>>>   ethdev: omit promiscuous config restore if not required
>>>>   ethdev: omit all multicast config restore if not required
>>>>   ethdev: omit MAC address restore if not required
>>>>
>>>>  lib/ethdev/rte_ethdev.c | 39 ++++++++++++++++++++++++++++++++++-----
>>>>  lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++
>>>>  2 files changed, 52 insertions(+), 5 deletions(-)
>>>>
>>>> --
>>>> 2.39.5
>>>>
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [RFC 0/4] ethdev: rework config restore
  2024-10-07 22:56       ` Ferruh Yigit
@ 2024-10-08 17:21         ` Konstantin Ananyev
  2024-10-09  1:07           ` Ferruh Yigit
  0 siblings, 1 reply; 23+ messages in thread
From: Konstantin Ananyev @ 2024-10-08 17:21 UTC (permalink / raw)
  To: Ferruh Yigit, Dariusz Sosnowski,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Andrew Rybchenko
  Cc: dev, Bruce Richardson



> >>>> We have been working on optimizing the latency of calls to
> >>>> rte_eth_dev_start(), on ports spawned by mlx5 PMD. Most of the work
> >>>> requires changes in the implementation of
> >>>> .dev_start() PMD callback, but I also wanted to start a discussion
> >>>> regarding configuration restore.
> >>>>
> >>>> rte_eth_dev_start() does a few things on top of calling .dev_start() callback:
> >>>>
> >>>> - Before calling it:
> >>>>     - eth_dev_mac_restore() - if device supports
> >>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> >>>> - After calling it:
> >>>>     - eth_dev_mac_restore() - if device does not support
> >>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> >>>>     - restore promiscuous config
> >>>>     - restore all multicast config
> >>>>
> >>>> eth_dev_mac_restore() iterates over all known MAC addresses - stored
> >>>> in rte_eth_dev_data.mac_addrs array - and calls
> >>>> .mac_addr_set() and .mac_addr_add() callbacks to apply these MAC addresses.
> >>>>
> >>>> Promiscuous config restore checks if promiscuous mode is enabled or
> >>>> not, and calls .promiscuous_enable() or .promiscuous_disable() callback.
> >>>>
> >>>> All multicast config restore checks if all multicast mode is enabled
> >>>> or not, and calls .allmulticast_enable() or .allmulticast_disable() callback.
> >>>>
> >>>> Callbacks are called directly in all of these cases, to bypass the
> >>>> checks for applying the same configuration, which exist in relevant APIs.
> >>>> Checks are bypassed to force drivers to reapply the configuration.
> >>>>
> >>>> Let's consider what happens in the following sequence of API calls.
> >>>>
> >>>> 1. rte_eth_dev_configure()
> >>>> 2. rte_eth_tx_queue_setup()
> >>>> 3. rte_eth_rx_queue_setup()
> >>>> 4. rte_eth_promiscuous_enable()
> >>>>     - Call dev->dev_ops->promiscuous_enable()
> >>>>     - Stores promiscuous state in dev->data->promiscuous 5.
> >>>> rte_eth_allmulticast_enable()
> >>>>     - Call dev->dev_ops->allmulticast_enable()
> >>>>     - Stores allmulticast state in dev->data->allmulticast 6.
> >>>> rte_eth_dev_start()
> >>>>     - Call dev->dev_ops->dev_start()
> >>>>     - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>>     - Call dev->dev_ops->promiscuous_enable()
> >>>>     - Call dev->dev_ops->allmulticast_enable()
> >>>>
> >>>> Even though all configuration is available in dev->data after step 5,
> >>>> library forces reapplying this configuration in step 6.
> >>>>
> >>>> In mlx5 PMD case all relevant callbacks require communication with the
> >>>> kernel driver, to configure the device (mlx5 PMD must create/destroy
> >>>> new kernel flow rules and/or change netdev config).
> >>>>
> >>>> mlx5 PMD handles applying all configuration in .dev_start(), so the
> >>>> following forced callbacks force additional communication with the kernel. The
> >>> same configuration is applied multiple times.
> >>>>
> >>>> As an optimization, mlx5 PMD could check if a given configuration was
> >>>> applied, but this would duplicate the functionality of the library
> >>>> (for example rte_eth_promiscuous_enable() does not call the driver if
> >>>> dev->data->promiscuous is set).
> >>>>
> >>>> Question: Since all of the configuration is available before
> >>>> .dev_start() callback is called, why ethdev library does not expect .dev_start() to
> >>> take this configuration into account?
> >>>> In other words, why library has to reapply the configuration?
> >>>>
> >>>> I could not find any particular reason why configuration restore
> >>>> exists as part of the process (it was in the initial DPDK commit).
> >>>>
> >>>
> >>> My assumption is .dev_stop() cause these values reset in some devices, so
> >>> .dev_start() restores them back.
> >>> @Bruce or @Konstantin may remember the history.
> >
> > Yep, as I remember, at least some Intel PMDs calling hw_reset() ad dec_stop() and
> > even dev_start() to make sure that HW is in a clean (known) state.
> >
> >>>
> >>> But I agree this is device specific behavior, and can be managed by what device
> >>> requires.
> >
> > Probably yes.
> >
> >>>
> >>>> The patches included in this RFC, propose a mechanism which would help
> >>>> with managing which drivers rely on forceful configuration restore.
> >>>> Drivers could advertise if forceful configuration restore is needed
> >>>> through `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If this flag is
> >>>> set, then the driver in question requires ethdev to forcefully restore
> >>> configuration.
> >>>>
> >>>
> >>> OK to use flag for it, but not sure about using 'dev_info->dev_flags'
> >>> (RTE_ETH_DEV_*) for this, as this flag is shared with user and this is all dpdk
> >>> internal.
> >>>
> >>> What about to have a dedicated flag for it? We can have a dedicated set of flag
> >>> values for restore.
> >>
> >> Agreed. What do you think about the following?
> >
> > Instead of exposing that, can we probably make it transparent to the user
> > and probably ethdev layer too?
> >
> 
> +1 to make it transparent to user, but not sure if we can make it
> transparent to ethdev layer.

Just to be clear:
Let say, using example from above: 

 rte_eth_dev_start()
     - Call dev->dev_ops->dev_start()
     - Call dev->dev_ops->mac_addr_set() - apply default MAC address
     - Call dev->dev_ops->promiscuous_enable()
     - Call dev->dev_ops->allmulticast_enable()

We probably can introduce ethdev internal function (still visible to PMDs)
that would do last 3 steps:
ethdev_replay_user_conf(...)
     - Call dev->dev_ops->mac_addr_set() - apply default MAC address
     - Call dev->dev_ops->promiscuous_enable()
     - Call dev->dev_ops->allmulticast_enable() 

And let PMD itself to decide does it needs to call it at dev_start() or not.
So it will become:
rte_eth_dev_start()
     - Call dev->dev_ops->dev_start()
	-Call ethdev_replay_user_conf(.)
     		- Call dev->dev_ops->mac_addr_set() - apply default MAC address
     		- Call dev->dev_ops->promiscuous_enable()
     		-Call dev->dev_ops->allmulticast_enable()

For PMDs that do need to restore user provided config
And 
rte_eth_dev_start()
     - Call dev->dev_ops->dev_start()

For those who do not.

> Suggested 'internal_flag' in "struct rte_eth_dev_data" can be confusing
> and open to interpretation what to use it for and by time become source
> of defect.

Yes, same thoughts.

> Instead what do you think to have a separate, dedicated data struct for it?

Hmm... not sure I understood you here...

> 
> > Might be we can move this restoration code into the new ethdev helper function,(ethdevd_user_config_restore()  or so)
> > that PMD can invoke during its dev_start() if needed?
> >
> >>
> >> #define RTE_ETH_DEV_INTERNAL_PROMISC_FORCE_RESTORE RTE_BIT32(0)
> >> #define RTE_ETH_DEV_INTERNAL_ALLMULTI_FORCE_RESTORE RTE_BIT32(1)
> >> #define RTE_ETH_DEV_INTERNAL_MAC_ADDR_FORCE_RESTORE RTE_BIT32(2)
> >>
> >> struct rte_eth_dev_data {
> >> 	/* snip */
> >>
> >> 	uint32_t dev_flags;
> >>
> >> 	/**
> >> 	 * Internal device capabilities, used only by ethdev library.
> >> 	 * Certain functionalities provided by the library might enabled/disabled,
> >> 	 * based on driver exposing certain capabilities.
> >> 	 */
> >> 	uint32_t internal_flags;
> >>
> >> 	/* snip */
> >> };
> >>
> >>> Also perhaps we have go into details what needs to be restored after 'stop' and
> >>> what needs to be restored after 'reset' and use similar mechanism etc...
> >>
> >> I think we should look into that.
> >> Any 'codification' of semantics between drivers and ethdev library is good in my opinion.
> >>
> >> At least right now, ethdev does not change any configuration in 'stop' and 'reset' from what I see.
> >> But that's on library side only.
> >>
> >>>> This way, if we would conclude that it makes sense for .dev_start() to
> >>>> handle all starting configuration aspects, we could track which drivers still rely
> >>> on configuration restore.
> >>>>
> >>>> Dariusz Sosnowski (4):
> >>>>   ethdev: rework config restore
> >>>>   ethdev: omit promiscuous config restore if not required
> >>>>   ethdev: omit all multicast config restore if not required
> >>>>   ethdev: omit MAC address restore if not required
> >>>>
> >>>>  lib/ethdev/rte_ethdev.c | 39 ++++++++++++++++++++++++++++++++++-----
> >>>>  lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++
> >>>>  2 files changed, 52 insertions(+), 5 deletions(-)
> >>>>
> >>>> --
> >>>> 2.39.5
> >>>>
> >


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC 0/4] ethdev: rework config restore
  2024-10-08 17:21         ` Konstantin Ananyev
@ 2024-10-09  1:07           ` Ferruh Yigit
  2024-10-09 10:54             ` Konstantin Ananyev
  2024-10-09 16:18             ` Dariusz Sosnowski
  0 siblings, 2 replies; 23+ messages in thread
From: Ferruh Yigit @ 2024-10-09  1:07 UTC (permalink / raw)
  To: Konstantin Ananyev, Dariusz Sosnowski,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Andrew Rybchenko
  Cc: dev, Bruce Richardson

On 10/8/2024 6:21 PM, Konstantin Ananyev wrote:
> 
> 
>>>>>> We have been working on optimizing the latency of calls to
>>>>>> rte_eth_dev_start(), on ports spawned by mlx5 PMD. Most of the work
>>>>>> requires changes in the implementation of
>>>>>> .dev_start() PMD callback, but I also wanted to start a discussion
>>>>>> regarding configuration restore.
>>>>>>
>>>>>> rte_eth_dev_start() does a few things on top of calling .dev_start() callback:
>>>>>>
>>>>>> - Before calling it:
>>>>>>     - eth_dev_mac_restore() - if device supports
>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
>>>>>> - After calling it:
>>>>>>     - eth_dev_mac_restore() - if device does not support
>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
>>>>>>     - restore promiscuous config
>>>>>>     - restore all multicast config
>>>>>>
>>>>>> eth_dev_mac_restore() iterates over all known MAC addresses - stored
>>>>>> in rte_eth_dev_data.mac_addrs array - and calls
>>>>>> .mac_addr_set() and .mac_addr_add() callbacks to apply these MAC addresses.
>>>>>>
>>>>>> Promiscuous config restore checks if promiscuous mode is enabled or
>>>>>> not, and calls .promiscuous_enable() or .promiscuous_disable() callback.
>>>>>>
>>>>>> All multicast config restore checks if all multicast mode is enabled
>>>>>> or not, and calls .allmulticast_enable() or .allmulticast_disable() callback.
>>>>>>
>>>>>> Callbacks are called directly in all of these cases, to bypass the
>>>>>> checks for applying the same configuration, which exist in relevant APIs.
>>>>>> Checks are bypassed to force drivers to reapply the configuration.
>>>>>>
>>>>>> Let's consider what happens in the following sequence of API calls.
>>>>>>
>>>>>> 1. rte_eth_dev_configure()
>>>>>> 2. rte_eth_tx_queue_setup()
>>>>>> 3. rte_eth_rx_queue_setup()
>>>>>> 4. rte_eth_promiscuous_enable()
>>>>>>     - Call dev->dev_ops->promiscuous_enable()
>>>>>>     - Stores promiscuous state in dev->data->promiscuous 5.
>>>>>> rte_eth_allmulticast_enable()
>>>>>>     - Call dev->dev_ops->allmulticast_enable()
>>>>>>     - Stores allmulticast state in dev->data->allmulticast 6.
>>>>>> rte_eth_dev_start()
>>>>>>     - Call dev->dev_ops->dev_start()
>>>>>>     - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>>>>>>     - Call dev->dev_ops->promiscuous_enable()
>>>>>>     - Call dev->dev_ops->allmulticast_enable()
>>>>>>
>>>>>> Even though all configuration is available in dev->data after step 5,
>>>>>> library forces reapplying this configuration in step 6.
>>>>>>
>>>>>> In mlx5 PMD case all relevant callbacks require communication with the
>>>>>> kernel driver, to configure the device (mlx5 PMD must create/destroy
>>>>>> new kernel flow rules and/or change netdev config).
>>>>>>
>>>>>> mlx5 PMD handles applying all configuration in .dev_start(), so the
>>>>>> following forced callbacks force additional communication with the kernel. The
>>>>> same configuration is applied multiple times.
>>>>>>
>>>>>> As an optimization, mlx5 PMD could check if a given configuration was
>>>>>> applied, but this would duplicate the functionality of the library
>>>>>> (for example rte_eth_promiscuous_enable() does not call the driver if
>>>>>> dev->data->promiscuous is set).
>>>>>>
>>>>>> Question: Since all of the configuration is available before
>>>>>> .dev_start() callback is called, why ethdev library does not expect .dev_start() to
>>>>> take this configuration into account?
>>>>>> In other words, why library has to reapply the configuration?
>>>>>>
>>>>>> I could not find any particular reason why configuration restore
>>>>>> exists as part of the process (it was in the initial DPDK commit).
>>>>>>
>>>>>
>>>>> My assumption is .dev_stop() cause these values reset in some devices, so
>>>>> .dev_start() restores them back.
>>>>> @Bruce or @Konstantin may remember the history.
>>>
>>> Yep, as I remember, at least some Intel PMDs calling hw_reset() ad dec_stop() and
>>> even dev_start() to make sure that HW is in a clean (known) state.
>>>
>>>>>
>>>>> But I agree this is device specific behavior, and can be managed by what device
>>>>> requires.
>>>
>>> Probably yes.
>>>
>>>>>
>>>>>> The patches included in this RFC, propose a mechanism which would help
>>>>>> with managing which drivers rely on forceful configuration restore.
>>>>>> Drivers could advertise if forceful configuration restore is needed
>>>>>> through `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If this flag is
>>>>>> set, then the driver in question requires ethdev to forcefully restore
>>>>> configuration.
>>>>>>
>>>>>
>>>>> OK to use flag for it, but not sure about using 'dev_info->dev_flags'
>>>>> (RTE_ETH_DEV_*) for this, as this flag is shared with user and this is all dpdk
>>>>> internal.
>>>>>
>>>>> What about to have a dedicated flag for it? We can have a dedicated set of flag
>>>>> values for restore.
>>>>
>>>> Agreed. What do you think about the following?
>>>
>>> Instead of exposing that, can we probably make it transparent to the user
>>> and probably ethdev layer too?
>>>
>>
>> +1 to make it transparent to user, but not sure if we can make it
>> transparent to ethdev layer.
> 
> Just to be clear:
> Let say, using example from above: 
> 
>  rte_eth_dev_start()
>      - Call dev->dev_ops->dev_start()
>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>      - Call dev->dev_ops->promiscuous_enable()
>      - Call dev->dev_ops->allmulticast_enable()
> 
> We probably can introduce ethdev internal function (still visible to PMDs)
> that would do last 3 steps:
> ethdev_replay_user_conf(...)
>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>      - Call dev->dev_ops->promiscuous_enable()
>      - Call dev->dev_ops->allmulticast_enable() 
> 
> And let PMD itself to decide does it needs to call it at dev_start() or not.
> So it will become:
> rte_eth_dev_start()
>      - Call dev->dev_ops->dev_start()
> 	-Call ethdev_replay_user_conf(.)
>      		- Call dev->dev_ops->mac_addr_set() - apply default MAC address
>      		- Call dev->dev_ops->promiscuous_enable()
>      		-Call dev->dev_ops->allmulticast_enable()
> 
> For PMDs that do need to restore user provided config
> And 
> rte_eth_dev_start()
>      - Call dev->dev_ops->dev_start()
> 
> For those who do not.
> 

OK, got it what you mean.
Pushing restore functionality to PMDs works, but this may be doing
redundant work on each PMD.

Instead Dariusz suggests PMD to provide a flag to ehtdev to what to
restore and common code in ethdev does the work.
My below dedicated data struct comment is to have this flag in a new
struct, overall like following:

rte_eth_dev_start()
   - Call dev->dev_ops->dev_start()
   - Call dev->dev_ops->get_restore_flags(ethdev, RTE_ETH_START, &flags)
   - if (flags & MAC) dev->dev_ops->mac_addr_set()
   - if (flags & PROMISC) dev->dev_ops->promiscuous_enable()
   - ...

So PMDs only will provide what to restore with an internal API and
common ethdev layer will restore it.
If no restore required PMD may not implement .get_restore_flags() at all.

Additionally, RTE_ETH_START, RTE_ETH_RESET etc flag can be provided to
internal API to get what to restore in different states...

>> Suggested 'internal_flag' in "struct rte_eth_dev_data" can be confusing
>> and open to interpretation what to use it for and by time become source
>> of defect.
> 
> Yes, same thoughts.
> 
>> Instead what do you think to have a separate, dedicated data struct for it?
> 
> Hmm... not sure I understood you here...
> 
>>
>>> Might be we can move this restoration code into the new ethdev helper function,(ethdevd_user_config_restore()  or so)
>>> that PMD can invoke during its dev_start() if needed?
>>>
>>>>
>>>> #define RTE_ETH_DEV_INTERNAL_PROMISC_FORCE_RESTORE RTE_BIT32(0)
>>>> #define RTE_ETH_DEV_INTERNAL_ALLMULTI_FORCE_RESTORE RTE_BIT32(1)
>>>> #define RTE_ETH_DEV_INTERNAL_MAC_ADDR_FORCE_RESTORE RTE_BIT32(2)
>>>>
>>>> struct rte_eth_dev_data {
>>>> 	/* snip */
>>>>
>>>> 	uint32_t dev_flags;
>>>>
>>>> 	/**
>>>> 	 * Internal device capabilities, used only by ethdev library.
>>>> 	 * Certain functionalities provided by the library might enabled/disabled,
>>>> 	 * based on driver exposing certain capabilities.
>>>> 	 */
>>>> 	uint32_t internal_flags;
>>>>
>>>> 	/* snip */
>>>> };
>>>>
>>>>> Also perhaps we have go into details what needs to be restored after 'stop' and
>>>>> what needs to be restored after 'reset' and use similar mechanism etc...
>>>>
>>>> I think we should look into that.
>>>> Any 'codification' of semantics between drivers and ethdev library is good in my opinion.
>>>>
>>>> At least right now, ethdev does not change any configuration in 'stop' and 'reset' from what I see.
>>>> But that's on library side only.
>>>>
>>>>>> This way, if we would conclude that it makes sense for .dev_start() to
>>>>>> handle all starting configuration aspects, we could track which drivers still rely
>>>>> on configuration restore.
>>>>>>
>>>>>> Dariusz Sosnowski (4):
>>>>>>   ethdev: rework config restore
>>>>>>   ethdev: omit promiscuous config restore if not required
>>>>>>   ethdev: omit all multicast config restore if not required
>>>>>>   ethdev: omit MAC address restore if not required
>>>>>>
>>>>>>  lib/ethdev/rte_ethdev.c | 39 ++++++++++++++++++++++++++++++++++-----
>>>>>>  lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++
>>>>>>  2 files changed, 52 insertions(+), 5 deletions(-)
>>>>>>
>>>>>> --
>>>>>> 2.39.5
>>>>>>
>>>
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [RFC 0/4] ethdev: rework config restore
  2024-10-09  1:07           ` Ferruh Yigit
@ 2024-10-09 10:54             ` Konstantin Ananyev
  2024-10-09 16:18             ` Dariusz Sosnowski
  1 sibling, 0 replies; 23+ messages in thread
From: Konstantin Ananyev @ 2024-10-09 10:54 UTC (permalink / raw)
  To: Ferruh Yigit, Dariusz Sosnowski,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Andrew Rybchenko
  Cc: dev, Bruce Richardson



> >>>>>> We have been working on optimizing the latency of calls to
> >>>>>> rte_eth_dev_start(), on ports spawned by mlx5 PMD. Most of the work
> >>>>>> requires changes in the implementation of
> >>>>>> .dev_start() PMD callback, but I also wanted to start a discussion
> >>>>>> regarding configuration restore.
> >>>>>>
> >>>>>> rte_eth_dev_start() does a few things on top of calling .dev_start() callback:
> >>>>>>
> >>>>>> - Before calling it:
> >>>>>>     - eth_dev_mac_restore() - if device supports
> >>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> >>>>>> - After calling it:
> >>>>>>     - eth_dev_mac_restore() - if device does not support
> >>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> >>>>>>     - restore promiscuous config
> >>>>>>     - restore all multicast config
> >>>>>>
> >>>>>> eth_dev_mac_restore() iterates over all known MAC addresses - stored
> >>>>>> in rte_eth_dev_data.mac_addrs array - and calls
> >>>>>> .mac_addr_set() and .mac_addr_add() callbacks to apply these MAC addresses.
> >>>>>>
> >>>>>> Promiscuous config restore checks if promiscuous mode is enabled or
> >>>>>> not, and calls .promiscuous_enable() or .promiscuous_disable() callback.
> >>>>>>
> >>>>>> All multicast config restore checks if all multicast mode is enabled
> >>>>>> or not, and calls .allmulticast_enable() or .allmulticast_disable() callback.
> >>>>>>
> >>>>>> Callbacks are called directly in all of these cases, to bypass the
> >>>>>> checks for applying the same configuration, which exist in relevant APIs.
> >>>>>> Checks are bypassed to force drivers to reapply the configuration.
> >>>>>>
> >>>>>> Let's consider what happens in the following sequence of API calls.
> >>>>>>
> >>>>>> 1. rte_eth_dev_configure()
> >>>>>> 2. rte_eth_tx_queue_setup()
> >>>>>> 3. rte_eth_rx_queue_setup()
> >>>>>> 4. rte_eth_promiscuous_enable()
> >>>>>>     - Call dev->dev_ops->promiscuous_enable()
> >>>>>>     - Stores promiscuous state in dev->data->promiscuous 5.
> >>>>>> rte_eth_allmulticast_enable()
> >>>>>>     - Call dev->dev_ops->allmulticast_enable()
> >>>>>>     - Stores allmulticast state in dev->data->allmulticast 6.
> >>>>>> rte_eth_dev_start()
> >>>>>>     - Call dev->dev_ops->dev_start()
> >>>>>>     - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>>>>     - Call dev->dev_ops->promiscuous_enable()
> >>>>>>     - Call dev->dev_ops->allmulticast_enable()
> >>>>>>
> >>>>>> Even though all configuration is available in dev->data after step 5,
> >>>>>> library forces reapplying this configuration in step 6.
> >>>>>>
> >>>>>> In mlx5 PMD case all relevant callbacks require communication with the
> >>>>>> kernel driver, to configure the device (mlx5 PMD must create/destroy
> >>>>>> new kernel flow rules and/or change netdev config).
> >>>>>>
> >>>>>> mlx5 PMD handles applying all configuration in .dev_start(), so the
> >>>>>> following forced callbacks force additional communication with the kernel. The
> >>>>> same configuration is applied multiple times.
> >>>>>>
> >>>>>> As an optimization, mlx5 PMD could check if a given configuration was
> >>>>>> applied, but this would duplicate the functionality of the library
> >>>>>> (for example rte_eth_promiscuous_enable() does not call the driver if
> >>>>>> dev->data->promiscuous is set).
> >>>>>>
> >>>>>> Question: Since all of the configuration is available before
> >>>>>> .dev_start() callback is called, why ethdev library does not expect .dev_start() to
> >>>>> take this configuration into account?
> >>>>>> In other words, why library has to reapply the configuration?
> >>>>>>
> >>>>>> I could not find any particular reason why configuration restore
> >>>>>> exists as part of the process (it was in the initial DPDK commit).
> >>>>>>
> >>>>>
> >>>>> My assumption is .dev_stop() cause these values reset in some devices, so
> >>>>> .dev_start() restores them back.
> >>>>> @Bruce or @Konstantin may remember the history.
> >>>
> >>> Yep, as I remember, at least some Intel PMDs calling hw_reset() ad dec_stop() and
> >>> even dev_start() to make sure that HW is in a clean (known) state.
> >>>
> >>>>>
> >>>>> But I agree this is device specific behavior, and can be managed by what device
> >>>>> requires.
> >>>
> >>> Probably yes.
> >>>
> >>>>>
> >>>>>> The patches included in this RFC, propose a mechanism which would help
> >>>>>> with managing which drivers rely on forceful configuration restore.
> >>>>>> Drivers could advertise if forceful configuration restore is needed
> >>>>>> through `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If this flag is
> >>>>>> set, then the driver in question requires ethdev to forcefully restore
> >>>>> configuration.
> >>>>>>
> >>>>>
> >>>>> OK to use flag for it, but not sure about using 'dev_info->dev_flags'
> >>>>> (RTE_ETH_DEV_*) for this, as this flag is shared with user and this is all dpdk
> >>>>> internal.
> >>>>>
> >>>>> What about to have a dedicated flag for it? We can have a dedicated set of flag
> >>>>> values for restore.
> >>>>
> >>>> Agreed. What do you think about the following?
> >>>
> >>> Instead of exposing that, can we probably make it transparent to the user
> >>> and probably ethdev layer too?
> >>>
> >>
> >> +1 to make it transparent to user, but not sure if we can make it
> >> transparent to ethdev layer.
> >
> > Just to be clear:
> > Let say, using example from above:
> >
> >  rte_eth_dev_start()
> >      - Call dev->dev_ops->dev_start()
> >      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >      - Call dev->dev_ops->promiscuous_enable()
> >      - Call dev->dev_ops->allmulticast_enable()
> >
> > We probably can introduce ethdev internal function (still visible to PMDs)
> > that would do last 3 steps:
> > ethdev_replay_user_conf(...)
> >      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >      - Call dev->dev_ops->promiscuous_enable()
> >      - Call dev->dev_ops->allmulticast_enable()
> >
> > And let PMD itself to decide does it needs to call it at dev_start() or not.
> > So it will become:
> > rte_eth_dev_start()
> >      - Call dev->dev_ops->dev_start()
> > 	-Call ethdev_replay_user_conf(.)
> >      		- Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >      		- Call dev->dev_ops->promiscuous_enable()
> >      		-Call dev->dev_ops->allmulticast_enable()
> >
> > For PMDs that do need to restore user provided config
> > And
> > rte_eth_dev_start()
> >      - Call dev->dev_ops->dev_start()
> >
> > For those who do not.
> >
> 
> OK, got it what you mean.
> Pushing restore functionality to PMDs works, but this may be doing
> redundant work on each PMD.

Yes, every PMD will have to be updated with that way.
 
> Instead Dariusz suggests PMD to provide a flag to ehtdev to what to
> restore and common code in ethdev does the work.
> My below dedicated data struct comment is to have this flag in a new
> struct, overall like following:
> 
> rte_eth_dev_start()
>    - Call dev->dev_ops->dev_start()
>    - Call dev->dev_ops->get_restore_flags(ethdev, RTE_ETH_START, &flags)
>    - if (flags & MAC) dev->dev_ops->mac_addr_set()
>    - if (flags & PROMISC) dev->dev_ops->promiscuous_enable()
>    - ...
> 
> So PMDs only will provide what to restore with an internal API and
> common ethdev layer will restore it.
> If no restore required PMD may not implement .get_restore_flags() at all.

Makes sense to me...
Probably less disruptive would be to have a dev_ops function
that would tell what settings don't need to be replayed by ehtdev layer. 
 
> Additionally, RTE_ETH_START, RTE_ETH_RESET etc flag can be provided to
> internal API to get what to restore in different states...
> 
> >> Suggested 'internal_flag' in "struct rte_eth_dev_data" can be confusing
> >> and open to interpretation what to use it for and by time become source
> >> of defect.
> >
> > Yes, same thoughts.
> >
> >> Instead what do you think to have a separate, dedicated data struct for it?
> >
> > Hmm... not sure I understood you here...
> >
> >>
> >>> Might be we can move this restoration code into the new ethdev helper function,(ethdevd_user_config_restore()  or so)
> >>> that PMD can invoke during its dev_start() if needed?
> >>>
> >>>>
> >>>> #define RTE_ETH_DEV_INTERNAL_PROMISC_FORCE_RESTORE RTE_BIT32(0)
> >>>> #define RTE_ETH_DEV_INTERNAL_ALLMULTI_FORCE_RESTORE RTE_BIT32(1)
> >>>> #define RTE_ETH_DEV_INTERNAL_MAC_ADDR_FORCE_RESTORE RTE_BIT32(2)
> >>>>
> >>>> struct rte_eth_dev_data {
> >>>> 	/* snip */
> >>>>
> >>>> 	uint32_t dev_flags;
> >>>>
> >>>> 	/**
> >>>> 	 * Internal device capabilities, used only by ethdev library.
> >>>> 	 * Certain functionalities provided by the library might enabled/disabled,
> >>>> 	 * based on driver exposing certain capabilities.
> >>>> 	 */
> >>>> 	uint32_t internal_flags;
> >>>>
> >>>> 	/* snip */
> >>>> };
> >>>>
> >>>>> Also perhaps we have go into details what needs to be restored after 'stop' and
> >>>>> what needs to be restored after 'reset' and use similar mechanism etc...
> >>>>
> >>>> I think we should look into that.
> >>>> Any 'codification' of semantics between drivers and ethdev library is good in my opinion.
> >>>>
> >>>> At least right now, ethdev does not change any configuration in 'stop' and 'reset' from what I see.
> >>>> But that's on library side only.
> >>>>
> >>>>>> This way, if we would conclude that it makes sense for .dev_start() to
> >>>>>> handle all starting configuration aspects, we could track which drivers still rely
> >>>>> on configuration restore.
> >>>>>>
> >>>>>> Dariusz Sosnowski (4):
> >>>>>>   ethdev: rework config restore
> >>>>>>   ethdev: omit promiscuous config restore if not required
> >>>>>>   ethdev: omit all multicast config restore if not required
> >>>>>>   ethdev: omit MAC address restore if not required
> >>>>>>
> >>>>>>  lib/ethdev/rte_ethdev.c | 39 ++++++++++++++++++++++++++++++++++-----
> >>>>>>  lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++
> >>>>>>  2 files changed, 52 insertions(+), 5 deletions(-)
> >>>>>>
> >>>>>> --
> >>>>>> 2.39.5
> >>>>>>
> >>>
> >


^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [RFC 0/4] ethdev: rework config restore
  2024-10-09  1:07           ` Ferruh Yigit
  2024-10-09 10:54             ` Konstantin Ananyev
@ 2024-10-09 16:18             ` Dariusz Sosnowski
  2024-10-09 23:16               ` Ferruh Yigit
  1 sibling, 1 reply; 23+ messages in thread
From: Dariusz Sosnowski @ 2024-10-09 16:18 UTC (permalink / raw)
  To: Ferruh Yigit, Konstantin Ananyev,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Andrew Rybchenko
  Cc: dev, Bruce Richardson

> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Wednesday, October 9, 2024 03:08
> To: Konstantin Ananyev <konstantin.ananyev@huawei.com>; Dariusz Sosnowski
> <dsosnowski@nvidia.com>; NBU-Contact-Thomas Monjalon (EXTERNAL)
> <thomas@monjalon.net>; Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
> Subject: Re: [RFC 0/4] ethdev: rework config restore
> 
> External email: Use caution opening links or attachments
> 
> 
> On 10/8/2024 6:21 PM, Konstantin Ananyev wrote:
> >
> >
> >>>>>> We have been working on optimizing the latency of calls to
> >>>>>> rte_eth_dev_start(), on ports spawned by mlx5 PMD. Most of the
> >>>>>> work requires changes in the implementation of
> >>>>>> .dev_start() PMD callback, but I also wanted to start a
> >>>>>> discussion regarding configuration restore.
> >>>>>>
> >>>>>> rte_eth_dev_start() does a few things on top of calling .dev_start()
> callback:
> >>>>>>
> >>>>>> - Before calling it:
> >>>>>>     - eth_dev_mac_restore() - if device supports
> >>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> >>>>>> - After calling it:
> >>>>>>     - eth_dev_mac_restore() - if device does not support
> >>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> >>>>>>     - restore promiscuous config
> >>>>>>     - restore all multicast config
> >>>>>>
> >>>>>> eth_dev_mac_restore() iterates over all known MAC addresses -
> >>>>>> stored in rte_eth_dev_data.mac_addrs array - and calls
> >>>>>> .mac_addr_set() and .mac_addr_add() callbacks to apply these MAC
> addresses.
> >>>>>>
> >>>>>> Promiscuous config restore checks if promiscuous mode is enabled
> >>>>>> or not, and calls .promiscuous_enable() or .promiscuous_disable()
> callback.
> >>>>>>
> >>>>>> All multicast config restore checks if all multicast mode is
> >>>>>> enabled or not, and calls .allmulticast_enable() or .allmulticast_disable()
> callback.
> >>>>>>
> >>>>>> Callbacks are called directly in all of these cases, to bypass
> >>>>>> the checks for applying the same configuration, which exist in relevant
> APIs.
> >>>>>> Checks are bypassed to force drivers to reapply the configuration.
> >>>>>>
> >>>>>> Let's consider what happens in the following sequence of API calls.
> >>>>>>
> >>>>>> 1. rte_eth_dev_configure()
> >>>>>> 2. rte_eth_tx_queue_setup()
> >>>>>> 3. rte_eth_rx_queue_setup()
> >>>>>> 4. rte_eth_promiscuous_enable()
> >>>>>>     - Call dev->dev_ops->promiscuous_enable()
> >>>>>>     - Stores promiscuous state in dev->data->promiscuous 5.
> >>>>>> rte_eth_allmulticast_enable()
> >>>>>>     - Call dev->dev_ops->allmulticast_enable()
> >>>>>>     - Stores allmulticast state in dev->data->allmulticast 6.
> >>>>>> rte_eth_dev_start()
> >>>>>>     - Call dev->dev_ops->dev_start()
> >>>>>>     - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>>>>     - Call dev->dev_ops->promiscuous_enable()
> >>>>>>     - Call dev->dev_ops->allmulticast_enable()
> >>>>>>
> >>>>>> Even though all configuration is available in dev->data after
> >>>>>> step 5, library forces reapplying this configuration in step 6.
> >>>>>>
> >>>>>> In mlx5 PMD case all relevant callbacks require communication
> >>>>>> with the kernel driver, to configure the device (mlx5 PMD must
> >>>>>> create/destroy new kernel flow rules and/or change netdev config).
> >>>>>>
> >>>>>> mlx5 PMD handles applying all configuration in .dev_start(), so
> >>>>>> the following forced callbacks force additional communication
> >>>>>> with the kernel. The
> >>>>> same configuration is applied multiple times.
> >>>>>>
> >>>>>> As an optimization, mlx5 PMD could check if a given configuration
> >>>>>> was applied, but this would duplicate the functionality of the
> >>>>>> library (for example rte_eth_promiscuous_enable() does not call
> >>>>>> the driver if
> >>>>>> dev->data->promiscuous is set).
> >>>>>>
> >>>>>> Question: Since all of the configuration is available before
> >>>>>> .dev_start() callback is called, why ethdev library does not
> >>>>>> expect .dev_start() to
> >>>>> take this configuration into account?
> >>>>>> In other words, why library has to reapply the configuration?
> >>>>>>
> >>>>>> I could not find any particular reason why configuration restore
> >>>>>> exists as part of the process (it was in the initial DPDK commit).
> >>>>>>
> >>>>>
> >>>>> My assumption is .dev_stop() cause these values reset in some
> >>>>> devices, so
> >>>>> .dev_start() restores them back.
> >>>>> @Bruce or @Konstantin may remember the history.
> >>>
> >>> Yep, as I remember, at least some Intel PMDs calling hw_reset() ad
> >>> dec_stop() and even dev_start() to make sure that HW is in a clean (known)
> state.
> >>>
> >>>>>
> >>>>> But I agree this is device specific behavior, and can be managed
> >>>>> by what device requires.
> >>>
> >>> Probably yes.
> >>>
> >>>>>
> >>>>>> The patches included in this RFC, propose a mechanism which would
> >>>>>> help with managing which drivers rely on forceful configuration restore.
> >>>>>> Drivers could advertise if forceful configuration restore is
> >>>>>> needed through `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If this
> >>>>>> flag is set, then the driver in question requires ethdev to
> >>>>>> forcefully restore
> >>>>> configuration.
> >>>>>>
> >>>>>
> >>>>> OK to use flag for it, but not sure about using 'dev_info->dev_flags'
> >>>>> (RTE_ETH_DEV_*) for this, as this flag is shared with user and
> >>>>> this is all dpdk internal.
> >>>>>
> >>>>> What about to have a dedicated flag for it? We can have a
> >>>>> dedicated set of flag values for restore.
> >>>>
> >>>> Agreed. What do you think about the following?
> >>>
> >>> Instead of exposing that, can we probably make it transparent to the
> >>> user and probably ethdev layer too?
> >>>
> >>
> >> +1 to make it transparent to user, but not sure if we can make it
> >> transparent to ethdev layer.
> >
> > Just to be clear:
> > Let say, using example from above:
> >
> >  rte_eth_dev_start()
> >      - Call dev->dev_ops->dev_start()
> >      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >      - Call dev->dev_ops->promiscuous_enable()
> >      - Call dev->dev_ops->allmulticast_enable()
> >
> > We probably can introduce ethdev internal function (still visible to
> > PMDs) that would do last 3 steps:
> > ethdev_replay_user_conf(...)
> >      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >      - Call dev->dev_ops->promiscuous_enable()
> >      - Call dev->dev_ops->allmulticast_enable()
> >
> > And let PMD itself to decide does it needs to call it at dev_start() or not.
> > So it will become:
> > rte_eth_dev_start()
> >      - Call dev->dev_ops->dev_start()
> >       -Call ethdev_replay_user_conf(.)
> >               - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >               - Call dev->dev_ops->promiscuous_enable()
> >               -Call dev->dev_ops->allmulticast_enable()
> >
> > For PMDs that do need to restore user provided config And
> > rte_eth_dev_start()
> >      - Call dev->dev_ops->dev_start()
> >
> > For those who do not.
> >
> 
> OK, got it what you mean.
> Pushing restore functionality to PMDs works, but this may be doing redundant
> work on each PMD.
> 
> Instead Dariusz suggests PMD to provide a flag to ehtdev to what to restore and
> common code in ethdev does the work.
> My below dedicated data struct comment is to have this flag in a new struct,
> overall like following:
> 
> rte_eth_dev_start()
>    - Call dev->dev_ops->dev_start()
>    - Call dev->dev_ops->get_restore_flags(ethdev, RTE_ETH_START, &flags)
>    - if (flags & MAC) dev->dev_ops->mac_addr_set()
>    - if (flags & PROMISC) dev->dev_ops->promiscuous_enable()
>    - ...

Could you please explain what is the benefit of exposing flags through dev_ops callback vs a dedicated flags field in rte_eth_dev_data?
In both solutions:
- config restore is transparent to the user,
- drivers can omit config restore (either by not implementing the callback or not providing the flags),
- an ABI change is introduced (not a huge concern, at least for 24.11).

I understand that my initial proposal with "internal_flags" was too vague,
but renaming and splitting this field into:

- dev_start_restore_flags
- dev_reset_restore_flags
- and so on...

seems sufficient, at least in my opinion.

> 
> So PMDs only will provide what to restore with an internal API and common
> ethdev layer will restore it.
> If no restore required PMD may not implement .get_restore_flags() at all.
> 
> Additionally, RTE_ETH_START, RTE_ETH_RESET etc flag can be provided to internal
> API to get what to restore in different states...
> 
> >> Suggested 'internal_flag' in "struct rte_eth_dev_data" can be
> >> confusing and open to interpretation what to use it for and by time
> >> become source of defect.
> >
> > Yes, same thoughts.
> >
> >> Instead what do you think to have a separate, dedicated data struct for it?
> >
> > Hmm... not sure I understood you here...
> >
> >>
> >>> Might be we can move this restoration code into the new ethdev
> >>> helper function,(ethdevd_user_config_restore()  or so) that PMD can invoke
> during its dev_start() if needed?
> >>>
> >>>>
> >>>> #define RTE_ETH_DEV_INTERNAL_PROMISC_FORCE_RESTORE RTE_BIT32(0)
> >>>> #define RTE_ETH_DEV_INTERNAL_ALLMULTI_FORCE_RESTORE RTE_BIT32(1)
> >>>> #define RTE_ETH_DEV_INTERNAL_MAC_ADDR_FORCE_RESTORE
> RTE_BIT32(2)
> >>>>
> >>>> struct rte_eth_dev_data {
> >>>>    /* snip */
> >>>>
> >>>>    uint32_t dev_flags;
> >>>>
> >>>>    /**
> >>>>     * Internal device capabilities, used only by ethdev library.
> >>>>     * Certain functionalities provided by the library might enabled/disabled,
> >>>>     * based on driver exposing certain capabilities.
> >>>>     */
> >>>>    uint32_t internal_flags;
> >>>>
> >>>>    /* snip */
> >>>> };
> >>>>
> >>>>> Also perhaps we have go into details what needs to be restored
> >>>>> after 'stop' and what needs to be restored after 'reset' and use similar
> mechanism etc...
> >>>>
> >>>> I think we should look into that.
> >>>> Any 'codification' of semantics between drivers and ethdev library is good in
> my opinion.
> >>>>
> >>>> At least right now, ethdev does not change any configuration in 'stop' and
> 'reset' from what I see.
> >>>> But that's on library side only.
> >>>>
> >>>>>> This way, if we would conclude that it makes sense for
> >>>>>> .dev_start() to handle all starting configuration aspects, we
> >>>>>> could track which drivers still rely
> >>>>> on configuration restore.
> >>>>>>
> >>>>>> Dariusz Sosnowski (4):
> >>>>>>   ethdev: rework config restore
> >>>>>>   ethdev: omit promiscuous config restore if not required
> >>>>>>   ethdev: omit all multicast config restore if not required
> >>>>>>   ethdev: omit MAC address restore if not required
> >>>>>>
> >>>>>>  lib/ethdev/rte_ethdev.c | 39
> >>>>>> ++++++++++++++++++++++++++++++++++-----
> >>>>>>  lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++
> >>>>>>  2 files changed, 52 insertions(+), 5 deletions(-)
> >>>>>>
> >>>>>> --
> >>>>>> 2.39.5
> >>>>>>
> >>>
> >


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC 0/4] ethdev: rework config restore
  2024-10-09 16:18             ` Dariusz Sosnowski
@ 2024-10-09 23:16               ` Ferruh Yigit
  2024-10-10 12:08                 ` Dariusz Sosnowski
  0 siblings, 1 reply; 23+ messages in thread
From: Ferruh Yigit @ 2024-10-09 23:16 UTC (permalink / raw)
  To: Dariusz Sosnowski, Konstantin Ananyev,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Andrew Rybchenko
  Cc: dev, Bruce Richardson

On 10/9/2024 5:18 PM, Dariusz Sosnowski wrote:
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>> Sent: Wednesday, October 9, 2024 03:08
>> To: Konstantin Ananyev <konstantin.ananyev@huawei.com>; Dariusz Sosnowski
>> <dsosnowski@nvidia.com>; NBU-Contact-Thomas Monjalon (EXTERNAL)
>> <thomas@monjalon.net>; Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
>> Subject: Re: [RFC 0/4] ethdev: rework config restore
>>
>> External email: Use caution opening links or attachments
>>
>>
>> On 10/8/2024 6:21 PM, Konstantin Ananyev wrote:
>>>
>>>
>>>>>>>> We have been working on optimizing the latency of calls to
>>>>>>>> rte_eth_dev_start(), on ports spawned by mlx5 PMD. Most of the
>>>>>>>> work requires changes in the implementation of
>>>>>>>> .dev_start() PMD callback, but I also wanted to start a
>>>>>>>> discussion regarding configuration restore.
>>>>>>>>
>>>>>>>> rte_eth_dev_start() does a few things on top of calling .dev_start()
>> callback:
>>>>>>>>
>>>>>>>> - Before calling it:
>>>>>>>>     - eth_dev_mac_restore() - if device supports
>>>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
>>>>>>>> - After calling it:
>>>>>>>>     - eth_dev_mac_restore() - if device does not support
>>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
>>>>>>>>     - restore promiscuous config
>>>>>>>>     - restore all multicast config
>>>>>>>>
>>>>>>>> eth_dev_mac_restore() iterates over all known MAC addresses -
>>>>>>>> stored in rte_eth_dev_data.mac_addrs array - and calls
>>>>>>>> .mac_addr_set() and .mac_addr_add() callbacks to apply these MAC
>> addresses.
>>>>>>>>
>>>>>>>> Promiscuous config restore checks if promiscuous mode is enabled
>>>>>>>> or not, and calls .promiscuous_enable() or .promiscuous_disable()
>> callback.
>>>>>>>>
>>>>>>>> All multicast config restore checks if all multicast mode is
>>>>>>>> enabled or not, and calls .allmulticast_enable() or .allmulticast_disable()
>> callback.
>>>>>>>>
>>>>>>>> Callbacks are called directly in all of these cases, to bypass
>>>>>>>> the checks for applying the same configuration, which exist in relevant
>> APIs.
>>>>>>>> Checks are bypassed to force drivers to reapply the configuration.
>>>>>>>>
>>>>>>>> Let's consider what happens in the following sequence of API calls.
>>>>>>>>
>>>>>>>> 1. rte_eth_dev_configure()
>>>>>>>> 2. rte_eth_tx_queue_setup()
>>>>>>>> 3. rte_eth_rx_queue_setup()
>>>>>>>> 4. rte_eth_promiscuous_enable()
>>>>>>>>     - Call dev->dev_ops->promiscuous_enable()
>>>>>>>>     - Stores promiscuous state in dev->data->promiscuous 5.
>>>>>>>> rte_eth_allmulticast_enable()
>>>>>>>>     - Call dev->dev_ops->allmulticast_enable()
>>>>>>>>     - Stores allmulticast state in dev->data->allmulticast 6.
>>>>>>>> rte_eth_dev_start()
>>>>>>>>     - Call dev->dev_ops->dev_start()
>>>>>>>>     - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>>>>>>>>     - Call dev->dev_ops->promiscuous_enable()
>>>>>>>>     - Call dev->dev_ops->allmulticast_enable()
>>>>>>>>
>>>>>>>> Even though all configuration is available in dev->data after
>>>>>>>> step 5, library forces reapplying this configuration in step 6.
>>>>>>>>
>>>>>>>> In mlx5 PMD case all relevant callbacks require communication
>>>>>>>> with the kernel driver, to configure the device (mlx5 PMD must
>>>>>>>> create/destroy new kernel flow rules and/or change netdev config).
>>>>>>>>
>>>>>>>> mlx5 PMD handles applying all configuration in .dev_start(), so
>>>>>>>> the following forced callbacks force additional communication
>>>>>>>> with the kernel. The
>>>>>>> same configuration is applied multiple times.
>>>>>>>>
>>>>>>>> As an optimization, mlx5 PMD could check if a given configuration
>>>>>>>> was applied, but this would duplicate the functionality of the
>>>>>>>> library (for example rte_eth_promiscuous_enable() does not call
>>>>>>>> the driver if
>>>>>>>> dev->data->promiscuous is set).
>>>>>>>>
>>>>>>>> Question: Since all of the configuration is available before
>>>>>>>> .dev_start() callback is called, why ethdev library does not
>>>>>>>> expect .dev_start() to
>>>>>>> take this configuration into account?
>>>>>>>> In other words, why library has to reapply the configuration?
>>>>>>>>
>>>>>>>> I could not find any particular reason why configuration restore
>>>>>>>> exists as part of the process (it was in the initial DPDK commit).
>>>>>>>>
>>>>>>>
>>>>>>> My assumption is .dev_stop() cause these values reset in some
>>>>>>> devices, so
>>>>>>> .dev_start() restores them back.
>>>>>>> @Bruce or @Konstantin may remember the history.
>>>>>
>>>>> Yep, as I remember, at least some Intel PMDs calling hw_reset() ad
>>>>> dec_stop() and even dev_start() to make sure that HW is in a clean (known)
>> state.
>>>>>
>>>>>>>
>>>>>>> But I agree this is device specific behavior, and can be managed
>>>>>>> by what device requires.
>>>>>
>>>>> Probably yes.
>>>>>
>>>>>>>
>>>>>>>> The patches included in this RFC, propose a mechanism which would
>>>>>>>> help with managing which drivers rely on forceful configuration restore.
>>>>>>>> Drivers could advertise if forceful configuration restore is
>>>>>>>> needed through `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If this
>>>>>>>> flag is set, then the driver in question requires ethdev to
>>>>>>>> forcefully restore
>>>>>>> configuration.
>>>>>>>>
>>>>>>>
>>>>>>> OK to use flag for it, but not sure about using 'dev_info->dev_flags'
>>>>>>> (RTE_ETH_DEV_*) for this, as this flag is shared with user and
>>>>>>> this is all dpdk internal.
>>>>>>>
>>>>>>> What about to have a dedicated flag for it? We can have a
>>>>>>> dedicated set of flag values for restore.
>>>>>>
>>>>>> Agreed. What do you think about the following?
>>>>>
>>>>> Instead of exposing that, can we probably make it transparent to the
>>>>> user and probably ethdev layer too?
>>>>>
>>>>
>>>> +1 to make it transparent to user, but not sure if we can make it
>>>> transparent to ethdev layer.
>>>
>>> Just to be clear:
>>> Let say, using example from above:
>>>
>>>  rte_eth_dev_start()
>>>      - Call dev->dev_ops->dev_start()
>>>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>>>      - Call dev->dev_ops->promiscuous_enable()
>>>      - Call dev->dev_ops->allmulticast_enable()
>>>
>>> We probably can introduce ethdev internal function (still visible to
>>> PMDs) that would do last 3 steps:
>>> ethdev_replay_user_conf(...)
>>>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>>>      - Call dev->dev_ops->promiscuous_enable()
>>>      - Call dev->dev_ops->allmulticast_enable()
>>>
>>> And let PMD itself to decide does it needs to call it at dev_start() or not.
>>> So it will become:
>>> rte_eth_dev_start()
>>>      - Call dev->dev_ops->dev_start()
>>>       -Call ethdev_replay_user_conf(.)
>>>               - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>>>               - Call dev->dev_ops->promiscuous_enable()
>>>               -Call dev->dev_ops->allmulticast_enable()
>>>
>>> For PMDs that do need to restore user provided config And
>>> rte_eth_dev_start()
>>>      - Call dev->dev_ops->dev_start()
>>>
>>> For those who do not.
>>>
>>
>> OK, got it what you mean.
>> Pushing restore functionality to PMDs works, but this may be doing redundant
>> work on each PMD.
>>
>> Instead Dariusz suggests PMD to provide a flag to ehtdev to what to restore and
>> common code in ethdev does the work.
>> My below dedicated data struct comment is to have this flag in a new struct,
>> overall like following:
>>
>> rte_eth_dev_start()
>>    - Call dev->dev_ops->dev_start()
>>    - Call dev->dev_ops->get_restore_flags(ethdev, RTE_ETH_START, &flags)
>>    - if (flags & MAC) dev->dev_ops->mac_addr_set()
>>    - if (flags & PROMISC) dev->dev_ops->promiscuous_enable()
>>    - ...
> 
> Could you please explain what is the benefit of exposing flags through dev_ops callback vs a dedicated flags field in rte_eth_dev_data?
> In both solutions:
> - config restore is transparent to the user,
> - drivers can omit config restore (either by not implementing the callback or not providing the flags),
> - an ABI change is introduced (not a huge concern, at least for 24.11).
> 
> I understand that my initial proposal with "internal_flags" was too vague,
> but renaming and splitting this field into:
> 
> - dev_start_restore_flags
> - dev_reset_restore_flags
> - and so on...
> 
> seems sufficient, at least in my opinion.
> 

Hi Dariusz,

Putting flags to rte_eth_dev_data works, and it is easier since there is
direct access from rte_eth_dev to rte_eth_dev_data, so you don't need
new dev_ops. So this is a valid option.

But benefit of new dev_ops is to keep "struct rte_eth_dev_data" clean.

"struct rte_eth_dev_data" is integral data structure for ethdev and it
is used in multiple locations, mostly related to the datapath and all
drivers needs to deal with fields of this struct.
Like [rx]_queues, dev_private, dev_conf all important and used a lot.

I want to protect "struct rte_eth_dev_data" from noise as much as
possible, though what is noise is not always that clear.

This restore flag is not critical, and I expect most of the drivers
won't care and populate this restore flag at all. That is why to me it
is better have dedicated struct for it and only drivers care about
restore feature know it.



>>
>> So PMDs only will provide what to restore with an internal API and common
>> ethdev layer will restore it.
>> If no restore required PMD may not implement .get_restore_flags() at all.
>>
>> Additionally, RTE_ETH_START, RTE_ETH_RESET etc flag can be provided to internal
>> API to get what to restore in different states...
>>
>>>> Suggested 'internal_flag' in "struct rte_eth_dev_data" can be
>>>> confusing and open to interpretation what to use it for and by time
>>>> become source of defect.
>>>
>>> Yes, same thoughts.
>>>
>>>> Instead what do you think to have a separate, dedicated data struct for it?
>>>
>>> Hmm... not sure I understood you here...
>>>
>>>>
>>>>> Might be we can move this restoration code into the new ethdev
>>>>> helper function,(ethdevd_user_config_restore()  or so) that PMD can invoke
>> during its dev_start() if needed?
>>>>>
>>>>>>
>>>>>> #define RTE_ETH_DEV_INTERNAL_PROMISC_FORCE_RESTORE RTE_BIT32(0)
>>>>>> #define RTE_ETH_DEV_INTERNAL_ALLMULTI_FORCE_RESTORE RTE_BIT32(1)
>>>>>> #define RTE_ETH_DEV_INTERNAL_MAC_ADDR_FORCE_RESTORE
>> RTE_BIT32(2)
>>>>>>
>>>>>> struct rte_eth_dev_data {
>>>>>>    /* snip */
>>>>>>
>>>>>>    uint32_t dev_flags;
>>>>>>
>>>>>>    /**
>>>>>>     * Internal device capabilities, used only by ethdev library.
>>>>>>     * Certain functionalities provided by the library might enabled/disabled,
>>>>>>     * based on driver exposing certain capabilities.
>>>>>>     */
>>>>>>    uint32_t internal_flags;
>>>>>>
>>>>>>    /* snip */
>>>>>> };
>>>>>>
>>>>>>> Also perhaps we have go into details what needs to be restored
>>>>>>> after 'stop' and what needs to be restored after 'reset' and use similar
>> mechanism etc...
>>>>>>
>>>>>> I think we should look into that.
>>>>>> Any 'codification' of semantics between drivers and ethdev library is good in
>> my opinion.
>>>>>>
>>>>>> At least right now, ethdev does not change any configuration in 'stop' and
>> 'reset' from what I see.
>>>>>> But that's on library side only.
>>>>>>
>>>>>>>> This way, if we would conclude that it makes sense for
>>>>>>>> .dev_start() to handle all starting configuration aspects, we
>>>>>>>> could track which drivers still rely
>>>>>>> on configuration restore.
>>>>>>>>
>>>>>>>> Dariusz Sosnowski (4):
>>>>>>>>   ethdev: rework config restore
>>>>>>>>   ethdev: omit promiscuous config restore if not required
>>>>>>>>   ethdev: omit all multicast config restore if not required
>>>>>>>>   ethdev: omit MAC address restore if not required
>>>>>>>>
>>>>>>>>  lib/ethdev/rte_ethdev.c | 39
>>>>>>>> ++++++++++++++++++++++++++++++++++-----
>>>>>>>>  lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++
>>>>>>>>  2 files changed, 52 insertions(+), 5 deletions(-)
>>>>>>>>
>>>>>>>> --
>>>>>>>> 2.39.5
>>>>>>>>
>>>>>
>>>
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [RFC 0/4] ethdev: rework config restore
  2024-10-09 23:16               ` Ferruh Yigit
@ 2024-10-10 12:08                 ` Dariusz Sosnowski
  2024-10-10 12:51                   ` Ferruh Yigit
  0 siblings, 1 reply; 23+ messages in thread
From: Dariusz Sosnowski @ 2024-10-10 12:08 UTC (permalink / raw)
  To: Ferruh Yigit, Konstantin Ananyev,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Andrew Rybchenko
  Cc: dev, Bruce Richardson

> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Thursday, October 10, 2024 01:17
> To: Dariusz Sosnowski <dsosnowski@nvidia.com>; Konstantin Ananyev
> <konstantin.ananyev@huawei.com>; NBU-Contact-Thomas Monjalon (EXTERNAL)
> <thomas@monjalon.net>; Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
> Subject: Re: [RFC 0/4] ethdev: rework config restore
> 
> External email: Use caution opening links or attachments
> 
> 
> On 10/9/2024 5:18 PM, Dariusz Sosnowski wrote:
> >> -----Original Message-----
> >> From: Ferruh Yigit <ferruh.yigit@amd.com>
> >> Sent: Wednesday, October 9, 2024 03:08
> >> To: Konstantin Ananyev <konstantin.ananyev@huawei.com>; Dariusz
> >> Sosnowski <dsosnowski@nvidia.com>; NBU-Contact-Thomas Monjalon
> >> (EXTERNAL) <thomas@monjalon.net>; Andrew Rybchenko
> >> <andrew.rybchenko@oktetlabs.ru>
> >> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
> >> Subject: Re: [RFC 0/4] ethdev: rework config restore
> >>
> >> External email: Use caution opening links or attachments
> >>
> >>
> >> On 10/8/2024 6:21 PM, Konstantin Ananyev wrote:
> >>>
> >>>
> >>>>>>>> We have been working on optimizing the latency of calls to
> >>>>>>>> rte_eth_dev_start(), on ports spawned by mlx5 PMD. Most of the
> >>>>>>>> work requires changes in the implementation of
> >>>>>>>> .dev_start() PMD callback, but I also wanted to start a
> >>>>>>>> discussion regarding configuration restore.
> >>>>>>>>
> >>>>>>>> rte_eth_dev_start() does a few things on top of calling
> >>>>>>>> .dev_start()
> >> callback:
> >>>>>>>>
> >>>>>>>> - Before calling it:
> >>>>>>>>     - eth_dev_mac_restore() - if device supports
> >>>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> >>>>>>>> - After calling it:
> >>>>>>>>     - eth_dev_mac_restore() - if device does not support
> >>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> >>>>>>>>     - restore promiscuous config
> >>>>>>>>     - restore all multicast config
> >>>>>>>>
> >>>>>>>> eth_dev_mac_restore() iterates over all known MAC addresses -
> >>>>>>>> stored in rte_eth_dev_data.mac_addrs array - and calls
> >>>>>>>> .mac_addr_set() and .mac_addr_add() callbacks to apply these
> >>>>>>>> MAC
> >> addresses.
> >>>>>>>>
> >>>>>>>> Promiscuous config restore checks if promiscuous mode is
> >>>>>>>> enabled or not, and calls .promiscuous_enable() or
> >>>>>>>> .promiscuous_disable()
> >> callback.
> >>>>>>>>
> >>>>>>>> All multicast config restore checks if all multicast mode is
> >>>>>>>> enabled or not, and calls .allmulticast_enable() or
> >>>>>>>> .allmulticast_disable()
> >> callback.
> >>>>>>>>
> >>>>>>>> Callbacks are called directly in all of these cases, to bypass
> >>>>>>>> the checks for applying the same configuration, which exist in
> >>>>>>>> relevant
> >> APIs.
> >>>>>>>> Checks are bypassed to force drivers to reapply the configuration.
> >>>>>>>>
> >>>>>>>> Let's consider what happens in the following sequence of API calls.
> >>>>>>>>
> >>>>>>>> 1. rte_eth_dev_configure()
> >>>>>>>> 2. rte_eth_tx_queue_setup()
> >>>>>>>> 3. rte_eth_rx_queue_setup()
> >>>>>>>> 4. rte_eth_promiscuous_enable()
> >>>>>>>>     - Call dev->dev_ops->promiscuous_enable()
> >>>>>>>>     - Stores promiscuous state in dev->data->promiscuous 5.
> >>>>>>>> rte_eth_allmulticast_enable()
> >>>>>>>>     - Call dev->dev_ops->allmulticast_enable()
> >>>>>>>>     - Stores allmulticast state in dev->data->allmulticast 6.
> >>>>>>>> rte_eth_dev_start()
> >>>>>>>>     - Call dev->dev_ops->dev_start()
> >>>>>>>>     - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>>>>>>     - Call dev->dev_ops->promiscuous_enable()
> >>>>>>>>     - Call dev->dev_ops->allmulticast_enable()
> >>>>>>>>
> >>>>>>>> Even though all configuration is available in dev->data after
> >>>>>>>> step 5, library forces reapplying this configuration in step 6.
> >>>>>>>>
> >>>>>>>> In mlx5 PMD case all relevant callbacks require communication
> >>>>>>>> with the kernel driver, to configure the device (mlx5 PMD must
> >>>>>>>> create/destroy new kernel flow rules and/or change netdev config).
> >>>>>>>>
> >>>>>>>> mlx5 PMD handles applying all configuration in .dev_start(), so
> >>>>>>>> the following forced callbacks force additional communication
> >>>>>>>> with the kernel. The
> >>>>>>> same configuration is applied multiple times.
> >>>>>>>>
> >>>>>>>> As an optimization, mlx5 PMD could check if a given
> >>>>>>>> configuration was applied, but this would duplicate the
> >>>>>>>> functionality of the library (for example
> >>>>>>>> rte_eth_promiscuous_enable() does not call the driver if
> >>>>>>>> dev->data->promiscuous is set).
> >>>>>>>>
> >>>>>>>> Question: Since all of the configuration is available before
> >>>>>>>> .dev_start() callback is called, why ethdev library does not
> >>>>>>>> expect .dev_start() to
> >>>>>>> take this configuration into account?
> >>>>>>>> In other words, why library has to reapply the configuration?
> >>>>>>>>
> >>>>>>>> I could not find any particular reason why configuration
> >>>>>>>> restore exists as part of the process (it was in the initial DPDK commit).
> >>>>>>>>
> >>>>>>>
> >>>>>>> My assumption is .dev_stop() cause these values reset in some
> >>>>>>> devices, so
> >>>>>>> .dev_start() restores them back.
> >>>>>>> @Bruce or @Konstantin may remember the history.
> >>>>>
> >>>>> Yep, as I remember, at least some Intel PMDs calling hw_reset() ad
> >>>>> dec_stop() and even dev_start() to make sure that HW is in a clean
> >>>>> (known)
> >> state.
> >>>>>
> >>>>>>>
> >>>>>>> But I agree this is device specific behavior, and can be managed
> >>>>>>> by what device requires.
> >>>>>
> >>>>> Probably yes.
> >>>>>
> >>>>>>>
> >>>>>>>> The patches included in this RFC, propose a mechanism which
> >>>>>>>> would help with managing which drivers rely on forceful configuration
> restore.
> >>>>>>>> Drivers could advertise if forceful configuration restore is
> >>>>>>>> needed through `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If
> >>>>>>>> this flag is set, then the driver in question requires ethdev
> >>>>>>>> to forcefully restore
> >>>>>>> configuration.
> >>>>>>>>
> >>>>>>>
> >>>>>>> OK to use flag for it, but not sure about using 'dev_info->dev_flags'
> >>>>>>> (RTE_ETH_DEV_*) for this, as this flag is shared with user and
> >>>>>>> this is all dpdk internal.
> >>>>>>>
> >>>>>>> What about to have a dedicated flag for it? We can have a
> >>>>>>> dedicated set of flag values for restore.
> >>>>>>
> >>>>>> Agreed. What do you think about the following?
> >>>>>
> >>>>> Instead of exposing that, can we probably make it transparent to
> >>>>> the user and probably ethdev layer too?
> >>>>>
> >>>>
> >>>> +1 to make it transparent to user, but not sure if we can make it
> >>>> transparent to ethdev layer.
> >>>
> >>> Just to be clear:
> >>> Let say, using example from above:
> >>>
> >>>  rte_eth_dev_start()
> >>>      - Call dev->dev_ops->dev_start()
> >>>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>      - Call dev->dev_ops->promiscuous_enable()
> >>>      - Call dev->dev_ops->allmulticast_enable()
> >>>
> >>> We probably can introduce ethdev internal function (still visible to
> >>> PMDs) that would do last 3 steps:
> >>> ethdev_replay_user_conf(...)
> >>>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>      - Call dev->dev_ops->promiscuous_enable()
> >>>      - Call dev->dev_ops->allmulticast_enable()
> >>>
> >>> And let PMD itself to decide does it needs to call it at dev_start() or not.
> >>> So it will become:
> >>> rte_eth_dev_start()
> >>>      - Call dev->dev_ops->dev_start()
> >>>       -Call ethdev_replay_user_conf(.)
> >>>               - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>               - Call dev->dev_ops->promiscuous_enable()
> >>>               -Call dev->dev_ops->allmulticast_enable()
> >>>
> >>> For PMDs that do need to restore user provided config And
> >>> rte_eth_dev_start()
> >>>      - Call dev->dev_ops->dev_start()
> >>>
> >>> For those who do not.
> >>>
> >>
> >> OK, got it what you mean.
> >> Pushing restore functionality to PMDs works, but this may be doing
> >> redundant work on each PMD.
> >>
> >> Instead Dariusz suggests PMD to provide a flag to ehtdev to what to
> >> restore and common code in ethdev does the work.
> >> My below dedicated data struct comment is to have this flag in a new
> >> struct, overall like following:
> >>
> >> rte_eth_dev_start()
> >>    - Call dev->dev_ops->dev_start()
> >>    - Call dev->dev_ops->get_restore_flags(ethdev, RTE_ETH_START, &flags)
> >>    - if (flags & MAC) dev->dev_ops->mac_addr_set()
> >>    - if (flags & PROMISC) dev->dev_ops->promiscuous_enable()
> >>    - ...
> >
> > Could you please explain what is the benefit of exposing flags through dev_ops
> callback vs a dedicated flags field in rte_eth_dev_data?
> > In both solutions:
> > - config restore is transparent to the user,
> > - drivers can omit config restore (either by not implementing the
> > callback or not providing the flags),
> > - an ABI change is introduced (not a huge concern, at least for 24.11).
> >
> > I understand that my initial proposal with "internal_flags" was too
> > vague, but renaming and splitting this field into:
> >
> > - dev_start_restore_flags
> > - dev_reset_restore_flags
> > - and so on...
> >
> > seems sufficient, at least in my opinion.
> >
> 
> Hi Dariusz,
> 
> Putting flags to rte_eth_dev_data works, and it is easier since there is direct access
> from rte_eth_dev to rte_eth_dev_data, so you don't need new dev_ops. So this is
> a valid option.
> 
> But benefit of new dev_ops is to keep "struct rte_eth_dev_data" clean.
> 
> "struct rte_eth_dev_data" is integral data structure for ethdev and it is used in
> multiple locations, mostly related to the datapath and all drivers needs to deal
> with fields of this struct.
> Like [rx]_queues, dev_private, dev_conf all important and used a lot.
> 
> I want to protect "struct rte_eth_dev_data" from noise as much as possible,
> though what is noise is not always that clear.
> 
> This restore flag is not critical, and I expect most of the drivers won't care and
> populate this restore flag at all. That is why to me it is better have dedicated struct
> for it and only drivers care about restore feature know it.

I see. Thank you very much for the explanation.

In this case, it looks like adding this to dev_ops is the way to go.

So, summarizing it all:

1. dev_ops should be extended with a callback with the following signature and enums/flags:

enum rte_eth_dev_operation op {
	RTE_ETH_START,
	RTE_ETH_STOP,
	RTE_ETH_RESET,
};

#define RTE_ETH_RESTORE_MAC_ADDR RTE_BIT32(0)
#define RTE_ETH_RESTORE_PROMISC RTE_BIT32(1)
#define RTE_ETH_RESTORE_ALLMULTI RTE_BIT32(2)

void (*get_restore_flags)(
	struct rte_eth_dev *dev,
	enum rte_eth_dev_operation op,
	uint32_t *flags);

2. rte_eth_dev_start() will work as follows:

- Call dev->dev_ops->dev_start()
- Call dev->dev_ops->get_restore_flags(dev, RTE_ETH_START, &flags). If callback is not provided, assume flags == 0.
- if (flags & RTE_ETH_RESTORE_MAC_ADDR) - restore MAC addresses
- and so on...

Also, I would like to add the following:

3. Patchset introducing this change should add get_restore_flags() implementation to all drivers, which informs that all config should be restored.
This would preserve the current behavior.
Later, this could be refined driver by driver.

What do you think?

Also, there's an open question about 'stop' and 'reset' operations.
At the moment, ethdev layer does not do any config manipulation during these operations.
Maybe we should limit get_restore_flags() to 'start' only?

> 
> 
> 
> >>
> >> So PMDs only will provide what to restore with an internal API and
> >> common ethdev layer will restore it.
> >> If no restore required PMD may not implement .get_restore_flags() at all.
> >>
> >> Additionally, RTE_ETH_START, RTE_ETH_RESET etc flag can be provided
> >> to internal API to get what to restore in different states...
> >>
> >>>> Suggested 'internal_flag' in "struct rte_eth_dev_data" can be
> >>>> confusing and open to interpretation what to use it for and by time
> >>>> become source of defect.
> >>>
> >>> Yes, same thoughts.
> >>>
> >>>> Instead what do you think to have a separate, dedicated data struct for it?
> >>>
> >>> Hmm... not sure I understood you here...
> >>>
> >>>>
> >>>>> Might be we can move this restoration code into the new ethdev
> >>>>> helper function,(ethdevd_user_config_restore()  or so) that PMD
> >>>>> can invoke
> >> during its dev_start() if needed?
> >>>>>
> >>>>>>
> >>>>>> #define RTE_ETH_DEV_INTERNAL_PROMISC_FORCE_RESTORE
> RTE_BIT32(0)
> >>>>>> #define RTE_ETH_DEV_INTERNAL_ALLMULTI_FORCE_RESTORE
> RTE_BIT32(1)
> >>>>>> #define RTE_ETH_DEV_INTERNAL_MAC_ADDR_FORCE_RESTORE
> >> RTE_BIT32(2)
> >>>>>>
> >>>>>> struct rte_eth_dev_data {
> >>>>>>    /* snip */
> >>>>>>
> >>>>>>    uint32_t dev_flags;
> >>>>>>
> >>>>>>    /**
> >>>>>>     * Internal device capabilities, used only by ethdev library.
> >>>>>>     * Certain functionalities provided by the library might
> enabled/disabled,
> >>>>>>     * based on driver exposing certain capabilities.
> >>>>>>     */
> >>>>>>    uint32_t internal_flags;
> >>>>>>
> >>>>>>    /* snip */
> >>>>>> };
> >>>>>>
> >>>>>>> Also perhaps we have go into details what needs to be restored
> >>>>>>> after 'stop' and what needs to be restored after 'reset' and use
> >>>>>>> similar
> >> mechanism etc...
> >>>>>>
> >>>>>> I think we should look into that.
> >>>>>> Any 'codification' of semantics between drivers and ethdev
> >>>>>> library is good in
> >> my opinion.
> >>>>>>
> >>>>>> At least right now, ethdev does not change any configuration in
> >>>>>> 'stop' and
> >> 'reset' from what I see.
> >>>>>> But that's on library side only.
> >>>>>>
> >>>>>>>> This way, if we would conclude that it makes sense for
> >>>>>>>> .dev_start() to handle all starting configuration aspects, we
> >>>>>>>> could track which drivers still rely
> >>>>>>> on configuration restore.
> >>>>>>>>
> >>>>>>>> Dariusz Sosnowski (4):
> >>>>>>>>   ethdev: rework config restore
> >>>>>>>>   ethdev: omit promiscuous config restore if not required
> >>>>>>>>   ethdev: omit all multicast config restore if not required
> >>>>>>>>   ethdev: omit MAC address restore if not required
> >>>>>>>>
> >>>>>>>>  lib/ethdev/rte_ethdev.c | 39
> >>>>>>>> ++++++++++++++++++++++++++++++++++-----
> >>>>>>>>  lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++
> >>>>>>>>  2 files changed, 52 insertions(+), 5 deletions(-)
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> 2.39.5
> >>>>>>>>
> >>>>>
> >>>
> >


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC 0/4] ethdev: rework config restore
  2024-10-10 12:08                 ` Dariusz Sosnowski
@ 2024-10-10 12:51                   ` Ferruh Yigit
  2024-10-10 16:23                     ` Dariusz Sosnowski
  0 siblings, 1 reply; 23+ messages in thread
From: Ferruh Yigit @ 2024-10-10 12:51 UTC (permalink / raw)
  To: Dariusz Sosnowski, Konstantin Ananyev,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Andrew Rybchenko
  Cc: dev, Bruce Richardson

On 10/10/2024 1:08 PM, Dariusz Sosnowski wrote:
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>> Sent: Thursday, October 10, 2024 01:17
>> To: Dariusz Sosnowski <dsosnowski@nvidia.com>; Konstantin Ananyev
>> <konstantin.ananyev@huawei.com>; NBU-Contact-Thomas Monjalon (EXTERNAL)
>> <thomas@monjalon.net>; Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
>> Subject: Re: [RFC 0/4] ethdev: rework config restore
>>
>> External email: Use caution opening links or attachments
>>
>>
>> On 10/9/2024 5:18 PM, Dariusz Sosnowski wrote:
>>>> -----Original Message-----
>>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>>>> Sent: Wednesday, October 9, 2024 03:08
>>>> To: Konstantin Ananyev <konstantin.ananyev@huawei.com>; Dariusz
>>>> Sosnowski <dsosnowski@nvidia.com>; NBU-Contact-Thomas Monjalon
>>>> (EXTERNAL) <thomas@monjalon.net>; Andrew Rybchenko
>>>> <andrew.rybchenko@oktetlabs.ru>
>>>> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
>>>> Subject: Re: [RFC 0/4] ethdev: rework config restore
>>>>
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> On 10/8/2024 6:21 PM, Konstantin Ananyev wrote:
>>>>>
>>>>>
>>>>>>>>>> We have been working on optimizing the latency of calls to
>>>>>>>>>> rte_eth_dev_start(), on ports spawned by mlx5 PMD. Most of the
>>>>>>>>>> work requires changes in the implementation of
>>>>>>>>>> .dev_start() PMD callback, but I also wanted to start a
>>>>>>>>>> discussion regarding configuration restore.
>>>>>>>>>>
>>>>>>>>>> rte_eth_dev_start() does a few things on top of calling
>>>>>>>>>> .dev_start()
>>>> callback:
>>>>>>>>>>
>>>>>>>>>> - Before calling it:
>>>>>>>>>>     - eth_dev_mac_restore() - if device supports
>>>>>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
>>>>>>>>>> - After calling it:
>>>>>>>>>>     - eth_dev_mac_restore() - if device does not support
>>>>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
>>>>>>>>>>     - restore promiscuous config
>>>>>>>>>>     - restore all multicast config
>>>>>>>>>>
>>>>>>>>>> eth_dev_mac_restore() iterates over all known MAC addresses -
>>>>>>>>>> stored in rte_eth_dev_data.mac_addrs array - and calls
>>>>>>>>>> .mac_addr_set() and .mac_addr_add() callbacks to apply these
>>>>>>>>>> MAC
>>>> addresses.
>>>>>>>>>>
>>>>>>>>>> Promiscuous config restore checks if promiscuous mode is
>>>>>>>>>> enabled or not, and calls .promiscuous_enable() or
>>>>>>>>>> .promiscuous_disable()
>>>> callback.
>>>>>>>>>>
>>>>>>>>>> All multicast config restore checks if all multicast mode is
>>>>>>>>>> enabled or not, and calls .allmulticast_enable() or
>>>>>>>>>> .allmulticast_disable()
>>>> callback.
>>>>>>>>>>
>>>>>>>>>> Callbacks are called directly in all of these cases, to bypass
>>>>>>>>>> the checks for applying the same configuration, which exist in
>>>>>>>>>> relevant
>>>> APIs.
>>>>>>>>>> Checks are bypassed to force drivers to reapply the configuration.
>>>>>>>>>>
>>>>>>>>>> Let's consider what happens in the following sequence of API calls.
>>>>>>>>>>
>>>>>>>>>> 1. rte_eth_dev_configure()
>>>>>>>>>> 2. rte_eth_tx_queue_setup()
>>>>>>>>>> 3. rte_eth_rx_queue_setup()
>>>>>>>>>> 4. rte_eth_promiscuous_enable()
>>>>>>>>>>     - Call dev->dev_ops->promiscuous_enable()
>>>>>>>>>>     - Stores promiscuous state in dev->data->promiscuous 5.
>>>>>>>>>> rte_eth_allmulticast_enable()
>>>>>>>>>>     - Call dev->dev_ops->allmulticast_enable()
>>>>>>>>>>     - Stores allmulticast state in dev->data->allmulticast 6.
>>>>>>>>>> rte_eth_dev_start()
>>>>>>>>>>     - Call dev->dev_ops->dev_start()
>>>>>>>>>>     - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>>>>>>>>>>     - Call dev->dev_ops->promiscuous_enable()
>>>>>>>>>>     - Call dev->dev_ops->allmulticast_enable()
>>>>>>>>>>
>>>>>>>>>> Even though all configuration is available in dev->data after
>>>>>>>>>> step 5, library forces reapplying this configuration in step 6.
>>>>>>>>>>
>>>>>>>>>> In mlx5 PMD case all relevant callbacks require communication
>>>>>>>>>> with the kernel driver, to configure the device (mlx5 PMD must
>>>>>>>>>> create/destroy new kernel flow rules and/or change netdev config).
>>>>>>>>>>
>>>>>>>>>> mlx5 PMD handles applying all configuration in .dev_start(), so
>>>>>>>>>> the following forced callbacks force additional communication
>>>>>>>>>> with the kernel. The
>>>>>>>>> same configuration is applied multiple times.
>>>>>>>>>>
>>>>>>>>>> As an optimization, mlx5 PMD could check if a given
>>>>>>>>>> configuration was applied, but this would duplicate the
>>>>>>>>>> functionality of the library (for example
>>>>>>>>>> rte_eth_promiscuous_enable() does not call the driver if
>>>>>>>>>> dev->data->promiscuous is set).
>>>>>>>>>>
>>>>>>>>>> Question: Since all of the configuration is available before
>>>>>>>>>> .dev_start() callback is called, why ethdev library does not
>>>>>>>>>> expect .dev_start() to
>>>>>>>>> take this configuration into account?
>>>>>>>>>> In other words, why library has to reapply the configuration?
>>>>>>>>>>
>>>>>>>>>> I could not find any particular reason why configuration
>>>>>>>>>> restore exists as part of the process (it was in the initial DPDK commit).
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> My assumption is .dev_stop() cause these values reset in some
>>>>>>>>> devices, so
>>>>>>>>> .dev_start() restores them back.
>>>>>>>>> @Bruce or @Konstantin may remember the history.
>>>>>>>
>>>>>>> Yep, as I remember, at least some Intel PMDs calling hw_reset() ad
>>>>>>> dec_stop() and even dev_start() to make sure that HW is in a clean
>>>>>>> (known)
>>>> state.
>>>>>>>
>>>>>>>>>
>>>>>>>>> But I agree this is device specific behavior, and can be managed
>>>>>>>>> by what device requires.
>>>>>>>
>>>>>>> Probably yes.
>>>>>>>
>>>>>>>>>
>>>>>>>>>> The patches included in this RFC, propose a mechanism which
>>>>>>>>>> would help with managing which drivers rely on forceful configuration
>> restore.
>>>>>>>>>> Drivers could advertise if forceful configuration restore is
>>>>>>>>>> needed through `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If
>>>>>>>>>> this flag is set, then the driver in question requires ethdev
>>>>>>>>>> to forcefully restore
>>>>>>>>> configuration.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> OK to use flag for it, but not sure about using 'dev_info->dev_flags'
>>>>>>>>> (RTE_ETH_DEV_*) for this, as this flag is shared with user and
>>>>>>>>> this is all dpdk internal.
>>>>>>>>>
>>>>>>>>> What about to have a dedicated flag for it? We can have a
>>>>>>>>> dedicated set of flag values for restore.
>>>>>>>>
>>>>>>>> Agreed. What do you think about the following?
>>>>>>>
>>>>>>> Instead of exposing that, can we probably make it transparent to
>>>>>>> the user and probably ethdev layer too?
>>>>>>>
>>>>>>
>>>>>> +1 to make it transparent to user, but not sure if we can make it
>>>>>> transparent to ethdev layer.
>>>>>
>>>>> Just to be clear:
>>>>> Let say, using example from above:
>>>>>
>>>>>  rte_eth_dev_start()
>>>>>      - Call dev->dev_ops->dev_start()
>>>>>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>>>>>      - Call dev->dev_ops->promiscuous_enable()
>>>>>      - Call dev->dev_ops->allmulticast_enable()
>>>>>
>>>>> We probably can introduce ethdev internal function (still visible to
>>>>> PMDs) that would do last 3 steps:
>>>>> ethdev_replay_user_conf(...)
>>>>>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>>>>>      - Call dev->dev_ops->promiscuous_enable()
>>>>>      - Call dev->dev_ops->allmulticast_enable()
>>>>>
>>>>> And let PMD itself to decide does it needs to call it at dev_start() or not.
>>>>> So it will become:
>>>>> rte_eth_dev_start()
>>>>>      - Call dev->dev_ops->dev_start()
>>>>>       -Call ethdev_replay_user_conf(.)
>>>>>               - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>>>>>               - Call dev->dev_ops->promiscuous_enable()
>>>>>               -Call dev->dev_ops->allmulticast_enable()
>>>>>
>>>>> For PMDs that do need to restore user provided config And
>>>>> rte_eth_dev_start()
>>>>>      - Call dev->dev_ops->dev_start()
>>>>>
>>>>> For those who do not.
>>>>>
>>>>
>>>> OK, got it what you mean.
>>>> Pushing restore functionality to PMDs works, but this may be doing
>>>> redundant work on each PMD.
>>>>
>>>> Instead Dariusz suggests PMD to provide a flag to ehtdev to what to
>>>> restore and common code in ethdev does the work.
>>>> My below dedicated data struct comment is to have this flag in a new
>>>> struct, overall like following:
>>>>
>>>> rte_eth_dev_start()
>>>>    - Call dev->dev_ops->dev_start()
>>>>    - Call dev->dev_ops->get_restore_flags(ethdev, RTE_ETH_START, &flags)
>>>>    - if (flags & MAC) dev->dev_ops->mac_addr_set()
>>>>    - if (flags & PROMISC) dev->dev_ops->promiscuous_enable()
>>>>    - ...
>>>
>>> Could you please explain what is the benefit of exposing flags through dev_ops
>> callback vs a dedicated flags field in rte_eth_dev_data?
>>> In both solutions:
>>> - config restore is transparent to the user,
>>> - drivers can omit config restore (either by not implementing the
>>> callback or not providing the flags),
>>> - an ABI change is introduced (not a huge concern, at least for 24.11).
>>>
>>> I understand that my initial proposal with "internal_flags" was too
>>> vague, but renaming and splitting this field into:
>>>
>>> - dev_start_restore_flags
>>> - dev_reset_restore_flags
>>> - and so on...
>>>
>>> seems sufficient, at least in my opinion.
>>>
>>
>> Hi Dariusz,
>>
>> Putting flags to rte_eth_dev_data works, and it is easier since there is direct access
>> from rte_eth_dev to rte_eth_dev_data, so you don't need new dev_ops. So this is
>> a valid option.
>>
>> But benefit of new dev_ops is to keep "struct rte_eth_dev_data" clean.
>>
>> "struct rte_eth_dev_data" is integral data structure for ethdev and it is used in
>> multiple locations, mostly related to the datapath and all drivers needs to deal
>> with fields of this struct.
>> Like [rx]_queues, dev_private, dev_conf all important and used a lot.
>>
>> I want to protect "struct rte_eth_dev_data" from noise as much as possible,
>> though what is noise is not always that clear.
>>
>> This restore flag is not critical, and I expect most of the drivers won't care and
>> populate this restore flag at all. That is why to me it is better have dedicated struct
>> for it and only drivers care about restore feature know it.
> 
> I see. Thank you very much for the explanation.
> 
> In this case, it looks like adding this to dev_ops is the way to go.
> 
> So, summarizing it all:
> 
> 1. dev_ops should be extended with a callback with the following signature and enums/flags:
> 
> enum rte_eth_dev_operation op {
> 	RTE_ETH_START,
> 	RTE_ETH_STOP,
> 	RTE_ETH_RESET,
> };
> 
> #define RTE_ETH_RESTORE_MAC_ADDR RTE_BIT32(0)
> #define RTE_ETH_RESTORE_PROMISC RTE_BIT32(1)
> #define RTE_ETH_RESTORE_ALLMULTI RTE_BIT32(2)
> 
> void (*get_restore_flags)(
> 	struct rte_eth_dev *dev,
> 	enum rte_eth_dev_operation op,
> 	uint32_t *flags);
> 
> 2. rte_eth_dev_start() will work as follows:
> 
> - Call dev->dev_ops->dev_start()
> - Call dev->dev_ops->get_restore_flags(dev, RTE_ETH_START, &flags). If callback is not provided, assume flags == 0.
> - if (flags & RTE_ETH_RESTORE_MAC_ADDR) - restore MAC addresses
> - and so on...
> 

All above looks good.

> Also, I would like to add the following:
> 
> 3. Patchset introducing this change should add get_restore_flags() implementation to all drivers, which informs that all config should be restored.
> This would preserve the current behavior.
> Later, this could be refined driver by driver.
> 
> What do you think?
> 

What you are saying is correct, but I suspect most of the drivers don't
really need this restore, but they have it since it was in the ethdev layer.

If we introduce back restore via get_restore_flags(), it may stay as it
is in drivers, at least for most of them.

What do you think to risk breaking stuff for this case.

So don't implement this in the drivers by default, so who needs it will
recognize the issue and will implement it. If we merge this patch for
-rc1, it gives enough time for drivers to detect the issue and fix it.

Only we may implement this to the drivers that exist when this restore
code was introduced.
I mean whatever driver exist in the initial DPDK commit, implement this
logic only to those drivers.

> Also, there's an open question about 'stop' and 'reset' operations.
> At the moment, ethdev layer does not do any config manipulation during these operations.
> Maybe we should limit get_restore_flags() to 'start' only?
> 

Ack, I was about to suggest the same, for now only have 'RTE_ETH_START'
as a placeholder for later possible usages.

>>
>>
>>
>>>>
>>>> So PMDs only will provide what to restore with an internal API and
>>>> common ethdev layer will restore it.
>>>> If no restore required PMD may not implement .get_restore_flags() at all.
>>>>
>>>> Additionally, RTE_ETH_START, RTE_ETH_RESET etc flag can be provided
>>>> to internal API to get what to restore in different states...
>>>>
>>>>>> Suggested 'internal_flag' in "struct rte_eth_dev_data" can be
>>>>>> confusing and open to interpretation what to use it for and by time
>>>>>> become source of defect.
>>>>>
>>>>> Yes, same thoughts.
>>>>>
>>>>>> Instead what do you think to have a separate, dedicated data struct for it?
>>>>>
>>>>> Hmm... not sure I understood you here...
>>>>>
>>>>>>
>>>>>>> Might be we can move this restoration code into the new ethdev
>>>>>>> helper function,(ethdevd_user_config_restore()  or so) that PMD
>>>>>>> can invoke
>>>> during its dev_start() if needed?
>>>>>>>
>>>>>>>>
>>>>>>>> #define RTE_ETH_DEV_INTERNAL_PROMISC_FORCE_RESTORE
>> RTE_BIT32(0)
>>>>>>>> #define RTE_ETH_DEV_INTERNAL_ALLMULTI_FORCE_RESTORE
>> RTE_BIT32(1)
>>>>>>>> #define RTE_ETH_DEV_INTERNAL_MAC_ADDR_FORCE_RESTORE
>>>> RTE_BIT32(2)
>>>>>>>>
>>>>>>>> struct rte_eth_dev_data {
>>>>>>>>    /* snip */
>>>>>>>>
>>>>>>>>    uint32_t dev_flags;
>>>>>>>>
>>>>>>>>    /**
>>>>>>>>     * Internal device capabilities, used only by ethdev library.
>>>>>>>>     * Certain functionalities provided by the library might
>> enabled/disabled,
>>>>>>>>     * based on driver exposing certain capabilities.
>>>>>>>>     */
>>>>>>>>    uint32_t internal_flags;
>>>>>>>>
>>>>>>>>    /* snip */
>>>>>>>> };
>>>>>>>>
>>>>>>>>> Also perhaps we have go into details what needs to be restored
>>>>>>>>> after 'stop' and what needs to be restored after 'reset' and use
>>>>>>>>> similar
>>>> mechanism etc...
>>>>>>>>
>>>>>>>> I think we should look into that.
>>>>>>>> Any 'codification' of semantics between drivers and ethdev
>>>>>>>> library is good in
>>>> my opinion.
>>>>>>>>
>>>>>>>> At least right now, ethdev does not change any configuration in
>>>>>>>> 'stop' and
>>>> 'reset' from what I see.
>>>>>>>> But that's on library side only.
>>>>>>>>
>>>>>>>>>> This way, if we would conclude that it makes sense for
>>>>>>>>>> .dev_start() to handle all starting configuration aspects, we
>>>>>>>>>> could track which drivers still rely
>>>>>>>>> on configuration restore.
>>>>>>>>>>
>>>>>>>>>> Dariusz Sosnowski (4):
>>>>>>>>>>   ethdev: rework config restore
>>>>>>>>>>   ethdev: omit promiscuous config restore if not required
>>>>>>>>>>   ethdev: omit all multicast config restore if not required
>>>>>>>>>>   ethdev: omit MAC address restore if not required
>>>>>>>>>>
>>>>>>>>>>  lib/ethdev/rte_ethdev.c | 39
>>>>>>>>>> ++++++++++++++++++++++++++++++++++-----
>>>>>>>>>>  lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++
>>>>>>>>>>  2 files changed, 52 insertions(+), 5 deletions(-)
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> 2.39.5
>>>>>>>>>>
>>>>>>>
>>>>>
>>>
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [RFC 0/4] ethdev: rework config restore
  2024-10-10 12:51                   ` Ferruh Yigit
@ 2024-10-10 16:23                     ` Dariusz Sosnowski
  2024-10-10 17:08                       ` Ferruh Yigit
  0 siblings, 1 reply; 23+ messages in thread
From: Dariusz Sosnowski @ 2024-10-10 16:23 UTC (permalink / raw)
  To: Ferruh Yigit, Konstantin Ananyev,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Andrew Rybchenko
  Cc: dev, Bruce Richardson

> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Thursday, October 10, 2024 14:52
> To: Dariusz Sosnowski <dsosnowski@nvidia.com>; Konstantin Ananyev
> <konstantin.ananyev@huawei.com>; NBU-Contact-Thomas Monjalon (EXTERNAL)
> <thomas@monjalon.net>; Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
> Subject: Re: [RFC 0/4] ethdev: rework config restore
> 
> External email: Use caution opening links or attachments
> 
> 
> On 10/10/2024 1:08 PM, Dariusz Sosnowski wrote:
> >> -----Original Message-----
> >> From: Ferruh Yigit <ferruh.yigit@amd.com>
> >> Sent: Thursday, October 10, 2024 01:17
> >> To: Dariusz Sosnowski <dsosnowski@nvidia.com>; Konstantin Ananyev
> >> <konstantin.ananyev@huawei.com>; NBU-Contact-Thomas Monjalon
> >> (EXTERNAL) <thomas@monjalon.net>; Andrew Rybchenko
> >> <andrew.rybchenko@oktetlabs.ru>
> >> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
> >> Subject: Re: [RFC 0/4] ethdev: rework config restore
> >>
> >> External email: Use caution opening links or attachments
> >>
> >>
> >> On 10/9/2024 5:18 PM, Dariusz Sosnowski wrote:
> >>>> -----Original Message-----
> >>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
> >>>> Sent: Wednesday, October 9, 2024 03:08
> >>>> To: Konstantin Ananyev <konstantin.ananyev@huawei.com>; Dariusz
> >>>> Sosnowski <dsosnowski@nvidia.com>; NBU-Contact-Thomas Monjalon
> >>>> (EXTERNAL) <thomas@monjalon.net>; Andrew Rybchenko
> >>>> <andrew.rybchenko@oktetlabs.ru>
> >>>> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
> >>>> Subject: Re: [RFC 0/4] ethdev: rework config restore
> >>>>
> >>>> External email: Use caution opening links or attachments
> >>>>
> >>>>
> >>>> On 10/8/2024 6:21 PM, Konstantin Ananyev wrote:
> >>>>>
> >>>>>
> >>>>>>>>>> We have been working on optimizing the latency of calls to
> >>>>>>>>>> rte_eth_dev_start(), on ports spawned by mlx5 PMD. Most of
> >>>>>>>>>> the work requires changes in the implementation of
> >>>>>>>>>> .dev_start() PMD callback, but I also wanted to start a
> >>>>>>>>>> discussion regarding configuration restore.
> >>>>>>>>>>
> >>>>>>>>>> rte_eth_dev_start() does a few things on top of calling
> >>>>>>>>>> .dev_start()
> >>>> callback:
> >>>>>>>>>>
> >>>>>>>>>> - Before calling it:
> >>>>>>>>>>     - eth_dev_mac_restore() - if device supports
> >>>>>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> >>>>>>>>>> - After calling it:
> >>>>>>>>>>     - eth_dev_mac_restore() - if device does not support
> >>>>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> >>>>>>>>>>     - restore promiscuous config
> >>>>>>>>>>     - restore all multicast config
> >>>>>>>>>>
> >>>>>>>>>> eth_dev_mac_restore() iterates over all known MAC addresses -
> >>>>>>>>>> stored in rte_eth_dev_data.mac_addrs array - and calls
> >>>>>>>>>> .mac_addr_set() and .mac_addr_add() callbacks to apply these
> >>>>>>>>>> MAC
> >>>> addresses.
> >>>>>>>>>>
> >>>>>>>>>> Promiscuous config restore checks if promiscuous mode is
> >>>>>>>>>> enabled or not, and calls .promiscuous_enable() or
> >>>>>>>>>> .promiscuous_disable()
> >>>> callback.
> >>>>>>>>>>
> >>>>>>>>>> All multicast config restore checks if all multicast mode is
> >>>>>>>>>> enabled or not, and calls .allmulticast_enable() or
> >>>>>>>>>> .allmulticast_disable()
> >>>> callback.
> >>>>>>>>>>
> >>>>>>>>>> Callbacks are called directly in all of these cases, to
> >>>>>>>>>> bypass the checks for applying the same configuration, which
> >>>>>>>>>> exist in relevant
> >>>> APIs.
> >>>>>>>>>> Checks are bypassed to force drivers to reapply the configuration.
> >>>>>>>>>>
> >>>>>>>>>> Let's consider what happens in the following sequence of API calls.
> >>>>>>>>>>
> >>>>>>>>>> 1. rte_eth_dev_configure()
> >>>>>>>>>> 2. rte_eth_tx_queue_setup()
> >>>>>>>>>> 3. rte_eth_rx_queue_setup()
> >>>>>>>>>> 4. rte_eth_promiscuous_enable()
> >>>>>>>>>>     - Call dev->dev_ops->promiscuous_enable()
> >>>>>>>>>>     - Stores promiscuous state in dev->data->promiscuous 5.
> >>>>>>>>>> rte_eth_allmulticast_enable()
> >>>>>>>>>>     - Call dev->dev_ops->allmulticast_enable()
> >>>>>>>>>>     - Stores allmulticast state in dev->data->allmulticast 6.
> >>>>>>>>>> rte_eth_dev_start()
> >>>>>>>>>>     - Call dev->dev_ops->dev_start()
> >>>>>>>>>>     - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>>>>>>>>     - Call dev->dev_ops->promiscuous_enable()
> >>>>>>>>>>     - Call dev->dev_ops->allmulticast_enable()
> >>>>>>>>>>
> >>>>>>>>>> Even though all configuration is available in dev->data after
> >>>>>>>>>> step 5, library forces reapplying this configuration in step 6.
> >>>>>>>>>>
> >>>>>>>>>> In mlx5 PMD case all relevant callbacks require communication
> >>>>>>>>>> with the kernel driver, to configure the device (mlx5 PMD
> >>>>>>>>>> must create/destroy new kernel flow rules and/or change netdev
> config).
> >>>>>>>>>>
> >>>>>>>>>> mlx5 PMD handles applying all configuration in .dev_start(),
> >>>>>>>>>> so the following forced callbacks force additional
> >>>>>>>>>> communication with the kernel. The
> >>>>>>>>> same configuration is applied multiple times.
> >>>>>>>>>>
> >>>>>>>>>> As an optimization, mlx5 PMD could check if a given
> >>>>>>>>>> configuration was applied, but this would duplicate the
> >>>>>>>>>> functionality of the library (for example
> >>>>>>>>>> rte_eth_promiscuous_enable() does not call the driver if
> >>>>>>>>>> dev->data->promiscuous is set).
> >>>>>>>>>>
> >>>>>>>>>> Question: Since all of the configuration is available before
> >>>>>>>>>> .dev_start() callback is called, why ethdev library does not
> >>>>>>>>>> expect .dev_start() to
> >>>>>>>>> take this configuration into account?
> >>>>>>>>>> In other words, why library has to reapply the configuration?
> >>>>>>>>>>
> >>>>>>>>>> I could not find any particular reason why configuration
> >>>>>>>>>> restore exists as part of the process (it was in the initial DPDK
> commit).
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> My assumption is .dev_stop() cause these values reset in some
> >>>>>>>>> devices, so
> >>>>>>>>> .dev_start() restores them back.
> >>>>>>>>> @Bruce or @Konstantin may remember the history.
> >>>>>>>
> >>>>>>> Yep, as I remember, at least some Intel PMDs calling hw_reset()
> >>>>>>> ad
> >>>>>>> dec_stop() and even dev_start() to make sure that HW is in a
> >>>>>>> clean
> >>>>>>> (known)
> >>>> state.
> >>>>>>>
> >>>>>>>>>
> >>>>>>>>> But I agree this is device specific behavior, and can be
> >>>>>>>>> managed by what device requires.
> >>>>>>>
> >>>>>>> Probably yes.
> >>>>>>>
> >>>>>>>>>
> >>>>>>>>>> The patches included in this RFC, propose a mechanism which
> >>>>>>>>>> would help with managing which drivers rely on forceful
> >>>>>>>>>> configuration
> >> restore.
> >>>>>>>>>> Drivers could advertise if forceful configuration restore is
> >>>>>>>>>> needed through `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If
> >>>>>>>>>> this flag is set, then the driver in question requires ethdev
> >>>>>>>>>> to forcefully restore
> >>>>>>>>> configuration.
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> OK to use flag for it, but not sure about using 'dev_info->dev_flags'
> >>>>>>>>> (RTE_ETH_DEV_*) for this, as this flag is shared with user and
> >>>>>>>>> this is all dpdk internal.
> >>>>>>>>>
> >>>>>>>>> What about to have a dedicated flag for it? We can have a
> >>>>>>>>> dedicated set of flag values for restore.
> >>>>>>>>
> >>>>>>>> Agreed. What do you think about the following?
> >>>>>>>
> >>>>>>> Instead of exposing that, can we probably make it transparent to
> >>>>>>> the user and probably ethdev layer too?
> >>>>>>>
> >>>>>>
> >>>>>> +1 to make it transparent to user, but not sure if we can make it
> >>>>>> transparent to ethdev layer.
> >>>>>
> >>>>> Just to be clear:
> >>>>> Let say, using example from above:
> >>>>>
> >>>>>  rte_eth_dev_start()
> >>>>>      - Call dev->dev_ops->dev_start()
> >>>>>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>>>      - Call dev->dev_ops->promiscuous_enable()
> >>>>>      - Call dev->dev_ops->allmulticast_enable()
> >>>>>
> >>>>> We probably can introduce ethdev internal function (still visible
> >>>>> to
> >>>>> PMDs) that would do last 3 steps:
> >>>>> ethdev_replay_user_conf(...)
> >>>>>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>>>      - Call dev->dev_ops->promiscuous_enable()
> >>>>>      - Call dev->dev_ops->allmulticast_enable()
> >>>>>
> >>>>> And let PMD itself to decide does it needs to call it at dev_start() or not.
> >>>>> So it will become:
> >>>>> rte_eth_dev_start()
> >>>>>      - Call dev->dev_ops->dev_start()
> >>>>>       -Call ethdev_replay_user_conf(.)
> >>>>>               - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>>>               - Call dev->dev_ops->promiscuous_enable()
> >>>>>               -Call dev->dev_ops->allmulticast_enable()
> >>>>>
> >>>>> For PMDs that do need to restore user provided config And
> >>>>> rte_eth_dev_start()
> >>>>>      - Call dev->dev_ops->dev_start()
> >>>>>
> >>>>> For those who do not.
> >>>>>
> >>>>
> >>>> OK, got it what you mean.
> >>>> Pushing restore functionality to PMDs works, but this may be doing
> >>>> redundant work on each PMD.
> >>>>
> >>>> Instead Dariusz suggests PMD to provide a flag to ehtdev to what to
> >>>> restore and common code in ethdev does the work.
> >>>> My below dedicated data struct comment is to have this flag in a
> >>>> new struct, overall like following:
> >>>>
> >>>> rte_eth_dev_start()
> >>>>    - Call dev->dev_ops->dev_start()
> >>>>    - Call dev->dev_ops->get_restore_flags(ethdev, RTE_ETH_START, &flags)
> >>>>    - if (flags & MAC) dev->dev_ops->mac_addr_set()
> >>>>    - if (flags & PROMISC) dev->dev_ops->promiscuous_enable()
> >>>>    - ...
> >>>
> >>> Could you please explain what is the benefit of exposing flags
> >>> through dev_ops
> >> callback vs a dedicated flags field in rte_eth_dev_data?
> >>> In both solutions:
> >>> - config restore is transparent to the user,
> >>> - drivers can omit config restore (either by not implementing the
> >>> callback or not providing the flags),
> >>> - an ABI change is introduced (not a huge concern, at least for 24.11).
> >>>
> >>> I understand that my initial proposal with "internal_flags" was too
> >>> vague, but renaming and splitting this field into:
> >>>
> >>> - dev_start_restore_flags
> >>> - dev_reset_restore_flags
> >>> - and so on...
> >>>
> >>> seems sufficient, at least in my opinion.
> >>>
> >>
> >> Hi Dariusz,
> >>
> >> Putting flags to rte_eth_dev_data works, and it is easier since there
> >> is direct access from rte_eth_dev to rte_eth_dev_data, so you don't
> >> need new dev_ops. So this is a valid option.
> >>
> >> But benefit of new dev_ops is to keep "struct rte_eth_dev_data" clean.
> >>
> >> "struct rte_eth_dev_data" is integral data structure for ethdev and
> >> it is used in multiple locations, mostly related to the datapath and
> >> all drivers needs to deal with fields of this struct.
> >> Like [rx]_queues, dev_private, dev_conf all important and used a lot.
> >>
> >> I want to protect "struct rte_eth_dev_data" from noise as much as
> >> possible, though what is noise is not always that clear.
> >>
> >> This restore flag is not critical, and I expect most of the drivers
> >> won't care and populate this restore flag at all. That is why to me
> >> it is better have dedicated struct for it and only drivers care about restore
> feature know it.
> >
> > I see. Thank you very much for the explanation.
> >
> > In this case, it looks like adding this to dev_ops is the way to go.
> >
> > So, summarizing it all:
> >
> > 1. dev_ops should be extended with a callback with the following signature and
> enums/flags:
> >
> > enum rte_eth_dev_operation op {
> >       RTE_ETH_START,
> >       RTE_ETH_STOP,
> >       RTE_ETH_RESET,
> > };
> >
> > #define RTE_ETH_RESTORE_MAC_ADDR RTE_BIT32(0) #define
> > RTE_ETH_RESTORE_PROMISC RTE_BIT32(1) #define
> RTE_ETH_RESTORE_ALLMULTI
> > RTE_BIT32(2)
> >
> > void (*get_restore_flags)(
> >       struct rte_eth_dev *dev,
> >       enum rte_eth_dev_operation op,
> >       uint32_t *flags);
> >
> > 2. rte_eth_dev_start() will work as follows:
> >
> > - Call dev->dev_ops->dev_start()
> > - Call dev->dev_ops->get_restore_flags(dev, RTE_ETH_START, &flags). If callback
> is not provided, assume flags == 0.
> > - if (flags & RTE_ETH_RESTORE_MAC_ADDR) - restore MAC addresses
> > - and so on...
> >
> 
> All above looks good.
> 
> > Also, I would like to add the following:
> >
> > 3. Patchset introducing this change should add get_restore_flags()
> implementation to all drivers, which informs that all config should be restored.
> > This would preserve the current behavior.
> > Later, this could be refined driver by driver.
> >
> > What do you think?
> >
> 
> What you are saying is correct, but I suspect most of the drivers don't really need
> this restore, but they have it since it was in the ethdev layer.
> 
> If we introduce back restore via get_restore_flags(), it may stay as it is in drivers, at
> least for most of them.
> 
> What do you think to risk breaking stuff for this case.
> 
> So don't implement this in the drivers by default, so who needs it will recognize
> the issue and will implement it. If we merge this patch for -rc1, it gives enough
> time for drivers to detect the issue and fix it.

It seems rather too risky, especially considering that for example, there are a few Intel drivers which do not have maintainers (like i40e).
So, I don't know what will happen to such drivers. They may be left broken (if they are affected) for 24.11 and future releases.
But I agree that if default behavior is preserved, this dependence of drivers on config restore might stay as is.
I'm on the fence about it.

> 
> Only we may implement this to the drivers that exist when this restore code was
> introduced.
> I mean whatever driver exist in the initial DPDK commit, implement this logic only
> to those drivers.

Seems reasonable to me. In this case, it would be igb (IIUC, now it's named e1000) and ixgbe.

> 
> > Also, there's an open question about 'stop' and 'reset' operations.
> > At the moment, ethdev layer does not do any config manipulation during these
> operations.
> > Maybe we should limit get_restore_flags() to 'start' only?
> >
> 
> Ack, I was about to suggest the same, for now only have 'RTE_ETH_START'
> as a placeholder for later possible usages.

Ack

> 
> >>
> >>
> >>
> >>>>
> >>>> So PMDs only will provide what to restore with an internal API and
> >>>> common ethdev layer will restore it.
> >>>> If no restore required PMD may not implement .get_restore_flags() at all.
> >>>>
> >>>> Additionally, RTE_ETH_START, RTE_ETH_RESET etc flag can be provided
> >>>> to internal API to get what to restore in different states...
> >>>>
> >>>>>> Suggested 'internal_flag' in "struct rte_eth_dev_data" can be
> >>>>>> confusing and open to interpretation what to use it for and by
> >>>>>> time become source of defect.
> >>>>>
> >>>>> Yes, same thoughts.
> >>>>>
> >>>>>> Instead what do you think to have a separate, dedicated data struct for it?
> >>>>>
> >>>>> Hmm... not sure I understood you here...
> >>>>>
> >>>>>>
> >>>>>>> Might be we can move this restoration code into the new ethdev
> >>>>>>> helper function,(ethdevd_user_config_restore()  or so) that PMD
> >>>>>>> can invoke
> >>>> during its dev_start() if needed?
> >>>>>>>
> >>>>>>>>
> >>>>>>>> #define RTE_ETH_DEV_INTERNAL_PROMISC_FORCE_RESTORE
> >> RTE_BIT32(0)
> >>>>>>>> #define RTE_ETH_DEV_INTERNAL_ALLMULTI_FORCE_RESTORE
> >> RTE_BIT32(1)
> >>>>>>>> #define RTE_ETH_DEV_INTERNAL_MAC_ADDR_FORCE_RESTORE
> >>>> RTE_BIT32(2)
> >>>>>>>>
> >>>>>>>> struct rte_eth_dev_data {
> >>>>>>>>    /* snip */
> >>>>>>>>
> >>>>>>>>    uint32_t dev_flags;
> >>>>>>>>
> >>>>>>>>    /**
> >>>>>>>>     * Internal device capabilities, used only by ethdev library.
> >>>>>>>>     * Certain functionalities provided by the library might
> >> enabled/disabled,
> >>>>>>>>     * based on driver exposing certain capabilities.
> >>>>>>>>     */
> >>>>>>>>    uint32_t internal_flags;
> >>>>>>>>
> >>>>>>>>    /* snip */
> >>>>>>>> };
> >>>>>>>>
> >>>>>>>>> Also perhaps we have go into details what needs to be restored
> >>>>>>>>> after 'stop' and what needs to be restored after 'reset' and
> >>>>>>>>> use similar
> >>>> mechanism etc...
> >>>>>>>>
> >>>>>>>> I think we should look into that.
> >>>>>>>> Any 'codification' of semantics between drivers and ethdev
> >>>>>>>> library is good in
> >>>> my opinion.
> >>>>>>>>
> >>>>>>>> At least right now, ethdev does not change any configuration in
> >>>>>>>> 'stop' and
> >>>> 'reset' from what I see.
> >>>>>>>> But that's on library side only.
> >>>>>>>>
> >>>>>>>>>> This way, if we would conclude that it makes sense for
> >>>>>>>>>> .dev_start() to handle all starting configuration aspects, we
> >>>>>>>>>> could track which drivers still rely
> >>>>>>>>> on configuration restore.
> >>>>>>>>>>
> >>>>>>>>>> Dariusz Sosnowski (4):
> >>>>>>>>>>   ethdev: rework config restore
> >>>>>>>>>>   ethdev: omit promiscuous config restore if not required
> >>>>>>>>>>   ethdev: omit all multicast config restore if not required
> >>>>>>>>>>   ethdev: omit MAC address restore if not required
> >>>>>>>>>>
> >>>>>>>>>>  lib/ethdev/rte_ethdev.c | 39
> >>>>>>>>>> ++++++++++++++++++++++++++++++++++-----
> >>>>>>>>>>  lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++
> >>>>>>>>>>  2 files changed, 52 insertions(+), 5 deletions(-)
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> 2.39.5
> >>>>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC 0/4] ethdev: rework config restore
  2024-10-10 16:23                     ` Dariusz Sosnowski
@ 2024-10-10 17:08                       ` Ferruh Yigit
  2024-10-10 22:58                         ` Konstantin Ananyev
  0 siblings, 1 reply; 23+ messages in thread
From: Ferruh Yigit @ 2024-10-10 17:08 UTC (permalink / raw)
  To: Dariusz Sosnowski, Konstantin Ananyev,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Andrew Rybchenko
  Cc: dev, Bruce Richardson

On 10/10/2024 5:23 PM, Dariusz Sosnowski wrote:
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>> Sent: Thursday, October 10, 2024 14:52
>> To: Dariusz Sosnowski <dsosnowski@nvidia.com>; Konstantin Ananyev
>> <konstantin.ananyev@huawei.com>; NBU-Contact-Thomas Monjalon (EXTERNAL)
>> <thomas@monjalon.net>; Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
>> Subject: Re: [RFC 0/4] ethdev: rework config restore
>>
>> External email: Use caution opening links or attachments
>>
>>
>> On 10/10/2024 1:08 PM, Dariusz Sosnowski wrote:
>>>> -----Original Message-----
>>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>>>> Sent: Thursday, October 10, 2024 01:17
>>>> To: Dariusz Sosnowski <dsosnowski@nvidia.com>; Konstantin Ananyev
>>>> <konstantin.ananyev@huawei.com>; NBU-Contact-Thomas Monjalon
>>>> (EXTERNAL) <thomas@monjalon.net>; Andrew Rybchenko
>>>> <andrew.rybchenko@oktetlabs.ru>
>>>> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
>>>> Subject: Re: [RFC 0/4] ethdev: rework config restore
>>>>
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> On 10/9/2024 5:18 PM, Dariusz Sosnowski wrote:
>>>>>> -----Original Message-----
>>>>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>>>>>> Sent: Wednesday, October 9, 2024 03:08
>>>>>> To: Konstantin Ananyev <konstantin.ananyev@huawei.com>; Dariusz
>>>>>> Sosnowski <dsosnowski@nvidia.com>; NBU-Contact-Thomas Monjalon
>>>>>> (EXTERNAL) <thomas@monjalon.net>; Andrew Rybchenko
>>>>>> <andrew.rybchenko@oktetlabs.ru>
>>>>>> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
>>>>>> Subject: Re: [RFC 0/4] ethdev: rework config restore
>>>>>>
>>>>>> External email: Use caution opening links or attachments
>>>>>>
>>>>>>
>>>>>> On 10/8/2024 6:21 PM, Konstantin Ananyev wrote:
>>>>>>>
>>>>>>>
>>>>>>>>>>>> We have been working on optimizing the latency of calls to
>>>>>>>>>>>> rte_eth_dev_start(), on ports spawned by mlx5 PMD. Most of
>>>>>>>>>>>> the work requires changes in the implementation of
>>>>>>>>>>>> .dev_start() PMD callback, but I also wanted to start a
>>>>>>>>>>>> discussion regarding configuration restore.
>>>>>>>>>>>>
>>>>>>>>>>>> rte_eth_dev_start() does a few things on top of calling
>>>>>>>>>>>> .dev_start()
>>>>>> callback:
>>>>>>>>>>>>
>>>>>>>>>>>> - Before calling it:
>>>>>>>>>>>>     - eth_dev_mac_restore() - if device supports
>>>>>>>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
>>>>>>>>>>>> - After calling it:
>>>>>>>>>>>>     - eth_dev_mac_restore() - if device does not support
>>>>>>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
>>>>>>>>>>>>     - restore promiscuous config
>>>>>>>>>>>>     - restore all multicast config
>>>>>>>>>>>>
>>>>>>>>>>>> eth_dev_mac_restore() iterates over all known MAC addresses -
>>>>>>>>>>>> stored in rte_eth_dev_data.mac_addrs array - and calls
>>>>>>>>>>>> .mac_addr_set() and .mac_addr_add() callbacks to apply these
>>>>>>>>>>>> MAC
>>>>>> addresses.
>>>>>>>>>>>>
>>>>>>>>>>>> Promiscuous config restore checks if promiscuous mode is
>>>>>>>>>>>> enabled or not, and calls .promiscuous_enable() or
>>>>>>>>>>>> .promiscuous_disable()
>>>>>> callback.
>>>>>>>>>>>>
>>>>>>>>>>>> All multicast config restore checks if all multicast mode is
>>>>>>>>>>>> enabled or not, and calls .allmulticast_enable() or
>>>>>>>>>>>> .allmulticast_disable()
>>>>>> callback.
>>>>>>>>>>>>
>>>>>>>>>>>> Callbacks are called directly in all of these cases, to
>>>>>>>>>>>> bypass the checks for applying the same configuration, which
>>>>>>>>>>>> exist in relevant
>>>>>> APIs.
>>>>>>>>>>>> Checks are bypassed to force drivers to reapply the configuration.
>>>>>>>>>>>>
>>>>>>>>>>>> Let's consider what happens in the following sequence of API calls.
>>>>>>>>>>>>
>>>>>>>>>>>> 1. rte_eth_dev_configure()
>>>>>>>>>>>> 2. rte_eth_tx_queue_setup()
>>>>>>>>>>>> 3. rte_eth_rx_queue_setup()
>>>>>>>>>>>> 4. rte_eth_promiscuous_enable()
>>>>>>>>>>>>     - Call dev->dev_ops->promiscuous_enable()
>>>>>>>>>>>>     - Stores promiscuous state in dev->data->promiscuous 5.
>>>>>>>>>>>> rte_eth_allmulticast_enable()
>>>>>>>>>>>>     - Call dev->dev_ops->allmulticast_enable()
>>>>>>>>>>>>     - Stores allmulticast state in dev->data->allmulticast 6.
>>>>>>>>>>>> rte_eth_dev_start()
>>>>>>>>>>>>     - Call dev->dev_ops->dev_start()
>>>>>>>>>>>>     - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>>>>>>>>>>>>     - Call dev->dev_ops->promiscuous_enable()
>>>>>>>>>>>>     - Call dev->dev_ops->allmulticast_enable()
>>>>>>>>>>>>
>>>>>>>>>>>> Even though all configuration is available in dev->data after
>>>>>>>>>>>> step 5, library forces reapplying this configuration in step 6.
>>>>>>>>>>>>
>>>>>>>>>>>> In mlx5 PMD case all relevant callbacks require communication
>>>>>>>>>>>> with the kernel driver, to configure the device (mlx5 PMD
>>>>>>>>>>>> must create/destroy new kernel flow rules and/or change netdev
>> config).
>>>>>>>>>>>>
>>>>>>>>>>>> mlx5 PMD handles applying all configuration in .dev_start(),
>>>>>>>>>>>> so the following forced callbacks force additional
>>>>>>>>>>>> communication with the kernel. The
>>>>>>>>>>> same configuration is applied multiple times.
>>>>>>>>>>>>
>>>>>>>>>>>> As an optimization, mlx5 PMD could check if a given
>>>>>>>>>>>> configuration was applied, but this would duplicate the
>>>>>>>>>>>> functionality of the library (for example
>>>>>>>>>>>> rte_eth_promiscuous_enable() does not call the driver if
>>>>>>>>>>>> dev->data->promiscuous is set).
>>>>>>>>>>>>
>>>>>>>>>>>> Question: Since all of the configuration is available before
>>>>>>>>>>>> .dev_start() callback is called, why ethdev library does not
>>>>>>>>>>>> expect .dev_start() to
>>>>>>>>>>> take this configuration into account?
>>>>>>>>>>>> In other words, why library has to reapply the configuration?
>>>>>>>>>>>>
>>>>>>>>>>>> I could not find any particular reason why configuration
>>>>>>>>>>>> restore exists as part of the process (it was in the initial DPDK
>> commit).
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> My assumption is .dev_stop() cause these values reset in some
>>>>>>>>>>> devices, so
>>>>>>>>>>> .dev_start() restores them back.
>>>>>>>>>>> @Bruce or @Konstantin may remember the history.
>>>>>>>>>
>>>>>>>>> Yep, as I remember, at least some Intel PMDs calling hw_reset()
>>>>>>>>> ad
>>>>>>>>> dec_stop() and even dev_start() to make sure that HW is in a
>>>>>>>>> clean
>>>>>>>>> (known)
>>>>>> state.
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> But I agree this is device specific behavior, and can be
>>>>>>>>>>> managed by what device requires.
>>>>>>>>>
>>>>>>>>> Probably yes.
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> The patches included in this RFC, propose a mechanism which
>>>>>>>>>>>> would help with managing which drivers rely on forceful
>>>>>>>>>>>> configuration
>>>> restore.
>>>>>>>>>>>> Drivers could advertise if forceful configuration restore is
>>>>>>>>>>>> needed through `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If
>>>>>>>>>>>> this flag is set, then the driver in question requires ethdev
>>>>>>>>>>>> to forcefully restore
>>>>>>>>>>> configuration.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> OK to use flag for it, but not sure about using 'dev_info->dev_flags'
>>>>>>>>>>> (RTE_ETH_DEV_*) for this, as this flag is shared with user and
>>>>>>>>>>> this is all dpdk internal.
>>>>>>>>>>>
>>>>>>>>>>> What about to have a dedicated flag for it? We can have a
>>>>>>>>>>> dedicated set of flag values for restore.
>>>>>>>>>>
>>>>>>>>>> Agreed. What do you think about the following?
>>>>>>>>>
>>>>>>>>> Instead of exposing that, can we probably make it transparent to
>>>>>>>>> the user and probably ethdev layer too?
>>>>>>>>>
>>>>>>>>
>>>>>>>> +1 to make it transparent to user, but not sure if we can make it
>>>>>>>> transparent to ethdev layer.
>>>>>>>
>>>>>>> Just to be clear:
>>>>>>> Let say, using example from above:
>>>>>>>
>>>>>>>  rte_eth_dev_start()
>>>>>>>      - Call dev->dev_ops->dev_start()
>>>>>>>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>>>>>>>      - Call dev->dev_ops->promiscuous_enable()
>>>>>>>      - Call dev->dev_ops->allmulticast_enable()
>>>>>>>
>>>>>>> We probably can introduce ethdev internal function (still visible
>>>>>>> to
>>>>>>> PMDs) that would do last 3 steps:
>>>>>>> ethdev_replay_user_conf(...)
>>>>>>>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>>>>>>>      - Call dev->dev_ops->promiscuous_enable()
>>>>>>>      - Call dev->dev_ops->allmulticast_enable()
>>>>>>>
>>>>>>> And let PMD itself to decide does it needs to call it at dev_start() or not.
>>>>>>> So it will become:
>>>>>>> rte_eth_dev_start()
>>>>>>>      - Call dev->dev_ops->dev_start()
>>>>>>>       -Call ethdev_replay_user_conf(.)
>>>>>>>               - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>>>>>>>               - Call dev->dev_ops->promiscuous_enable()
>>>>>>>               -Call dev->dev_ops->allmulticast_enable()
>>>>>>>
>>>>>>> For PMDs that do need to restore user provided config And
>>>>>>> rte_eth_dev_start()
>>>>>>>      - Call dev->dev_ops->dev_start()
>>>>>>>
>>>>>>> For those who do not.
>>>>>>>
>>>>>>
>>>>>> OK, got it what you mean.
>>>>>> Pushing restore functionality to PMDs works, but this may be doing
>>>>>> redundant work on each PMD.
>>>>>>
>>>>>> Instead Dariusz suggests PMD to provide a flag to ehtdev to what to
>>>>>> restore and common code in ethdev does the work.
>>>>>> My below dedicated data struct comment is to have this flag in a
>>>>>> new struct, overall like following:
>>>>>>
>>>>>> rte_eth_dev_start()
>>>>>>    - Call dev->dev_ops->dev_start()
>>>>>>    - Call dev->dev_ops->get_restore_flags(ethdev, RTE_ETH_START, &flags)
>>>>>>    - if (flags & MAC) dev->dev_ops->mac_addr_set()
>>>>>>    - if (flags & PROMISC) dev->dev_ops->promiscuous_enable()
>>>>>>    - ...
>>>>>
>>>>> Could you please explain what is the benefit of exposing flags
>>>>> through dev_ops
>>>> callback vs a dedicated flags field in rte_eth_dev_data?
>>>>> In both solutions:
>>>>> - config restore is transparent to the user,
>>>>> - drivers can omit config restore (either by not implementing the
>>>>> callback or not providing the flags),
>>>>> - an ABI change is introduced (not a huge concern, at least for 24.11).
>>>>>
>>>>> I understand that my initial proposal with "internal_flags" was too
>>>>> vague, but renaming and splitting this field into:
>>>>>
>>>>> - dev_start_restore_flags
>>>>> - dev_reset_restore_flags
>>>>> - and so on...
>>>>>
>>>>> seems sufficient, at least in my opinion.
>>>>>
>>>>
>>>> Hi Dariusz,
>>>>
>>>> Putting flags to rte_eth_dev_data works, and it is easier since there
>>>> is direct access from rte_eth_dev to rte_eth_dev_data, so you don't
>>>> need new dev_ops. So this is a valid option.
>>>>
>>>> But benefit of new dev_ops is to keep "struct rte_eth_dev_data" clean.
>>>>
>>>> "struct rte_eth_dev_data" is integral data structure for ethdev and
>>>> it is used in multiple locations, mostly related to the datapath and
>>>> all drivers needs to deal with fields of this struct.
>>>> Like [rx]_queues, dev_private, dev_conf all important and used a lot.
>>>>
>>>> I want to protect "struct rte_eth_dev_data" from noise as much as
>>>> possible, though what is noise is not always that clear.
>>>>
>>>> This restore flag is not critical, and I expect most of the drivers
>>>> won't care and populate this restore flag at all. That is why to me
>>>> it is better have dedicated struct for it and only drivers care about restore
>> feature know it.
>>>
>>> I see. Thank you very much for the explanation.
>>>
>>> In this case, it looks like adding this to dev_ops is the way to go.
>>>
>>> So, summarizing it all:
>>>
>>> 1. dev_ops should be extended with a callback with the following signature and
>> enums/flags:
>>>
>>> enum rte_eth_dev_operation op {
>>>       RTE_ETH_START,
>>>       RTE_ETH_STOP,
>>>       RTE_ETH_RESET,
>>> };
>>>
>>> #define RTE_ETH_RESTORE_MAC_ADDR RTE_BIT32(0) #define
>>> RTE_ETH_RESTORE_PROMISC RTE_BIT32(1) #define
>> RTE_ETH_RESTORE_ALLMULTI
>>> RTE_BIT32(2)
>>>
>>> void (*get_restore_flags)(
>>>       struct rte_eth_dev *dev,
>>>       enum rte_eth_dev_operation op,
>>>       uint32_t *flags);
>>>
>>> 2. rte_eth_dev_start() will work as follows:
>>>
>>> - Call dev->dev_ops->dev_start()
>>> - Call dev->dev_ops->get_restore_flags(dev, RTE_ETH_START, &flags). If callback
>> is not provided, assume flags == 0.
>>> - if (flags & RTE_ETH_RESTORE_MAC_ADDR) - restore MAC addresses
>>> - and so on...
>>>
>>
>> All above looks good.
>>
>>> Also, I would like to add the following:
>>>
>>> 3. Patchset introducing this change should add get_restore_flags()
>> implementation to all drivers, which informs that all config should be restored.
>>> This would preserve the current behavior.
>>> Later, this could be refined driver by driver.
>>>
>>> What do you think?
>>>
>>
>> What you are saying is correct, but I suspect most of the drivers don't really need
>> this restore, but they have it since it was in the ethdev layer.
>>
>> If we introduce back restore via get_restore_flags(), it may stay as it is in drivers, at
>> least for most of them.
>>
>> What do you think to risk breaking stuff for this case.
>>
>> So don't implement this in the drivers by default, so who needs it will recognize
>> the issue and will implement it. If we merge this patch for -rc1, it gives enough
>> time for drivers to detect the issue and fix it.
> 
> It seems rather too risky, especially considering that for example, there are a few Intel drivers which do not have maintainers (like i40e).
> So, I don't know what will happen to such drivers. They may be left broken (if they are affected) for 24.11 and future releases.
> But I agree that if default behavior is preserved, this dependence of drivers on config restore might stay as is.
> I'm on the fence about it.
> 

Yeah, not sure.

Do you think the dev_ops function can be implemented in the
'ethdev_driver.c' and all drivers use exact same function?
So this reduces changes and duplication in drivers while preserving the
behavior.

>>
>> Only we may implement this to the drivers that exist when this restore code was
>> introduced.
>> I mean whatever driver exist in the initial DPDK commit, implement this logic only
>> to those drivers.
> 
> Seems reasonable to me. In this case, it would be igb (IIUC, now it's named e1000) and ixgbe.
> 
>>
>>> Also, there's an open question about 'stop' and 'reset' operations.
>>> At the moment, ethdev layer does not do any config manipulation during these
>> operations.
>>> Maybe we should limit get_restore_flags() to 'start' only?
>>>
>>
>> Ack, I was about to suggest the same, for now only have 'RTE_ETH_START'
>> as a placeholder for later possible usages.
> 
> Ack
> 
>>
>>>>
>>>>
>>>>
>>>>>>
>>>>>> So PMDs only will provide what to restore with an internal API and
>>>>>> common ethdev layer will restore it.
>>>>>> If no restore required PMD may not implement .get_restore_flags() at all.
>>>>>>
>>>>>> Additionally, RTE_ETH_START, RTE_ETH_RESET etc flag can be provided
>>>>>> to internal API to get what to restore in different states...
>>>>>>
>>>>>>>> Suggested 'internal_flag' in "struct rte_eth_dev_data" can be
>>>>>>>> confusing and open to interpretation what to use it for and by
>>>>>>>> time become source of defect.
>>>>>>>
>>>>>>> Yes, same thoughts.
>>>>>>>
>>>>>>>> Instead what do you think to have a separate, dedicated data struct for it?
>>>>>>>
>>>>>>> Hmm... not sure I understood you here...
>>>>>>>
>>>>>>>>
>>>>>>>>> Might be we can move this restoration code into the new ethdev
>>>>>>>>> helper function,(ethdevd_user_config_restore()  or so) that PMD
>>>>>>>>> can invoke
>>>>>> during its dev_start() if needed?
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> #define RTE_ETH_DEV_INTERNAL_PROMISC_FORCE_RESTORE
>>>> RTE_BIT32(0)
>>>>>>>>>> #define RTE_ETH_DEV_INTERNAL_ALLMULTI_FORCE_RESTORE
>>>> RTE_BIT32(1)
>>>>>>>>>> #define RTE_ETH_DEV_INTERNAL_MAC_ADDR_FORCE_RESTORE
>>>>>> RTE_BIT32(2)
>>>>>>>>>>
>>>>>>>>>> struct rte_eth_dev_data {
>>>>>>>>>>    /* snip */
>>>>>>>>>>
>>>>>>>>>>    uint32_t dev_flags;
>>>>>>>>>>
>>>>>>>>>>    /**
>>>>>>>>>>     * Internal device capabilities, used only by ethdev library.
>>>>>>>>>>     * Certain functionalities provided by the library might
>>>> enabled/disabled,
>>>>>>>>>>     * based on driver exposing certain capabilities.
>>>>>>>>>>     */
>>>>>>>>>>    uint32_t internal_flags;
>>>>>>>>>>
>>>>>>>>>>    /* snip */
>>>>>>>>>> };
>>>>>>>>>>
>>>>>>>>>>> Also perhaps we have go into details what needs to be restored
>>>>>>>>>>> after 'stop' and what needs to be restored after 'reset' and
>>>>>>>>>>> use similar
>>>>>> mechanism etc...
>>>>>>>>>>
>>>>>>>>>> I think we should look into that.
>>>>>>>>>> Any 'codification' of semantics between drivers and ethdev
>>>>>>>>>> library is good in
>>>>>> my opinion.
>>>>>>>>>>
>>>>>>>>>> At least right now, ethdev does not change any configuration in
>>>>>>>>>> 'stop' and
>>>>>> 'reset' from what I see.
>>>>>>>>>> But that's on library side only.
>>>>>>>>>>
>>>>>>>>>>>> This way, if we would conclude that it makes sense for
>>>>>>>>>>>> .dev_start() to handle all starting configuration aspects, we
>>>>>>>>>>>> could track which drivers still rely
>>>>>>>>>>> on configuration restore.
>>>>>>>>>>>>
>>>>>>>>>>>> Dariusz Sosnowski (4):
>>>>>>>>>>>>   ethdev: rework config restore
>>>>>>>>>>>>   ethdev: omit promiscuous config restore if not required
>>>>>>>>>>>>   ethdev: omit all multicast config restore if not required
>>>>>>>>>>>>   ethdev: omit MAC address restore if not required
>>>>>>>>>>>>
>>>>>>>>>>>>  lib/ethdev/rte_ethdev.c | 39
>>>>>>>>>>>> ++++++++++++++++++++++++++++++++++-----
>>>>>>>>>>>>  lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++
>>>>>>>>>>>>  2 files changed, 52 insertions(+), 5 deletions(-)
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> 2.39.5
>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [RFC 0/4] ethdev: rework config restore
  2024-10-10 17:08                       ` Ferruh Yigit
@ 2024-10-10 22:58                         ` Konstantin Ananyev
  2024-10-11  0:02                           ` Ferruh Yigit
  0 siblings, 1 reply; 23+ messages in thread
From: Konstantin Ananyev @ 2024-10-10 22:58 UTC (permalink / raw)
  To: Ferruh Yigit, Dariusz Sosnowski,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Andrew Rybchenko
  Cc: dev, Bruce Richardson



config restore
> >>
> >> External email: Use caution opening links or attachments
> >>
> >>
> >> On 10/10/2024 1:08 PM, Dariusz Sosnowski wrote:
> >>>> -----Original Message-----
> >>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
> >>>> Sent: Thursday, October 10, 2024 01:17
> >>>> To: Dariusz Sosnowski <dsosnowski@nvidia.com>; Konstantin Ananyev
> >>>> <konstantin.ananyev@huawei.com>; NBU-Contact-Thomas Monjalon
> >>>> (EXTERNAL) <thomas@monjalon.net>; Andrew Rybchenko
> >>>> <andrew.rybchenko@oktetlabs.ru>
> >>>> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
> >>>> Subject: Re: [RFC 0/4] ethdev: rework config restore
> >>>>
> >>>> External email: Use caution opening links or attachments
> >>>>
> >>>>
> >>>> On 10/9/2024 5:18 PM, Dariusz Sosnowski wrote:
> >>>>>> -----Original Message-----
> >>>>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
> >>>>>> Sent: Wednesday, October 9, 2024 03:08
> >>>>>> To: Konstantin Ananyev <konstantin.ananyev@huawei.com>; Dariusz
> >>>>>> Sosnowski <dsosnowski@nvidia.com>; NBU-Contact-Thomas Monjalon
> >>>>>> (EXTERNAL) <thomas@monjalon.net>; Andrew Rybchenko
> >>>>>> <andrew.rybchenko@oktetlabs.ru>
> >>>>>> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
> >>>>>> Subject: Re: [RFC 0/4] ethdev: rework config restore
> >>>>>>
> >>>>>> External email: Use caution opening links or attachments
> >>>>>>
> >>>>>>
> >>>>>> On 10/8/2024 6:21 PM, Konstantin Ananyev wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>>>>>> We have been working on optimizing the latency of calls to
> >>>>>>>>>>>> rte_eth_dev_start(), on ports spawned by mlx5 PMD. Most of
> >>>>>>>>>>>> the work requires changes in the implementation of
> >>>>>>>>>>>> .dev_start() PMD callback, but I also wanted to start a
> >>>>>>>>>>>> discussion regarding configuration restore.
> >>>>>>>>>>>>
> >>>>>>>>>>>> rte_eth_dev_start() does a few things on top of calling
> >>>>>>>>>>>> .dev_start()
> >>>>>> callback:
> >>>>>>>>>>>>
> >>>>>>>>>>>> - Before calling it:
> >>>>>>>>>>>>     - eth_dev_mac_restore() - if device supports
> >>>>>>>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> >>>>>>>>>>>> - After calling it:
> >>>>>>>>>>>>     - eth_dev_mac_restore() - if device does not support
> >>>>>>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> >>>>>>>>>>>>     - restore promiscuous config
> >>>>>>>>>>>>     - restore all multicast config
> >>>>>>>>>>>>
> >>>>>>>>>>>> eth_dev_mac_restore() iterates over all known MAC addresses -
> >>>>>>>>>>>> stored in rte_eth_dev_data.mac_addrs array - and calls
> >>>>>>>>>>>> .mac_addr_set() and .mac_addr_add() callbacks to apply these
> >>>>>>>>>>>> MAC
> >>>>>> addresses.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Promiscuous config restore checks if promiscuous mode is
> >>>>>>>>>>>> enabled or not, and calls .promiscuous_enable() or
> >>>>>>>>>>>> .promiscuous_disable()
> >>>>>> callback.
> >>>>>>>>>>>>
> >>>>>>>>>>>> All multicast config restore checks if all multicast mode is
> >>>>>>>>>>>> enabled or not, and calls .allmulticast_enable() or
> >>>>>>>>>>>> .allmulticast_disable()
> >>>>>> callback.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Callbacks are called directly in all of these cases, to
> >>>>>>>>>>>> bypass the checks for applying the same configuration, which
> >>>>>>>>>>>> exist in relevant
> >>>>>> APIs.
> >>>>>>>>>>>> Checks are bypassed to force drivers to reapply the configuration.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Let's consider what happens in the following sequence of API calls.
> >>>>>>>>>>>>
> >>>>>>>>>>>> 1. rte_eth_dev_configure()
> >>>>>>>>>>>> 2. rte_eth_tx_queue_setup()
> >>>>>>>>>>>> 3. rte_eth_rx_queue_setup()
> >>>>>>>>>>>> 4. rte_eth_promiscuous_enable()
> >>>>>>>>>>>>     - Call dev->dev_ops->promiscuous_enable()
> >>>>>>>>>>>>     - Stores promiscuous state in dev->data->promiscuous 5.
> >>>>>>>>>>>> rte_eth_allmulticast_enable()
> >>>>>>>>>>>>     - Call dev->dev_ops->allmulticast_enable()
> >>>>>>>>>>>>     - Stores allmulticast state in dev->data->allmulticast 6.
> >>>>>>>>>>>> rte_eth_dev_start()
> >>>>>>>>>>>>     - Call dev->dev_ops->dev_start()
> >>>>>>>>>>>>     - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>>>>>>>>>>     - Call dev->dev_ops->promiscuous_enable()
> >>>>>>>>>>>>     - Call dev->dev_ops->allmulticast_enable()
> >>>>>>>>>>>>
> >>>>>>>>>>>> Even though all configuration is available in dev->data after
> >>>>>>>>>>>> step 5, library forces reapplying this configuration in step 6.
> >>>>>>>>>>>>
> >>>>>>>>>>>> In mlx5 PMD case all relevant callbacks require communication
> >>>>>>>>>>>> with the kernel driver, to configure the device (mlx5 PMD
> >>>>>>>>>>>> must create/destroy new kernel flow rules and/or change netdev
> >> config).
> >>>>>>>>>>>>
> >>>>>>>>>>>> mlx5 PMD handles applying all configuration in .dev_start(),
> >>>>>>>>>>>> so the following forced callbacks force additional
> >>>>>>>>>>>> communication with the kernel. The
> >>>>>>>>>>> same configuration is applied multiple times.
> >>>>>>>>>>>>
> >>>>>>>>>>>> As an optimization, mlx5 PMD could check if a given
> >>>>>>>>>>>> configuration was applied, but this would duplicate the
> >>>>>>>>>>>> functionality of the library (for example
> >>>>>>>>>>>> rte_eth_promiscuous_enable() does not call the driver if
> >>>>>>>>>>>> dev->data->promiscuous is set).
> >>>>>>>>>>>>
> >>>>>>>>>>>> Question: Since all of the configuration is available before
> >>>>>>>>>>>> .dev_start() callback is called, why ethdev library does not
> >>>>>>>>>>>> expect .dev_start() to
> >>>>>>>>>>> take this configuration into account?
> >>>>>>>>>>>> In other words, why library has to reapply the configuration?
> >>>>>>>>>>>>
> >>>>>>>>>>>> I could not find any particular reason why configuration
> >>>>>>>>>>>> restore exists as part of the process (it was in the initial DPDK
> >> commit).
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> My assumption is .dev_stop() cause these values reset in some
> >>>>>>>>>>> devices, so
> >>>>>>>>>>> .dev_start() restores them back.
> >>>>>>>>>>> @Bruce or @Konstantin may remember the history.
> >>>>>>>>>
> >>>>>>>>> Yep, as I remember, at least some Intel PMDs calling hw_reset()
> >>>>>>>>> ad
> >>>>>>>>> dec_stop() and even dev_start() to make sure that HW is in a
> >>>>>>>>> clean
> >>>>>>>>> (known)
> >>>>>> state.
> >>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> But I agree this is device specific behavior, and can be
> >>>>>>>>>>> managed by what device requires.
> >>>>>>>>>
> >>>>>>>>> Probably yes.
> >>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>> The patches included in this RFC, propose a mechanism which
> >>>>>>>>>>>> would help with managing which drivers rely on forceful
> >>>>>>>>>>>> configuration
> >>>> restore.
> >>>>>>>>>>>> Drivers could advertise if forceful configuration restore is
> >>>>>>>>>>>> needed through `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If
> >>>>>>>>>>>> this flag is set, then the driver in question requires ethdev
> >>>>>>>>>>>> to forcefully restore
> >>>>>>>>>>> configuration.
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> OK to use flag for it, but not sure about using 'dev_info->dev_flags'
> >>>>>>>>>>> (RTE_ETH_DEV_*) for this, as this flag is shared with user and
> >>>>>>>>>>> this is all dpdk internal.
> >>>>>>>>>>>
> >>>>>>>>>>> What about to have a dedicated flag for it? We can have a
> >>>>>>>>>>> dedicated set of flag values for restore.
> >>>>>>>>>>
> >>>>>>>>>> Agreed. What do you think about the following?
> >>>>>>>>>
> >>>>>>>>> Instead of exposing that, can we probably make it transparent to
> >>>>>>>>> the user and probably ethdev layer too?
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> +1 to make it transparent to user, but not sure if we can make it
> >>>>>>>> transparent to ethdev layer.
> >>>>>>>
> >>>>>>> Just to be clear:
> >>>>>>> Let say, using example from above:
> >>>>>>>
> >>>>>>>  rte_eth_dev_start()
> >>>>>>>      - Call dev->dev_ops->dev_start()
> >>>>>>>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>>>>>      - Call dev->dev_ops->promiscuous_enable()
> >>>>>>>      - Call dev->dev_ops->allmulticast_enable()
> >>>>>>>
> >>>>>>> We probably can introduce ethdev internal function (still visible
> >>>>>>> to
> >>>>>>> PMDs) that would do last 3 steps:
> >>>>>>> ethdev_replay_user_conf(...)
> >>>>>>>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>>>>>      - Call dev->dev_ops->promiscuous_enable()
> >>>>>>>      - Call dev->dev_ops->allmulticast_enable()
> >>>>>>>
> >>>>>>> And let PMD itself to decide does it needs to call it at dev_start() or not.
> >>>>>>> So it will become:
> >>>>>>> rte_eth_dev_start()
> >>>>>>>      - Call dev->dev_ops->dev_start()
> >>>>>>>       -Call ethdev_replay_user_conf(.)
> >>>>>>>               - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>>>>>               - Call dev->dev_ops->promiscuous_enable()
> >>>>>>>               -Call dev->dev_ops->allmulticast_enable()
> >>>>>>>
> >>>>>>> For PMDs that do need to restore user provided config And
> >>>>>>> rte_eth_dev_start()
> >>>>>>>      - Call dev->dev_ops->dev_start()
> >>>>>>>
> >>>>>>> For those who do not.
> >>>>>>>
> >>>>>>
> >>>>>> OK, got it what you mean.
> >>>>>> Pushing restore functionality to PMDs works, but this may be doing
> >>>>>> redundant work on each PMD.
> >>>>>>
> >>>>>> Instead Dariusz suggests PMD to provide a flag to ehtdev to what to
> >>>>>> restore and common code in ethdev does the work.
> >>>>>> My below dedicated data struct comment is to have this flag in a
> >>>>>> new struct, overall like following:
> >>>>>>
> >>>>>> rte_eth_dev_start()
> >>>>>>    - Call dev->dev_ops->dev_start()
> >>>>>>    - Call dev->dev_ops->get_restore_flags(ethdev, RTE_ETH_START, &flags)
> >>>>>>    - if (flags & MAC) dev->dev_ops->mac_addr_set()
> >>>>>>    - if (flags & PROMISC) dev->dev_ops->promiscuous_enable()
> >>>>>>    - ...
> >>>>>
> >>>>> Could you please explain what is the benefit of exposing flags
> >>>>> through dev_ops
> >>>> callback vs a dedicated flags field in rte_eth_dev_data?
> >>>>> In both solutions:
> >>>>> - config restore is transparent to the user,
> >>>>> - drivers can omit config restore (either by not implementing the
> >>>>> callback or not providing the flags),
> >>>>> - an ABI change is introduced (not a huge concern, at least for 24.11).
> >>>>>
> >>>>> I understand that my initial proposal with "internal_flags" was too
> >>>>> vague, but renaming and splitting this field into:
> >>>>>
> >>>>> - dev_start_restore_flags
> >>>>> - dev_reset_restore_flags
> >>>>> - and so on...
> >>>>>
> >>>>> seems sufficient, at least in my opinion.
> >>>>>
> >>>>
> >>>> Hi Dariusz,
> >>>>
> >>>> Putting flags to rte_eth_dev_data works, and it is easier since there
> >>>> is direct access from rte_eth_dev to rte_eth_dev_data, so you don't
> >>>> need new dev_ops. So this is a valid option.
> >>>>
> >>>> But benefit of new dev_ops is to keep "struct rte_eth_dev_data" clean.
> >>>>
> >>>> "struct rte_eth_dev_data" is integral data structure for ethdev and
> >>>> it is used in multiple locations, mostly related to the datapath and
> >>>> all drivers needs to deal with fields of this struct.
> >>>> Like [rx]_queues, dev_private, dev_conf all important and used a lot.
> >>>>
> >>>> I want to protect "struct rte_eth_dev_data" from noise as much as
> >>>> possible, though what is noise is not always that clear.
> >>>>
> >>>> This restore flag is not critical, and I expect most of the drivers
> >>>> won't care and populate this restore flag at all. That is why to me
> >>>> it is better have dedicated struct for it and only drivers care about restore
> >> feature know it.
> >>>
> >>> I see. Thank you very much for the explanation.
> >>>
> >>> In this case, it looks like adding this to dev_ops is the way to go.
> >>>
> >>> So, summarizing it all:
> >>>
> >>> 1. dev_ops should be extended with a callback with the following signature and
> >> enums/flags:
> >>>
> >>> enum rte_eth_dev_operation op {
> >>>       RTE_ETH_START,
> >>>       RTE_ETH_STOP,
> >>>       RTE_ETH_RESET,
> >>> };
> >>>
> >>> #define RTE_ETH_RESTORE_MAC_ADDR RTE_BIT32(0) #define
> >>> RTE_ETH_RESTORE_PROMISC RTE_BIT32(1) #define
> >> RTE_ETH_RESTORE_ALLMULTI
> >>> RTE_BIT32(2)
> >>>
> >>> void (*get_restore_flags)(
> >>>       struct rte_eth_dev *dev,
> >>>       enum rte_eth_dev_operation op,
> >>>       uint32_t *flags);
> >>>
> >>> 2. rte_eth_dev_start() will work as follows:
> >>>
> >>> - Call dev->dev_ops->dev_start()
> >>> - Call dev->dev_ops->get_restore_flags(dev, RTE_ETH_START, &flags). If callback
> >> is not provided, assume flags == 0.
> >>> - if (flags & RTE_ETH_RESTORE_MAC_ADDR) - restore MAC addresses
> >>> - and so on...
> >>>
> >>
> >> All above looks good.
> >>
> >>> Also, I would like to add the following:
> >>>
> >>> 3. Patchset introducing this change should add get_restore_flags()
> >> implementation to all drivers, which informs that all config should be restored.
> >>> This would preserve the current behavior.
> >>> Later, this could be refined driver by driver.
> >>>
> >>> What do you think?
> >>>
> >>
> >> What you are saying is correct, but I suspect most of the drivers don't really need
> >> this restore, but they have it since it was in the ethdev layer.
> >>
> >> If we introduce back restore via get_restore_flags(), it may stay as it is in drivers, at
> >> least for most of them.
> >>
> >> What do you think to risk breaking stuff for this case.
> >>
> >> So don't implement this in the drivers by default, so who needs it will recognize
> >> the issue and will implement it. If we merge this patch for -rc1, it gives enough
> >> time for drivers to detect the issue and fix it.
> >
> > It seems rather too risky, especially considering that for example, there are a few Intel drivers which do not have maintainers (like
> i40e).
> > So, I don't know what will happen to such drivers. They may be left broken (if they are affected) for 24.11 and future releases.
> > But I agree that if default behavior is preserved, this dependence of drivers on config restore might stay as is.
> > I'm on the fence about it.
> >
> 
> Yeah, not sure.
> 
> Do you think the dev_ops function can be implemented in the
> 'ethdev_driver.c' and all drivers use exact same function?
> So this reduces changes and duplication in drivers while preserving the
> behavior.

Wonder why it can't be done visa-versa?
If this new function is not implemented by PMD, then simply assume
that everything needs to be restored (as it happens now)? 

> >>
> >> Only we may implement this to the drivers that exist when this restore code was
> >> introduced.
> >> I mean whatever driver exist in the initial DPDK commit, implement this logic only
> >> to those drivers.
> >
> > Seems reasonable to me. In this case, it would be igb (IIUC, now it's named e1000) and ixgbe.
> >
> >>
> >>> Also, there's an open question about 'stop' and 'reset' operations.
> >>> At the moment, ethdev layer does not do any config manipulation during these
> >> operations.
> >>> Maybe we should limit get_restore_flags() to 'start' only?
> >>>
> >>
> >> Ack, I was about to suggest the same, for now only have 'RTE_ETH_START'
> >> as a placeholder for later possible usages.
> >
> > Ack
> >
> >>
> >>>>
> >>>>
> >>>>
> >>>>>>
> >>>>>> So PMDs only will provide what to restore with an internal API and
> >>>>>> common ethdev layer will restore it.
> >>>>>> If no restore required PMD may not implement .get_restore_flags() at all.
> >>>>>>
> >>>>>> Additionally, RTE_ETH_START, RTE_ETH_RESET etc flag can be provided
> >>>>>> to internal API to get what to restore in different states...
> >>>>>>
> >>>>>>>> Suggested 'internal_flag' in "struct rte_eth_dev_data" can be
> >>>>>>>> confusing and open to interpretation what to use it for and by
> >>>>>>>> time become source of defect.
> >>>>>>>
> >>>>>>> Yes, same thoughts.
> >>>>>>>
> >>>>>>>> Instead what do you think to have a separate, dedicated data struct for it?
> >>>>>>>
> >>>>>>> Hmm... not sure I understood you here...
> >>>>>>>
> >>>>>>>>
> >>>>>>>>> Might be we can move this restoration code into the new ethdev
> >>>>>>>>> helper function,(ethdevd_user_config_restore()  or so) that PMD
> >>>>>>>>> can invoke
> >>>>>> during its dev_start() if needed?
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> #define RTE_ETH_DEV_INTERNAL_PROMISC_FORCE_RESTORE
> >>>> RTE_BIT32(0)
> >>>>>>>>>> #define RTE_ETH_DEV_INTERNAL_ALLMULTI_FORCE_RESTORE
> >>>> RTE_BIT32(1)
> >>>>>>>>>> #define RTE_ETH_DEV_INTERNAL_MAC_ADDR_FORCE_RESTORE
> >>>>>> RTE_BIT32(2)
> >>>>>>>>>>
> >>>>>>>>>> struct rte_eth_dev_data {
> >>>>>>>>>>    /* snip */
> >>>>>>>>>>
> >>>>>>>>>>    uint32_t dev_flags;
> >>>>>>>>>>
> >>>>>>>>>>    /**
> >>>>>>>>>>     * Internal device capabilities, used only by ethdev library.
> >>>>>>>>>>     * Certain functionalities provided by the library might
> >>>> enabled/disabled,
> >>>>>>>>>>     * based on driver exposing certain capabilities.
> >>>>>>>>>>     */
> >>>>>>>>>>    uint32_t internal_flags;
> >>>>>>>>>>
> >>>>>>>>>>    /* snip */
> >>>>>>>>>> };
> >>>>>>>>>>
> >>>>>>>>>>> Also perhaps we have go into details what needs to be restored
> >>>>>>>>>>> after 'stop' and what needs to be restored after 'reset' and
> >>>>>>>>>>> use similar
> >>>>>> mechanism etc...
> >>>>>>>>>>
> >>>>>>>>>> I think we should look into that.
> >>>>>>>>>> Any 'codification' of semantics between drivers and ethdev
> >>>>>>>>>> library is good in
> >>>>>> my opinion.
> >>>>>>>>>>
> >>>>>>>>>> At least right now, ethdev does not change any configuration in
> >>>>>>>>>> 'stop' and
> >>>>>> 'reset' from what I see.
> >>>>>>>>>> But that's on library side only.
> >>>>>>>>>>
> >>>>>>>>>>>> This way, if we would conclude that it makes sense for
> >>>>>>>>>>>> .dev_start() to handle all starting configuration aspects, we
> >>>>>>>>>>>> could track which drivers still rely
> >>>>>>>>>>> on configuration restore.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Dariusz Sosnowski (4):
> >>>>>>>>>>>>   ethdev: rework config restore
> >>>>>>>>>>>>   ethdev: omit promiscuous config restore if not required
> >>>>>>>>>>>>   ethdev: omit all multicast config restore if not required
> >>>>>>>>>>>>   ethdev: omit MAC address restore if not required
> >>>>>>>>>>>>
> >>>>>>>>>>>>  lib/ethdev/rte_ethdev.c | 39
> >>>>>>>>>>>> ++++++++++++++++++++++++++++++++++-----
> >>>>>>>>>>>>  lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++
> >>>>>>>>>>>>  2 files changed, 52 insertions(+), 5 deletions(-)
> >>>>>>>>>>>>
> >>>>>>>>>>>> --
> >>>>>>>>>>>> 2.39.5
> >>>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC 0/4] ethdev: rework config restore
  2024-10-10 22:58                         ` Konstantin Ananyev
@ 2024-10-11  0:02                           ` Ferruh Yigit
  2024-10-11  8:23                             ` Dariusz Sosnowski
  2024-10-11  8:29                             ` Konstantin Ananyev
  0 siblings, 2 replies; 23+ messages in thread
From: Ferruh Yigit @ 2024-10-11  0:02 UTC (permalink / raw)
  To: Konstantin Ananyev, Dariusz Sosnowski,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Andrew Rybchenko
  Cc: dev, Bruce Richardson

On 10/10/2024 11:58 PM, Konstantin Ananyev wrote:
> 
> 
> config restore
>>>>
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> On 10/10/2024 1:08 PM, Dariusz Sosnowski wrote:
>>>>>> -----Original Message-----
>>>>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>>>>>> Sent: Thursday, October 10, 2024 01:17
>>>>>> To: Dariusz Sosnowski <dsosnowski@nvidia.com>; Konstantin Ananyev
>>>>>> <konstantin.ananyev@huawei.com>; NBU-Contact-Thomas Monjalon
>>>>>> (EXTERNAL) <thomas@monjalon.net>; Andrew Rybchenko
>>>>>> <andrew.rybchenko@oktetlabs.ru>
>>>>>> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
>>>>>> Subject: Re: [RFC 0/4] ethdev: rework config restore
>>>>>>
>>>>>> External email: Use caution opening links or attachments
>>>>>>
>>>>>>
>>>>>> On 10/9/2024 5:18 PM, Dariusz Sosnowski wrote:
>>>>>>>> -----Original Message-----
>>>>>>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>>>>>>>> Sent: Wednesday, October 9, 2024 03:08
>>>>>>>> To: Konstantin Ananyev <konstantin.ananyev@huawei.com>; Dariusz
>>>>>>>> Sosnowski <dsosnowski@nvidia.com>; NBU-Contact-Thomas Monjalon
>>>>>>>> (EXTERNAL) <thomas@monjalon.net>; Andrew Rybchenko
>>>>>>>> <andrew.rybchenko@oktetlabs.ru>
>>>>>>>> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
>>>>>>>> Subject: Re: [RFC 0/4] ethdev: rework config restore
>>>>>>>>
>>>>>>>> External email: Use caution opening links or attachments
>>>>>>>>
>>>>>>>>
>>>>>>>> On 10/8/2024 6:21 PM, Konstantin Ananyev wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>>>>> We have been working on optimizing the latency of calls to
>>>>>>>>>>>>>> rte_eth_dev_start(), on ports spawned by mlx5 PMD. Most of
>>>>>>>>>>>>>> the work requires changes in the implementation of
>>>>>>>>>>>>>> .dev_start() PMD callback, but I also wanted to start a
>>>>>>>>>>>>>> discussion regarding configuration restore.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> rte_eth_dev_start() does a few things on top of calling
>>>>>>>>>>>>>> .dev_start()
>>>>>>>> callback:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> - Before calling it:
>>>>>>>>>>>>>>     - eth_dev_mac_restore() - if device supports
>>>>>>>>>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
>>>>>>>>>>>>>> - After calling it:
>>>>>>>>>>>>>>     - eth_dev_mac_restore() - if device does not support
>>>>>>>>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
>>>>>>>>>>>>>>     - restore promiscuous config
>>>>>>>>>>>>>>     - restore all multicast config
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> eth_dev_mac_restore() iterates over all known MAC addresses -
>>>>>>>>>>>>>> stored in rte_eth_dev_data.mac_addrs array - and calls
>>>>>>>>>>>>>> .mac_addr_set() and .mac_addr_add() callbacks to apply these
>>>>>>>>>>>>>> MAC
>>>>>>>> addresses.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Promiscuous config restore checks if promiscuous mode is
>>>>>>>>>>>>>> enabled or not, and calls .promiscuous_enable() or
>>>>>>>>>>>>>> .promiscuous_disable()
>>>>>>>> callback.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> All multicast config restore checks if all multicast mode is
>>>>>>>>>>>>>> enabled or not, and calls .allmulticast_enable() or
>>>>>>>>>>>>>> .allmulticast_disable()
>>>>>>>> callback.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Callbacks are called directly in all of these cases, to
>>>>>>>>>>>>>> bypass the checks for applying the same configuration, which
>>>>>>>>>>>>>> exist in relevant
>>>>>>>> APIs.
>>>>>>>>>>>>>> Checks are bypassed to force drivers to reapply the configuration.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Let's consider what happens in the following sequence of API calls.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1. rte_eth_dev_configure()
>>>>>>>>>>>>>> 2. rte_eth_tx_queue_setup()
>>>>>>>>>>>>>> 3. rte_eth_rx_queue_setup()
>>>>>>>>>>>>>> 4. rte_eth_promiscuous_enable()
>>>>>>>>>>>>>>     - Call dev->dev_ops->promiscuous_enable()
>>>>>>>>>>>>>>     - Stores promiscuous state in dev->data->promiscuous 5.
>>>>>>>>>>>>>> rte_eth_allmulticast_enable()
>>>>>>>>>>>>>>     - Call dev->dev_ops->allmulticast_enable()
>>>>>>>>>>>>>>     - Stores allmulticast state in dev->data->allmulticast 6.
>>>>>>>>>>>>>> rte_eth_dev_start()
>>>>>>>>>>>>>>     - Call dev->dev_ops->dev_start()
>>>>>>>>>>>>>>     - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>>>>>>>>>>>>>>     - Call dev->dev_ops->promiscuous_enable()
>>>>>>>>>>>>>>     - Call dev->dev_ops->allmulticast_enable()
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Even though all configuration is available in dev->data after
>>>>>>>>>>>>>> step 5, library forces reapplying this configuration in step 6.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In mlx5 PMD case all relevant callbacks require communication
>>>>>>>>>>>>>> with the kernel driver, to configure the device (mlx5 PMD
>>>>>>>>>>>>>> must create/destroy new kernel flow rules and/or change netdev
>>>> config).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> mlx5 PMD handles applying all configuration in .dev_start(),
>>>>>>>>>>>>>> so the following forced callbacks force additional
>>>>>>>>>>>>>> communication with the kernel. The
>>>>>>>>>>>>> same configuration is applied multiple times.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As an optimization, mlx5 PMD could check if a given
>>>>>>>>>>>>>> configuration was applied, but this would duplicate the
>>>>>>>>>>>>>> functionality of the library (for example
>>>>>>>>>>>>>> rte_eth_promiscuous_enable() does not call the driver if
>>>>>>>>>>>>>> dev->data->promiscuous is set).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Question: Since all of the configuration is available before
>>>>>>>>>>>>>> .dev_start() callback is called, why ethdev library does not
>>>>>>>>>>>>>> expect .dev_start() to
>>>>>>>>>>>>> take this configuration into account?
>>>>>>>>>>>>>> In other words, why library has to reapply the configuration?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I could not find any particular reason why configuration
>>>>>>>>>>>>>> restore exists as part of the process (it was in the initial DPDK
>>>> commit).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> My assumption is .dev_stop() cause these values reset in some
>>>>>>>>>>>>> devices, so
>>>>>>>>>>>>> .dev_start() restores them back.
>>>>>>>>>>>>> @Bruce or @Konstantin may remember the history.
>>>>>>>>>>>
>>>>>>>>>>> Yep, as I remember, at least some Intel PMDs calling hw_reset()
>>>>>>>>>>> ad
>>>>>>>>>>> dec_stop() and even dev_start() to make sure that HW is in a
>>>>>>>>>>> clean
>>>>>>>>>>> (known)
>>>>>>>> state.
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> But I agree this is device specific behavior, and can be
>>>>>>>>>>>>> managed by what device requires.
>>>>>>>>>>>
>>>>>>>>>>> Probably yes.
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> The patches included in this RFC, propose a mechanism which
>>>>>>>>>>>>>> would help with managing which drivers rely on forceful
>>>>>>>>>>>>>> configuration
>>>>>> restore.
>>>>>>>>>>>>>> Drivers could advertise if forceful configuration restore is
>>>>>>>>>>>>>> needed through `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If
>>>>>>>>>>>>>> this flag is set, then the driver in question requires ethdev
>>>>>>>>>>>>>> to forcefully restore
>>>>>>>>>>>>> configuration.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> OK to use flag for it, but not sure about using 'dev_info->dev_flags'
>>>>>>>>>>>>> (RTE_ETH_DEV_*) for this, as this flag is shared with user and
>>>>>>>>>>>>> this is all dpdk internal.
>>>>>>>>>>>>>
>>>>>>>>>>>>> What about to have a dedicated flag for it? We can have a
>>>>>>>>>>>>> dedicated set of flag values for restore.
>>>>>>>>>>>>
>>>>>>>>>>>> Agreed. What do you think about the following?
>>>>>>>>>>>
>>>>>>>>>>> Instead of exposing that, can we probably make it transparent to
>>>>>>>>>>> the user and probably ethdev layer too?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> +1 to make it transparent to user, but not sure if we can make it
>>>>>>>>>> transparent to ethdev layer.
>>>>>>>>>
>>>>>>>>> Just to be clear:
>>>>>>>>> Let say, using example from above:
>>>>>>>>>
>>>>>>>>>  rte_eth_dev_start()
>>>>>>>>>      - Call dev->dev_ops->dev_start()
>>>>>>>>>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>>>>>>>>>      - Call dev->dev_ops->promiscuous_enable()
>>>>>>>>>      - Call dev->dev_ops->allmulticast_enable()
>>>>>>>>>
>>>>>>>>> We probably can introduce ethdev internal function (still visible
>>>>>>>>> to
>>>>>>>>> PMDs) that would do last 3 steps:
>>>>>>>>> ethdev_replay_user_conf(...)
>>>>>>>>>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>>>>>>>>>      - Call dev->dev_ops->promiscuous_enable()
>>>>>>>>>      - Call dev->dev_ops->allmulticast_enable()
>>>>>>>>>
>>>>>>>>> And let PMD itself to decide does it needs to call it at dev_start() or not.
>>>>>>>>> So it will become:
>>>>>>>>> rte_eth_dev_start()
>>>>>>>>>      - Call dev->dev_ops->dev_start()
>>>>>>>>>       -Call ethdev_replay_user_conf(.)
>>>>>>>>>               - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>>>>>>>>>               - Call dev->dev_ops->promiscuous_enable()
>>>>>>>>>               -Call dev->dev_ops->allmulticast_enable()
>>>>>>>>>
>>>>>>>>> For PMDs that do need to restore user provided config And
>>>>>>>>> rte_eth_dev_start()
>>>>>>>>>      - Call dev->dev_ops->dev_start()
>>>>>>>>>
>>>>>>>>> For those who do not.
>>>>>>>>>
>>>>>>>>
>>>>>>>> OK, got it what you mean.
>>>>>>>> Pushing restore functionality to PMDs works, but this may be doing
>>>>>>>> redundant work on each PMD.
>>>>>>>>
>>>>>>>> Instead Dariusz suggests PMD to provide a flag to ehtdev to what to
>>>>>>>> restore and common code in ethdev does the work.
>>>>>>>> My below dedicated data struct comment is to have this flag in a
>>>>>>>> new struct, overall like following:
>>>>>>>>
>>>>>>>> rte_eth_dev_start()
>>>>>>>>    - Call dev->dev_ops->dev_start()
>>>>>>>>    - Call dev->dev_ops->get_restore_flags(ethdev, RTE_ETH_START, &flags)
>>>>>>>>    - if (flags & MAC) dev->dev_ops->mac_addr_set()
>>>>>>>>    - if (flags & PROMISC) dev->dev_ops->promiscuous_enable()
>>>>>>>>    - ...
>>>>>>>
>>>>>>> Could you please explain what is the benefit of exposing flags
>>>>>>> through dev_ops
>>>>>> callback vs a dedicated flags field in rte_eth_dev_data?
>>>>>>> In both solutions:
>>>>>>> - config restore is transparent to the user,
>>>>>>> - drivers can omit config restore (either by not implementing the
>>>>>>> callback or not providing the flags),
>>>>>>> - an ABI change is introduced (not a huge concern, at least for 24.11).
>>>>>>>
>>>>>>> I understand that my initial proposal with "internal_flags" was too
>>>>>>> vague, but renaming and splitting this field into:
>>>>>>>
>>>>>>> - dev_start_restore_flags
>>>>>>> - dev_reset_restore_flags
>>>>>>> - and so on...
>>>>>>>
>>>>>>> seems sufficient, at least in my opinion.
>>>>>>>
>>>>>>
>>>>>> Hi Dariusz,
>>>>>>
>>>>>> Putting flags to rte_eth_dev_data works, and it is easier since there
>>>>>> is direct access from rte_eth_dev to rte_eth_dev_data, so you don't
>>>>>> need new dev_ops. So this is a valid option.
>>>>>>
>>>>>> But benefit of new dev_ops is to keep "struct rte_eth_dev_data" clean.
>>>>>>
>>>>>> "struct rte_eth_dev_data" is integral data structure for ethdev and
>>>>>> it is used in multiple locations, mostly related to the datapath and
>>>>>> all drivers needs to deal with fields of this struct.
>>>>>> Like [rx]_queues, dev_private, dev_conf all important and used a lot.
>>>>>>
>>>>>> I want to protect "struct rte_eth_dev_data" from noise as much as
>>>>>> possible, though what is noise is not always that clear.
>>>>>>
>>>>>> This restore flag is not critical, and I expect most of the drivers
>>>>>> won't care and populate this restore flag at all. That is why to me
>>>>>> it is better have dedicated struct for it and only drivers care about restore
>>>> feature know it.
>>>>>
>>>>> I see. Thank you very much for the explanation.
>>>>>
>>>>> In this case, it looks like adding this to dev_ops is the way to go.
>>>>>
>>>>> So, summarizing it all:
>>>>>
>>>>> 1. dev_ops should be extended with a callback with the following signature and
>>>> enums/flags:
>>>>>
>>>>> enum rte_eth_dev_operation op {
>>>>>       RTE_ETH_START,
>>>>>       RTE_ETH_STOP,
>>>>>       RTE_ETH_RESET,
>>>>> };
>>>>>
>>>>> #define RTE_ETH_RESTORE_MAC_ADDR RTE_BIT32(0) #define
>>>>> RTE_ETH_RESTORE_PROMISC RTE_BIT32(1) #define
>>>> RTE_ETH_RESTORE_ALLMULTI
>>>>> RTE_BIT32(2)
>>>>>
>>>>> void (*get_restore_flags)(
>>>>>       struct rte_eth_dev *dev,
>>>>>       enum rte_eth_dev_operation op,
>>>>>       uint32_t *flags);
>>>>>
>>>>> 2. rte_eth_dev_start() will work as follows:
>>>>>
>>>>> - Call dev->dev_ops->dev_start()
>>>>> - Call dev->dev_ops->get_restore_flags(dev, RTE_ETH_START, &flags). If callback
>>>> is not provided, assume flags == 0.
>>>>> - if (flags & RTE_ETH_RESTORE_MAC_ADDR) - restore MAC addresses
>>>>> - and so on...
>>>>>
>>>>
>>>> All above looks good.
>>>>
>>>>> Also, I would like to add the following:
>>>>>
>>>>> 3. Patchset introducing this change should add get_restore_flags()
>>>> implementation to all drivers, which informs that all config should be restored.
>>>>> This would preserve the current behavior.
>>>>> Later, this could be refined driver by driver.
>>>>>
>>>>> What do you think?
>>>>>
>>>>
>>>> What you are saying is correct, but I suspect most of the drivers don't really need
>>>> this restore, but they have it since it was in the ethdev layer.
>>>>
>>>> If we introduce back restore via get_restore_flags(), it may stay as it is in drivers, at
>>>> least for most of them.
>>>>
>>>> What do you think to risk breaking stuff for this case.
>>>>
>>>> So don't implement this in the drivers by default, so who needs it will recognize
>>>> the issue and will implement it. If we merge this patch for -rc1, it gives enough
>>>> time for drivers to detect the issue and fix it.
>>>
>>> It seems rather too risky, especially considering that for example, there are a few Intel drivers which do not have maintainers (like
>> i40e).
>>> So, I don't know what will happen to such drivers. They may be left broken (if they are affected) for 24.11 and future releases.
>>> But I agree that if default behavior is preserved, this dependence of drivers on config restore might stay as is.
>>> I'm on the fence about it.
>>>
>>
>> Yeah, not sure.
>>
>> Do you think the dev_ops function can be implemented in the
>> 'ethdev_driver.c' and all drivers use exact same function?
>> So this reduces changes and duplication in drivers while preserving the
>> behavior.
> 
> Wonder why it can't be done visa-versa?
> If this new function is not implemented by PMD, then simply assume
> that everything needs to be restored (as it happens now)? 
> 

True, this preserve the behavior and prevents updating all drivers.

I think can be done in two ways:
1. dev_ops() tells what NOT to restore, if dev_ops() not implemented or
it returns zero, restore everything. Here what not nice is what to
restore in this case is not very well defined.

2. dev_ops() tells what to restore, but when not implemented default
flag value is all restore. Which is current behavior.


I am for option 2. and I assume that is what Konstantin also suggest,
but I want to clarify to be sure.


>>>>
>>>> Only we may implement this to the drivers that exist when this restore code was
>>>> introduced.
>>>> I mean whatever driver exist in the initial DPDK commit, implement this logic only
>>>> to those drivers.
>>>
>>> Seems reasonable to me. In this case, it would be igb (IIUC, now it's named e1000) and ixgbe.
>>>
>>>>
>>>>> Also, there's an open question about 'stop' and 'reset' operations.
>>>>> At the moment, ethdev layer does not do any config manipulation during these
>>>> operations.
>>>>> Maybe we should limit get_restore_flags() to 'start' only?
>>>>>
>>>>
>>>> Ack, I was about to suggest the same, for now only have 'RTE_ETH_START'
>>>> as a placeholder for later possible usages.
>>>
>>> Ack
>>>
>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>>
>>>>>>>> So PMDs only will provide what to restore with an internal API and
>>>>>>>> common ethdev layer will restore it.
>>>>>>>> If no restore required PMD may not implement .get_restore_flags() at all.
>>>>>>>>
>>>>>>>> Additionally, RTE_ETH_START, RTE_ETH_RESET etc flag can be provided
>>>>>>>> to internal API to get what to restore in different states...
>>>>>>>>
>>>>>>>>>> Suggested 'internal_flag' in "struct rte_eth_dev_data" can be
>>>>>>>>>> confusing and open to interpretation what to use it for and by
>>>>>>>>>> time become source of defect.
>>>>>>>>>
>>>>>>>>> Yes, same thoughts.
>>>>>>>>>
>>>>>>>>>> Instead what do you think to have a separate, dedicated data struct for it?
>>>>>>>>>
>>>>>>>>> Hmm... not sure I understood you here...
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Might be we can move this restoration code into the new ethdev
>>>>>>>>>>> helper function,(ethdevd_user_config_restore()  or so) that PMD
>>>>>>>>>>> can invoke
>>>>>>>> during its dev_start() if needed?
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> #define RTE_ETH_DEV_INTERNAL_PROMISC_FORCE_RESTORE
>>>>>> RTE_BIT32(0)
>>>>>>>>>>>> #define RTE_ETH_DEV_INTERNAL_ALLMULTI_FORCE_RESTORE
>>>>>> RTE_BIT32(1)
>>>>>>>>>>>> #define RTE_ETH_DEV_INTERNAL_MAC_ADDR_FORCE_RESTORE
>>>>>>>> RTE_BIT32(2)
>>>>>>>>>>>>
>>>>>>>>>>>> struct rte_eth_dev_data {
>>>>>>>>>>>>    /* snip */
>>>>>>>>>>>>
>>>>>>>>>>>>    uint32_t dev_flags;
>>>>>>>>>>>>
>>>>>>>>>>>>    /**
>>>>>>>>>>>>     * Internal device capabilities, used only by ethdev library.
>>>>>>>>>>>>     * Certain functionalities provided by the library might
>>>>>> enabled/disabled,
>>>>>>>>>>>>     * based on driver exposing certain capabilities.
>>>>>>>>>>>>     */
>>>>>>>>>>>>    uint32_t internal_flags;
>>>>>>>>>>>>
>>>>>>>>>>>>    /* snip */
>>>>>>>>>>>> };
>>>>>>>>>>>>
>>>>>>>>>>>>> Also perhaps we have go into details what needs to be restored
>>>>>>>>>>>>> after 'stop' and what needs to be restored after 'reset' and
>>>>>>>>>>>>> use similar
>>>>>>>> mechanism etc...
>>>>>>>>>>>>
>>>>>>>>>>>> I think we should look into that.
>>>>>>>>>>>> Any 'codification' of semantics between drivers and ethdev
>>>>>>>>>>>> library is good in
>>>>>>>> my opinion.
>>>>>>>>>>>>
>>>>>>>>>>>> At least right now, ethdev does not change any configuration in
>>>>>>>>>>>> 'stop' and
>>>>>>>> 'reset' from what I see.
>>>>>>>>>>>> But that's on library side only.
>>>>>>>>>>>>
>>>>>>>>>>>>>> This way, if we would conclude that it makes sense for
>>>>>>>>>>>>>> .dev_start() to handle all starting configuration aspects, we
>>>>>>>>>>>>>> could track which drivers still rely
>>>>>>>>>>>>> on configuration restore.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Dariusz Sosnowski (4):
>>>>>>>>>>>>>>   ethdev: rework config restore
>>>>>>>>>>>>>>   ethdev: omit promiscuous config restore if not required
>>>>>>>>>>>>>>   ethdev: omit all multicast config restore if not required
>>>>>>>>>>>>>>   ethdev: omit MAC address restore if not required
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  lib/ethdev/rte_ethdev.c | 39
>>>>>>>>>>>>>> ++++++++++++++++++++++++++++++++++-----
>>>>>>>>>>>>>>  lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++
>>>>>>>>>>>>>>  2 files changed, 52 insertions(+), 5 deletions(-)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> 2.39.5
>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [RFC 0/4] ethdev: rework config restore
  2024-10-11  0:02                           ` Ferruh Yigit
@ 2024-10-11  8:23                             ` Dariusz Sosnowski
  2024-10-11  8:29                             ` Konstantin Ananyev
  1 sibling, 0 replies; 23+ messages in thread
From: Dariusz Sosnowski @ 2024-10-11  8:23 UTC (permalink / raw)
  To: Ferruh Yigit, Konstantin Ananyev,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Andrew Rybchenko
  Cc: dev, Bruce Richardson



> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Friday, October 11, 2024 02:03
> To: Konstantin Ananyev <konstantin.ananyev@huawei.com>; Dariusz Sosnowski
> <dsosnowski@nvidia.com>; NBU-Contact-Thomas Monjalon (EXTERNAL)
> <thomas@monjalon.net>; Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
> Subject: Re: [RFC 0/4] ethdev: rework config restore
> 
> External email: Use caution opening links or attachments
> 
> 
> On 10/10/2024 11:58 PM, Konstantin Ananyev wrote:
> >
> >
> > config restore
> >>>>
> >>>> External email: Use caution opening links or attachments
> >>>>
> >>>>
> >>>> On 10/10/2024 1:08 PM, Dariusz Sosnowski wrote:
> >>>>>> -----Original Message-----
> >>>>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
> >>>>>> Sent: Thursday, October 10, 2024 01:17
> >>>>>> To: Dariusz Sosnowski <dsosnowski@nvidia.com>; Konstantin Ananyev
> >>>>>> <konstantin.ananyev@huawei.com>; NBU-Contact-Thomas Monjalon
> >>>>>> (EXTERNAL) <thomas@monjalon.net>; Andrew Rybchenko
> >>>>>> <andrew.rybchenko@oktetlabs.ru>
> >>>>>> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
> >>>>>> Subject: Re: [RFC 0/4] ethdev: rework config restore
> >>>>>>
> >>>>>> External email: Use caution opening links or attachments
> >>>>>>
> >>>>>>
> >>>>>> On 10/9/2024 5:18 PM, Dariusz Sosnowski wrote:
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
> >>>>>>>> Sent: Wednesday, October 9, 2024 03:08
> >>>>>>>> To: Konstantin Ananyev <konstantin.ananyev@huawei.com>; Dariusz
> >>>>>>>> Sosnowski <dsosnowski@nvidia.com>; NBU-Contact-Thomas Monjalon
> >>>>>>>> (EXTERNAL) <thomas@monjalon.net>; Andrew Rybchenko
> >>>>>>>> <andrew.rybchenko@oktetlabs.ru>
> >>>>>>>> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
> >>>>>>>> Subject: Re: [RFC 0/4] ethdev: rework config restore
> >>>>>>>>
> >>>>>>>> External email: Use caution opening links or attachments
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 10/8/2024 6:21 PM, Konstantin Ananyev wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>>>>> We have been working on optimizing the latency of calls
> >>>>>>>>>>>>>> to rte_eth_dev_start(), on ports spawned by mlx5 PMD.
> >>>>>>>>>>>>>> Most of the work requires changes in the implementation
> >>>>>>>>>>>>>> of
> >>>>>>>>>>>>>> .dev_start() PMD callback, but I also wanted to start a
> >>>>>>>>>>>>>> discussion regarding configuration restore.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> rte_eth_dev_start() does a few things on top of calling
> >>>>>>>>>>>>>> .dev_start()
> >>>>>>>> callback:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> - Before calling it:
> >>>>>>>>>>>>>>     - eth_dev_mac_restore() - if device supports
> >>>>>>>>>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> >>>>>>>>>>>>>> - After calling it:
> >>>>>>>>>>>>>>     - eth_dev_mac_restore() - if device does not support
> >>>>>>>>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> >>>>>>>>>>>>>>     - restore promiscuous config
> >>>>>>>>>>>>>>     - restore all multicast config
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> eth_dev_mac_restore() iterates over all known MAC
> >>>>>>>>>>>>>> addresses - stored in rte_eth_dev_data.mac_addrs array -
> >>>>>>>>>>>>>> and calls
> >>>>>>>>>>>>>> .mac_addr_set() and .mac_addr_add() callbacks to apply
> >>>>>>>>>>>>>> these MAC
> >>>>>>>> addresses.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Promiscuous config restore checks if promiscuous mode is
> >>>>>>>>>>>>>> enabled or not, and calls .promiscuous_enable() or
> >>>>>>>>>>>>>> .promiscuous_disable()
> >>>>>>>> callback.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> All multicast config restore checks if all multicast mode
> >>>>>>>>>>>>>> is enabled or not, and calls .allmulticast_enable() or
> >>>>>>>>>>>>>> .allmulticast_disable()
> >>>>>>>> callback.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Callbacks are called directly in all of these cases, to
> >>>>>>>>>>>>>> bypass the checks for applying the same configuration,
> >>>>>>>>>>>>>> which exist in relevant
> >>>>>>>> APIs.
> >>>>>>>>>>>>>> Checks are bypassed to force drivers to reapply the
> configuration.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Let's consider what happens in the following sequence of API
> calls.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 1. rte_eth_dev_configure() 2. rte_eth_tx_queue_setup() 3.
> >>>>>>>>>>>>>> rte_eth_rx_queue_setup() 4. rte_eth_promiscuous_enable()
> >>>>>>>>>>>>>>     - Call dev->dev_ops->promiscuous_enable()
> >>>>>>>>>>>>>>     - Stores promiscuous state in dev->data->promiscuous 5.
> >>>>>>>>>>>>>> rte_eth_allmulticast_enable()
> >>>>>>>>>>>>>>     - Call dev->dev_ops->allmulticast_enable()
> >>>>>>>>>>>>>>     - Stores allmulticast state in dev->data->allmulticast 6.
> >>>>>>>>>>>>>> rte_eth_dev_start()
> >>>>>>>>>>>>>>     - Call dev->dev_ops->dev_start()
> >>>>>>>>>>>>>>     - Call dev->dev_ops->mac_addr_set() - apply default MAC
> address
> >>>>>>>>>>>>>>     - Call dev->dev_ops->promiscuous_enable()
> >>>>>>>>>>>>>>     - Call dev->dev_ops->allmulticast_enable()
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Even though all configuration is available in dev->data
> >>>>>>>>>>>>>> after step 5, library forces reapplying this configuration in step 6.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> In mlx5 PMD case all relevant callbacks require
> >>>>>>>>>>>>>> communication with the kernel driver, to configure the
> >>>>>>>>>>>>>> device (mlx5 PMD must create/destroy new kernel flow
> >>>>>>>>>>>>>> rules and/or change netdev
> >>>> config).
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> mlx5 PMD handles applying all configuration in
> >>>>>>>>>>>>>> .dev_start(), so the following forced callbacks force
> >>>>>>>>>>>>>> additional communication with the kernel. The
> >>>>>>>>>>>>> same configuration is applied multiple times.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> As an optimization, mlx5 PMD could check if a given
> >>>>>>>>>>>>>> configuration was applied, but this would duplicate the
> >>>>>>>>>>>>>> functionality of the library (for example
> >>>>>>>>>>>>>> rte_eth_promiscuous_enable() does not call the driver if
> >>>>>>>>>>>>>> dev->data->promiscuous is set).
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Question: Since all of the configuration is available
> >>>>>>>>>>>>>> before
> >>>>>>>>>>>>>> .dev_start() callback is called, why ethdev library does
> >>>>>>>>>>>>>> not expect .dev_start() to
> >>>>>>>>>>>>> take this configuration into account?
> >>>>>>>>>>>>>> In other words, why library has to reapply the configuration?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I could not find any particular reason why configuration
> >>>>>>>>>>>>>> restore exists as part of the process (it was in the
> >>>>>>>>>>>>>> initial DPDK
> >>>> commit).
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> My assumption is .dev_stop() cause these values reset in
> >>>>>>>>>>>>> some devices, so
> >>>>>>>>>>>>> .dev_start() restores them back.
> >>>>>>>>>>>>> @Bruce or @Konstantin may remember the history.
> >>>>>>>>>>>
> >>>>>>>>>>> Yep, as I remember, at least some Intel PMDs calling
> >>>>>>>>>>> hw_reset() ad
> >>>>>>>>>>> dec_stop() and even dev_start() to make sure that HW is in a
> >>>>>>>>>>> clean
> >>>>>>>>>>> (known)
> >>>>>>>> state.
> >>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> But I agree this is device specific behavior, and can be
> >>>>>>>>>>>>> managed by what device requires.
> >>>>>>>>>>>
> >>>>>>>>>>> Probably yes.
> >>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> The patches included in this RFC, propose a mechanism
> >>>>>>>>>>>>>> which would help with managing which drivers rely on
> >>>>>>>>>>>>>> forceful configuration
> >>>>>> restore.
> >>>>>>>>>>>>>> Drivers could advertise if forceful configuration restore
> >>>>>>>>>>>>>> is needed through `RTE_ETH_DEV_*_FORCE_RESTORE` device
> >>>>>>>>>>>>>> flag. If this flag is set, then the driver in question
> >>>>>>>>>>>>>> requires ethdev to forcefully restore
> >>>>>>>>>>>>> configuration.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> OK to use flag for it, but not sure about using 'dev_info-
> >dev_flags'
> >>>>>>>>>>>>> (RTE_ETH_DEV_*) for this, as this flag is shared with user
> >>>>>>>>>>>>> and this is all dpdk internal.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> What about to have a dedicated flag for it? We can have a
> >>>>>>>>>>>>> dedicated set of flag values for restore.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Agreed. What do you think about the following?
> >>>>>>>>>>>
> >>>>>>>>>>> Instead of exposing that, can we probably make it
> >>>>>>>>>>> transparent to the user and probably ethdev layer too?
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> +1 to make it transparent to user, but not sure if we can
> >>>>>>>>>> +make it
> >>>>>>>>>> transparent to ethdev layer.
> >>>>>>>>>
> >>>>>>>>> Just to be clear:
> >>>>>>>>> Let say, using example from above:
> >>>>>>>>>
> >>>>>>>>>  rte_eth_dev_start()
> >>>>>>>>>      - Call dev->dev_ops->dev_start()
> >>>>>>>>>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>>>>>>>      - Call dev->dev_ops->promiscuous_enable()
> >>>>>>>>>      - Call dev->dev_ops->allmulticast_enable()
> >>>>>>>>>
> >>>>>>>>> We probably can introduce ethdev internal function (still
> >>>>>>>>> visible to
> >>>>>>>>> PMDs) that would do last 3 steps:
> >>>>>>>>> ethdev_replay_user_conf(...)
> >>>>>>>>>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>>>>>>>      - Call dev->dev_ops->promiscuous_enable()
> >>>>>>>>>      - Call dev->dev_ops->allmulticast_enable()
> >>>>>>>>>
> >>>>>>>>> And let PMD itself to decide does it needs to call it at dev_start() or
> not.
> >>>>>>>>> So it will become:
> >>>>>>>>> rte_eth_dev_start()
> >>>>>>>>>      - Call dev->dev_ops->dev_start()
> >>>>>>>>>       -Call ethdev_replay_user_conf(.)
> >>>>>>>>>               - Call dev->dev_ops->mac_addr_set() - apply default MAC
> address
> >>>>>>>>>               - Call dev->dev_ops->promiscuous_enable()
> >>>>>>>>>               -Call dev->dev_ops->allmulticast_enable()
> >>>>>>>>>
> >>>>>>>>> For PMDs that do need to restore user provided config And
> >>>>>>>>> rte_eth_dev_start()
> >>>>>>>>>      - Call dev->dev_ops->dev_start()
> >>>>>>>>>
> >>>>>>>>> For those who do not.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> OK, got it what you mean.
> >>>>>>>> Pushing restore functionality to PMDs works, but this may be
> >>>>>>>> doing redundant work on each PMD.
> >>>>>>>>
> >>>>>>>> Instead Dariusz suggests PMD to provide a flag to ehtdev to
> >>>>>>>> what to restore and common code in ethdev does the work.
> >>>>>>>> My below dedicated data struct comment is to have this flag in
> >>>>>>>> a new struct, overall like following:
> >>>>>>>>
> >>>>>>>> rte_eth_dev_start()
> >>>>>>>>    - Call dev->dev_ops->dev_start()
> >>>>>>>>    - Call dev->dev_ops->get_restore_flags(ethdev, RTE_ETH_START,
> &flags)
> >>>>>>>>    - if (flags & MAC) dev->dev_ops->mac_addr_set()
> >>>>>>>>    - if (flags & PROMISC) dev->dev_ops->promiscuous_enable()
> >>>>>>>>    - ...
> >>>>>>>
> >>>>>>> Could you please explain what is the benefit of exposing flags
> >>>>>>> through dev_ops
> >>>>>> callback vs a dedicated flags field in rte_eth_dev_data?
> >>>>>>> In both solutions:
> >>>>>>> - config restore is transparent to the user,
> >>>>>>> - drivers can omit config restore (either by not implementing
> >>>>>>> the callback or not providing the flags),
> >>>>>>> - an ABI change is introduced (not a huge concern, at least for 24.11).
> >>>>>>>
> >>>>>>> I understand that my initial proposal with "internal_flags" was
> >>>>>>> too vague, but renaming and splitting this field into:
> >>>>>>>
> >>>>>>> - dev_start_restore_flags
> >>>>>>> - dev_reset_restore_flags
> >>>>>>> - and so on...
> >>>>>>>
> >>>>>>> seems sufficient, at least in my opinion.
> >>>>>>>
> >>>>>>
> >>>>>> Hi Dariusz,
> >>>>>>
> >>>>>> Putting flags to rte_eth_dev_data works, and it is easier since
> >>>>>> there is direct access from rte_eth_dev to rte_eth_dev_data, so
> >>>>>> you don't need new dev_ops. So this is a valid option.
> >>>>>>
> >>>>>> But benefit of new dev_ops is to keep "struct rte_eth_dev_data" clean.
> >>>>>>
> >>>>>> "struct rte_eth_dev_data" is integral data structure for ethdev
> >>>>>> and it is used in multiple locations, mostly related to the
> >>>>>> datapath and all drivers needs to deal with fields of this struct.
> >>>>>> Like [rx]_queues, dev_private, dev_conf all important and used a lot.
> >>>>>>
> >>>>>> I want to protect "struct rte_eth_dev_data" from noise as much as
> >>>>>> possible, though what is noise is not always that clear.
> >>>>>>
> >>>>>> This restore flag is not critical, and I expect most of the
> >>>>>> drivers won't care and populate this restore flag at all. That is
> >>>>>> why to me it is better have dedicated struct for it and only
> >>>>>> drivers care about restore
> >>>> feature know it.
> >>>>>
> >>>>> I see. Thank you very much for the explanation.
> >>>>>
> >>>>> In this case, it looks like adding this to dev_ops is the way to go.
> >>>>>
> >>>>> So, summarizing it all:
> >>>>>
> >>>>> 1. dev_ops should be extended with a callback with the following
> >>>>> signature and
> >>>> enums/flags:
> >>>>>
> >>>>> enum rte_eth_dev_operation op {
> >>>>>       RTE_ETH_START,
> >>>>>       RTE_ETH_STOP,
> >>>>>       RTE_ETH_RESET,
> >>>>> };
> >>>>>
> >>>>> #define RTE_ETH_RESTORE_MAC_ADDR RTE_BIT32(0) #define
> >>>>> RTE_ETH_RESTORE_PROMISC RTE_BIT32(1) #define
> >>>> RTE_ETH_RESTORE_ALLMULTI
> >>>>> RTE_BIT32(2)
> >>>>>
> >>>>> void (*get_restore_flags)(
> >>>>>       struct rte_eth_dev *dev,
> >>>>>       enum rte_eth_dev_operation op,
> >>>>>       uint32_t *flags);
> >>>>>
> >>>>> 2. rte_eth_dev_start() will work as follows:
> >>>>>
> >>>>> - Call dev->dev_ops->dev_start()
> >>>>> - Call dev->dev_ops->get_restore_flags(dev, RTE_ETH_START,
> >>>>> &flags). If callback
> >>>> is not provided, assume flags == 0.
> >>>>> - if (flags & RTE_ETH_RESTORE_MAC_ADDR) - restore MAC addresses
> >>>>> - and so on...
> >>>>>
> >>>>
> >>>> All above looks good.
> >>>>
> >>>>> Also, I would like to add the following:
> >>>>>
> >>>>> 3. Patchset introducing this change should add get_restore_flags()
> >>>> implementation to all drivers, which informs that all config should be
> restored.
> >>>>> This would preserve the current behavior.
> >>>>> Later, this could be refined driver by driver.
> >>>>>
> >>>>> What do you think?
> >>>>>
> >>>>
> >>>> What you are saying is correct, but I suspect most of the drivers
> >>>> don't really need this restore, but they have it since it was in the ethdev
> layer.
> >>>>
> >>>> If we introduce back restore via get_restore_flags(), it may stay
> >>>> as it is in drivers, at least for most of them.
> >>>>
> >>>> What do you think to risk breaking stuff for this case.
> >>>>
> >>>> So don't implement this in the drivers by default, so who needs it
> >>>> will recognize the issue and will implement it. If we merge this
> >>>> patch for -rc1, it gives enough time for drivers to detect the issue and fix it.
> >>>
> >>> It seems rather too risky, especially considering that for example,
> >>> there are a few Intel drivers which do not have maintainers (like
> >> i40e).
> >>> So, I don't know what will happen to such drivers. They may be left broken (if
> they are affected) for 24.11 and future releases.
> >>> But I agree that if default behavior is preserved, this dependence of drivers
> on config restore might stay as is.
> >>> I'm on the fence about it.
> >>>
> >>
> >> Yeah, not sure.
> >>
> >> Do you think the dev_ops function can be implemented in the
> >> 'ethdev_driver.c' and all drivers use exact same function?
> >> So this reduces changes and duplication in drivers while preserving
> >> the behavior.
> >
> > Wonder why it can't be done visa-versa?
> > If this new function is not implemented by PMD, then simply assume
> > that everything needs to be restored (as it happens now)?
> >
> 
> True, this preserve the behavior and prevents updating all drivers.
> 
> I think can be done in two ways:
> 1. dev_ops() tells what NOT to restore, if dev_ops() not implemented or it returns
> zero, restore everything. Here what not nice is what to restore in this case is not
> very well defined.
> 
> 2. dev_ops() tells what to restore, but when not implemented default flag value is
> all restore. Which is current behavior.
> 
> 
> I am for option 2. and I assume that is what Konstantin also suggest, but I want to
> clarify to be sure.

+1 for option 2

> 
> 
> >>>>
> >>>> Only we may implement this to the drivers that exist when this
> >>>> restore code was introduced.
> >>>> I mean whatever driver exist in the initial DPDK commit, implement
> >>>> this logic only to those drivers.
> >>>
> >>> Seems reasonable to me. In this case, it would be igb (IIUC, now it's named
> e1000) and ixgbe.
> >>>
> >>>>
> >>>>> Also, there's an open question about 'stop' and 'reset' operations.
> >>>>> At the moment, ethdev layer does not do any config manipulation
> >>>>> during these
> >>>> operations.
> >>>>> Maybe we should limit get_restore_flags() to 'start' only?
> >>>>>
> >>>>
> >>>> Ack, I was about to suggest the same, for now only have 'RTE_ETH_START'
> >>>> as a placeholder for later possible usages.
> >>>
> >>> Ack
> >>>
> >>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>
> >>>>>>>> So PMDs only will provide what to restore with an internal API
> >>>>>>>> and common ethdev layer will restore it.
> >>>>>>>> If no restore required PMD may not implement .get_restore_flags() at
> all.
> >>>>>>>>
> >>>>>>>> Additionally, RTE_ETH_START, RTE_ETH_RESET etc flag can be
> >>>>>>>> provided to internal API to get what to restore in different states...
> >>>>>>>>
> >>>>>>>>>> Suggested 'internal_flag' in "struct rte_eth_dev_data" can be
> >>>>>>>>>> confusing and open to interpretation what to use it for and
> >>>>>>>>>> by time become source of defect.
> >>>>>>>>>
> >>>>>>>>> Yes, same thoughts.
> >>>>>>>>>
> >>>>>>>>>> Instead what do you think to have a separate, dedicated data struct
> for it?
> >>>>>>>>>
> >>>>>>>>> Hmm... not sure I understood you here...
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> Might be we can move this restoration code into the new
> >>>>>>>>>>> ethdev helper function,(ethdevd_user_config_restore()  or
> >>>>>>>>>>> so) that PMD can invoke
> >>>>>>>> during its dev_start() if needed?
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> #define RTE_ETH_DEV_INTERNAL_PROMISC_FORCE_RESTORE
> >>>>>> RTE_BIT32(0)
> >>>>>>>>>>>> #define RTE_ETH_DEV_INTERNAL_ALLMULTI_FORCE_RESTORE
> >>>>>> RTE_BIT32(1)
> >>>>>>>>>>>> #define RTE_ETH_DEV_INTERNAL_MAC_ADDR_FORCE_RESTORE
> >>>>>>>> RTE_BIT32(2)
> >>>>>>>>>>>>
> >>>>>>>>>>>> struct rte_eth_dev_data {
> >>>>>>>>>>>>    /* snip */
> >>>>>>>>>>>>
> >>>>>>>>>>>>    uint32_t dev_flags;
> >>>>>>>>>>>>
> >>>>>>>>>>>>    /**
> >>>>>>>>>>>>     * Internal device capabilities, used only by ethdev library.
> >>>>>>>>>>>>     * Certain functionalities provided by the library might
> >>>>>> enabled/disabled,
> >>>>>>>>>>>>     * based on driver exposing certain capabilities.
> >>>>>>>>>>>>     */
> >>>>>>>>>>>>    uint32_t internal_flags;
> >>>>>>>>>>>>
> >>>>>>>>>>>>    /* snip */
> >>>>>>>>>>>> };
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Also perhaps we have go into details what needs to be
> >>>>>>>>>>>>> restored after 'stop' and what needs to be restored after
> >>>>>>>>>>>>> 'reset' and use similar
> >>>>>>>> mechanism etc...
> >>>>>>>>>>>>
> >>>>>>>>>>>> I think we should look into that.
> >>>>>>>>>>>> Any 'codification' of semantics between drivers and ethdev
> >>>>>>>>>>>> library is good in
> >>>>>>>> my opinion.
> >>>>>>>>>>>>
> >>>>>>>>>>>> At least right now, ethdev does not change any
> >>>>>>>>>>>> configuration in 'stop' and
> >>>>>>>> 'reset' from what I see.
> >>>>>>>>>>>> But that's on library side only.
> >>>>>>>>>>>>
> >>>>>>>>>>>>>> This way, if we would conclude that it makes sense for
> >>>>>>>>>>>>>> .dev_start() to handle all starting configuration
> >>>>>>>>>>>>>> aspects, we could track which drivers still rely
> >>>>>>>>>>>>> on configuration restore.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Dariusz Sosnowski (4):
> >>>>>>>>>>>>>>   ethdev: rework config restore
> >>>>>>>>>>>>>>   ethdev: omit promiscuous config restore if not required
> >>>>>>>>>>>>>>   ethdev: omit all multicast config restore if not required
> >>>>>>>>>>>>>>   ethdev: omit MAC address restore if not required
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>  lib/ethdev/rte_ethdev.c | 39
> >>>>>>>>>>>>>> ++++++++++++++++++++++++++++++++++-----
> >>>>>>>>>>>>>>  lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++
> >>>>>>>>>>>>>>  2 files changed, 52 insertions(+), 5 deletions(-)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>> 2.39.5
> >>>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >


^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [RFC 0/4] ethdev: rework config restore
  2024-10-11  0:02                           ` Ferruh Yigit
  2024-10-11  8:23                             ` Dariusz Sosnowski
@ 2024-10-11  8:29                             ` Konstantin Ananyev
  2024-10-11  9:37                               ` Dariusz Sosnowski
  1 sibling, 1 reply; 23+ messages in thread
From: Konstantin Ananyev @ 2024-10-11  8:29 UTC (permalink / raw)
  To: Ferruh Yigit, Dariusz Sosnowski,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Andrew Rybchenko
  Cc: dev, Bruce Richardson

> >>>>
> >>>> External email: Use caution opening links or attachments
> >>>>
> >>>>
> >>>> On 10/10/2024 1:08 PM, Dariusz Sosnowski wrote:
> >>>>>> -----Original Message-----
> >>>>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
> >>>>>> Sent: Thursday, October 10, 2024 01:17
> >>>>>> To: Dariusz Sosnowski <dsosnowski@nvidia.com>; Konstantin Ananyev
> >>>>>> <konstantin.ananyev@huawei.com>; NBU-Contact-Thomas Monjalon
> >>>>>> (EXTERNAL) <thomas@monjalon.net>; Andrew Rybchenko
> >>>>>> <andrew.rybchenko@oktetlabs.ru>
> >>>>>> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
> >>>>>> Subject: Re: [RFC 0/4] ethdev: rework config restore
> >>>>>>
> >>>>>> External email: Use caution opening links or attachments
> >>>>>>
> >>>>>>
> >>>>>> On 10/9/2024 5:18 PM, Dariusz Sosnowski wrote:
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
> >>>>>>>> Sent: Wednesday, October 9, 2024 03:08
> >>>>>>>> To: Konstantin Ananyev <konstantin.ananyev@huawei.com>; Dariusz
> >>>>>>>> Sosnowski <dsosnowski@nvidia.com>; NBU-Contact-Thomas Monjalon
> >>>>>>>> (EXTERNAL) <thomas@monjalon.net>; Andrew Rybchenko
> >>>>>>>> <andrew.rybchenko@oktetlabs.ru>
> >>>>>>>> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
> >>>>>>>> Subject: Re: [RFC 0/4] ethdev: rework config restore
> >>>>>>>>
> >>>>>>>> External email: Use caution opening links or attachments
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 10/8/2024 6:21 PM, Konstantin Ananyev wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>>>>> We have been working on optimizing the latency of calls to
> >>>>>>>>>>>>>> rte_eth_dev_start(), on ports spawned by mlx5 PMD. Most of
> >>>>>>>>>>>>>> the work requires changes in the implementation of
> >>>>>>>>>>>>>> .dev_start() PMD callback, but I also wanted to start a
> >>>>>>>>>>>>>> discussion regarding configuration restore.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> rte_eth_dev_start() does a few things on top of calling
> >>>>>>>>>>>>>> .dev_start()
> >>>>>>>> callback:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> - Before calling it:
> >>>>>>>>>>>>>>     - eth_dev_mac_restore() - if device supports
> >>>>>>>>>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> >>>>>>>>>>>>>> - After calling it:
> >>>>>>>>>>>>>>     - eth_dev_mac_restore() - if device does not support
> >>>>>>>>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> >>>>>>>>>>>>>>     - restore promiscuous config
> >>>>>>>>>>>>>>     - restore all multicast config
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> eth_dev_mac_restore() iterates over all known MAC addresses -
> >>>>>>>>>>>>>> stored in rte_eth_dev_data.mac_addrs array - and calls
> >>>>>>>>>>>>>> .mac_addr_set() and .mac_addr_add() callbacks to apply these
> >>>>>>>>>>>>>> MAC
> >>>>>>>> addresses.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Promiscuous config restore checks if promiscuous mode is
> >>>>>>>>>>>>>> enabled or not, and calls .promiscuous_enable() or
> >>>>>>>>>>>>>> .promiscuous_disable()
> >>>>>>>> callback.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> All multicast config restore checks if all multicast mode is
> >>>>>>>>>>>>>> enabled or not, and calls .allmulticast_enable() or
> >>>>>>>>>>>>>> .allmulticast_disable()
> >>>>>>>> callback.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Callbacks are called directly in all of these cases, to
> >>>>>>>>>>>>>> bypass the checks for applying the same configuration, which
> >>>>>>>>>>>>>> exist in relevant
> >>>>>>>> APIs.
> >>>>>>>>>>>>>> Checks are bypassed to force drivers to reapply the configuration.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Let's consider what happens in the following sequence of API calls.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 1. rte_eth_dev_configure()
> >>>>>>>>>>>>>> 2. rte_eth_tx_queue_setup()
> >>>>>>>>>>>>>> 3. rte_eth_rx_queue_setup()
> >>>>>>>>>>>>>> 4. rte_eth_promiscuous_enable()
> >>>>>>>>>>>>>>     - Call dev->dev_ops->promiscuous_enable()
> >>>>>>>>>>>>>>     - Stores promiscuous state in dev->data->promiscuous 5.
> >>>>>>>>>>>>>> rte_eth_allmulticast_enable()
> >>>>>>>>>>>>>>     - Call dev->dev_ops->allmulticast_enable()
> >>>>>>>>>>>>>>     - Stores allmulticast state in dev->data->allmulticast 6.
> >>>>>>>>>>>>>> rte_eth_dev_start()
> >>>>>>>>>>>>>>     - Call dev->dev_ops->dev_start()
> >>>>>>>>>>>>>>     - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>>>>>>>>>>>>     - Call dev->dev_ops->promiscuous_enable()
> >>>>>>>>>>>>>>     - Call dev->dev_ops->allmulticast_enable()
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Even though all configuration is available in dev->data after
> >>>>>>>>>>>>>> step 5, library forces reapplying this configuration in step 6.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> In mlx5 PMD case all relevant callbacks require communication
> >>>>>>>>>>>>>> with the kernel driver, to configure the device (mlx5 PMD
> >>>>>>>>>>>>>> must create/destroy new kernel flow rules and/or change netdev
> >>>> config).
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> mlx5 PMD handles applying all configuration in .dev_start(),
> >>>>>>>>>>>>>> so the following forced callbacks force additional
> >>>>>>>>>>>>>> communication with the kernel. The
> >>>>>>>>>>>>> same configuration is applied multiple times.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> As an optimization, mlx5 PMD could check if a given
> >>>>>>>>>>>>>> configuration was applied, but this would duplicate the
> >>>>>>>>>>>>>> functionality of the library (for example
> >>>>>>>>>>>>>> rte_eth_promiscuous_enable() does not call the driver if
> >>>>>>>>>>>>>> dev->data->promiscuous is set).
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Question: Since all of the configuration is available before
> >>>>>>>>>>>>>> .dev_start() callback is called, why ethdev library does not
> >>>>>>>>>>>>>> expect .dev_start() to
> >>>>>>>>>>>>> take this configuration into account?
> >>>>>>>>>>>>>> In other words, why library has to reapply the configuration?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I could not find any particular reason why configuration
> >>>>>>>>>>>>>> restore exists as part of the process (it was in the initial DPDK
> >>>> commit).
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> My assumption is .dev_stop() cause these values reset in some
> >>>>>>>>>>>>> devices, so
> >>>>>>>>>>>>> .dev_start() restores them back.
> >>>>>>>>>>>>> @Bruce or @Konstantin may remember the history.
> >>>>>>>>>>>
> >>>>>>>>>>> Yep, as I remember, at least some Intel PMDs calling hw_reset()
> >>>>>>>>>>> ad
> >>>>>>>>>>> dec_stop() and even dev_start() to make sure that HW is in a
> >>>>>>>>>>> clean
> >>>>>>>>>>> (known)
> >>>>>>>> state.
> >>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> But I agree this is device specific behavior, and can be
> >>>>>>>>>>>>> managed by what device requires.
> >>>>>>>>>>>
> >>>>>>>>>>> Probably yes.
> >>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> The patches included in this RFC, propose a mechanism which
> >>>>>>>>>>>>>> would help with managing which drivers rely on forceful
> >>>>>>>>>>>>>> configuration
> >>>>>> restore.
> >>>>>>>>>>>>>> Drivers could advertise if forceful configuration restore is
> >>>>>>>>>>>>>> needed through `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If
> >>>>>>>>>>>>>> this flag is set, then the driver in question requires ethdev
> >>>>>>>>>>>>>> to forcefully restore
> >>>>>>>>>>>>> configuration.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> OK to use flag for it, but not sure about using 'dev_info->dev_flags'
> >>>>>>>>>>>>> (RTE_ETH_DEV_*) for this, as this flag is shared with user and
> >>>>>>>>>>>>> this is all dpdk internal.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> What about to have a dedicated flag for it? We can have a
> >>>>>>>>>>>>> dedicated set of flag values for restore.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Agreed. What do you think about the following?
> >>>>>>>>>>>
> >>>>>>>>>>> Instead of exposing that, can we probably make it transparent to
> >>>>>>>>>>> the user and probably ethdev layer too?
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> +1 to make it transparent to user, but not sure if we can make it
> >>>>>>>>>> transparent to ethdev layer.
> >>>>>>>>>
> >>>>>>>>> Just to be clear:
> >>>>>>>>> Let say, using example from above:
> >>>>>>>>>
> >>>>>>>>>  rte_eth_dev_start()
> >>>>>>>>>      - Call dev->dev_ops->dev_start()
> >>>>>>>>>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>>>>>>>      - Call dev->dev_ops->promiscuous_enable()
> >>>>>>>>>      - Call dev->dev_ops->allmulticast_enable()
> >>>>>>>>>
> >>>>>>>>> We probably can introduce ethdev internal function (still visible
> >>>>>>>>> to
> >>>>>>>>> PMDs) that would do last 3 steps:
> >>>>>>>>> ethdev_replay_user_conf(...)
> >>>>>>>>>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>>>>>>>      - Call dev->dev_ops->promiscuous_enable()
> >>>>>>>>>      - Call dev->dev_ops->allmulticast_enable()
> >>>>>>>>>
> >>>>>>>>> And let PMD itself to decide does it needs to call it at dev_start() or not.
> >>>>>>>>> So it will become:
> >>>>>>>>> rte_eth_dev_start()
> >>>>>>>>>      - Call dev->dev_ops->dev_start()
> >>>>>>>>>       -Call ethdev_replay_user_conf(.)
> >>>>>>>>>               - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> >>>>>>>>>               - Call dev->dev_ops->promiscuous_enable()
> >>>>>>>>>               -Call dev->dev_ops->allmulticast_enable()
> >>>>>>>>>
> >>>>>>>>> For PMDs that do need to restore user provided config And
> >>>>>>>>> rte_eth_dev_start()
> >>>>>>>>>      - Call dev->dev_ops->dev_start()
> >>>>>>>>>
> >>>>>>>>> For those who do not.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> OK, got it what you mean.
> >>>>>>>> Pushing restore functionality to PMDs works, but this may be doing
> >>>>>>>> redundant work on each PMD.
> >>>>>>>>
> >>>>>>>> Instead Dariusz suggests PMD to provide a flag to ehtdev to what to
> >>>>>>>> restore and common code in ethdev does the work.
> >>>>>>>> My below dedicated data struct comment is to have this flag in a
> >>>>>>>> new struct, overall like following:
> >>>>>>>>
> >>>>>>>> rte_eth_dev_start()
> >>>>>>>>    - Call dev->dev_ops->dev_start()
> >>>>>>>>    - Call dev->dev_ops->get_restore_flags(ethdev, RTE_ETH_START, &flags)
> >>>>>>>>    - if (flags & MAC) dev->dev_ops->mac_addr_set()
> >>>>>>>>    - if (flags & PROMISC) dev->dev_ops->promiscuous_enable()
> >>>>>>>>    - ...
> >>>>>>>
> >>>>>>> Could you please explain what is the benefit of exposing flags
> >>>>>>> through dev_ops
> >>>>>> callback vs a dedicated flags field in rte_eth_dev_data?
> >>>>>>> In both solutions:
> >>>>>>> - config restore is transparent to the user,
> >>>>>>> - drivers can omit config restore (either by not implementing the
> >>>>>>> callback or not providing the flags),
> >>>>>>> - an ABI change is introduced (not a huge concern, at least for 24.11).
> >>>>>>>
> >>>>>>> I understand that my initial proposal with "internal_flags" was too
> >>>>>>> vague, but renaming and splitting this field into:
> >>>>>>>
> >>>>>>> - dev_start_restore_flags
> >>>>>>> - dev_reset_restore_flags
> >>>>>>> - and so on...
> >>>>>>>
> >>>>>>> seems sufficient, at least in my opinion.
> >>>>>>>
> >>>>>>
> >>>>>> Hi Dariusz,
> >>>>>>
> >>>>>> Putting flags to rte_eth_dev_data works, and it is easier since there
> >>>>>> is direct access from rte_eth_dev to rte_eth_dev_data, so you don't
> >>>>>> need new dev_ops. So this is a valid option.
> >>>>>>
> >>>>>> But benefit of new dev_ops is to keep "struct rte_eth_dev_data" clean.
> >>>>>>
> >>>>>> "struct rte_eth_dev_data" is integral data structure for ethdev and
> >>>>>> it is used in multiple locations, mostly related to the datapath and
> >>>>>> all drivers needs to deal with fields of this struct.
> >>>>>> Like [rx]_queues, dev_private, dev_conf all important and used a lot.
> >>>>>>
> >>>>>> I want to protect "struct rte_eth_dev_data" from noise as much as
> >>>>>> possible, though what is noise is not always that clear.
> >>>>>>
> >>>>>> This restore flag is not critical, and I expect most of the drivers
> >>>>>> won't care and populate this restore flag at all. That is why to me
> >>>>>> it is better have dedicated struct for it and only drivers care about restore
> >>>> feature know it.
> >>>>>
> >>>>> I see. Thank you very much for the explanation.
> >>>>>
> >>>>> In this case, it looks like adding this to dev_ops is the way to go.
> >>>>>
> >>>>> So, summarizing it all:
> >>>>>
> >>>>> 1. dev_ops should be extended with a callback with the following signature and
> >>>> enums/flags:
> >>>>>
> >>>>> enum rte_eth_dev_operation op {
> >>>>>       RTE_ETH_START,
> >>>>>       RTE_ETH_STOP,
> >>>>>       RTE_ETH_RESET,
> >>>>> };
> >>>>>
> >>>>> #define RTE_ETH_RESTORE_MAC_ADDR RTE_BIT32(0) #define
> >>>>> RTE_ETH_RESTORE_PROMISC RTE_BIT32(1) #define
> >>>> RTE_ETH_RESTORE_ALLMULTI
> >>>>> RTE_BIT32(2)
> >>>>>
> >>>>> void (*get_restore_flags)(
> >>>>>       struct rte_eth_dev *dev,
> >>>>>       enum rte_eth_dev_operation op,
> >>>>>       uint32_t *flags);
> >>>>>
> >>>>> 2. rte_eth_dev_start() will work as follows:
> >>>>>
> >>>>> - Call dev->dev_ops->dev_start()
> >>>>> - Call dev->dev_ops->get_restore_flags(dev, RTE_ETH_START, &flags). If callback
> >>>> is not provided, assume flags == 0.
> >>>>> - if (flags & RTE_ETH_RESTORE_MAC_ADDR) - restore MAC addresses
> >>>>> - and so on...
> >>>>>
> >>>>
> >>>> All above looks good.
> >>>>
> >>>>> Also, I would like to add the following:
> >>>>>
> >>>>> 3. Patchset introducing this change should add get_restore_flags()
> >>>> implementation to all drivers, which informs that all config should be restored.
> >>>>> This would preserve the current behavior.
> >>>>> Later, this could be refined driver by driver.
> >>>>>
> >>>>> What do you think?
> >>>>>
> >>>>
> >>>> What you are saying is correct, but I suspect most of the drivers don't really need
> >>>> this restore, but they have it since it was in the ethdev layer.
> >>>>
> >>>> If we introduce back restore via get_restore_flags(), it may stay as it is in drivers, at
> >>>> least for most of them.
> >>>>
> >>>> What do you think to risk breaking stuff for this case.
> >>>>
> >>>> So don't implement this in the drivers by default, so who needs it will recognize
> >>>> the issue and will implement it. If we merge this patch for -rc1, it gives enough
> >>>> time for drivers to detect the issue and fix it.
> >>>
> >>> It seems rather too risky, especially considering that for example, there are a few Intel drivers which do not have maintainers (like
> >> i40e).
> >>> So, I don't know what will happen to such drivers. They may be left broken (if they are affected) for 24.11 and future releases.
> >>> But I agree that if default behavior is preserved, this dependence of drivers on config restore might stay as is.
> >>> I'm on the fence about it.
> >>>
> >>
> >> Yeah, not sure.
> >>
> >> Do you think the dev_ops function can be implemented in the
> >> 'ethdev_driver.c' and all drivers use exact same function?
> >> So this reduces changes and duplication in drivers while preserving the
> >> behavior.
> >
> > Wonder why it can't be done visa-versa?
> > If this new function is not implemented by PMD, then simply assume
> > that everything needs to be restored (as it happens now)?
> >
> 
> True, this preserve the behavior and prevents updating all drivers.
> 
> I think can be done in two ways:
> 1. dev_ops() tells what NOT to restore, if dev_ops() not implemented or
> it returns zero, restore everything. Here what not nice is what to
> restore in this case is not very well defined.
> 
> 2. dev_ops() tells what to restore, but when not implemented default
> flag value is all restore. Which is current behavior.
> 
> 
> I am for option 2. and I assume that is what Konstantin also suggest,
> but I want to clarify to be sure.

Yep, I am also in favor of option 2.

> 
> >>>>
> >>>> Only we may implement this to the drivers that exist when this restore code was
> >>>> introduced.
> >>>> I mean whatever driver exist in the initial DPDK commit, implement this logic only
> >>>> to those drivers.
> >>>
> >>> Seems reasonable to me. In this case, it would be igb (IIUC, now it's named e1000) and ixgbe.
> >>>
> >>>>
> >>>>> Also, there's an open question about 'stop' and 'reset' operations.
> >>>>> At the moment, ethdev layer does not do any config manipulation during these
> >>>> operations.
> >>>>> Maybe we should limit get_restore_flags() to 'start' only?
> >>>>>
> >>>>
> >>>> Ack, I was about to suggest the same, for now only have 'RTE_ETH_START'
> >>>> as a placeholder for later possible usages.
> >>>
> >>> Ack
> >>>
> >>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>
> >>>>>>>> So PMDs only will provide what to restore with an internal API and
> >>>>>>>> common ethdev layer will restore it.
> >>>>>>>> If no restore required PMD may not implement .get_restore_flags() at all.
> >>>>>>>>
> >>>>>>>> Additionally, RTE_ETH_START, RTE_ETH_RESET etc flag can be provided
> >>>>>>>> to internal API to get what to restore in different states...
> >>>>>>>>
> >>>>>>>>>> Suggested 'internal_flag' in "struct rte_eth_dev_data" can be
> >>>>>>>>>> confusing and open to interpretation what to use it for and by
> >>>>>>>>>> time become source of defect.
> >>>>>>>>>
> >>>>>>>>> Yes, same thoughts.
> >>>>>>>>>
> >>>>>>>>>> Instead what do you think to have a separate, dedicated data struct for it?
> >>>>>>>>>
> >>>>>>>>> Hmm... not sure I understood you here...
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> Might be we can move this restoration code into the new ethdev
> >>>>>>>>>>> helper function,(ethdevd_user_config_restore()  or so) that PMD
> >>>>>>>>>>> can invoke
> >>>>>>>> during its dev_start() if needed?
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> #define RTE_ETH_DEV_INTERNAL_PROMISC_FORCE_RESTORE
> >>>>>> RTE_BIT32(0)
> >>>>>>>>>>>> #define RTE_ETH_DEV_INTERNAL_ALLMULTI_FORCE_RESTORE
> >>>>>> RTE_BIT32(1)
> >>>>>>>>>>>> #define RTE_ETH_DEV_INTERNAL_MAC_ADDR_FORCE_RESTORE
> >>>>>>>> RTE_BIT32(2)
> >>>>>>>>>>>>
> >>>>>>>>>>>> struct rte_eth_dev_data {
> >>>>>>>>>>>>    /* snip */
> >>>>>>>>>>>>
> >>>>>>>>>>>>    uint32_t dev_flags;
> >>>>>>>>>>>>
> >>>>>>>>>>>>    /**
> >>>>>>>>>>>>     * Internal device capabilities, used only by ethdev library.
> >>>>>>>>>>>>     * Certain functionalities provided by the library might
> >>>>>> enabled/disabled,
> >>>>>>>>>>>>     * based on driver exposing certain capabilities.
> >>>>>>>>>>>>     */
> >>>>>>>>>>>>    uint32_t internal_flags;
> >>>>>>>>>>>>
> >>>>>>>>>>>>    /* snip */
> >>>>>>>>>>>> };
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Also perhaps we have go into details what needs to be restored
> >>>>>>>>>>>>> after 'stop' and what needs to be restored after 'reset' and
> >>>>>>>>>>>>> use similar
> >>>>>>>> mechanism etc...
> >>>>>>>>>>>>
> >>>>>>>>>>>> I think we should look into that.
> >>>>>>>>>>>> Any 'codification' of semantics between drivers and ethdev
> >>>>>>>>>>>> library is good in
> >>>>>>>> my opinion.
> >>>>>>>>>>>>
> >>>>>>>>>>>> At least right now, ethdev does not change any configuration in
> >>>>>>>>>>>> 'stop' and
> >>>>>>>> 'reset' from what I see.
> >>>>>>>>>>>> But that's on library side only.
> >>>>>>>>>>>>
> >>>>>>>>>>>>>> This way, if we would conclude that it makes sense for
> >>>>>>>>>>>>>> .dev_start() to handle all starting configuration aspects, we
> >>>>>>>>>>>>>> could track which drivers still rely
> >>>>>>>>>>>>> on configuration restore.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Dariusz Sosnowski (4):
> >>>>>>>>>>>>>>   ethdev: rework config restore
> >>>>>>>>>>>>>>   ethdev: omit promiscuous config restore if not required
> >>>>>>>>>>>>>>   ethdev: omit all multicast config restore if not required
> >>>>>>>>>>>>>>   ethdev: omit MAC address restore if not required
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>  lib/ethdev/rte_ethdev.c | 39
> >>>>>>>>>>>>>> ++++++++++++++++++++++++++++++++++-----
> >>>>>>>>>>>>>>  lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++
> >>>>>>>>>>>>>>  2 files changed, 52 insertions(+), 5 deletions(-)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>> 2.39.5
> >>>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >


^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [RFC 0/4] ethdev: rework config restore
  2024-10-11  8:29                             ` Konstantin Ananyev
@ 2024-10-11  9:37                               ` Dariusz Sosnowski
  0 siblings, 0 replies; 23+ messages in thread
From: Dariusz Sosnowski @ 2024-10-11  9:37 UTC (permalink / raw)
  To: Konstantin Ananyev, Ferruh Yigit,
	NBU-Contact-Thomas Monjalon (EXTERNAL),
	Andrew Rybchenko
  Cc: dev, Bruce Richardson



> -----Original Message-----
> From: Konstantin Ananyev <konstantin.ananyev@huawei.com>
> Sent: Friday, October 11, 2024 10:29
> To: Ferruh Yigit <ferruh.yigit@amd.com>; Dariusz Sosnowski
> <dsosnowski@nvidia.com>; NBU-Contact-Thomas Monjalon (EXTERNAL)
> <thomas@monjalon.net>; Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
> Subject: RE: [RFC 0/4] ethdev: rework config restore
> 
> External email: Use caution opening links or attachments
> 
> 
> > >>>>
> > >>>> External email: Use caution opening links or attachments
> > >>>>
> > >>>>
> > >>>> On 10/10/2024 1:08 PM, Dariusz Sosnowski wrote:
> > >>>>>> -----Original Message-----
> > >>>>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
> > >>>>>> Sent: Thursday, October 10, 2024 01:17
> > >>>>>> To: Dariusz Sosnowski <dsosnowski@nvidia.com>; Konstantin
> > >>>>>> Ananyev <konstantin.ananyev@huawei.com>; NBU-Contact-Thomas
> > >>>>>> Monjalon
> > >>>>>> (EXTERNAL) <thomas@monjalon.net>; Andrew Rybchenko
> > >>>>>> <andrew.rybchenko@oktetlabs.ru>
> > >>>>>> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>
> > >>>>>> Subject: Re: [RFC 0/4] ethdev: rework config restore
> > >>>>>>
> > >>>>>> External email: Use caution opening links or attachments
> > >>>>>>
> > >>>>>>
> > >>>>>> On 10/9/2024 5:18 PM, Dariusz Sosnowski wrote:
> > >>>>>>>> -----Original Message-----
> > >>>>>>>> From: Ferruh Yigit <ferruh.yigit@amd.com>
> > >>>>>>>> Sent: Wednesday, October 9, 2024 03:08
> > >>>>>>>> To: Konstantin Ananyev <konstantin.ananyev@huawei.com>;
> > >>>>>>>> Dariusz Sosnowski <dsosnowski@nvidia.com>; NBU-Contact-Thomas
> > >>>>>>>> Monjalon
> > >>>>>>>> (EXTERNAL) <thomas@monjalon.net>; Andrew Rybchenko
> > >>>>>>>> <andrew.rybchenko@oktetlabs.ru>
> > >>>>>>>> Cc: dev@dpdk.org; Bruce Richardson
> > >>>>>>>> <bruce.richardson@intel.com>
> > >>>>>>>> Subject: Re: [RFC 0/4] ethdev: rework config restore
> > >>>>>>>>
> > >>>>>>>> External email: Use caution opening links or attachments
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 10/8/2024 6:21 PM, Konstantin Ananyev wrote:
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>>>>>> We have been working on optimizing the latency of calls
> > >>>>>>>>>>>>>> to rte_eth_dev_start(), on ports spawned by mlx5 PMD.
> > >>>>>>>>>>>>>> Most of the work requires changes in the implementation
> > >>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>> .dev_start() PMD callback, but I also wanted to start a
> > >>>>>>>>>>>>>> discussion regarding configuration restore.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> rte_eth_dev_start() does a few things on top of calling
> > >>>>>>>>>>>>>> .dev_start()
> > >>>>>>>> callback:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> - Before calling it:
> > >>>>>>>>>>>>>>     - eth_dev_mac_restore() - if device supports
> > >>>>>>>>>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> > >>>>>>>>>>>>>> - After calling it:
> > >>>>>>>>>>>>>>     - eth_dev_mac_restore() - if device does not
> > >>>>>>>>>>>>>> support
> > >>>>>>>>>>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
> > >>>>>>>>>>>>>>     - restore promiscuous config
> > >>>>>>>>>>>>>>     - restore all multicast config
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> eth_dev_mac_restore() iterates over all known MAC
> > >>>>>>>>>>>>>> addresses - stored in rte_eth_dev_data.mac_addrs array
> > >>>>>>>>>>>>>> - and calls
> > >>>>>>>>>>>>>> .mac_addr_set() and .mac_addr_add() callbacks to apply
> > >>>>>>>>>>>>>> these MAC
> > >>>>>>>> addresses.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Promiscuous config restore checks if promiscuous mode
> > >>>>>>>>>>>>>> is enabled or not, and calls .promiscuous_enable() or
> > >>>>>>>>>>>>>> .promiscuous_disable()
> > >>>>>>>> callback.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> All multicast config restore checks if all multicast
> > >>>>>>>>>>>>>> mode is enabled or not, and calls
> > >>>>>>>>>>>>>> .allmulticast_enable() or
> > >>>>>>>>>>>>>> .allmulticast_disable()
> > >>>>>>>> callback.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Callbacks are called directly in all of these cases, to
> > >>>>>>>>>>>>>> bypass the checks for applying the same configuration,
> > >>>>>>>>>>>>>> which exist in relevant
> > >>>>>>>> APIs.
> > >>>>>>>>>>>>>> Checks are bypassed to force drivers to reapply the
> configuration.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Let's consider what happens in the following sequence of API
> calls.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> 1. rte_eth_dev_configure() 2. rte_eth_tx_queue_setup()
> > >>>>>>>>>>>>>> 3. rte_eth_rx_queue_setup() 4.
> > >>>>>>>>>>>>>> rte_eth_promiscuous_enable()
> > >>>>>>>>>>>>>>     - Call dev->dev_ops->promiscuous_enable()
> > >>>>>>>>>>>>>>     - Stores promiscuous state in dev->data->promiscuous 5.
> > >>>>>>>>>>>>>> rte_eth_allmulticast_enable()
> > >>>>>>>>>>>>>>     - Call dev->dev_ops->allmulticast_enable()
> > >>>>>>>>>>>>>>     - Stores allmulticast state in dev->data->allmulticast 6.
> > >>>>>>>>>>>>>> rte_eth_dev_start()
> > >>>>>>>>>>>>>>     - Call dev->dev_ops->dev_start()
> > >>>>>>>>>>>>>>     - Call dev->dev_ops->mac_addr_set() - apply default MAC
> address
> > >>>>>>>>>>>>>>     - Call dev->dev_ops->promiscuous_enable()
> > >>>>>>>>>>>>>>     - Call dev->dev_ops->allmulticast_enable()
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Even though all configuration is available in dev->data
> > >>>>>>>>>>>>>> after step 5, library forces reapplying this configuration in step
> 6.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> In mlx5 PMD case all relevant callbacks require
> > >>>>>>>>>>>>>> communication with the kernel driver, to configure the
> > >>>>>>>>>>>>>> device (mlx5 PMD must create/destroy new kernel flow
> > >>>>>>>>>>>>>> rules and/or change netdev
> > >>>> config).
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> mlx5 PMD handles applying all configuration in
> > >>>>>>>>>>>>>> .dev_start(), so the following forced callbacks force
> > >>>>>>>>>>>>>> additional communication with the kernel. The
> > >>>>>>>>>>>>> same configuration is applied multiple times.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> As an optimization, mlx5 PMD could check if a given
> > >>>>>>>>>>>>>> configuration was applied, but this would duplicate the
> > >>>>>>>>>>>>>> functionality of the library (for example
> > >>>>>>>>>>>>>> rte_eth_promiscuous_enable() does not call the driver
> > >>>>>>>>>>>>>> if
> > >>>>>>>>>>>>>> dev->data->promiscuous is set).
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Question: Since all of the configuration is available
> > >>>>>>>>>>>>>> before
> > >>>>>>>>>>>>>> .dev_start() callback is called, why ethdev library
> > >>>>>>>>>>>>>> does not expect .dev_start() to
> > >>>>>>>>>>>>> take this configuration into account?
> > >>>>>>>>>>>>>> In other words, why library has to reapply the configuration?
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I could not find any particular reason why
> > >>>>>>>>>>>>>> configuration restore exists as part of the process (it
> > >>>>>>>>>>>>>> was in the initial DPDK
> > >>>> commit).
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> My assumption is .dev_stop() cause these values reset in
> > >>>>>>>>>>>>> some devices, so
> > >>>>>>>>>>>>> .dev_start() restores them back.
> > >>>>>>>>>>>>> @Bruce or @Konstantin may remember the history.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Yep, as I remember, at least some Intel PMDs calling
> > >>>>>>>>>>> hw_reset() ad
> > >>>>>>>>>>> dec_stop() and even dev_start() to make sure that HW is in
> > >>>>>>>>>>> a clean
> > >>>>>>>>>>> (known)
> > >>>>>>>> state.
> > >>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> But I agree this is device specific behavior, and can be
> > >>>>>>>>>>>>> managed by what device requires.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Probably yes.
> > >>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> The patches included in this RFC, propose a mechanism
> > >>>>>>>>>>>>>> which would help with managing which drivers rely on
> > >>>>>>>>>>>>>> forceful configuration
> > >>>>>> restore.
> > >>>>>>>>>>>>>> Drivers could advertise if forceful configuration
> > >>>>>>>>>>>>>> restore is needed through `RTE_ETH_DEV_*_FORCE_RESTORE`
> > >>>>>>>>>>>>>> device flag. If this flag is set, then the driver in
> > >>>>>>>>>>>>>> question requires ethdev to forcefully restore
> > >>>>>>>>>>>>> configuration.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> OK to use flag for it, but not sure about using 'dev_info-
> >dev_flags'
> > >>>>>>>>>>>>> (RTE_ETH_DEV_*) for this, as this flag is shared with
> > >>>>>>>>>>>>> user and this is all dpdk internal.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> What about to have a dedicated flag for it? We can have
> > >>>>>>>>>>>>> a dedicated set of flag values for restore.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Agreed. What do you think about the following?
> > >>>>>>>>>>>
> > >>>>>>>>>>> Instead of exposing that, can we probably make it
> > >>>>>>>>>>> transparent to the user and probably ethdev layer too?
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> +1 to make it transparent to user, but not sure if we can
> > >>>>>>>>>> +make it
> > >>>>>>>>>> transparent to ethdev layer.
> > >>>>>>>>>
> > >>>>>>>>> Just to be clear:
> > >>>>>>>>> Let say, using example from above:
> > >>>>>>>>>
> > >>>>>>>>>  rte_eth_dev_start()
> > >>>>>>>>>      - Call dev->dev_ops->dev_start()
> > >>>>>>>>>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> > >>>>>>>>>      - Call dev->dev_ops->promiscuous_enable()
> > >>>>>>>>>      - Call dev->dev_ops->allmulticast_enable()
> > >>>>>>>>>
> > >>>>>>>>> We probably can introduce ethdev internal function (still
> > >>>>>>>>> visible to
> > >>>>>>>>> PMDs) that would do last 3 steps:
> > >>>>>>>>> ethdev_replay_user_conf(...)
> > >>>>>>>>>      - Call dev->dev_ops->mac_addr_set() - apply default MAC address
> > >>>>>>>>>      - Call dev->dev_ops->promiscuous_enable()
> > >>>>>>>>>      - Call dev->dev_ops->allmulticast_enable()
> > >>>>>>>>>
> > >>>>>>>>> And let PMD itself to decide does it needs to call it at dev_start() or
> not.
> > >>>>>>>>> So it will become:
> > >>>>>>>>> rte_eth_dev_start()
> > >>>>>>>>>      - Call dev->dev_ops->dev_start()
> > >>>>>>>>>       -Call ethdev_replay_user_conf(.)
> > >>>>>>>>>               - Call dev->dev_ops->mac_addr_set() - apply default MAC
> address
> > >>>>>>>>>               - Call dev->dev_ops->promiscuous_enable()
> > >>>>>>>>>               -Call dev->dev_ops->allmulticast_enable()
> > >>>>>>>>>
> > >>>>>>>>> For PMDs that do need to restore user provided config And
> > >>>>>>>>> rte_eth_dev_start()
> > >>>>>>>>>      - Call dev->dev_ops->dev_start()
> > >>>>>>>>>
> > >>>>>>>>> For those who do not.
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>> OK, got it what you mean.
> > >>>>>>>> Pushing restore functionality to PMDs works, but this may be
> > >>>>>>>> doing redundant work on each PMD.
> > >>>>>>>>
> > >>>>>>>> Instead Dariusz suggests PMD to provide a flag to ehtdev to
> > >>>>>>>> what to restore and common code in ethdev does the work.
> > >>>>>>>> My below dedicated data struct comment is to have this flag
> > >>>>>>>> in a new struct, overall like following:
> > >>>>>>>>
> > >>>>>>>> rte_eth_dev_start()
> > >>>>>>>>    - Call dev->dev_ops->dev_start()
> > >>>>>>>>    - Call dev->dev_ops->get_restore_flags(ethdev, RTE_ETH_START,
> &flags)
> > >>>>>>>>    - if (flags & MAC) dev->dev_ops->mac_addr_set()
> > >>>>>>>>    - if (flags & PROMISC) dev->dev_ops->promiscuous_enable()
> > >>>>>>>>    - ...
> > >>>>>>>
> > >>>>>>> Could you please explain what is the benefit of exposing flags
> > >>>>>>> through dev_ops
> > >>>>>> callback vs a dedicated flags field in rte_eth_dev_data?
> > >>>>>>> In both solutions:
> > >>>>>>> - config restore is transparent to the user,
> > >>>>>>> - drivers can omit config restore (either by not implementing
> > >>>>>>> the callback or not providing the flags),
> > >>>>>>> - an ABI change is introduced (not a huge concern, at least for 24.11).
> > >>>>>>>
> > >>>>>>> I understand that my initial proposal with "internal_flags"
> > >>>>>>> was too vague, but renaming and splitting this field into:
> > >>>>>>>
> > >>>>>>> - dev_start_restore_flags
> > >>>>>>> - dev_reset_restore_flags
> > >>>>>>> - and so on...
> > >>>>>>>
> > >>>>>>> seems sufficient, at least in my opinion.
> > >>>>>>>
> > >>>>>>
> > >>>>>> Hi Dariusz,
> > >>>>>>
> > >>>>>> Putting flags to rte_eth_dev_data works, and it is easier since
> > >>>>>> there is direct access from rte_eth_dev to rte_eth_dev_data, so
> > >>>>>> you don't need new dev_ops. So this is a valid option.
> > >>>>>>
> > >>>>>> But benefit of new dev_ops is to keep "struct rte_eth_dev_data" clean.
> > >>>>>>
> > >>>>>> "struct rte_eth_dev_data" is integral data structure for ethdev
> > >>>>>> and it is used in multiple locations, mostly related to the
> > >>>>>> datapath and all drivers needs to deal with fields of this struct.
> > >>>>>> Like [rx]_queues, dev_private, dev_conf all important and used a lot.
> > >>>>>>
> > >>>>>> I want to protect "struct rte_eth_dev_data" from noise as much
> > >>>>>> as possible, though what is noise is not always that clear.
> > >>>>>>
> > >>>>>> This restore flag is not critical, and I expect most of the
> > >>>>>> drivers won't care and populate this restore flag at all. That
> > >>>>>> is why to me it is better have dedicated struct for it and only
> > >>>>>> drivers care about restore
> > >>>> feature know it.
> > >>>>>
> > >>>>> I see. Thank you very much for the explanation.
> > >>>>>
> > >>>>> In this case, it looks like adding this to dev_ops is the way to go.
> > >>>>>
> > >>>>> So, summarizing it all:
> > >>>>>
> > >>>>> 1. dev_ops should be extended with a callback with the following
> > >>>>> signature and
> > >>>> enums/flags:
> > >>>>>
> > >>>>> enum rte_eth_dev_operation op {
> > >>>>>       RTE_ETH_START,
> > >>>>>       RTE_ETH_STOP,
> > >>>>>       RTE_ETH_RESET,
> > >>>>> };
> > >>>>>
> > >>>>> #define RTE_ETH_RESTORE_MAC_ADDR RTE_BIT32(0) #define
> > >>>>> RTE_ETH_RESTORE_PROMISC RTE_BIT32(1) #define
> > >>>> RTE_ETH_RESTORE_ALLMULTI
> > >>>>> RTE_BIT32(2)
> > >>>>>
> > >>>>> void (*get_restore_flags)(
> > >>>>>       struct rte_eth_dev *dev,
> > >>>>>       enum rte_eth_dev_operation op,
> > >>>>>       uint32_t *flags);
> > >>>>>
> > >>>>> 2. rte_eth_dev_start() will work as follows:
> > >>>>>
> > >>>>> - Call dev->dev_ops->dev_start()
> > >>>>> - Call dev->dev_ops->get_restore_flags(dev, RTE_ETH_START,
> > >>>>> &flags). If callback
> > >>>> is not provided, assume flags == 0.
> > >>>>> - if (flags & RTE_ETH_RESTORE_MAC_ADDR) - restore MAC addresses
> > >>>>> - and so on...
> > >>>>>
> > >>>>
> > >>>> All above looks good.
> > >>>>
> > >>>>> Also, I would like to add the following:
> > >>>>>
> > >>>>> 3. Patchset introducing this change should add
> > >>>>> get_restore_flags()
> > >>>> implementation to all drivers, which informs that all config should be
> restored.
> > >>>>> This would preserve the current behavior.
> > >>>>> Later, this could be refined driver by driver.
> > >>>>>
> > >>>>> What do you think?
> > >>>>>
> > >>>>
> > >>>> What you are saying is correct, but I suspect most of the drivers
> > >>>> don't really need this restore, but they have it since it was in the ethdev
> layer.
> > >>>>
> > >>>> If we introduce back restore via get_restore_flags(), it may stay
> > >>>> as it is in drivers, at least for most of them.
> > >>>>
> > >>>> What do you think to risk breaking stuff for this case.
> > >>>>
> > >>>> So don't implement this in the drivers by default, so who needs
> > >>>> it will recognize the issue and will implement it. If we merge
> > >>>> this patch for -rc1, it gives enough time for drivers to detect the issue and
> fix it.
> > >>>
> > >>> It seems rather too risky, especially considering that for
> > >>> example, there are a few Intel drivers which do not have
> > >>> maintainers (like
> > >> i40e).
> > >>> So, I don't know what will happen to such drivers. They may be left broken
> (if they are affected) for 24.11 and future releases.
> > >>> But I agree that if default behavior is preserved, this dependence of drivers
> on config restore might stay as is.
> > >>> I'm on the fence about it.
> > >>>
> > >>
> > >> Yeah, not sure.
> > >>
> > >> Do you think the dev_ops function can be implemented in the
> > >> 'ethdev_driver.c' and all drivers use exact same function?
> > >> So this reduces changes and duplication in drivers while preserving
> > >> the behavior.
> > >
> > > Wonder why it can't be done visa-versa?
> > > If this new function is not implemented by PMD, then simply assume
> > > that everything needs to be restored (as it happens now)?
> > >
> >
> > True, this preserve the behavior and prevents updating all drivers.
> >
> > I think can be done in two ways:
> > 1. dev_ops() tells what NOT to restore, if dev_ops() not implemented
> > or it returns zero, restore everything. Here what not nice is what to
> > restore in this case is not very well defined.
> >
> > 2. dev_ops() tells what to restore, but when not implemented default
> > flag value is all restore. Which is current behavior.
> >
> >
> > I am for option 2. and I assume that is what Konstantin also suggest,
> > but I want to clarify to be sure.
> 
> Yep, I am also in favor of option 2.

I sent the patch series implementing the option 2: https://patches.dpdk.org/project/dpdk/list/?series=33433

> 
> >
> > >>>>
> > >>>> Only we may implement this to the drivers that exist when this
> > >>>> restore code was introduced.
> > >>>> I mean whatever driver exist in the initial DPDK commit,
> > >>>> implement this logic only to those drivers.
> > >>>
> > >>> Seems reasonable to me. In this case, it would be igb (IIUC, now it's named
> e1000) and ixgbe.
> > >>>
> > >>>>
> > >>>>> Also, there's an open question about 'stop' and 'reset' operations.
> > >>>>> At the moment, ethdev layer does not do any config manipulation
> > >>>>> during these
> > >>>> operations.
> > >>>>> Maybe we should limit get_restore_flags() to 'start' only?
> > >>>>>
> > >>>>
> > >>>> Ack, I was about to suggest the same, for now only have 'RTE_ETH_START'
> > >>>> as a placeholder for later possible usages.
> > >>>
> > >>> Ack
> > >>>
> > >>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>>>
> > >>>>>>>> So PMDs only will provide what to restore with an internal
> > >>>>>>>> API and common ethdev layer will restore it.
> > >>>>>>>> If no restore required PMD may not implement .get_restore_flags() at
> all.
> > >>>>>>>>
> > >>>>>>>> Additionally, RTE_ETH_START, RTE_ETH_RESET etc flag can be
> > >>>>>>>> provided to internal API to get what to restore in different states...
> > >>>>>>>>
> > >>>>>>>>>> Suggested 'internal_flag' in "struct rte_eth_dev_data" can
> > >>>>>>>>>> be confusing and open to interpretation what to use it for
> > >>>>>>>>>> and by time become source of defect.
> > >>>>>>>>>
> > >>>>>>>>> Yes, same thoughts.
> > >>>>>>>>>
> > >>>>>>>>>> Instead what do you think to have a separate, dedicated data struct
> for it?
> > >>>>>>>>>
> > >>>>>>>>> Hmm... not sure I understood you here...
> > >>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>> Might be we can move this restoration code into the new
> > >>>>>>>>>>> ethdev helper function,(ethdevd_user_config_restore()  or
> > >>>>>>>>>>> so) that PMD can invoke
> > >>>>>>>> during its dev_start() if needed?
> > >>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> #define RTE_ETH_DEV_INTERNAL_PROMISC_FORCE_RESTORE
> > >>>>>> RTE_BIT32(0)
> > >>>>>>>>>>>> #define RTE_ETH_DEV_INTERNAL_ALLMULTI_FORCE_RESTORE
> > >>>>>> RTE_BIT32(1)
> > >>>>>>>>>>>> #define RTE_ETH_DEV_INTERNAL_MAC_ADDR_FORCE_RESTORE
> > >>>>>>>> RTE_BIT32(2)
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> struct rte_eth_dev_data {
> > >>>>>>>>>>>>    /* snip */
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>    uint32_t dev_flags;
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>    /**
> > >>>>>>>>>>>>     * Internal device capabilities, used only by ethdev library.
> > >>>>>>>>>>>>     * Certain functionalities provided by the library
> > >>>>>>>>>>>> might
> > >>>>>> enabled/disabled,
> > >>>>>>>>>>>>     * based on driver exposing certain capabilities.
> > >>>>>>>>>>>>     */
> > >>>>>>>>>>>>    uint32_t internal_flags;
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>    /* snip */
> > >>>>>>>>>>>> };
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Also perhaps we have go into details what needs to be
> > >>>>>>>>>>>>> restored after 'stop' and what needs to be restored
> > >>>>>>>>>>>>> after 'reset' and use similar
> > >>>>>>>> mechanism etc...
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I think we should look into that.
> > >>>>>>>>>>>> Any 'codification' of semantics between drivers and
> > >>>>>>>>>>>> ethdev library is good in
> > >>>>>>>> my opinion.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> At least right now, ethdev does not change any
> > >>>>>>>>>>>> configuration in 'stop' and
> > >>>>>>>> 'reset' from what I see.
> > >>>>>>>>>>>> But that's on library side only.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>>> This way, if we would conclude that it makes sense for
> > >>>>>>>>>>>>>> .dev_start() to handle all starting configuration
> > >>>>>>>>>>>>>> aspects, we could track which drivers still rely
> > >>>>>>>>>>>>> on configuration restore.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Dariusz Sosnowski (4):
> > >>>>>>>>>>>>>>   ethdev: rework config restore
> > >>>>>>>>>>>>>>   ethdev: omit promiscuous config restore if not required
> > >>>>>>>>>>>>>>   ethdev: omit all multicast config restore if not required
> > >>>>>>>>>>>>>>   ethdev: omit MAC address restore if not required
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>  lib/ethdev/rte_ethdev.c | 39
> > >>>>>>>>>>>>>> ++++++++++++++++++++++++++++++++++-----
> > >>>>>>>>>>>>>>  lib/ethdev/rte_ethdev.h | 18 ++++++++++++++++++
> > >>>>>>>>>>>>>>  2 files changed, 52 insertions(+), 5 deletions(-)
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>> 2.39.5
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>
> > >


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2024-10-11  9:37 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-18  9:21 [RFC 0/4] ethdev: rework config restore Dariusz Sosnowski
2024-09-18  9:21 ` [RFC 1/4] " Dariusz Sosnowski
2024-09-18  9:21 ` [RFC 2/4] ethdev: omit promiscuous config restore if not required Dariusz Sosnowski
2024-09-18  9:22 ` [RFC 3/4] ethdev: omit all multicast " Dariusz Sosnowski
2024-09-18  9:22 ` [RFC 4/4] ethdev: omit MAC address " Dariusz Sosnowski
2024-09-29 23:31 ` [RFC 0/4] ethdev: rework config restore Ferruh Yigit
2024-10-04 19:13   ` Dariusz Sosnowski
2024-10-07  9:27     ` Konstantin Ananyev
2024-10-07 22:56       ` Ferruh Yigit
2024-10-08 17:21         ` Konstantin Ananyev
2024-10-09  1:07           ` Ferruh Yigit
2024-10-09 10:54             ` Konstantin Ananyev
2024-10-09 16:18             ` Dariusz Sosnowski
2024-10-09 23:16               ` Ferruh Yigit
2024-10-10 12:08                 ` Dariusz Sosnowski
2024-10-10 12:51                   ` Ferruh Yigit
2024-10-10 16:23                     ` Dariusz Sosnowski
2024-10-10 17:08                       ` Ferruh Yigit
2024-10-10 22:58                         ` Konstantin Ananyev
2024-10-11  0:02                           ` Ferruh Yigit
2024-10-11  8:23                             ` Dariusz Sosnowski
2024-10-11  8:29                             ` Konstantin Ananyev
2024-10-11  9:37                               ` Dariusz Sosnowski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).