DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 0/5] ethdev: Port ownership
@ 2017-11-28 11:57 Matan Azrad
  2017-11-28 11:57 ` [dpdk-dev] [PATCH 1/5] ethdev: free a port by a dedicated API Matan Azrad
                   ` (5 more replies)
  0 siblings, 6 replies; 214+ messages in thread
From: Matan Azrad @ 2017-11-28 11:57 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu; +Cc: dev

Add ownership mechanism to DPDK Ethernet devices to avoid multiple
management of a device by different DPDK entities as discussed in:
http://dpdk.org/ml/archives/dev/2017-September/074656.html

Adjusts failsafe and testpmd to use it.

Matan Azrad (5):
  ethdev: free a port by a dedicated API
  ethdev: add port ownership
  net/failsafe: free an eth port by a dedicated API
  net/failsafe: use ownership mechanism to own ports
  app/testpmd: adjust ethdev port ownership

 app/test-pmd/cmdline.c                  | 100 ++++++++++++++++----------
 app/test-pmd/cmdline_flow.c             |   2 +-
 app/test-pmd/config.c                   |  40 +++++++----
 app/test-pmd/parameters.c               |   4 +-
 app/test-pmd/testpmd.c                  |  65 +++++++++++------
 app/test-pmd/testpmd.h                  |   3 +
 doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
 drivers/net/failsafe/failsafe.c         |   7 ++
 drivers/net/failsafe/failsafe_eal.c     |  10 +++
 drivers/net/failsafe/failsafe_ether.c   |   2 +-
 drivers/net/failsafe/failsafe_private.h |   2 +
 lib/librte_ether/rte_ethdev.c           | 123 +++++++++++++++++++++++++++++++-
 lib/librte_ether/rte_ethdev.h           |  86 ++++++++++++++++++++++
 lib/librte_ether/rte_ethdev_version.map |  12 ++++
 14 files changed, 386 insertions(+), 82 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH 1/5] ethdev: free a port by a dedicated API
  2017-11-28 11:57 [dpdk-dev] [PATCH 0/5] ethdev: Port ownership Matan Azrad
@ 2017-11-28 11:57 ` Matan Azrad
  2017-11-28 11:57 ` [dpdk-dev] [PATCH 2/5] ethdev: add port ownership Matan Azrad
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2017-11-28 11:57 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu; +Cc: dev

Use a dedicated API to free port instead of changing its state
directly.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 lib/librte_ether/rte_ethdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 318af28..2d754d9 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -437,7 +437,7 @@ struct rte_eth_dev *
 	if (ret < 0)
 		goto err;
 
-	rte_eth_devices[port_id].state = RTE_ETH_DEV_UNUSED;
+	rte_eth_dev_release_port(&rte_eth_devices[port_id]);
 	return 0;
 
 err:
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-11-28 11:57 [dpdk-dev] [PATCH 0/5] ethdev: Port ownership Matan Azrad
  2017-11-28 11:57 ` [dpdk-dev] [PATCH 1/5] ethdev: free a port by a dedicated API Matan Azrad
@ 2017-11-28 11:57 ` Matan Azrad
  2017-11-30 12:36   ` Neil Horman
  2017-11-28 11:57 ` [dpdk-dev] [PATCH 3/5] net/failsafe: free an eth port by a dedicated API Matan Azrad
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2017-11-28 11:57 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu; +Cc: dev

The ownership of a port is implicit in DPDK.
Making it explicit is better from the next reasons:
1. It may be convenient for multi-process applications to know which
   process is in charge of a port.
2. A library could work on top of a port.
3. A port can work on top of another port.

Also in the fail-safe case, an issue has been met in testpmd.
We need to check that the user is not trying to use a port which is
already managed by fail-safe.

Add ownership mechanism to DPDK Ethernet devices to avoid multiple
management of a device by different DPDK entities.

A port owner is built from owner id(number) and owner name(string) while
the owner id must be unique to distinguish between two identical entity
instances and the owner name can be any name.
The name helps to logically recognize the owner by different DPDK
entities and allows easy debug.
Each DPDK entity can allocate an owner unique identifier and can use it
and its preferred name to owns valid ethdev ports.
Each DPDK entity can get any port owner status to decide if it can
manage the port or not.

The current ethdev internal port management is not affected by this
feature.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
 lib/librte_ether/rte_ethdev.c           | 121 ++++++++++++++++++++++++++++++++
 lib/librte_ether/rte_ethdev.h           |  86 +++++++++++++++++++++++
 lib/librte_ether/rte_ethdev_version.map |  12 ++++
 4 files changed, 230 insertions(+), 1 deletion(-)

diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
index 6a0c9f9..af639a1 100644
--- a/doc/guides/prog_guide/poll_mode_drv.rst
+++ b/doc/guides/prog_guide/poll_mode_drv.rst
@@ -156,7 +156,7 @@ concurrently on the same tx queue without SW lock. This PMD feature found in som
 
 See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE`` capability probing details.
 
-Device Identification and Configuration
+Device Identification, Ownership  and Configuration
 ---------------------------------------
 
 Device Identification
@@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports are assigned two other identifiers:
 *   A port name used to designate the port in console messages, for administration or debugging purposes.
     For ease of use, the port name includes the port index.
 
+Port Ownership
+~~~~~~~~~~~~~
+The Ethernet devices ports can be owned by a single DPDK entity (application, library, PMD, process, etc).
+The ownership mechanism is controlled by ethdev APIs and allows to set/remove/get a port owner by DPDK entities.
+Allowing this should prevent any multiple management of Ethernet port by different entities.
+
+.. note::
+
+    It is the DPDK entity responsibility either to check the port owner before using it or to set the port owner to prevent others from using it.
+
 Device Configuration
 ~~~~~~~~~~~~~~~~~~~~
 
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 2d754d9..836991e 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -71,6 +71,7 @@
 static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
 struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
 static struct rte_eth_dev_data *rte_eth_dev_data;
+static uint16_t rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER + 1;
 static uint8_t eth_dev_last_created_port;
 
 /* spinlock for eth device callbacks */
@@ -278,6 +279,7 @@ struct rte_eth_dev *
 	if (eth_dev == NULL)
 		return -EINVAL;
 
+	memset(&eth_dev->data->owner, 0, sizeof(struct rte_eth_dev_owner));
 	eth_dev->state = RTE_ETH_DEV_UNUSED;
 	return 0;
 }
@@ -293,6 +295,125 @@ struct rte_eth_dev *
 		return 1;
 }
 
+static int
+rte_eth_is_valid_owner_id(uint16_t owner_id)
+{
+	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
+	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER &&
+	    rte_eth_next_owner_id <= owner_id)) {
+		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
+		return 0;
+	}
+	return 1;
+}
+
+uint16_t
+rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t owner_id)
+{
+	while (port_id < RTE_MAX_ETHPORTS &&
+	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
+	       rte_eth_devices[port_id].data->owner.id != owner_id))
+		port_id++;
+
+	if (port_id >= RTE_MAX_ETHPORTS)
+		return RTE_MAX_ETHPORTS;
+
+	return port_id;
+}
+
+int
+rte_eth_dev_owner_new(uint16_t *owner_id)
+{
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+		RTE_PMD_DEBUG_TRACE("Not primary process cannot own ports.\n");
+		return -EPERM;
+	}
+	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
+		RTE_PMD_DEBUG_TRACE("Reached maximum number of Ethernet port owners.\n");
+		return -EUSERS;
+	}
+	*owner_id = rte_eth_next_owner_id++;
+	return 0;
+}
+
+int
+rte_eth_dev_owner_set(const uint16_t port_id,
+		      const struct rte_eth_dev_owner *owner)
+{
+	struct rte_eth_dev_owner *port_owner;
+	int ret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+		RTE_PMD_DEBUG_TRACE("Not primary process cannot own ports.\n");
+		return -EPERM;
+	}
+	if (!rte_eth_is_valid_owner_id(owner->id))
+		return -EINVAL;
+	port_owner = &rte_eth_devices[port_id].data->owner;
+	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
+	    port_owner->id != owner->id) {
+		RTE_LOG(ERR, EAL,
+			"Cannot set owner to port %d already owned by %s_%05d.\n",
+			port_id, port_owner->name, port_owner->id);
+		return -EPERM;
+	}
+	ret = snprintf(port_owner->name, RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
+		       owner->name);
+	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
+		memset(port_owner->name, 0, RTE_ETH_MAX_OWNER_NAME_LEN);
+		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
+		return -EINVAL;
+	}
+	port_owner->id = owner->id;
+	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
+			    owner->name, owner->id);
+	return 0;
+}
+
+int
+rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t owner_id)
+{
+	struct rte_eth_dev_owner *port_owner;
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+	if (!rte_eth_is_valid_owner_id(owner_id))
+		return -EINVAL;
+	port_owner = &rte_eth_devices[port_id].data->owner;
+	if (port_owner->id != owner_id) {
+		RTE_LOG(ERR, EAL,
+			"Cannot remove port %d owner %s_%05d by different owner id %5d.\n",
+			port_id, port_owner->name, port_owner->id, owner_id);
+		return -EPERM;
+	}
+	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has removed.\n", port_id,
+			port_owner->name, port_owner->id);
+	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
+	return 0;
+}
+
+void
+rte_eth_dev_owner_delete(const uint16_t owner_id)
+{
+	uint16_t p;
+
+	if (!rte_eth_is_valid_owner_id(owner_id))
+		return;
+	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
+		memset(&rte_eth_devices[p].data->owner, 0,
+		       sizeof(struct rte_eth_dev_owner));
+	RTE_PMD_DEBUG_TRACE("All port owners owned by "
+			    "%05d identifier has removed.\n", owner_id);
+}
+
+const struct rte_eth_dev_owner *
+rte_eth_dev_owner_get(const uint16_t port_id)
+{
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
+	if (rte_eth_devices[port_id].data->owner.id == RTE_ETH_DEV_NO_OWNER)
+		return NULL;
+	return &rte_eth_devices[port_id].data->owner;
+}
+
 int
 rte_eth_dev_socket_id(uint16_t port_id)
 {
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 341c2d6..f54c26d 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
 
 #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
 
+#define RTE_ETH_DEV_NO_OWNER 0
+
+#define RTE_ETH_MAX_OWNER_NAME_LEN 64
+
+struct rte_eth_dev_owner {
+	uint16_t id; /**< The owner unique identifier. */
+	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner name. */
+};
+
 /**
  * @internal
  * The data part, with no function pointers, associated with each ethernet device.
@@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
 	int numa_node;  /**< NUMA node connection */
 	struct rte_vlan_filter_conf vlan_filter_conf;
 	/**< VLAN filter configuration. */
+	struct rte_eth_dev_owner owner; /**< The port owner. */
 };
 
 /** Device supports link state interrupt */
@@ -1846,6 +1856,82 @@ struct rte_eth_dev_data {
 
 
 /**
+ * Iterates over valid ethdev ports owned by a specific owner.
+ *
+ * @param port_id
+ *   The id of the next possible valid owned port.
+ * @param	owner_id
+ *  The owner identifier.
+ *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless ports.
+ * @return
+ *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if there is none.
+ */
+uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t owner_id);
+
+/**
+ * Macro to iterate over all enabled ethdev ports owned by a specific owner.
+ */
+#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
+	for (p = rte_eth_find_next_owned_by(0, o); \
+	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
+	     p = rte_eth_find_next_owned_by(p + 1, o))
+
+/**
+ * Get a new unique owner identifier.
+ * An owner identifier is used to owns Ethernet devices by only one DPDK entity
+ * to avoid multiple management of device by different entities.
+ *
+ * @param	owner_id
+ *   Owner identifier pointer.
+ * @return
+ *   Negative errno value on error, 0 on success.
+ */
+int rte_eth_dev_owner_new(uint16_t *owner_id);
+
+/**
+ * Set an Ethernet device owner.
+ *
+ * @param	port_id
+ *  The identifier of the port to own.
+ * @param	owner
+ *  The owner pointer.
+ * @return
+ *  Negative errno value on error, 0 on success.
+ */
+int rte_eth_dev_owner_set(const uint16_t port_id,
+			  const struct rte_eth_dev_owner *owner);
+
+/**
+ * Remove Ethernet device owner to make the device ownerless.
+ *
+ * @param	port_id
+ *  The identifier of port to make ownerless.
+ * @param	owner
+ *  The owner identifier.
+ * @return
+ *  0 on success, negative errno value on error.
+ */
+int rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t owner_id);
+
+/**
+ * Remove owner from all Ethernet devices owned by a specific owner.
+ *
+ * @param	owner
+ *  The owner identifier.
+ */
+void rte_eth_dev_owner_delete(const uint16_t owner_id);
+
+/**
+ * Get the owner of an Ethernet device.
+ *
+ * @param	port_id
+ *  The port identifier.
+ * @return
+ *  NULL when the device is ownerless, else the device owner pointer.
+ */
+const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const uint16_t port_id);
+
+/**
  * Get the total number of Ethernet devices that have been successfully
  * initialized by the matching Ethernet driver during the PCI probing phase
  * and that are available for applications to use. These devices must be
diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
index e9681ac..7d07edb 100644
--- a/lib/librte_ether/rte_ethdev_version.map
+++ b/lib/librte_ether/rte_ethdev_version.map
@@ -198,6 +198,18 @@ DPDK_17.11 {
 
 } DPDK_17.08;
 
+DPDK_18.02 {
+	global:
+
+	rte_eth_find_next_owned_by;
+	rte_eth_dev_owner_new;
+	rte_eth_dev_owner_set;
+	rte_eth_dev_owner_remove;
+	rte_eth_dev_owner_delete;
+	rte_eth_dev_owner_get;
+
+} DPDK_17.11;
+
 EXPERIMENTAL {
 	global:
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH 3/5] net/failsafe: free an eth port by a dedicated API
  2017-11-28 11:57 [dpdk-dev] [PATCH 0/5] ethdev: Port ownership Matan Azrad
  2017-11-28 11:57 ` [dpdk-dev] [PATCH 1/5] ethdev: free a port by a dedicated API Matan Azrad
  2017-11-28 11:57 ` [dpdk-dev] [PATCH 2/5] ethdev: add port ownership Matan Azrad
@ 2017-11-28 11:57 ` Matan Azrad
  2017-11-28 11:58 ` [dpdk-dev] [PATCH 4/5] net/failsafe: use ownership mechanism to own ports Matan Azrad
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2017-11-28 11:57 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu; +Cc: dev

Call dedicated ethdev API to free port in remove time as was done in
other fail-safe places.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/failsafe/failsafe_ether.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index 21392e5..f72f44f 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -297,7 +297,7 @@
 			ERROR("Bus detach failed for sub_device %u",
 			      SUB_ID(sdev));
 		} else {
-			ETH(sdev)->state = RTE_ETH_DEV_UNUSED;
+			rte_eth_dev_release_port(ETH(sdev));
 		}
 		sdev->state = DEV_PARSED;
 		/* fallthrough */
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH 4/5] net/failsafe: use ownership mechanism to own ports
  2017-11-28 11:57 [dpdk-dev] [PATCH 0/5] ethdev: Port ownership Matan Azrad
                   ` (2 preceding siblings ...)
  2017-11-28 11:57 ` [dpdk-dev] [PATCH 3/5] net/failsafe: free an eth port by a dedicated API Matan Azrad
@ 2017-11-28 11:58 ` Matan Azrad
  2017-11-28 11:58 ` [dpdk-dev] [PATCH 5/5] app/testpmd: adjust ethdev port ownership Matan Azrad
  2018-01-07  9:45 ` [dpdk-dev] [PATCH v2 0/6] ethdev: " Matan Azrad
  5 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2017-11-28 11:58 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu; +Cc: dev

Fail-safe PMD sub devices management is based on ethdev port mechanism.
So, the sub-devices management structures are exposed to other DPDK
entities which may use them in parallel to fail-safe PMD.

Use the new port ownership mechanism to avoid multiple managments of
fail-safe PMD sub-devices.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/failsafe/failsafe.c         |  7 +++++++
 drivers/net/failsafe/failsafe_eal.c     | 10 ++++++++++
 drivers/net/failsafe/failsafe_private.h |  2 ++
 3 files changed, 19 insertions(+)

diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index 6bc5aba..d413c20 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -191,6 +191,13 @@
 	ret = failsafe_args_parse(dev, params);
 	if (ret)
 		goto free_subs;
+	ret = rte_eth_dev_owner_new(&priv->my_owner.id);
+	if (ret) {
+		ERROR("Failed to get unique owner identifier");
+		goto free_args;
+	}
+	snprintf(priv->my_owner.name, sizeof(priv->my_owner.name),
+		 FAILSAFE_OWNER_NAME);
 	ret = failsafe_eal_init(dev);
 	if (ret)
 		goto free_args;
diff --git a/drivers/net/failsafe/failsafe_eal.c b/drivers/net/failsafe/failsafe_eal.c
index 19d26f5..b4628fb 100644
--- a/drivers/net/failsafe/failsafe_eal.c
+++ b/drivers/net/failsafe/failsafe_eal.c
@@ -69,6 +69,16 @@
 			ERROR("sub_device %d init went wrong", i);
 			return -ENODEV;
 		}
+		ret = rte_eth_dev_owner_set(j, &PRIV(dev)->my_owner);
+		if (ret) {
+			/*
+			 * It is unexpected for a fail-safe sub-device
+			 * to be owned by another DPDK entity.
+			 */
+			ERROR("sub_device %d owner set failed (%s)", i,
+			      strerror(ret));
+			return ret;
+		}
 		SUB_ID(sdev) = i;
 		sdev->fs_dev = dev;
 		sdev->dev = ETH(sdev)->device;
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index d81cc3c..9936875 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -42,6 +42,7 @@
 #include <rte_devargs.h>
 
 #define FAILSAFE_DRIVER_NAME "Fail-safe PMD"
+#define FAILSAFE_OWNER_NAME "Fail-safe"
 
 #define PMD_FAILSAFE_MAC_KVARG "mac"
 #define PMD_FAILSAFE_HOTPLUG_POLL_KVARG "hotplug_poll"
@@ -139,6 +140,7 @@ struct fs_priv {
 	uint32_t mac_addr_pool[FAILSAFE_MAX_ETHADDR];
 	/* current capabilities */
 	struct rte_eth_dev_info infos;
+	struct rte_eth_dev_owner my_owner; /* Unique owner. */
 	/*
 	 * Fail-safe state machine.
 	 * This level will be tracking state of the EAL and eth
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH 5/5] app/testpmd: adjust ethdev port ownership
  2017-11-28 11:57 [dpdk-dev] [PATCH 0/5] ethdev: Port ownership Matan Azrad
                   ` (3 preceding siblings ...)
  2017-11-28 11:58 ` [dpdk-dev] [PATCH 4/5] net/failsafe: use ownership mechanism to own ports Matan Azrad
@ 2017-11-28 11:58 ` Matan Azrad
  2018-01-07  9:45 ` [dpdk-dev] [PATCH v2 0/6] ethdev: " Matan Azrad
  5 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2017-11-28 11:58 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu; +Cc: dev

Testpmd should not use ethdev ports which are managed by other DPDK
entities.

Set Testpmd ownership to each port which is not used by other entity and
prevent any usage of ethdev ports which are not owned by Testpmd.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 app/test-pmd/cmdline.c      | 100 +++++++++++++++++++++++++++-----------------
 app/test-pmd/cmdline_flow.c |   2 +-
 app/test-pmd/config.c       |  40 +++++++++++-------
 app/test-pmd/parameters.c   |   4 +-
 app/test-pmd/testpmd.c      |  65 ++++++++++++++++++----------
 app/test-pmd/testpmd.h      |   3 ++
 6 files changed, 135 insertions(+), 79 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index f71d963..2878cfc 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -1357,7 +1357,7 @@ struct cmd_config_speed_all {
 			&link_speed) < 0)
 		return;
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		ports[pid].dev_conf.link_speeds = link_speed;
 	}
 
@@ -1851,7 +1851,7 @@ struct cmd_config_rss {
 	struct cmd_config_rss *res = parsed_result;
 	struct rte_eth_rss_conf rss_conf = { .rss_key_len = 0, };
 	int diag;
-	uint8_t i;
+	uint16_t pid;
 
 	if (!strcmp(res->value, "all"))
 		rss_conf.rss_hf = ETH_RSS_IP | ETH_RSS_TCP |
@@ -1885,12 +1885,12 @@ struct cmd_config_rss {
 		return;
 	}
 	rss_conf.rss_key = NULL;
-	for (i = 0; i < rte_eth_dev_count(); i++) {
-		diag = rte_eth_dev_rss_hash_update(i, &rss_conf);
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
+		diag = rte_eth_dev_rss_hash_update(pid, &rss_conf);
 		if (diag < 0)
 			printf("Configuration of RSS hash at ethernet port %d "
 				"failed with error (%d): %s.\n",
-				i, -diag, strerror(-diag));
+				pid, -diag, strerror(-diag));
 	}
 }
 
@@ -4281,9 +4281,11 @@ struct cmd_gso_show_result {
 		       __attribute__((unused)) void *data)
 {
 	struct cmd_gso_show_result *res = parsed_result;
+	const struct rte_eth_dev_owner *owner =
+		rte_eth_dev_owner_get(res->cmd_pid);
 
-	if (!rte_eth_dev_is_valid_port(res->cmd_pid)) {
-		printf("invalid port id %u\n", res->cmd_pid);
+	if (owner == NULL || owner->id != my_owner.id) {
+		printf("invalid/not owned port id %u\n", res->cmd_pid);
 		return;
 	}
 	if (!strcmp(res->cmd_keyword, "gso")) {
@@ -5293,7 +5295,12 @@ static void cmd_create_bonded_device_parsed(void *parsed_result,
 				port_id);
 
 		/* Update number of ports */
-		nb_ports = rte_eth_dev_count();
+		if (rte_eth_dev_owner_set(port_id, &my_owner) != 0) {
+			printf("Error: cannot own new attached port %d\n",
+			       port_id);
+			return;
+		}
+		nb_ports++;
 		reconfig(port_id, res->socket);
 		rte_eth_promiscuous_enable(port_id);
 	}
@@ -5401,9 +5408,11 @@ static void cmd_set_bond_mon_period_parsed(void *parsed_result,
 {
 	struct cmd_set_bond_mon_period_result *res = parsed_result;
 	int ret;
+	const struct rte_eth_dev_owner *owner =
+		rte_eth_dev_owner_get(res->port_num);
 
-	if (res->port_num >= nb_ports) {
-		printf("Port id %d must be less than %d\n", res->port_num, nb_ports);
+	if (owner == NULL || owner->id != my_owner.id) {
+		printf("invalid/not owned port id %u\n", res->port_num);
 		return;
 	}
 
@@ -5462,10 +5471,11 @@ struct cmd_set_bonding_agg_mode_policy_result {
 {
 	struct cmd_set_bonding_agg_mode_policy_result *res = parsed_result;
 	uint8_t policy = AGG_BANDWIDTH;
+	const struct rte_eth_dev_owner *owner =
+		rte_eth_dev_owner_get(res->port_num);
 
-	if (res->port_num >= nb_ports) {
-		printf("Port id %d must be less than %d\n",
-				res->port_num, nb_ports);
+	if (owner == NULL || owner->id != my_owner.id) {
+		printf("invalid/not owned port id %u\n", res->port_num);
 		return;
 	}
 
@@ -5726,7 +5736,7 @@ static void cmd_set_promisc_mode_parsed(void *parsed_result,
 
 	/* all ports */
 	if (allports) {
-		RTE_ETH_FOREACH_DEV(i) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
 			if (enable)
 				rte_eth_promiscuous_enable(i);
 			else
@@ -5806,7 +5816,7 @@ static void cmd_set_allmulti_mode_parsed(void *parsed_result,
 
 	/* all ports */
 	if (allports) {
-		RTE_ETH_FOREACH_DEV(i) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
 			if (enable)
 				rte_eth_allmulticast_enable(i);
 			else
@@ -6540,31 +6550,31 @@ static void cmd_showportall_parsed(void *parsed_result,
 	struct cmd_showportall_result *res = parsed_result;
 	if (!strcmp(res->show, "clear")) {
 		if (!strcmp(res->what, "stats"))
-			RTE_ETH_FOREACH_DEV(i)
+			RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 				nic_stats_clear(i);
 		else if (!strcmp(res->what, "xstats"))
-			RTE_ETH_FOREACH_DEV(i)
+			RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 				nic_xstats_clear(i);
 	} else if (!strcmp(res->what, "info"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			port_infos_display(i);
 	else if (!strcmp(res->what, "stats"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			nic_stats_display(i);
 	else if (!strcmp(res->what, "xstats"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			nic_xstats_display(i);
 	else if (!strcmp(res->what, "fdir"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			fdir_get_infos(i);
 	else if (!strcmp(res->what, "stat_qmap"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			nic_stats_mapping_display(i);
 	else if (!strcmp(res->what, "dcb_tc"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			port_dcb_info_display(i);
 	else if (!strcmp(res->what, "cap"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			port_offload_cap_display(i);
 }
 
@@ -10483,9 +10493,11 @@ struct cmd_flow_director_mask_result {
 	struct cmd_flow_director_mask_result *res = parsed_result;
 	struct rte_eth_fdir_masks *mask;
 	struct rte_port *port;
+	const struct rte_eth_dev_owner *owner =
+		rte_eth_dev_owner_get(res->port_id);
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (owner == NULL || owner->id != my_owner.id) {
+		printf("invalid/not owned port id %u\n", res->port_id);
 		return;
 	}
 
@@ -10684,9 +10696,11 @@ struct cmd_flow_director_flex_mask_result {
 	uint32_t flow_type_mask;
 	uint16_t i;
 	int ret;
+	const struct rte_eth_dev_owner *owner =
+		rte_eth_dev_owner_get(res->port_id);
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (owner == NULL || owner->id != my_owner.id) {
+		printf("invalid/not owned port id %u\n", res->port_id);
 		return;
 	}
 
@@ -10840,9 +10854,11 @@ struct cmd_flow_director_flexpayload_result {
 	struct rte_eth_flex_payload_cfg flex_cfg;
 	struct rte_port *port;
 	int ret = 0;
+	const struct rte_eth_dev_owner *owner =
+		rte_eth_dev_owner_get(res->port_id);
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (owner == NULL || owner->id != my_owner.id) {
+		printf("invalid/not owned port id %u\n", res->port_id);
 		return;
 	}
 
@@ -11560,7 +11576,7 @@ struct cmd_config_l2_tunnel_eth_type_result {
 	entry.l2_tunnel_type = str2fdir_l2_tunnel_type(res->l2_tunnel_type);
 	entry.ether_type = res->eth_type_val;
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		rte_eth_dev_l2_tunnel_eth_type_conf(pid, &entry);
 	}
 }
@@ -11676,7 +11692,7 @@ struct cmd_config_l2_tunnel_en_dis_result {
 	else
 		en = 0;
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		rte_eth_dev_l2_tunnel_offload_set(pid,
 						  &entry,
 						  ETH_L2_TUNNEL_ENABLE_MASK,
@@ -14202,9 +14218,11 @@ struct cmd_ddp_add_result {
 	char *file_fld[2];
 	int file_num;
 	int ret = -ENOTSUP;
+	const struct rte_eth_dev_owner *owner =
+		rte_eth_dev_owner_get(res->port_id);
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (owner == NULL || owner->id != my_owner.id) {
+		printf("invalid/not owned port id %u\n", res->port_id);
 		return;
 	}
 
@@ -14284,9 +14302,11 @@ struct cmd_ddp_del_result {
 	uint8_t *buff;
 	uint32_t size;
 	int ret = -ENOTSUP;
+	const struct rte_eth_dev_owner *owner =
+		rte_eth_dev_owner_get(res->port_id);
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (owner == NULL || owner->id != my_owner.id) {
+		printf("invalid/not owned port id %u\n", res->port_id);
 		return;
 	}
 
@@ -14599,9 +14619,11 @@ struct cmd_ddp_get_list_result {
 	uint32_t i;
 #endif
 	int ret = -ENOTSUP;
+	const struct rte_eth_dev_owner *owner =
+		rte_eth_dev_owner_get(res->port_id);
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (owner == NULL || owner->id != my_owner.id) {
+		printf("invalid/not owned port id %u\n", res->port_id);
 		return;
 	}
 
@@ -15821,7 +15843,7 @@ struct cmd_cmdfile_result {
 	if (id == (portid_t)RTE_PORT_ALL) {
 		portid_t pid;
 
-		RTE_ETH_FOREACH_DEV(pid) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 			/* check if need_reconfig has been set to 1 */
 			if (ports[pid].need_reconfig == 0)
 				ports[pid].need_reconfig = dev;
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index df16d2a..bc2a16b 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -2621,7 +2621,7 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 
 	(void)ctx;
 	(void)token;
-	RTE_ETH_FOREACH_DEV(p) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(p, my_owner.id) {
 		if (buf && i == ent)
 			return snprintf(buf, size, "%u", p);
 		++i;
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index cd2ac11..6da471e 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -155,7 +155,7 @@ struct rss_type_info {
 
 	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
 		printf("Valid port range is [0");
-		RTE_ETH_FOREACH_DEV(pid)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 			printf(", %d", pid);
 		printf("]\n");
 		return;
@@ -235,7 +235,7 @@ struct rss_type_info {
 
 	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
 		printf("Valid port range is [0");
-		RTE_ETH_FOREACH_DEV(pid)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 			printf(", %d", pid);
 		printf("]\n");
 		return;
@@ -250,10 +250,11 @@ struct rss_type_info {
 	struct rte_eth_xstat *xstats;
 	int cnt_xstats, idx_xstat;
 	struct rte_eth_xstat_name *xstats_names;
+	const struct rte_eth_dev_owner *owner = rte_eth_dev_owner_get(port_id);
 
 	printf("###### NIC extended statistics for port %-2d\n", port_id);
-	if (!rte_eth_dev_is_valid_port(port_id)) {
-		printf("Error: Invalid port number %i\n", port_id);
+	if (owner == NULL || owner->id != my_owner.id) {
+		printf("Error: invalid/not owned port number %i\n", port_id);
 		return;
 	}
 
@@ -320,7 +321,7 @@ struct rss_type_info {
 
 	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
 		printf("Valid port range is [0");
-		RTE_ETH_FOREACH_DEV(pid)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 			printf(", %d", pid);
 		printf("]\n");
 		return;
@@ -439,7 +440,7 @@ struct rss_type_info {
 
 	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
 		printf("Valid port range is [0");
-		RTE_ETH_FOREACH_DEV(pid)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 			printf(", %d", pid);
 		printf("]\n");
 		return;
@@ -727,7 +728,10 @@ struct rss_type_info {
 	if (port_id == (portid_t)RTE_PORT_ALL)
 		return 0;
 
-	if (rte_eth_dev_is_valid_port(port_id))
+	const struct rte_eth_dev_owner *owner =
+		rte_eth_dev_owner_get(port_id);
+
+	if (owner != NULL && owner->id == my_owner.id)
 		return 0;
 
 	if (warning == ENABLED_WARN)
@@ -2309,7 +2313,7 @@ struct igb_ring_desc_16_bytes {
 		return;
 	}
 	nb_pt = 0;
-	RTE_ETH_FOREACH_DEV(i) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
 		if (! ((uint64_t)(1ULL << i) & portmask))
 			continue;
 		portlist[nb_pt++] = i;
@@ -2448,8 +2452,11 @@ struct igb_ring_desc_16_bytes {
 void
 setup_gro(const char *onoff, portid_t port_id)
 {
-	if (!rte_eth_dev_is_valid_port(port_id)) {
-		printf("invalid port id %u\n", port_id);
+	const struct rte_eth_dev_owner *owner =
+		rte_eth_dev_owner_get(port_id);
+
+	if (owner == NULL || owner->id != my_owner.id) {
+		printf("invalid/not owned port id %u\n", port_id);
 		return;
 	}
 	if (test_done == 0) {
@@ -2507,11 +2514,13 @@ struct igb_ring_desc_16_bytes {
 {
 	struct rte_gro_param *param;
 	uint32_t max_pkts_num;
+	const struct rte_eth_dev_owner *owner =
+		rte_eth_dev_owner_get(port_id);
 
 	param = &gro_ports[port_id].param;
 
-	if (!rte_eth_dev_is_valid_port(port_id)) {
-		printf("Invalid port id %u.\n", port_id);
+	if (owner == NULL || owner->id != my_owner.id) {
+		printf("invalid/not owned port id %u\n", port_id);
 		return;
 	}
 	if (gro_ports[port_id].enable) {
@@ -2531,8 +2540,11 @@ struct igb_ring_desc_16_bytes {
 void
 setup_gso(const char *mode, portid_t port_id)
 {
-	if (!rte_eth_dev_is_valid_port(port_id)) {
-		printf("invalid port id %u\n", port_id);
+	const struct rte_eth_dev_owner *owner =
+		rte_eth_dev_owner_get(port_id);
+
+	if (owner == NULL || owner->id != my_owner.id) {
+		printf("invalid/not owned port id %u\n", port_id);
 		return;
 	}
 	if (strcmp(mode, "on") == 0) {
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 84e7a63..63c533c 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -428,7 +428,7 @@
 		if (port_id_is_invalid(port_id, ENABLED_WARN) ||
 			port_id == (portid_t)RTE_PORT_ALL) {
 			printf("Valid port range is [0");
-			RTE_ETH_FOREACH_DEV(pid)
+			RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 				printf(", %d", pid);
 			printf("]\n");
 			return -1;
@@ -489,7 +489,7 @@
 		if (port_id_is_invalid(port_id, ENABLED_WARN) ||
 			port_id == (portid_t)RTE_PORT_ALL) {
 			printf("Valid port range is [0");
-			RTE_ETH_FOREACH_DEV(pid)
+			RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 				printf(", %d", pid);
 			printf("]\n");
 			return -1;
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index c3ab448..a687c80 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -136,6 +136,11 @@
 lcoreid_t nb_lcores;           /**< Number of probed logical cores. */
 
 /*
+ * My port owner structure used to own Ethernet ports.
+ */
+struct rte_eth_dev_owner my_owner; /**< Unique owner. */
+
+/*
  * Test Forwarding Configuration.
  *    nb_fwd_lcores <= nb_cfg_lcores <= nb_lcores
  *    nb_fwd_ports  <= nb_cfg_ports  <= nb_ports
@@ -483,7 +488,7 @@ static int eth_event_callback(portid_t port_id,
 	portid_t pt_id;
 	int i = 0;
 
-	RTE_ETH_FOREACH_DEV(pt_id)
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pt_id, my_owner.id)
 		fwd_ports_ids[i++] = pt_id;
 
 	nb_cfg_ports = nb_ports;
@@ -607,7 +612,7 @@ static int eth_event_callback(portid_t port_id,
 		fwd_lcores[lc_id]->cpuid_idx = lc_id;
 	}
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		port = &ports[pid];
 		rte_eth_dev_info_get(pid, &port->dev_info);
 
@@ -733,7 +738,7 @@ static int eth_event_callback(portid_t port_id,
 	queueid_t q;
 
 	/* set socket id according to numa or not */
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		port = &ports[pid];
 		if (nb_rxq > port->dev_info.max_rx_queues) {
 			printf("Fail: nb_rxq(%d) is greater than "
@@ -1027,9 +1032,8 @@ static int eth_event_callback(portid_t port_id,
 	uint64_t tics_per_1sec;
 	uint64_t tics_datum;
 	uint64_t tics_current;
-	uint8_t idx_port, cnt_ports;
+	uint16_t idx_port;
 
-	cnt_ports = rte_eth_dev_count();
 	tics_datum = rte_rdtsc();
 	tics_per_1sec = rte_get_timer_hz();
 #endif
@@ -1044,11 +1048,10 @@ static int eth_event_callback(portid_t port_id,
 			tics_current = rte_rdtsc();
 			if (tics_current - tics_datum >= tics_per_1sec) {
 				/* Periodic bitrate calculation */
-				for (idx_port = 0;
-						idx_port < cnt_ports;
-						idx_port++)
+				RTE_ETH_FOREACH_DEV_OWNED_BY(idx_port,
+							     my_owner.id)
 					rte_stats_bitrate_calc(bitrate_data,
-						idx_port);
+							       idx_port);
 				tics_datum = tics_current;
 			}
 		}
@@ -1386,7 +1389,7 @@ static int eth_event_callback(portid_t port_id,
 	portid_t pi;
 	struct rte_port *port;
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		port = &ports[pi];
 		/* Check if there is a port which is not started */
 		if ((port->port_status != RTE_PORT_STARTED) &&
@@ -1404,7 +1407,7 @@ static int eth_event_callback(portid_t port_id,
 	portid_t pi;
 	struct rte_port *port;
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		port = &ports[pi];
 		if ((port->port_status != RTE_PORT_STOPPED) &&
 			(port->slave_flag == 0))
@@ -1453,7 +1456,7 @@ static int eth_event_callback(portid_t port_id,
 
 	if(dcb_config)
 		dcb_test = 1;
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
 			continue;
 
@@ -1634,7 +1637,7 @@ static int eth_event_callback(portid_t port_id,
 
 	printf("Stopping ports...\n");
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
 			continue;
 
@@ -1677,7 +1680,7 @@ static int eth_event_callback(portid_t port_id,
 
 	printf("Closing ports...\n");
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
 			continue;
 
@@ -1728,7 +1731,7 @@ static int eth_event_callback(portid_t port_id,
 
 	printf("Resetting ports...\n");
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
 			continue;
 
@@ -1773,6 +1776,12 @@ static int eth_event_callback(portid_t port_id,
 	if (rte_eth_dev_attach(identifier, &pi))
 		return;
 
+	if (rte_eth_dev_owner_set(pi, &my_owner) != 0) {
+		printf("Error: cannot own new attached port %d\n", pi);
+		return;
+	}
+	nb_ports++;
+
 	socket_id = (unsigned)rte_eth_dev_socket_id(pi);
 	/* if socket_id is invalid, set to 0 */
 	if (check_socket_id(socket_id) < 0)
@@ -1780,8 +1789,6 @@ static int eth_event_callback(portid_t port_id,
 	reconfig(pi, socket_id);
 	rte_eth_promiscuous_enable(pi);
 
-	nb_ports = rte_eth_dev_count();
-
 	ports[pi].port_status = RTE_PORT_STOPPED;
 
 	printf("Port %d is attached. Now total ports is %d\n", pi, nb_ports);
@@ -1792,9 +1799,16 @@ static int eth_event_callback(portid_t port_id,
 detach_port(portid_t port_id)
 {
 	char name[RTE_ETH_NAME_MAX_LEN];
+	const struct rte_eth_dev_owner *owner = rte_eth_dev_owner_get(port_id);
 
 	printf("Detaching a port...\n");
 
+	if (owner == NULL || owner->id != my_owner.id) {
+		printf("Failed to detach invalid/not owned port id %u\n",
+		       port_id);
+		return;
+	}
+
 	if (!port_is_closed(port_id)) {
 		printf("Please close port first\n");
 		return;
@@ -1808,7 +1822,7 @@ static int eth_event_callback(portid_t port_id,
 		return;
 	}
 
-	nb_ports = rte_eth_dev_count();
+	nb_ports--;
 
 	printf("Port '%s' is detached. Now total ports is %d\n",
 			name, nb_ports);
@@ -1826,7 +1840,7 @@ static int eth_event_callback(portid_t port_id,
 
 	if (ports != NULL) {
 		no_link_check = 1;
-		RTE_ETH_FOREACH_DEV(pt_id) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pt_id, my_owner.id) {
 			printf("\nShutting down port %d...\n", pt_id);
 			fflush(stdout);
 			stop_port(pt_id);
@@ -1858,7 +1872,7 @@ struct pmd_test_command {
 	fflush(stdout);
 	for (count = 0; count <= MAX_CHECK_TIME; count++) {
 		all_ports_up = 1;
-		RTE_ETH_FOREACH_DEV(portid) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(portid, my_owner.id) {
 			if ((port_mask & (1 << portid)) == 0)
 				continue;
 			memset(&link, 0, sizeof(link));
@@ -2083,7 +2097,7 @@ struct pmd_test_command {
 	portid_t pid;
 	struct rte_port *port;
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		port = &ports[pid];
 		port->dev_conf.rxmode = rx_mode;
 		port->dev_conf.fdir_conf = fdir_conf;
@@ -2394,7 +2408,12 @@ uint8_t port_is_bonding_slave(portid_t slave_pid)
 	rte_pdump_init(NULL);
 #endif
 
-	nb_ports = (portid_t) rte_eth_dev_count();
+	if (rte_eth_dev_owner_new(&my_owner.id))
+		rte_panic("Failed to get unique owner identifier\n");
+	snprintf(my_owner.name, sizeof(my_owner.name), TESTPMD_OWNER_NAME);
+	RTE_ETH_FOREACH_DEV_OWNED_BY(port_id, RTE_ETH_DEV_NO_OWNER)
+		if (rte_eth_dev_owner_set(port_id, &my_owner) == 0)
+			nb_ports++;
 	if (nb_ports == 0)
 		RTE_LOG(WARNING, EAL, "No probed ethernet devices\n");
 
@@ -2442,7 +2461,7 @@ uint8_t port_is_bonding_slave(portid_t slave_pid)
 		rte_exit(EXIT_FAILURE, "Start ports failed\n");
 
 	/* set all ports to promiscuous mode by default */
-	RTE_ETH_FOREACH_DEV(port_id)
+	RTE_ETH_FOREACH_DEV_OWNED_BY(port_id, my_owner.id)
 		rte_eth_promiscuous_enable(port_id);
 
 	/* Init metrics library */
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 1639d27..7db7d72 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -79,6 +79,8 @@
 #define NUMA_NO_CONFIG 0xFF
 #define UMA_NO_CONFIG  0xFF
 
+#define TESTPMD_OWNER_NAME "TestPMD"
+
 typedef uint8_t  lcoreid_t;
 typedef uint16_t portid_t;
 typedef uint16_t queueid_t;
@@ -409,6 +411,7 @@ struct queue_stats_mappings {
  * nb_fwd_ports <= nb_cfg_ports <= nb_ports
  */
 extern portid_t nb_ports; /**< Number of ethernet ports probed at init time. */
+extern struct rte_eth_dev_owner my_owner; /**< Unique owner. */
 extern portid_t nb_cfg_ports; /**< Number of configured ports. */
 extern portid_t nb_fwd_ports; /**< Number of forwarding ports. */
 extern portid_t fwd_ports_ids[RTE_MAX_ETHPORTS];
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-11-28 11:57 ` [dpdk-dev] [PATCH 2/5] ethdev: add port ownership Matan Azrad
@ 2017-11-30 12:36   ` Neil Horman
  2017-11-30 13:24     ` Gaëtan Rivet
  0 siblings, 1 reply; 214+ messages in thread
From: Neil Horman @ 2017-11-30 12:36 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Thomas Monjalon, Gaetan Rivet, Jingjing Wu, dev

On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> The ownership of a port is implicit in DPDK.
> Making it explicit is better from the next reasons:
> 1. It may be convenient for multi-process applications to know which
>    process is in charge of a port.
> 2. A library could work on top of a port.
> 3. A port can work on top of another port.
> 
> Also in the fail-safe case, an issue has been met in testpmd.
> We need to check that the user is not trying to use a port which is
> already managed by fail-safe.
> 
> Add ownership mechanism to DPDK Ethernet devices to avoid multiple
> management of a device by different DPDK entities.
> 
> A port owner is built from owner id(number) and owner name(string) while
> the owner id must be unique to distinguish between two identical entity
> instances and the owner name can be any name.
> The name helps to logically recognize the owner by different DPDK
> entities and allows easy debug.
> Each DPDK entity can allocate an owner unique identifier and can use it
> and its preferred name to owns valid ethdev ports.
> Each DPDK entity can get any port owner status to decide if it can
> manage the port or not.
> 
> The current ethdev internal port management is not affected by this
> feature.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>


This seems fairly racy.  What if one thread attempts to set ownership on a port,
while another is checking it on another cpu in parallel.  It doesn't seem like
it will protect against that at all. It also doesn't protect against the
possibility of multiple threads attempting to poll for rx in parallel, which I
think was part of Thomas's origional statement regarding port ownership (he
noted that the lockless design implied only a single thread should be allowed to
poll for receive or make configuration changes at a time.

Neil

> ---
>  doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
>  lib/librte_ether/rte_ethdev.c           | 121 ++++++++++++++++++++++++++++++++
>  lib/librte_ether/rte_ethdev.h           |  86 +++++++++++++++++++++++
>  lib/librte_ether/rte_ethdev_version.map |  12 ++++
>  4 files changed, 230 insertions(+), 1 deletion(-)
> 
> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
> index 6a0c9f9..af639a1 100644
> --- a/doc/guides/prog_guide/poll_mode_drv.rst
> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> @@ -156,7 +156,7 @@ concurrently on the same tx queue without SW lock. This PMD feature found in som
>  
>  See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE`` capability probing details.
>  
> -Device Identification and Configuration
> +Device Identification, Ownership  and Configuration
>  ---------------------------------------
>  
>  Device Identification
> @@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports are assigned two other identifiers:
>  *   A port name used to designate the port in console messages, for administration or debugging purposes.
>      For ease of use, the port name includes the port index.
>  
> +Port Ownership
> +~~~~~~~~~~~~~
> +The Ethernet devices ports can be owned by a single DPDK entity (application, library, PMD, process, etc).
> +The ownership mechanism is controlled by ethdev APIs and allows to set/remove/get a port owner by DPDK entities.
> +Allowing this should prevent any multiple management of Ethernet port by different entities.
> +
> +.. note::
> +
> +    It is the DPDK entity responsibility either to check the port owner before using it or to set the port owner to prevent others from using it.
> +
>  Device Configuration
>  ~~~~~~~~~~~~~~~~~~~~
>  
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 2d754d9..836991e 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -71,6 +71,7 @@
>  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
>  struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
>  static struct rte_eth_dev_data *rte_eth_dev_data;
> +static uint16_t rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER + 1;
>  static uint8_t eth_dev_last_created_port;
>  
>  /* spinlock for eth device callbacks */
> @@ -278,6 +279,7 @@ struct rte_eth_dev *
>  	if (eth_dev == NULL)
>  		return -EINVAL;
>  
> +	memset(&eth_dev->data->owner, 0, sizeof(struct rte_eth_dev_owner));
>  	eth_dev->state = RTE_ETH_DEV_UNUSED;
>  	return 0;
>  }
> @@ -293,6 +295,125 @@ struct rte_eth_dev *
>  		return 1;
>  }
>  
> +static int
> +rte_eth_is_valid_owner_id(uint16_t owner_id)
> +{
> +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> +	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER &&
> +	    rte_eth_next_owner_id <= owner_id)) {
> +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> +		return 0;
> +	}
> +	return 1;
> +}
> +
> +uint16_t
> +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t owner_id)
> +{
> +	while (port_id < RTE_MAX_ETHPORTS &&
> +	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
> +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> +		port_id++;
> +
> +	if (port_id >= RTE_MAX_ETHPORTS)
> +		return RTE_MAX_ETHPORTS;
> +
> +	return port_id;
> +}
> +
> +int
> +rte_eth_dev_owner_new(uint16_t *owner_id)
> +{
> +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own ports.\n");
> +		return -EPERM;
> +	}
> +	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> +		RTE_PMD_DEBUG_TRACE("Reached maximum number of Ethernet port owners.\n");
> +		return -EUSERS;
> +	}
> +	*owner_id = rte_eth_next_owner_id++;
> +	return 0;
> +}
> +
> +int
> +rte_eth_dev_owner_set(const uint16_t port_id,
> +		      const struct rte_eth_dev_owner *owner)
> +{
> +	struct rte_eth_dev_owner *port_owner;
> +	int ret;
> +
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own ports.\n");
> +		return -EPERM;
> +	}
> +	if (!rte_eth_is_valid_owner_id(owner->id))
> +		return -EINVAL;
> +	port_owner = &rte_eth_devices[port_id].data->owner;
> +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> +	    port_owner->id != owner->id) {
> +		RTE_LOG(ERR, EAL,
> +			"Cannot set owner to port %d already owned by %s_%05d.\n",
> +			port_id, port_owner->name, port_owner->id);
> +		return -EPERM;
> +	}
> +	ret = snprintf(port_owner->name, RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> +		       owner->name);
> +	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> +		memset(port_owner->name, 0, RTE_ETH_MAX_OWNER_NAME_LEN);
> +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> +		return -EINVAL;
> +	}
> +	port_owner->id = owner->id;
> +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
> +			    owner->name, owner->id);
> +	return 0;
> +}
> +
> +int
> +rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t owner_id)
> +{
> +	struct rte_eth_dev_owner *port_owner;
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> +	if (!rte_eth_is_valid_owner_id(owner_id))
> +		return -EINVAL;
> +	port_owner = &rte_eth_devices[port_id].data->owner;
> +	if (port_owner->id != owner_id) {
> +		RTE_LOG(ERR, EAL,
> +			"Cannot remove port %d owner %s_%05d by different owner id %5d.\n",
> +			port_id, port_owner->name, port_owner->id, owner_id);
> +		return -EPERM;
> +	}
> +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has removed.\n", port_id,
> +			port_owner->name, port_owner->id);
> +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> +	return 0;
> +}
> +
> +void
> +rte_eth_dev_owner_delete(const uint16_t owner_id)
> +{
> +	uint16_t p;
> +
> +	if (!rte_eth_is_valid_owner_id(owner_id))
> +		return;
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
> +		memset(&rte_eth_devices[p].data->owner, 0,
> +		       sizeof(struct rte_eth_dev_owner));
> +	RTE_PMD_DEBUG_TRACE("All port owners owned by "
> +			    "%05d identifier has removed.\n", owner_id);
> +}
> +
> +const struct rte_eth_dev_owner *
> +rte_eth_dev_owner_get(const uint16_t port_id)
> +{
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> +	if (rte_eth_devices[port_id].data->owner.id == RTE_ETH_DEV_NO_OWNER)
> +		return NULL;
> +	return &rte_eth_devices[port_id].data->owner;
> +}
> +
>  int
>  rte_eth_dev_socket_id(uint16_t port_id)
>  {
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 341c2d6..f54c26d 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
>  
>  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
>  
> +#define RTE_ETH_DEV_NO_OWNER 0
> +
> +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> +
> +struct rte_eth_dev_owner {
> +	uint16_t id; /**< The owner unique identifier. */
> +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner name. */
> +};
> +
>  /**
>   * @internal
>   * The data part, with no function pointers, associated with each ethernet device.
> @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
>  	int numa_node;  /**< NUMA node connection */
>  	struct rte_vlan_filter_conf vlan_filter_conf;
>  	/**< VLAN filter configuration. */
> +	struct rte_eth_dev_owner owner; /**< The port owner. */
>  };
>  
>  /** Device supports link state interrupt */
> @@ -1846,6 +1856,82 @@ struct rte_eth_dev_data {
>  
>  
>  /**
> + * Iterates over valid ethdev ports owned by a specific owner.
> + *
> + * @param port_id
> + *   The id of the next possible valid owned port.
> + * @param	owner_id
> + *  The owner identifier.
> + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless ports.
> + * @return
> + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if there is none.
> + */
> +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t owner_id);
> +
> +/**
> + * Macro to iterate over all enabled ethdev ports owned by a specific owner.
> + */
> +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> +	for (p = rte_eth_find_next_owned_by(0, o); \
> +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> +	     p = rte_eth_find_next_owned_by(p + 1, o))
> +
> +/**
> + * Get a new unique owner identifier.
> + * An owner identifier is used to owns Ethernet devices by only one DPDK entity
> + * to avoid multiple management of device by different entities.
> + *
> + * @param	owner_id
> + *   Owner identifier pointer.
> + * @return
> + *   Negative errno value on error, 0 on success.
> + */
> +int rte_eth_dev_owner_new(uint16_t *owner_id);
> +
> +/**
> + * Set an Ethernet device owner.
> + *
> + * @param	port_id
> + *  The identifier of the port to own.
> + * @param	owner
> + *  The owner pointer.
> + * @return
> + *  Negative errno value on error, 0 on success.
> + */
> +int rte_eth_dev_owner_set(const uint16_t port_id,
> +			  const struct rte_eth_dev_owner *owner);
> +
> +/**
> + * Remove Ethernet device owner to make the device ownerless.
> + *
> + * @param	port_id
> + *  The identifier of port to make ownerless.
> + * @param	owner
> + *  The owner identifier.
> + * @return
> + *  0 on success, negative errno value on error.
> + */
> +int rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t owner_id);
> +
> +/**
> + * Remove owner from all Ethernet devices owned by a specific owner.
> + *
> + * @param	owner
> + *  The owner identifier.
> + */
> +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> +
> +/**
> + * Get the owner of an Ethernet device.
> + *
> + * @param	port_id
> + *  The port identifier.
> + * @return
> + *  NULL when the device is ownerless, else the device owner pointer.
> + */
> +const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const uint16_t port_id);
> +
> +/**
>   * Get the total number of Ethernet devices that have been successfully
>   * initialized by the matching Ethernet driver during the PCI probing phase
>   * and that are available for applications to use. These devices must be
> diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
> index e9681ac..7d07edb 100644
> --- a/lib/librte_ether/rte_ethdev_version.map
> +++ b/lib/librte_ether/rte_ethdev_version.map
> @@ -198,6 +198,18 @@ DPDK_17.11 {
>  
>  } DPDK_17.08;
>  
> +DPDK_18.02 {
> +	global:
> +
> +	rte_eth_find_next_owned_by;
> +	rte_eth_dev_owner_new;
> +	rte_eth_dev_owner_set;
> +	rte_eth_dev_owner_remove;
> +	rte_eth_dev_owner_delete;
> +	rte_eth_dev_owner_get;
> +
> +} DPDK_17.11;
> +
>  EXPERIMENTAL {
>  	global:
>  
> -- 
> 1.8.3.1
> 
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-11-30 12:36   ` Neil Horman
@ 2017-11-30 13:24     ` Gaëtan Rivet
  2017-11-30 14:30       ` Matan Azrad
  2017-12-01 12:09       ` Neil Horman
  0 siblings, 2 replies; 214+ messages in thread
From: Gaëtan Rivet @ 2017-11-30 13:24 UTC (permalink / raw)
  To: Neil Horman; +Cc: Matan Azrad, Thomas Monjalon, Jingjing Wu, dev

Hello Matan, Neil,

I like the port ownership concept. I think it is needed to clarify some
operations and should be useful to several subsystems.

This patch could certainly be sub-divided however, and your current 1/5
should probably come after this one.

Some comments inline.

On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > The ownership of a port is implicit in DPDK.
> > Making it explicit is better from the next reasons:
> > 1. It may be convenient for multi-process applications to know which
> >    process is in charge of a port.
> > 2. A library could work on top of a port.
> > 3. A port can work on top of another port.
> > 
> > Also in the fail-safe case, an issue has been met in testpmd.
> > We need to check that the user is not trying to use a port which is
> > already managed by fail-safe.
> > 
> > Add ownership mechanism to DPDK Ethernet devices to avoid multiple
> > management of a device by different DPDK entities.
> > 
> > A port owner is built from owner id(number) and owner name(string) while
> > the owner id must be unique to distinguish between two identical entity
> > instances and the owner name can be any name.
> > The name helps to logically recognize the owner by different DPDK
> > entities and allows easy debug.
> > Each DPDK entity can allocate an owner unique identifier and can use it
> > and its preferred name to owns valid ethdev ports.
> > Each DPDK entity can get any port owner status to decide if it can
> > manage the port or not.
> > 
> > The current ethdev internal port management is not affected by this
> > feature.
> > 

The internal port management is not affected, but the external interface
is, however. In order to respect port ownership, applications are forced
to modify their port iterator, as shown by your testpmd patch.

I think it would be better to modify the current RTE_ETH_FOREACH_DEV to call
RTE_FOREACH_DEV_OWNED_BY, and introduce a default owner that would
represent the application itself (probably with the ID 0 and an owner
string ""). Only with specific additional configuration should this
default subset of ethdev be divided.

This would make this evolution seamless for applications, at no cost to
the complexity of the design.

> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> 
> 
> This seems fairly racy.  What if one thread attempts to set ownership on a port,
> while another is checking it on another cpu in parallel.  It doesn't seem like
> it will protect against that at all. It also doesn't protect against the
> possibility of multiple threads attempting to poll for rx in parallel, which I
> think was part of Thomas's origional statement regarding port ownership (he
> noted that the lockless design implied only a single thread should be allowed to
> poll for receive or make configuration changes at a time.
> 
> Neil
> 

Isn't this race already there for any configuration operation / polling
function? The DPDK arch is expecting applications to solve it. Why should
port ownership be designed differently from other DPDK components?

Embedding checks for port ownership within operations will force
everyone to bear their costs, even those not interested in using this
API. These checks should be kept outside, within the entity claiming
ownership of the port, in the form of using the proper port iterator
IMO.

> > ---
> >  doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
> >  lib/librte_ether/rte_ethdev.c           | 121 ++++++++++++++++++++++++++++++++
> >  lib/librte_ether/rte_ethdev.h           |  86 +++++++++++++++++++++++
> >  lib/librte_ether/rte_ethdev_version.map |  12 ++++
> >  4 files changed, 230 insertions(+), 1 deletion(-)
> > 
> > diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
> > index 6a0c9f9..af639a1 100644
> > --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > @@ -156,7 +156,7 @@ concurrently on the same tx queue without SW lock. This PMD feature found in som
> >  
> >  See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE`` capability probing details.
> >  
> > -Device Identification and Configuration
> > +Device Identification, Ownership  and Configuration
> >  ---------------------------------------
> >  
> >  Device Identification
> > @@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports are assigned two other identifiers:
> >  *   A port name used to designate the port in console messages, for administration or debugging purposes.
> >      For ease of use, the port name includes the port index.
> >  
> > +Port Ownership
> > +~~~~~~~~~~~~~
> > +The Ethernet devices ports can be owned by a single DPDK entity (application, library, PMD, process, etc).
> > +The ownership mechanism is controlled by ethdev APIs and allows to set/remove/get a port owner by DPDK entities.
> > +Allowing this should prevent any multiple management of Ethernet port by different entities.
> > +
> > +.. note::
> > +
> > +    It is the DPDK entity responsibility either to check the port owner before using it or to set the port owner to prevent others from using it.
> > +
> >  Device Configuration
> >  ~~~~~~~~~~~~~~~~~~~~
> >  
> > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> > index 2d754d9..836991e 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -71,6 +71,7 @@
> >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
> >  struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > +static uint16_t rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER + 1;
> >  static uint8_t eth_dev_last_created_port;
> >  
> >  /* spinlock for eth device callbacks */
> > @@ -278,6 +279,7 @@ struct rte_eth_dev *
> >  	if (eth_dev == NULL)
> >  		return -EINVAL;
> >  
> > +	memset(&eth_dev->data->owner, 0, sizeof(struct rte_eth_dev_owner));
> >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> >  	return 0;
> >  }
> > @@ -293,6 +295,125 @@ struct rte_eth_dev *
> >  		return 1;
> >  }
> >  
> > +static int
> > +rte_eth_is_valid_owner_id(uint16_t owner_id)
> > +{
> > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > +	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER &&
> > +	    rte_eth_next_owner_id <= owner_id)) {
> > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > +		return 0;
> > +	}
> > +	return 1;
> > +}
> > +
> > +uint16_t
> > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t owner_id)
> > +{
> > +	while (port_id < RTE_MAX_ETHPORTS &&
> > +	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
> > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > +		port_id++;
> > +
> > +	if (port_id >= RTE_MAX_ETHPORTS)
> > +		return RTE_MAX_ETHPORTS;
> > +
> > +	return port_id;
> > +}
> > +
> > +int
> > +rte_eth_dev_owner_new(uint16_t *owner_id)
> > +{
> > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own ports.\n");
> > +		return -EPERM;
> > +	}
> > +	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > +		RTE_PMD_DEBUG_TRACE("Reached maximum number of Ethernet port owners.\n");
> > +		return -EUSERS;
> > +	}
> > +	*owner_id = rte_eth_next_owner_id++;
> > +	return 0;
> > +}
> > +
> > +int
> > +rte_eth_dev_owner_set(const uint16_t port_id,
> > +		      const struct rte_eth_dev_owner *owner)
> > +{
> > +	struct rte_eth_dev_owner *port_owner;
> > +	int ret;
> > +
> > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own ports.\n");
> > +		return -EPERM;
> > +	}
> > +	if (!rte_eth_is_valid_owner_id(owner->id))
> > +		return -EINVAL;
> > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > +	    port_owner->id != owner->id) {
> > +		RTE_LOG(ERR, EAL,
> > +			"Cannot set owner to port %d already owned by %s_%05d.\n",
> > +			port_id, port_owner->name, port_owner->id);
> > +		return -EPERM;
> > +	}
> > +	ret = snprintf(port_owner->name, RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > +		       owner->name);
> > +	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> > +		memset(port_owner->name, 0, RTE_ETH_MAX_OWNER_NAME_LEN);
> > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > +		return -EINVAL;
> > +	}
> > +	port_owner->id = owner->id;
> > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
> > +			    owner->name, owner->id);
> > +	return 0;
> > +}
> > +
> > +int
> > +rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t owner_id)
> > +{
> > +	struct rte_eth_dev_owner *port_owner;
> > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > +		return -EINVAL;
> > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > +	if (port_owner->id != owner_id) {
> > +		RTE_LOG(ERR, EAL,
> > +			"Cannot remove port %d owner %s_%05d by different owner id %5d.\n",
> > +			port_id, port_owner->name, port_owner->id, owner_id);
> > +		return -EPERM;
> > +	}
> > +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has removed.\n", port_id,
> > +			port_owner->name, port_owner->id);
> > +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> > +	return 0;
> > +}
> > +
> > +void
> > +rte_eth_dev_owner_delete(const uint16_t owner_id)
> > +{
> > +	uint16_t p;
> > +
> > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > +		return;
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
> > +		memset(&rte_eth_devices[p].data->owner, 0,
> > +		       sizeof(struct rte_eth_dev_owner));
> > +	RTE_PMD_DEBUG_TRACE("All port owners owned by "
> > +			    "%05d identifier has removed.\n", owner_id);
> > +}
> > +
> > +const struct rte_eth_dev_owner *
> > +rte_eth_dev_owner_get(const uint16_t port_id)
> > +{
> > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> > +	if (rte_eth_devices[port_id].data->owner.id == RTE_ETH_DEV_NO_OWNER)
> > +		return NULL;
> > +	return &rte_eth_devices[port_id].data->owner;
> > +}
> > +
> >  int
> >  rte_eth_dev_socket_id(uint16_t port_id)
> >  {
> > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> > index 341c2d6..f54c26d 100644
> > --- a/lib/librte_ether/rte_ethdev.h
> > +++ b/lib/librte_ether/rte_ethdev.h
> > @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
> >  
> >  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
> >  
> > +#define RTE_ETH_DEV_NO_OWNER 0
> > +
> > +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> > +
> > +struct rte_eth_dev_owner {
> > +	uint16_t id; /**< The owner unique identifier. */
> > +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner name. */
> > +};
> > +
> >  /**
> >   * @internal
> >   * The data part, with no function pointers, associated with each ethernet device.
> > @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
> >  	int numa_node;  /**< NUMA node connection */
> >  	struct rte_vlan_filter_conf vlan_filter_conf;
> >  	/**< VLAN filter configuration. */
> > +	struct rte_eth_dev_owner owner; /**< The port owner. */
> >  };
> >  
> >  /** Device supports link state interrupt */
> > @@ -1846,6 +1856,82 @@ struct rte_eth_dev_data {
> >  
> >  
> >  /**
> > + * Iterates over valid ethdev ports owned by a specific owner.
> > + *
> > + * @param port_id
> > + *   The id of the next possible valid owned port.
> > + * @param	owner_id
> > + *  The owner identifier.
> > + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless ports.
> > + * @return
> > + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if there is none.
> > + */
> > +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t owner_id);
> > +
> > +/**
> > + * Macro to iterate over all enabled ethdev ports owned by a specific owner.
> > + */
> > +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> > +	for (p = rte_eth_find_next_owned_by(0, o); \
> > +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> > +	     p = rte_eth_find_next_owned_by(p + 1, o))
> > +
> > +/**
> > + * Get a new unique owner identifier.
> > + * An owner identifier is used to owns Ethernet devices by only one DPDK entity
> > + * to avoid multiple management of device by different entities.
> > + *
> > + * @param	owner_id
> > + *   Owner identifier pointer.
> > + * @return
> > + *   Negative errno value on error, 0 on success.
> > + */
> > +int rte_eth_dev_owner_new(uint16_t *owner_id);
> > +
> > +/**
> > + * Set an Ethernet device owner.
> > + *
> > + * @param	port_id
> > + *  The identifier of the port to own.
> > + * @param	owner
> > + *  The owner pointer.
> > + * @return
> > + *  Negative errno value on error, 0 on success.
> > + */
> > +int rte_eth_dev_owner_set(const uint16_t port_id,
> > +			  const struct rte_eth_dev_owner *owner);
> > +
> > +/**
> > + * Remove Ethernet device owner to make the device ownerless.
> > + *
> > + * @param	port_id
> > + *  The identifier of port to make ownerless.
> > + * @param	owner
> > + *  The owner identifier.
> > + * @return
> > + *  0 on success, negative errno value on error.
> > + */
> > +int rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t owner_id);
> > +
> > +/**
> > + * Remove owner from all Ethernet devices owned by a specific owner.
> > + *
> > + * @param	owner
> > + *  The owner identifier.
> > + */
> > +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> > +
> > +/**
> > + * Get the owner of an Ethernet device.
> > + *
> > + * @param	port_id
> > + *  The port identifier.
> > + * @return
> > + *  NULL when the device is ownerless, else the device owner pointer.
> > + */
> > +const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const uint16_t port_id);
> > +
> > +/**
> >   * Get the total number of Ethernet devices that have been successfully
> >   * initialized by the matching Ethernet driver during the PCI probing phase
> >   * and that are available for applications to use. These devices must be
> > diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
> > index e9681ac..7d07edb 100644
> > --- a/lib/librte_ether/rte_ethdev_version.map
> > +++ b/lib/librte_ether/rte_ethdev_version.map
> > @@ -198,6 +198,18 @@ DPDK_17.11 {
> >  
> >  } DPDK_17.08;
> >  
> > +DPDK_18.02 {
> > +	global:
> > +
> > +	rte_eth_find_next_owned_by;
> > +	rte_eth_dev_owner_new;
> > +	rte_eth_dev_owner_set;
> > +	rte_eth_dev_owner_remove;
> > +	rte_eth_dev_owner_delete;
> > +	rte_eth_dev_owner_get;
> > +
> > +} DPDK_17.11;
> > +
> >  EXPERIMENTAL {
> >  	global:
> >  
> > -- 
> > 1.8.3.1
> > 
> > 

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-11-30 13:24     ` Gaëtan Rivet
@ 2017-11-30 14:30       ` Matan Azrad
  2017-11-30 15:09         ` Gaëtan Rivet
  2017-12-01 12:09       ` Neil Horman
  1 sibling, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2017-11-30 14:30 UTC (permalink / raw)
  To: Gaëtan Rivet, Neil Horman; +Cc: Thomas Monjalon, Jingjing Wu, dev

Hi all

> -----Original Message-----
> From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> Sent: Thursday, November 30, 2017 3:25 PM
> To: Neil Horman <nhorman@tuxdriver.com>
> Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> Hello Matan, Neil,
> 
> I like the port ownership concept. I think it is needed to clarify some
> operations and should be useful to several subsystems.
> 
> This patch could certainly be sub-divided however, and your current 1/5
> should probably come after this one.

Can you suggest how to divide it?

1/5 could be actually outside of this series, it is just better behavior to use the right function to do release port :)

> 
> Some comments inline.
> 
> On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > The ownership of a port is implicit in DPDK.
> > > Making it explicit is better from the next reasons:
> > > 1. It may be convenient for multi-process applications to know which
> > >    process is in charge of a port.
> > > 2. A library could work on top of a port.
> > > 3. A port can work on top of another port.
> > >
> > > Also in the fail-safe case, an issue has been met in testpmd.
> > > We need to check that the user is not trying to use a port which is
> > > already managed by fail-safe.
> > >
> > > Add ownership mechanism to DPDK Ethernet devices to avoid multiple
> > > management of a device by different DPDK entities.
> > >
> > > A port owner is built from owner id(number) and owner name(string)
> > > while the owner id must be unique to distinguish between two
> > > identical entity instances and the owner name can be any name.
> > > The name helps to logically recognize the owner by different DPDK
> > > entities and allows easy debug.
> > > Each DPDK entity can allocate an owner unique identifier and can use
> > > it and its preferred name to owns valid ethdev ports.
> > > Each DPDK entity can get any port owner status to decide if it can
> > > manage the port or not.
> > >
> > > The current ethdev internal port management is not affected by this
> > > feature.
> > >
> 
> The internal port management is not affected, but the external interface is,
> however. In order to respect port ownership, applications are forced to
> modify their port iterator, as shown by your testpmd patch.
> 
> I think it would be better to modify the current RTE_ETH_FOREACH_DEV to
> call RTE_FOREACH_DEV_OWNED_BY, and introduce a default owner that
> would represent the application itself (probably with the ID 0 and an owner
> string ""). Only with specific additional configuration should this default
> subset of ethdev be divided.
> 
> This would make this evolution seamless for applications, at no cost to the
> complexity of the design.

As you can see in patch code and in testpmd example I added option to iterate over all valid ownerless ports which should be more relevant by owner_id = 0.
So maybe the RTE_ETH_FOREACH_DEV should be changed to run this by the new iterator.
By this way current applications don't need to build their owners but the ABI will be broken.

Actually, I think the old iterator RTE_ETH_FOREACH_DEV should be unexposed or just removed,
also the DEFFERED state should be removed,
I don't really see any usage to iterate over all valid ports by DPDK entities different from ethdev itself.
I just don't want to break it now.

> 
> > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> >
> >
> > This seems fairly racy.  What if one thread attempts to set ownership
> > on a port, while another is checking it on another cpu in parallel.
> > It doesn't seem like it will protect against that at all. It also
> > doesn't protect against the possibility of multiple threads attempting
> > to poll for rx in parallel, which I think was part of Thomas's
> > origional statement regarding port ownership (he noted that the
> > lockless design implied only a single thread should be allowed to poll for
> receive or make configuration changes at a time.
> >
> > Neil
> >
> 
> Isn't this race already there for any configuration operation / polling
> function? The DPDK arch is expecting applications to solve it. Why should port
> ownership be designed differently from other DPDK components?
> 
> Embedding checks for port ownership within operations will force everyone
> to bear their costs, even those not interested in using this API. These checks
> should be kept outside, within the entity claiming ownership of the port, in
> the form of using the proper port iterator IMO.

As Gaetan said, there is a race in ethdev in many places, think about new port creation in parallel.
If one day ethdev will be race safe than the port ownership should be too.

> 
> > > ---
> > >  doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
> > >  lib/librte_ether/rte_ethdev.c           | 121
> ++++++++++++++++++++++++++++++++
> > >  lib/librte_ether/rte_ethdev.h           |  86 +++++++++++++++++++++++
> > >  lib/librte_ether/rte_ethdev_version.map |  12 ++++
> > >  4 files changed, 230 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> > > b/doc/guides/prog_guide/poll_mode_drv.rst
> > > index 6a0c9f9..af639a1 100644
> > > --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > > +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > > @@ -156,7 +156,7 @@ concurrently on the same tx queue without SW
> > > lock. This PMD feature found in som
> > >
> > >  See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE``
> capability probing details.
> > >
> > > -Device Identification and Configuration
> > > +Device Identification, Ownership  and Configuration
> > >  ---------------------------------------
> > >
> > >  Device Identification
> > > @@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports are
> assigned two other identifiers:
> > >  *   A port name used to designate the port in console messages, for
> administration or debugging purposes.
> > >      For ease of use, the port name includes the port index.
> > >
> > > +Port Ownership
> > > +~~~~~~~~~~~~~
> > > +The Ethernet devices ports can be owned by a single DPDK entity
> (application, library, PMD, process, etc).
> > > +The ownership mechanism is controlled by ethdev APIs and allows to
> set/remove/get a port owner by DPDK entities.
> > > +Allowing this should prevent any multiple management of Ethernet port
> by different entities.
> > > +
> > > +.. note::
> > > +
> > > +    It is the DPDK entity responsibility either to check the port owner
> before using it or to set the port owner to prevent others from using it.
> > > +
> > >  Device Configuration
> > >  ~~~~~~~~~~~~~~~~~~~~
> > >
> > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > b/lib/librte_ether/rte_ethdev.c index 2d754d9..836991e 100644
> > > --- a/lib/librte_ether/rte_ethdev.c
> > > +++ b/lib/librte_ether/rte_ethdev.c
> > > @@ -71,6 +71,7 @@
> > >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
> > > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> > >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > > +static uint16_t rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER +
> 1;
> > >  static uint8_t eth_dev_last_created_port;
> > >
> > >  /* spinlock for eth device callbacks */ @@ -278,6 +279,7 @@ struct
> > > rte_eth_dev *
> > >  	if (eth_dev == NULL)
> > >  		return -EINVAL;
> > >
> > > +	memset(&eth_dev->data->owner, 0, sizeof(struct
> > > +rte_eth_dev_owner));
> > >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> > >  	return 0;
> > >  }
> > > @@ -293,6 +295,125 @@ struct rte_eth_dev *
> > >  		return 1;
> > >  }
> > >
> > > +static int
> > > +rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > > +	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER &&
> > > +	    rte_eth_next_owner_id <= owner_id)) {
> > > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > +		return 0;
> > > +	}
> > > +	return 1;
> > > +}
> > > +
> > > +uint16_t
> > > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t
> > > +owner_id) {
> > > +	while (port_id < RTE_MAX_ETHPORTS &&
> > > +	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
> > > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > > +		port_id++;
> > > +
> > > +	if (port_id >= RTE_MAX_ETHPORTS)
> > > +		return RTE_MAX_ETHPORTS;
> > > +
> > > +	return port_id;
> > > +}
> > > +
> > > +int
> > > +rte_eth_dev_owner_new(uint16_t *owner_id) {
> > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own
> ports.\n");
> > > +		return -EPERM;
> > > +	}
> > > +	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > > +		RTE_PMD_DEBUG_TRACE("Reached maximum number of
> Ethernet port owners.\n");
> > > +		return -EUSERS;
> > > +	}
> > > +	*owner_id = rte_eth_next_owner_id++;
> > > +	return 0;
> > > +}
> > > +
> > > +int
> > > +rte_eth_dev_owner_set(const uint16_t port_id,
> > > +		      const struct rte_eth_dev_owner *owner) {
> > > +	struct rte_eth_dev_owner *port_owner;
> > > +	int ret;
> > > +
> > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own
> ports.\n");
> > > +		return -EPERM;
> > > +	}
> > > +	if (!rte_eth_is_valid_owner_id(owner->id))
> > > +		return -EINVAL;
> > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > > +	    port_owner->id != owner->id) {
> > > +		RTE_LOG(ERR, EAL,
> > > +			"Cannot set owner to port %d already owned by
> %s_%05d.\n",
> > > +			port_id, port_owner->name, port_owner->id);
> > > +		return -EPERM;
> > > +	}
> > > +	ret = snprintf(port_owner->name,
> RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > > +		       owner->name);
> > > +	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> > > +		memset(port_owner->name, 0,
> RTE_ETH_MAX_OWNER_NAME_LEN);
> > > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > > +		return -EINVAL;
> > > +	}
> > > +	port_owner->id = owner->id;
> > > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
> > > +			    owner->name, owner->id);
> > > +	return 0;
> > > +}
> > > +
> > > +int
> > > +rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t
> > > +owner_id) {
> > > +	struct rte_eth_dev_owner *port_owner;
> > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > +		return -EINVAL;
> > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > +	if (port_owner->id != owner_id) {
> > > +		RTE_LOG(ERR, EAL,
> > > +			"Cannot remove port %d owner %s_%05d by
> different owner id %5d.\n",
> > > +			port_id, port_owner->name, port_owner->id,
> owner_id);
> > > +		return -EPERM;
> > > +	}
> > > +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has
> removed.\n", port_id,
> > > +			port_owner->name, port_owner->id);
> > > +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> > > +	return 0;
> > > +}
> > > +
> > > +void
> > > +rte_eth_dev_owner_delete(const uint16_t owner_id) {
> > > +	uint16_t p;
> > > +
> > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > +		return;
> > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
> > > +		memset(&rte_eth_devices[p].data->owner, 0,
> > > +		       sizeof(struct rte_eth_dev_owner));
> > > +	RTE_PMD_DEBUG_TRACE("All port owners owned by "
> > > +			    "%05d identifier has removed.\n", owner_id); }
> > > +
> > > +const struct rte_eth_dev_owner *
> > > +rte_eth_dev_owner_get(const uint16_t port_id) {
> > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> > > +	if (rte_eth_devices[port_id].data->owner.id ==
> RTE_ETH_DEV_NO_OWNER)
> > > +		return NULL;
> > > +	return &rte_eth_devices[port_id].data->owner;
> > > +}
> > > +
> > >  int
> > >  rte_eth_dev_socket_id(uint16_t port_id)  { diff --git
> > > a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> > > index 341c2d6..f54c26d 100644
> > > --- a/lib/librte_ether/rte_ethdev.h
> > > +++ b/lib/librte_ether/rte_ethdev.h
> > > @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
> > >
> > >  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
> > >
> > > +#define RTE_ETH_DEV_NO_OWNER 0
> > > +
> > > +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> > > +
> > > +struct rte_eth_dev_owner {
> > > +	uint16_t id; /**< The owner unique identifier. */
> > > +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner
> name. */ };
> > > +
> > >  /**
> > >   * @internal
> > >   * The data part, with no function pointers, associated with each
> ethernet device.
> > > @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
> > >  	int numa_node;  /**< NUMA node connection */
> > >  	struct rte_vlan_filter_conf vlan_filter_conf;
> > >  	/**< VLAN filter configuration. */
> > > +	struct rte_eth_dev_owner owner; /**< The port owner. */
> > >  };
> > >
> > >  /** Device supports link state interrupt */ @@ -1846,6 +1856,82 @@
> > > struct rte_eth_dev_data {
> > >
> > >
> > >  /**
> > > + * Iterates over valid ethdev ports owned by a specific owner.
> > > + *
> > > + * @param port_id
> > > + *   The id of the next possible valid owned port.
> > > + * @param	owner_id
> > > + *  The owner identifier.
> > > + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless
> ports.
> > > + * @return
> > > + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if there
> is none.
> > > + */
> > > +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const
> > > +uint16_t owner_id);
> > > +
> > > +/**
> > > + * Macro to iterate over all enabled ethdev ports owned by a specific
> owner.
> > > + */
> > > +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> > > +	for (p = rte_eth_find_next_owned_by(0, o); \
> > > +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> > > +	     p = rte_eth_find_next_owned_by(p + 1, o))
> > > +
> > > +/**
> > > + * Get a new unique owner identifier.
> > > + * An owner identifier is used to owns Ethernet devices by only one
> > > +DPDK entity
> > > + * to avoid multiple management of device by different entities.
> > > + *
> > > + * @param	owner_id
> > > + *   Owner identifier pointer.
> > > + * @return
> > > + *   Negative errno value on error, 0 on success.
> > > + */
> > > +int rte_eth_dev_owner_new(uint16_t *owner_id);
> > > +
> > > +/**
> > > + * Set an Ethernet device owner.
> > > + *
> > > + * @param	port_id
> > > + *  The identifier of the port to own.
> > > + * @param	owner
> > > + *  The owner pointer.
> > > + * @return
> > > + *  Negative errno value on error, 0 on success.
> > > + */
> > > +int rte_eth_dev_owner_set(const uint16_t port_id,
> > > +			  const struct rte_eth_dev_owner *owner);
> > > +
> > > +/**
> > > + * Remove Ethernet device owner to make the device ownerless.
> > > + *
> > > + * @param	port_id
> > > + *  The identifier of port to make ownerless.
> > > + * @param	owner
> > > + *  The owner identifier.
> > > + * @return
> > > + *  0 on success, negative errno value on error.
> > > + */
> > > +int rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t
> > > +owner_id);
> > > +
> > > +/**
> > > + * Remove owner from all Ethernet devices owned by a specific owner.
> > > + *
> > > + * @param	owner
> > > + *  The owner identifier.
> > > + */
> > > +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> > > +
> > > +/**
> > > + * Get the owner of an Ethernet device.
> > > + *
> > > + * @param	port_id
> > > + *  The port identifier.
> > > + * @return
> > > + *  NULL when the device is ownerless, else the device owner pointer.
> > > + */
> > > +const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const
> > > +uint16_t port_id);
> > > +
> > > +/**
> > >   * Get the total number of Ethernet devices that have been successfully
> > >   * initialized by the matching Ethernet driver during the PCI probing
> phase
> > >   * and that are available for applications to use. These devices
> > > must be diff --git a/lib/librte_ether/rte_ethdev_version.map
> > > b/lib/librte_ether/rte_ethdev_version.map
> > > index e9681ac..7d07edb 100644
> > > --- a/lib/librte_ether/rte_ethdev_version.map
> > > +++ b/lib/librte_ether/rte_ethdev_version.map
> > > @@ -198,6 +198,18 @@ DPDK_17.11 {
> > >
> > >  } DPDK_17.08;
> > >
> > > +DPDK_18.02 {
> > > +	global:
> > > +
> > > +	rte_eth_find_next_owned_by;
> > > +	rte_eth_dev_owner_new;
> > > +	rte_eth_dev_owner_set;
> > > +	rte_eth_dev_owner_remove;
> > > +	rte_eth_dev_owner_delete;
> > > +	rte_eth_dev_owner_get;
> > > +
> > > +} DPDK_17.11;
> > > +
> > >  EXPERIMENTAL {
> > >  	global:
> > >
> > > --
> > > 1.8.3.1
> > >
> > >
> 
> --
> Gaëtan Rivet
> 6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-11-30 14:30       ` Matan Azrad
@ 2017-11-30 15:09         ` Gaëtan Rivet
  2017-11-30 15:43           ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Gaëtan Rivet @ 2017-11-30 15:09 UTC (permalink / raw)
  To: Matan Azrad; +Cc: Neil Horman, Thomas Monjalon, Jingjing Wu, dev

On Thu, Nov 30, 2017 at 02:30:20PM +0000, Matan Azrad wrote:
> Hi all
> 
> > -----Original Message-----
> > From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> > Sent: Thursday, November 30, 2017 3:25 PM
> > To: Neil Horman <nhorman@tuxdriver.com>
> > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> > dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > 
> > Hello Matan, Neil,
> > 
> > I like the port ownership concept. I think it is needed to clarify some
> > operations and should be useful to several subsystems.
> > 
> > This patch could certainly be sub-divided however, and your current 1/5
> > should probably come after this one.
> 
> Can you suggest how to divide it?
> 

Adding first the API to add / remove owners, then in a second patch
set / get / unset. (by the way, remove / delete is pretty confusing, I'd
suggest renaming those.) You can also separate the introduction of the
new owner-wise iterator.

Ultimately, you are the author, it's your job to help us review your
work.

> 1/5 could be actually outside of this series, it is just better behavior to use the right function to do release port :)
> 
> > 
> > Some comments inline.
> > 
> > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > The ownership of a port is implicit in DPDK.
> > > > Making it explicit is better from the next reasons:
> > > > 1. It may be convenient for multi-process applications to know which
> > > >    process is in charge of a port.
> > > > 2. A library could work on top of a port.
> > > > 3. A port can work on top of another port.
> > > >
> > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > We need to check that the user is not trying to use a port which is
> > > > already managed by fail-safe.
> > > >
> > > > Add ownership mechanism to DPDK Ethernet devices to avoid multiple
> > > > management of a device by different DPDK entities.
> > > >
> > > > A port owner is built from owner id(number) and owner name(string)
> > > > while the owner id must be unique to distinguish between two
> > > > identical entity instances and the owner name can be any name.
> > > > The name helps to logically recognize the owner by different DPDK
> > > > entities and allows easy debug.
> > > > Each DPDK entity can allocate an owner unique identifier and can use
> > > > it and its preferred name to owns valid ethdev ports.
> > > > Each DPDK entity can get any port owner status to decide if it can
> > > > manage the port or not.
> > > >
> > > > The current ethdev internal port management is not affected by this
> > > > feature.
> > > >
> > 
> > The internal port management is not affected, but the external interface is,
> > however. In order to respect port ownership, applications are forced to
> > modify their port iterator, as shown by your testpmd patch.
> > 
> > I think it would be better to modify the current RTE_ETH_FOREACH_DEV to
> > call RTE_FOREACH_DEV_OWNED_BY, and introduce a default owner that
> > would represent the application itself (probably with the ID 0 and an owner
> > string ""). Only with specific additional configuration should this default
> > subset of ethdev be divided.
> > 
> > This would make this evolution seamless for applications, at no cost to the
> > complexity of the design.
> 
> As you can see in patch code and in testpmd example I added option to iterate
> over all valid ownerless ports which should be more relevant by owner_id = 0.
> So maybe the RTE_ETH_FOREACH_DEV should be changed to run this by the new iterator.

That is precisely what I am suggesting.
Ideally, you should not have to change anything in testpmd, beside some
bug fixing regarding port iteration to avoid those with a specific
owner.

RTE_ETH_FOREACH_DEV must stay valid, and should iterate over ownerless
ports (read: port owned by the default owner).

> By this way current applications don't need to build their owners but the ABI will be broken.
> 

ABI is broken anyway as you will add the owner to rte_eth_dev_data.

> Actually, I think the old iterator RTE_ETH_FOREACH_DEV should be unexposed or just removed,

I don't think so. Using RTE_ETH_FOREACH_DEV should allow keeping the
current behavior of iterating over ownerless ports. Applications that do
not care for this API should not have to change anything to their code.

> also the DEFFERED state should be removed,

Of course.

> I don't really see any usage to iterate over all valid ports by DPDK entities different from ethdev itself.
> I just don't want to break it now.
> 

[snip]

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-11-30 15:09         ` Gaëtan Rivet
@ 2017-11-30 15:43           ` Matan Azrad
  0 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2017-11-30 15:43 UTC (permalink / raw)
  To: Gaëtan Rivet; +Cc: Neil Horman, Thomas Monjalon, Jingjing Wu, dev

Hi Gaetan

> -----Original Message-----
> From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> Sent: Thursday, November 30, 2017 5:10 PM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Neil Horman <nhorman@tuxdriver.com>; Thomas Monjalon
> <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> On Thu, Nov 30, 2017 at 02:30:20PM +0000, Matan Azrad wrote:
> > Hi all
> >
> > > -----Original Message-----
> > > From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
> > > Sent: Thursday, November 30, 2017 3:25 PM
> > > To: Neil Horman <nhorman@tuxdriver.com>
> > > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> > > dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > >
> > > Hello Matan, Neil,
> > >
> > > I like the port ownership concept. I think it is needed to clarify
> > > some operations and should be useful to several subsystems.
> > >
> > > This patch could certainly be sub-divided however, and your current
> > > 1/5 should probably come after this one.
> >
> > Can you suggest how to divide it?
> >
> 
> Adding first the API to add / remove owners, then in a second patch set / get
> / unset. (by the way, remove / delete is pretty confusing, I'd suggest
> renaming those.) You can also separate the introduction of the new owner-
> wise iterator.
>
> Ultimately, you are the author, it's your job to help us review your work.
> 

When you suggest improvement I think you need to propose another method\idea.
The author probably thought about it and arrived to his conclusions.
 Exactly as you are doing now in naming.
If you have a specific question, I'm here to answer :) 

I agree with unset name instead of remove, will change it in V2.
  
> > 1/5 could be actually outside of this series, it is just better
> > behavior to use the right function to do release port :)
> >
> > >
> > > Some comments inline.
> > >
> > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > > The ownership of a port is implicit in DPDK.
> > > > > Making it explicit is better from the next reasons:
> > > > > 1. It may be convenient for multi-process applications to know which
> > > > >    process is in charge of a port.
> > > > > 2. A library could work on top of a port.
> > > > > 3. A port can work on top of another port.
> > > > >
> > > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > > We need to check that the user is not trying to use a port which
> > > > > is already managed by fail-safe.
> > > > >
> > > > > Add ownership mechanism to DPDK Ethernet devices to avoid
> > > > > multiple management of a device by different DPDK entities.
> > > > >
> > > > > A port owner is built from owner id(number) and owner
> > > > > name(string) while the owner id must be unique to distinguish
> > > > > between two identical entity instances and the owner name can be
> any name.
> > > > > The name helps to logically recognize the owner by different
> > > > > DPDK entities and allows easy debug.
> > > > > Each DPDK entity can allocate an owner unique identifier and can
> > > > > use it and its preferred name to owns valid ethdev ports.
> > > > > Each DPDK entity can get any port owner status to decide if it
> > > > > can manage the port or not.
> > > > >
> > > > > The current ethdev internal port management is not affected by
> > > > > this feature.
> > > > >
> > >
> > > The internal port management is not affected, but the external
> > > interface is, however. In order to respect port ownership,
> > > applications are forced to modify their port iterator, as shown by your
> testpmd patch.
> > >
> > > I think it would be better to modify the current RTE_ETH_FOREACH_DEV
> > > to call RTE_FOREACH_DEV_OWNED_BY, and introduce a default owner
> that
> > > would represent the application itself (probably with the ID 0 and
> > > an owner string ""). Only with specific additional configuration
> > > should this default subset of ethdev be divided.
> > >
> > > This would make this evolution seamless for applications, at no cost
> > > to the complexity of the design.
> >
> > As you can see in patch code and in testpmd example I added option to
> > iterate over all valid ownerless ports which should be more relevant by
> owner_id = 0.
> > So maybe the RTE_ETH_FOREACH_DEV should be changed to run this by
> the new iterator.
> 
> That is precisely what I am suggesting.
> Ideally, you should not have to change anything in testpmd, beside some bug
> fixing regarding port iteration to avoid those with a specific owner.
> 
> RTE_ETH_FOREACH_DEV must stay valid, and should iterate over ownerless
> ports (read: port owned by the default owner).
> 
> > By this way current applications don't need to build their owners but the
> ABI will be broken.
> >
> 
> ABI is broken anyway as you will add the owner to rte_eth_dev_data.
> 

It is not, rte_eth_dev_data is internal.

> > Actually, I think the old iterator RTE_ETH_FOREACH_DEV should be
> > unexposed or just removed,
> 
> I don't think so. Using RTE_ETH_FOREACH_DEV should allow keeping the
> current behavior of iterating over ownerless ports. Applications that do not
> care for this API should not have to change anything to their code.
> 

If we will break the ABI later users can use RTE_ETH_FOREACH_DEV_OWNED_BY(p,0)
to do it. RTE_ETH_FOREACH_DEV will be unnecessary anymore but maybe is too much to applications to change also the API.
I can agree with it.

> > also the DEFFERED state should be removed,
> 
> Of course.
> 
> > I don't really see any usage to iterate over all valid ports by DPDK entities
> different from ethdev itself.
> > I just don't want to break it now.
> >
> 
> [snip]
> 
> --
> Gaëtan Rivet
> 6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-11-30 13:24     ` Gaëtan Rivet
  2017-11-30 14:30       ` Matan Azrad
@ 2017-12-01 12:09       ` Neil Horman
  2017-12-03  8:04         ` Matan Azrad
  1 sibling, 1 reply; 214+ messages in thread
From: Neil Horman @ 2017-12-01 12:09 UTC (permalink / raw)
  To: Gaëtan Rivet; +Cc: Matan Azrad, Thomas Monjalon, Jingjing Wu, dev

On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> Hello Matan, Neil,
> 
> I like the port ownership concept. I think it is needed to clarify some
> operations and should be useful to several subsystems.
> 
> This patch could certainly be sub-divided however, and your current 1/5
> should probably come after this one.
> 
> Some comments inline.
> 
> On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > The ownership of a port is implicit in DPDK.
> > > Making it explicit is better from the next reasons:
> > > 1. It may be convenient for multi-process applications to know which
> > >    process is in charge of a port.
> > > 2. A library could work on top of a port.
> > > 3. A port can work on top of another port.
> > > 
> > > Also in the fail-safe case, an issue has been met in testpmd.
> > > We need to check that the user is not trying to use a port which is
> > > already managed by fail-safe.
> > > 
> > > Add ownership mechanism to DPDK Ethernet devices to avoid multiple
> > > management of a device by different DPDK entities.
> > > 
> > > A port owner is built from owner id(number) and owner name(string) while
> > > the owner id must be unique to distinguish between two identical entity
> > > instances and the owner name can be any name.
> > > The name helps to logically recognize the owner by different DPDK
> > > entities and allows easy debug.
> > > Each DPDK entity can allocate an owner unique identifier and can use it
> > > and its preferred name to owns valid ethdev ports.
> > > Each DPDK entity can get any port owner status to decide if it can
> > > manage the port or not.
> > > 
> > > The current ethdev internal port management is not affected by this
> > > feature.
> > > 
> 
> The internal port management is not affected, but the external interface
> is, however. In order to respect port ownership, applications are forced
> to modify their port iterator, as shown by your testpmd patch.
> 
> I think it would be better to modify the current RTE_ETH_FOREACH_DEV to call
> RTE_FOREACH_DEV_OWNED_BY, and introduce a default owner that would
> represent the application itself (probably with the ID 0 and an owner
> string ""). Only with specific additional configuration should this
> default subset of ethdev be divided.
> 
> This would make this evolution seamless for applications, at no cost to
> the complexity of the design.
> 
> > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > 
> > 
> > This seems fairly racy.  What if one thread attempts to set ownership on a port,
> > while another is checking it on another cpu in parallel.  It doesn't seem like
> > it will protect against that at all. It also doesn't protect against the
> > possibility of multiple threads attempting to poll for rx in parallel, which I
> > think was part of Thomas's origional statement regarding port ownership (he
> > noted that the lockless design implied only a single thread should be allowed to
> > poll for receive or make configuration changes at a time.
> > 
> > Neil
> > 
> 
> Isn't this race already there for any configuration operation / polling
> function? The DPDK arch is expecting applications to solve it. Why should
> port ownership be designed differently from other DPDK components?
> 
Yes, but that doesn't mean it should exist in purpituity, nor does it mean that
your new api should contain it as well.

> Embedding checks for port ownership within operations will force
> everyone to bear their costs, even those not interested in using this
> API. These checks should be kept outside, within the entity claiming
> ownership of the port, in the form of using the proper port iterator
> IMO.
> 
No.  At the very least, you need to make the API itself exclusive.  That is to
say, you should at least ensure that a port ownership get call doesn't race with
a port ownership set call.  It seems rediculous to just leave that sort of
locking as an exercize to the user.

Neil

> > > ---
> > >  doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
> > >  lib/librte_ether/rte_ethdev.c           | 121 ++++++++++++++++++++++++++++++++
> > >  lib/librte_ether/rte_ethdev.h           |  86 +++++++++++++++++++++++
> > >  lib/librte_ether/rte_ethdev_version.map |  12 ++++
> > >  4 files changed, 230 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
> > > index 6a0c9f9..af639a1 100644
> > > --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > > +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > > @@ -156,7 +156,7 @@ concurrently on the same tx queue without SW lock. This PMD feature found in som
> > >  
> > >  See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE`` capability probing details.
> > >  
> > > -Device Identification and Configuration
> > > +Device Identification, Ownership  and Configuration
> > >  ---------------------------------------
> > >  
> > >  Device Identification
> > > @@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports are assigned two other identifiers:
> > >  *   A port name used to designate the port in console messages, for administration or debugging purposes.
> > >      For ease of use, the port name includes the port index.
> > >  
> > > +Port Ownership
> > > +~~~~~~~~~~~~~
> > > +The Ethernet devices ports can be owned by a single DPDK entity (application, library, PMD, process, etc).
> > > +The ownership mechanism is controlled by ethdev APIs and allows to set/remove/get a port owner by DPDK entities.
> > > +Allowing this should prevent any multiple management of Ethernet port by different entities.
> > > +
> > > +.. note::
> > > +
> > > +    It is the DPDK entity responsibility either to check the port owner before using it or to set the port owner to prevent others from using it.
> > > +
> > >  Device Configuration
> > >  ~~~~~~~~~~~~~~~~~~~~
> > >  
> > > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> > > index 2d754d9..836991e 100644
> > > --- a/lib/librte_ether/rte_ethdev.c
> > > +++ b/lib/librte_ether/rte_ethdev.c
> > > @@ -71,6 +71,7 @@
> > >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
> > >  struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> > >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > > +static uint16_t rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER + 1;
> > >  static uint8_t eth_dev_last_created_port;
> > >  
> > >  /* spinlock for eth device callbacks */
> > > @@ -278,6 +279,7 @@ struct rte_eth_dev *
> > >  	if (eth_dev == NULL)
> > >  		return -EINVAL;
> > >  
> > > +	memset(&eth_dev->data->owner, 0, sizeof(struct rte_eth_dev_owner));
> > >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> > >  	return 0;
> > >  }
> > > @@ -293,6 +295,125 @@ struct rte_eth_dev *
> > >  		return 1;
> > >  }
> > >  
> > > +static int
> > > +rte_eth_is_valid_owner_id(uint16_t owner_id)
> > > +{
> > > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > > +	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER &&
> > > +	    rte_eth_next_owner_id <= owner_id)) {
> > > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > +		return 0;
> > > +	}
> > > +	return 1;
> > > +}
> > > +
> > > +uint16_t
> > > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t owner_id)
> > > +{
> > > +	while (port_id < RTE_MAX_ETHPORTS &&
> > > +	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
> > > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > > +		port_id++;
> > > +
> > > +	if (port_id >= RTE_MAX_ETHPORTS)
> > > +		return RTE_MAX_ETHPORTS;
> > > +
> > > +	return port_id;
> > > +}
> > > +
> > > +int
> > > +rte_eth_dev_owner_new(uint16_t *owner_id)
> > > +{
> > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own ports.\n");
> > > +		return -EPERM;
> > > +	}
> > > +	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > > +		RTE_PMD_DEBUG_TRACE("Reached maximum number of Ethernet port owners.\n");
> > > +		return -EUSERS;
> > > +	}
> > > +	*owner_id = rte_eth_next_owner_id++;
> > > +	return 0;
> > > +}
> > > +
> > > +int
> > > +rte_eth_dev_owner_set(const uint16_t port_id,
> > > +		      const struct rte_eth_dev_owner *owner)
> > > +{
> > > +	struct rte_eth_dev_owner *port_owner;
> > > +	int ret;
> > > +
> > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own ports.\n");
> > > +		return -EPERM;
> > > +	}
> > > +	if (!rte_eth_is_valid_owner_id(owner->id))
> > > +		return -EINVAL;
> > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > > +	    port_owner->id != owner->id) {
> > > +		RTE_LOG(ERR, EAL,
> > > +			"Cannot set owner to port %d already owned by %s_%05d.\n",
> > > +			port_id, port_owner->name, port_owner->id);
> > > +		return -EPERM;
> > > +	}
> > > +	ret = snprintf(port_owner->name, RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > > +		       owner->name);
> > > +	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> > > +		memset(port_owner->name, 0, RTE_ETH_MAX_OWNER_NAME_LEN);
> > > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > > +		return -EINVAL;
> > > +	}
> > > +	port_owner->id = owner->id;
> > > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
> > > +			    owner->name, owner->id);
> > > +	return 0;
> > > +}
> > > +
> > > +int
> > > +rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t owner_id)
> > > +{
> > > +	struct rte_eth_dev_owner *port_owner;
> > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > +		return -EINVAL;
> > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > +	if (port_owner->id != owner_id) {
> > > +		RTE_LOG(ERR, EAL,
> > > +			"Cannot remove port %d owner %s_%05d by different owner id %5d.\n",
> > > +			port_id, port_owner->name, port_owner->id, owner_id);
> > > +		return -EPERM;
> > > +	}
> > > +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has removed.\n", port_id,
> > > +			port_owner->name, port_owner->id);
> > > +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> > > +	return 0;
> > > +}
> > > +
> > > +void
> > > +rte_eth_dev_owner_delete(const uint16_t owner_id)
> > > +{
> > > +	uint16_t p;
> > > +
> > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > +		return;
> > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
> > > +		memset(&rte_eth_devices[p].data->owner, 0,
> > > +		       sizeof(struct rte_eth_dev_owner));
> > > +	RTE_PMD_DEBUG_TRACE("All port owners owned by "
> > > +			    "%05d identifier has removed.\n", owner_id);
> > > +}
> > > +
> > > +const struct rte_eth_dev_owner *
> > > +rte_eth_dev_owner_get(const uint16_t port_id)
> > > +{
> > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> > > +	if (rte_eth_devices[port_id].data->owner.id == RTE_ETH_DEV_NO_OWNER)
> > > +		return NULL;
> > > +	return &rte_eth_devices[port_id].data->owner;
> > > +}
> > > +
> > >  int
> > >  rte_eth_dev_socket_id(uint16_t port_id)
> > >  {
> > > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> > > index 341c2d6..f54c26d 100644
> > > --- a/lib/librte_ether/rte_ethdev.h
> > > +++ b/lib/librte_ether/rte_ethdev.h
> > > @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
> > >  
> > >  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
> > >  
> > > +#define RTE_ETH_DEV_NO_OWNER 0
> > > +
> > > +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> > > +
> > > +struct rte_eth_dev_owner {
> > > +	uint16_t id; /**< The owner unique identifier. */
> > > +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner name. */
> > > +};
> > > +
> > >  /**
> > >   * @internal
> > >   * The data part, with no function pointers, associated with each ethernet device.
> > > @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
> > >  	int numa_node;  /**< NUMA node connection */
> > >  	struct rte_vlan_filter_conf vlan_filter_conf;
> > >  	/**< VLAN filter configuration. */
> > > +	struct rte_eth_dev_owner owner; /**< The port owner. */
> > >  };
> > >  
> > >  /** Device supports link state interrupt */
> > > @@ -1846,6 +1856,82 @@ struct rte_eth_dev_data {
> > >  
> > >  
> > >  /**
> > > + * Iterates over valid ethdev ports owned by a specific owner.
> > > + *
> > > + * @param port_id
> > > + *   The id of the next possible valid owned port.
> > > + * @param	owner_id
> > > + *  The owner identifier.
> > > + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless ports.
> > > + * @return
> > > + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if there is none.
> > > + */
> > > +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t owner_id);
> > > +
> > > +/**
> > > + * Macro to iterate over all enabled ethdev ports owned by a specific owner.
> > > + */
> > > +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> > > +	for (p = rte_eth_find_next_owned_by(0, o); \
> > > +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> > > +	     p = rte_eth_find_next_owned_by(p + 1, o))
> > > +
> > > +/**
> > > + * Get a new unique owner identifier.
> > > + * An owner identifier is used to owns Ethernet devices by only one DPDK entity
> > > + * to avoid multiple management of device by different entities.
> > > + *
> > > + * @param	owner_id
> > > + *   Owner identifier pointer.
> > > + * @return
> > > + *   Negative errno value on error, 0 on success.
> > > + */
> > > +int rte_eth_dev_owner_new(uint16_t *owner_id);
> > > +
> > > +/**
> > > + * Set an Ethernet device owner.
> > > + *
> > > + * @param	port_id
> > > + *  The identifier of the port to own.
> > > + * @param	owner
> > > + *  The owner pointer.
> > > + * @return
> > > + *  Negative errno value on error, 0 on success.
> > > + */
> > > +int rte_eth_dev_owner_set(const uint16_t port_id,
> > > +			  const struct rte_eth_dev_owner *owner);
> > > +
> > > +/**
> > > + * Remove Ethernet device owner to make the device ownerless.
> > > + *
> > > + * @param	port_id
> > > + *  The identifier of port to make ownerless.
> > > + * @param	owner
> > > + *  The owner identifier.
> > > + * @return
> > > + *  0 on success, negative errno value on error.
> > > + */
> > > +int rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t owner_id);
> > > +
> > > +/**
> > > + * Remove owner from all Ethernet devices owned by a specific owner.
> > > + *
> > > + * @param	owner
> > > + *  The owner identifier.
> > > + */
> > > +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> > > +
> > > +/**
> > > + * Get the owner of an Ethernet device.
> > > + *
> > > + * @param	port_id
> > > + *  The port identifier.
> > > + * @return
> > > + *  NULL when the device is ownerless, else the device owner pointer.
> > > + */
> > > +const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const uint16_t port_id);
> > > +
> > > +/**
> > >   * Get the total number of Ethernet devices that have been successfully
> > >   * initialized by the matching Ethernet driver during the PCI probing phase
> > >   * and that are available for applications to use. These devices must be
> > > diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
> > > index e9681ac..7d07edb 100644
> > > --- a/lib/librte_ether/rte_ethdev_version.map
> > > +++ b/lib/librte_ether/rte_ethdev_version.map
> > > @@ -198,6 +198,18 @@ DPDK_17.11 {
> > >  
> > >  } DPDK_17.08;
> > >  
> > > +DPDK_18.02 {
> > > +	global:
> > > +
> > > +	rte_eth_find_next_owned_by;
> > > +	rte_eth_dev_owner_new;
> > > +	rte_eth_dev_owner_set;
> > > +	rte_eth_dev_owner_remove;
> > > +	rte_eth_dev_owner_delete;
> > > +	rte_eth_dev_owner_get;
> > > +
> > > +} DPDK_17.11;
> > > +
> > >  EXPERIMENTAL {
> > >  	global:
> > >  
> > > -- 
> > > 1.8.3.1
> > > 
> > > 
> 
> -- 
> Gaëtan Rivet
> 6WIND
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-01 12:09       ` Neil Horman
@ 2017-12-03  8:04         ` Matan Azrad
  2017-12-03 11:10           ` Ananyev, Konstantin
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2017-12-03  8:04 UTC (permalink / raw)
  To: Neil Horman, Gaëtan Rivet; +Cc: Thomas Monjalon, Jingjing Wu, dev

Hi

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Friday, December 1, 2017 2:10 PM
> To: Gaëtan Rivet <gaetan.rivet@6wind.com>
> Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> > Hello Matan, Neil,
> >
> > I like the port ownership concept. I think it is needed to clarify
> > some operations and should be useful to several subsystems.
> >
> > This patch could certainly be sub-divided however, and your current
> > 1/5 should probably come after this one.
> >
> > Some comments inline.
> >
> > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > The ownership of a port is implicit in DPDK.
> > > > Making it explicit is better from the next reasons:
> > > > 1. It may be convenient for multi-process applications to know which
> > > >    process is in charge of a port.
> > > > 2. A library could work on top of a port.
> > > > 3. A port can work on top of another port.
> > > >
> > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > We need to check that the user is not trying to use a port which
> > > > is already managed by fail-safe.
> > > >
> > > > Add ownership mechanism to DPDK Ethernet devices to avoid multiple
> > > > management of a device by different DPDK entities.
> > > >
> > > > A port owner is built from owner id(number) and owner name(string)
> > > > while the owner id must be unique to distinguish between two
> > > > identical entity instances and the owner name can be any name.
> > > > The name helps to logically recognize the owner by different DPDK
> > > > entities and allows easy debug.
> > > > Each DPDK entity can allocate an owner unique identifier and can
> > > > use it and its preferred name to owns valid ethdev ports.
> > > > Each DPDK entity can get any port owner status to decide if it can
> > > > manage the port or not.
> > > >
> > > > The current ethdev internal port management is not affected by
> > > > this feature.
> > > >
> >
> > The internal port management is not affected, but the external
> > interface is, however. In order to respect port ownership,
> > applications are forced to modify their port iterator, as shown by your
> testpmd patch.
> >
> > I think it would be better to modify the current RTE_ETH_FOREACH_DEV
> > to call RTE_FOREACH_DEV_OWNED_BY, and introduce a default owner that
> > would represent the application itself (probably with the ID 0 and an
> > owner string ""). Only with specific additional configuration should
> > this default subset of ethdev be divided.
> >
> > This would make this evolution seamless for applications, at no cost
> > to the complexity of the design.
> >
> > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > >
> > >
> > > This seems fairly racy.  What if one thread attempts to set
> > > ownership on a port, while another is checking it on another cpu in
> > > parallel.  It doesn't seem like it will protect against that at all.
> > > It also doesn't protect against the possibility of multiple threads
> > > attempting to poll for rx in parallel, which I think was part of
> > > Thomas's origional statement regarding port ownership (he noted that
> > > the lockless design implied only a single thread should be allowed to poll
> for receive or make configuration changes at a time.
> > >
> > > Neil
> > >
> >
> > Isn't this race already there for any configuration operation /
> > polling function? The DPDK arch is expecting applications to solve it.
> > Why should port ownership be designed differently from other DPDK
> components?
> >
> Yes, but that doesn't mean it should exist in purpituity, nor does it mean that
> your new api should contain it as well.
> 
> > Embedding checks for port ownership within operations will force
> > everyone to bear their costs, even those not interested in using this
> > API. These checks should be kept outside, within the entity claiming
> > ownership of the port, in the form of using the proper port iterator
> > IMO.
> >
> No.  At the very least, you need to make the API itself exclusive.  That is to
> say, you should at least ensure that a port ownership get call doesn't race
> with a port ownership set call.  It seems rediculous to just leave that sort of
> locking as an exercize to the user.
> 
> Neil
> 
Neil, 
As Thomas mentioned, a DPDK port is designed to be managed by only one
thread (or synchronized DPDK entity).
So all the port management includes port ownership shouldn't be synchronized,
i.e. locks are not needed.
If some application want to do dual thread port management, the responsibility
to synchronize the port ownership or any other port management is on this
application.
Port ownership doesn't come to allow synchronized management of the port by
two DPDK entities in parallel, it is just nice way to answer next questions:
	1. Is the port already owned by some DPDK entity(in early control path)?
	2. If yes, Who is the owner?
If the answer to the first question is no, the current entity can take the ownership
without any lock(1 thread).
If the answer to the first question is yes, you can recognize the owner and take
decisions accordingly, sometimes you can decide to use the port because you
logically know what the current owner does and you can be logically synchronized
with it, sometimes you can just leave this port because you have not any deal with
 this owner.

> > > > ---
> > > >  doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
> > > >  lib/librte_ether/rte_ethdev.c           | 121
> ++++++++++++++++++++++++++++++++
> > > >  lib/librte_ether/rte_ethdev.h           |  86 +++++++++++++++++++++++
> > > >  lib/librte_ether/rte_ethdev_version.map |  12 ++++
> > > >  4 files changed, 230 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > index 6a0c9f9..af639a1 100644
> > > > --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > @@ -156,7 +156,7 @@ concurrently on the same tx queue without SW
> > > > lock. This PMD feature found in som
> > > >
> > > >  See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE``
> capability probing details.
> > > >
> > > > -Device Identification and Configuration
> > > > +Device Identification, Ownership  and Configuration
> > > >  ---------------------------------------
> > > >
> > > >  Device Identification
> > > > @@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports are
> assigned two other identifiers:
> > > >  *   A port name used to designate the port in console messages, for
> administration or debugging purposes.
> > > >      For ease of use, the port name includes the port index.
> > > >
> > > > +Port Ownership
> > > > +~~~~~~~~~~~~~
> > > > +The Ethernet devices ports can be owned by a single DPDK entity
> (application, library, PMD, process, etc).
> > > > +The ownership mechanism is controlled by ethdev APIs and allows to
> set/remove/get a port owner by DPDK entities.
> > > > +Allowing this should prevent any multiple management of Ethernet
> port by different entities.
> > > > +
> > > > +.. note::
> > > > +
> > > > +    It is the DPDK entity responsibility either to check the port owner
> before using it or to set the port owner to prevent others from using it.
> > > > +
> > > >  Device Configuration
> > > >  ~~~~~~~~~~~~~~~~~~~~
> > > >
> > > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > > b/lib/librte_ether/rte_ethdev.c index 2d754d9..836991e 100644
> > > > --- a/lib/librte_ether/rte_ethdev.c
> > > > +++ b/lib/librte_ether/rte_ethdev.c
> > > > @@ -71,6 +71,7 @@
> > > >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
> > > > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> > > >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > > > +static uint16_t rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER
> + 1;
> > > >  static uint8_t eth_dev_last_created_port;
> > > >
> > > >  /* spinlock for eth device callbacks */ @@ -278,6 +279,7 @@
> > > > struct rte_eth_dev *
> > > >  	if (eth_dev == NULL)
> > > >  		return -EINVAL;
> > > >
> > > > +	memset(&eth_dev->data->owner, 0, sizeof(struct
> > > > +rte_eth_dev_owner));
> > > >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> > > >  	return 0;
> > > >  }
> > > > @@ -293,6 +295,125 @@ struct rte_eth_dev *
> > > >  		return 1;
> > > >  }
> > > >
> > > > +static int
> > > > +rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > > > +	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER &&
> > > > +	    rte_eth_next_owner_id <= owner_id)) {
> > > > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > > +		return 0;
> > > > +	}
> > > > +	return 1;
> > > > +}
> > > > +
> > > > +uint16_t
> > > > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t
> > > > +owner_id) {
> > > > +	while (port_id < RTE_MAX_ETHPORTS &&
> > > > +	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
> > > > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > > > +		port_id++;
> > > > +
> > > > +	if (port_id >= RTE_MAX_ETHPORTS)
> > > > +		return RTE_MAX_ETHPORTS;
> > > > +
> > > > +	return port_id;
> > > > +}
> > > > +
> > > > +int
> > > > +rte_eth_dev_owner_new(uint16_t *owner_id) {
> > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own
> ports.\n");
> > > > +		return -EPERM;
> > > > +	}
> > > > +	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > > > +		RTE_PMD_DEBUG_TRACE("Reached maximum number of
> Ethernet port owners.\n");
> > > > +		return -EUSERS;
> > > > +	}
> > > > +	*owner_id = rte_eth_next_owner_id++;
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +int
> > > > +rte_eth_dev_owner_set(const uint16_t port_id,
> > > > +		      const struct rte_eth_dev_owner *owner) {
> > > > +	struct rte_eth_dev_owner *port_owner;
> > > > +	int ret;
> > > > +
> > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own
> ports.\n");
> > > > +		return -EPERM;
> > > > +	}
> > > > +	if (!rte_eth_is_valid_owner_id(owner->id))
> > > > +		return -EINVAL;
> > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > > > +	    port_owner->id != owner->id) {
> > > > +		RTE_LOG(ERR, EAL,
> > > > +			"Cannot set owner to port %d already owned by
> %s_%05d.\n",
> > > > +			port_id, port_owner->name, port_owner->id);
> > > > +		return -EPERM;
> > > > +	}
> > > > +	ret = snprintf(port_owner->name,
> RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > > > +		       owner->name);
> > > > +	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> > > > +		memset(port_owner->name, 0,
> RTE_ETH_MAX_OWNER_NAME_LEN);
> > > > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > > > +		return -EINVAL;
> > > > +	}
> > > > +	port_owner->id = owner->id;
> > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
> > > > +			    owner->name, owner->id);
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +int
> > > > +rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t
> > > > +owner_id) {
> > > > +	struct rte_eth_dev_owner *port_owner;
> > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > +		return -EINVAL;
> > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > +	if (port_owner->id != owner_id) {
> > > > +		RTE_LOG(ERR, EAL,
> > > > +			"Cannot remove port %d owner %s_%05d by
> different owner id %5d.\n",
> > > > +			port_id, port_owner->name, port_owner->id,
> owner_id);
> > > > +		return -EPERM;
> > > > +	}
> > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has
> removed.\n", port_id,
> > > > +			port_owner->name, port_owner->id);
> > > > +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +void
> > > > +rte_eth_dev_owner_delete(const uint16_t owner_id) {
> > > > +	uint16_t p;
> > > > +
> > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > +		return;
> > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
> > > > +		memset(&rte_eth_devices[p].data->owner, 0,
> > > > +		       sizeof(struct rte_eth_dev_owner));
> > > > +	RTE_PMD_DEBUG_TRACE("All port owners owned by "
> > > > +			    "%05d identifier has removed.\n", owner_id); }
> > > > +
> > > > +const struct rte_eth_dev_owner *
> > > > +rte_eth_dev_owner_get(const uint16_t port_id) {
> > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> > > > +	if (rte_eth_devices[port_id].data->owner.id ==
> RTE_ETH_DEV_NO_OWNER)
> > > > +		return NULL;
> > > > +	return &rte_eth_devices[port_id].data->owner;
> > > > +}
> > > > +
> > > >  int
> > > >  rte_eth_dev_socket_id(uint16_t port_id)  { diff --git
> > > > a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> > > > index 341c2d6..f54c26d 100644
> > > > --- a/lib/librte_ether/rte_ethdev.h
> > > > +++ b/lib/librte_ether/rte_ethdev.h
> > > > @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
> > > >
> > > >  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
> > > >
> > > > +#define RTE_ETH_DEV_NO_OWNER 0
> > > > +
> > > > +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> > > > +
> > > > +struct rte_eth_dev_owner {
> > > > +	uint16_t id; /**< The owner unique identifier. */
> > > > +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner
> name. */
> > > > +};
> > > > +
> > > >  /**
> > > >   * @internal
> > > >   * The data part, with no function pointers, associated with each
> ethernet device.
> > > > @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
> > > >  	int numa_node;  /**< NUMA node connection */
> > > >  	struct rte_vlan_filter_conf vlan_filter_conf;
> > > >  	/**< VLAN filter configuration. */
> > > > +	struct rte_eth_dev_owner owner; /**< The port owner. */
> > > >  };
> > > >
> > > >  /** Device supports link state interrupt */ @@ -1846,6 +1856,82
> > > > @@ struct rte_eth_dev_data {
> > > >
> > > >
> > > >  /**
> > > > + * Iterates over valid ethdev ports owned by a specific owner.
> > > > + *
> > > > + * @param port_id
> > > > + *   The id of the next possible valid owned port.
> > > > + * @param	owner_id
> > > > + *  The owner identifier.
> > > > + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless
> ports.
> > > > + * @return
> > > > + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if
> there is none.
> > > > + */
> > > > +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const
> > > > +uint16_t owner_id);
> > > > +
> > > > +/**
> > > > + * Macro to iterate over all enabled ethdev ports owned by a specific
> owner.
> > > > + */
> > > > +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> > > > +	for (p = rte_eth_find_next_owned_by(0, o); \
> > > > +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> > > > +	     p = rte_eth_find_next_owned_by(p + 1, o))
> > > > +
> > > > +/**
> > > > + * Get a new unique owner identifier.
> > > > + * An owner identifier is used to owns Ethernet devices by only
> > > > +one DPDK entity
> > > > + * to avoid multiple management of device by different entities.
> > > > + *
> > > > + * @param	owner_id
> > > > + *   Owner identifier pointer.
> > > > + * @return
> > > > + *   Negative errno value on error, 0 on success.
> > > > + */
> > > > +int rte_eth_dev_owner_new(uint16_t *owner_id);
> > > > +
> > > > +/**
> > > > + * Set an Ethernet device owner.
> > > > + *
> > > > + * @param	port_id
> > > > + *  The identifier of the port to own.
> > > > + * @param	owner
> > > > + *  The owner pointer.
> > > > + * @return
> > > > + *  Negative errno value on error, 0 on success.
> > > > + */
> > > > +int rte_eth_dev_owner_set(const uint16_t port_id,
> > > > +			  const struct rte_eth_dev_owner *owner);
> > > > +
> > > > +/**
> > > > + * Remove Ethernet device owner to make the device ownerless.
> > > > + *
> > > > + * @param	port_id
> > > > + *  The identifier of port to make ownerless.
> > > > + * @param	owner
> > > > + *  The owner identifier.
> > > > + * @return
> > > > + *  0 on success, negative errno value on error.
> > > > + */
> > > > +int rte_eth_dev_owner_remove(const uint16_t port_id, const
> > > > +uint16_t owner_id);
> > > > +
> > > > +/**
> > > > + * Remove owner from all Ethernet devices owned by a specific
> owner.
> > > > + *
> > > > + * @param	owner
> > > > + *  The owner identifier.
> > > > + */
> > > > +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> > > > +
> > > > +/**
> > > > + * Get the owner of an Ethernet device.
> > > > + *
> > > > + * @param	port_id
> > > > + *  The port identifier.
> > > > + * @return
> > > > + *  NULL when the device is ownerless, else the device owner pointer.
> > > > + */
> > > > +const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const
> > > > +uint16_t port_id);
> > > > +
> > > > +/**
> > > >   * Get the total number of Ethernet devices that have been
> successfully
> > > >   * initialized by the matching Ethernet driver during the PCI probing
> phase
> > > >   * and that are available for applications to use. These devices
> > > > must be diff --git a/lib/librte_ether/rte_ethdev_version.map
> > > > b/lib/librte_ether/rte_ethdev_version.map
> > > > index e9681ac..7d07edb 100644
> > > > --- a/lib/librte_ether/rte_ethdev_version.map
> > > > +++ b/lib/librte_ether/rte_ethdev_version.map
> > > > @@ -198,6 +198,18 @@ DPDK_17.11 {
> > > >
> > > >  } DPDK_17.08;
> > > >
> > > > +DPDK_18.02 {
> > > > +	global:
> > > > +
> > > > +	rte_eth_find_next_owned_by;
> > > > +	rte_eth_dev_owner_new;
> > > > +	rte_eth_dev_owner_set;
> > > > +	rte_eth_dev_owner_remove;
> > > > +	rte_eth_dev_owner_delete;
> > > > +	rte_eth_dev_owner_get;
> > > > +
> > > > +} DPDK_17.11;
> > > > +
> > > >  EXPERIMENTAL {
> > > >  	global:
> > > >
> > > > --
> > > > 1.8.3.1
> > > >
> > > >
> >
> > --
> > Gaëtan Rivet
> > 6WIND
> >

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-03  8:04         ` Matan Azrad
@ 2017-12-03 11:10           ` Ananyev, Konstantin
  2017-12-03 13:46             ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2017-12-03 11:10 UTC (permalink / raw)
  To: Matan Azrad, Neil Horman, Gaëtan Rivet
  Cc: Thomas Monjalon, Wu, Jingjing, dev



Hi Matan,

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> Sent: Sunday, December 3, 2017 8:05 AM
> To: Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> Hi
> 
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Friday, December 1, 2017 2:10 PM
> > To: Gaëtan Rivet <gaetan.rivet@6wind.com>
> > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> > dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> >
> > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> > > Hello Matan, Neil,
> > >
> > > I like the port ownership concept. I think it is needed to clarify
> > > some operations and should be useful to several subsystems.
> > >
> > > This patch could certainly be sub-divided however, and your current
> > > 1/5 should probably come after this one.
> > >
> > > Some comments inline.
> > >
> > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > > The ownership of a port is implicit in DPDK.
> > > > > Making it explicit is better from the next reasons:
> > > > > 1. It may be convenient for multi-process applications to know which
> > > > >    process is in charge of a port.
> > > > > 2. A library could work on top of a port.
> > > > > 3. A port can work on top of another port.
> > > > >
> > > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > > We need to check that the user is not trying to use a port which
> > > > > is already managed by fail-safe.
> > > > >
> > > > > Add ownership mechanism to DPDK Ethernet devices to avoid multiple
> > > > > management of a device by different DPDK entities.
> > > > >
> > > > > A port owner is built from owner id(number) and owner name(string)
> > > > > while the owner id must be unique to distinguish between two
> > > > > identical entity instances and the owner name can be any name.
> > > > > The name helps to logically recognize the owner by different DPDK
> > > > > entities and allows easy debug.
> > > > > Each DPDK entity can allocate an owner unique identifier and can
> > > > > use it and its preferred name to owns valid ethdev ports.
> > > > > Each DPDK entity can get any port owner status to decide if it can
> > > > > manage the port or not.
> > > > >
> > > > > The current ethdev internal port management is not affected by
> > > > > this feature.
> > > > >
> > >
> > > The internal port management is not affected, but the external
> > > interface is, however. In order to respect port ownership,
> > > applications are forced to modify their port iterator, as shown by your
> > testpmd patch.
> > >
> > > I think it would be better to modify the current RTE_ETH_FOREACH_DEV
> > > to call RTE_FOREACH_DEV_OWNED_BY, and introduce a default owner that
> > > would represent the application itself (probably with the ID 0 and an
> > > owner string ""). Only with specific additional configuration should
> > > this default subset of ethdev be divided.
> > >
> > > This would make this evolution seamless for applications, at no cost
> > > to the complexity of the design.
> > >
> > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > >
> > > >
> > > > This seems fairly racy.  What if one thread attempts to set
> > > > ownership on a port, while another is checking it on another cpu in
> > > > parallel.  It doesn't seem like it will protect against that at all.
> > > > It also doesn't protect against the possibility of multiple threads
> > > > attempting to poll for rx in parallel, which I think was part of
> > > > Thomas's origional statement regarding port ownership (he noted that
> > > > the lockless design implied only a single thread should be allowed to poll
> > for receive or make configuration changes at a time.
> > > >
> > > > Neil
> > > >
> > >
> > > Isn't this race already there for any configuration operation /
> > > polling function? The DPDK arch is expecting applications to solve it.
> > > Why should port ownership be designed differently from other DPDK
> > components?
> > >
> > Yes, but that doesn't mean it should exist in purpituity, nor does it mean that
> > your new api should contain it as well.
> >
> > > Embedding checks for port ownership within operations will force
> > > everyone to bear their costs, even those not interested in using this
> > > API. These checks should be kept outside, within the entity claiming
> > > ownership of the port, in the form of using the proper port iterator
> > > IMO.
> > >
> > No.  At the very least, you need to make the API itself exclusive.  That is to
> > say, you should at least ensure that a port ownership get call doesn't race
> > with a port ownership set call.  It seems rediculous to just leave that sort of
> > locking as an exercize to the user.
> >
> > Neil
> >
> Neil,
> As Thomas mentioned, a DPDK port is designed to be managed by only one
> thread (or synchronized DPDK entity).
> So all the port management includes port ownership shouldn't be synchronized,
> i.e. locks are not needed.
> If some application want to do dual thread port management, the responsibility
> to synchronize the port ownership or any other port management is on this
> application.
> Port ownership doesn't come to allow synchronized management of the port by
> two DPDK entities in parallel, it is just nice way to answer next questions:
> 	1. Is the port already owned by some DPDK entity(in early control path)?
> 	2. If yes, Who is the owner?
> If the answer to the first question is no, the current entity can take the ownership
> without any lock(1 thread).
> If the answer to the first question is yes, you can recognize the owner and take
> decisions accordingly, sometimes you can decide to use the port because you
> logically know what the current owner does and you can be logically synchronized
> with it, sometimes you can just leave this port because you have not any deal with
>  this owner.

If the goal is just to have an ability to recognize is that device is managed by another device
(failsafe, bonding, etc.),  then I think all we need is a pointer to rte_eth_dev_data of the owner
(NULL would mean no owner).
Also I think if we'd like to introduce that mechanism, then it needs to be
- mandatory (control API just don't allow changes to the device configuration if caller is not an owner).
- transparent to the user (no API changes).
 - set/get owner ops need to be atomic if we want this mechanism to be usable for MP.
Konstantin  





> 
> > > > > ---
> > > > >  doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
> > > > >  lib/librte_ether/rte_ethdev.c           | 121
> > ++++++++++++++++++++++++++++++++
> > > > >  lib/librte_ether/rte_ethdev.h           |  86 +++++++++++++++++++++++
> > > > >  lib/librte_ether/rte_ethdev_version.map |  12 ++++
> > > > >  4 files changed, 230 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > index 6a0c9f9..af639a1 100644
> > > > > --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > @@ -156,7 +156,7 @@ concurrently on the same tx queue without SW
> > > > > lock. This PMD feature found in som
> > > > >
> > > > >  See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE``
> > capability probing details.
> > > > >
> > > > > -Device Identification and Configuration
> > > > > +Device Identification, Ownership  and Configuration
> > > > >  ---------------------------------------
> > > > >
> > > > >  Device Identification
> > > > > @@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports are
> > assigned two other identifiers:
> > > > >  *   A port name used to designate the port in console messages, for
> > administration or debugging purposes.
> > > > >      For ease of use, the port name includes the port index.
> > > > >
> > > > > +Port Ownership
> > > > > +~~~~~~~~~~~~~
> > > > > +The Ethernet devices ports can be owned by a single DPDK entity
> > (application, library, PMD, process, etc).
> > > > > +The ownership mechanism is controlled by ethdev APIs and allows to
> > set/remove/get a port owner by DPDK entities.
> > > > > +Allowing this should prevent any multiple management of Ethernet
> > port by different entities.
> > > > > +
> > > > > +.. note::
> > > > > +
> > > > > +    It is the DPDK entity responsibility either to check the port owner
> > before using it or to set the port owner to prevent others from using it.
> > > > > +
> > > > >  Device Configuration
> > > > >  ~~~~~~~~~~~~~~~~~~~~
> > > > >
> > > > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > > > b/lib/librte_ether/rte_ethdev.c index 2d754d9..836991e 100644
> > > > > --- a/lib/librte_ether/rte_ethdev.c
> > > > > +++ b/lib/librte_ether/rte_ethdev.c
> > > > > @@ -71,6 +71,7 @@
> > > > >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
> > > > > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> > > > >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > > > > +static uint16_t rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER
> > + 1;
> > > > >  static uint8_t eth_dev_last_created_port;
> > > > >
> > > > >  /* spinlock for eth device callbacks */ @@ -278,6 +279,7 @@
> > > > > struct rte_eth_dev *
> > > > >  	if (eth_dev == NULL)
> > > > >  		return -EINVAL;
> > > > >
> > > > > +	memset(&eth_dev->data->owner, 0, sizeof(struct
> > > > > +rte_eth_dev_owner));
> > > > >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> > > > >  	return 0;
> > > > >  }
> > > > > @@ -293,6 +295,125 @@ struct rte_eth_dev *
> > > > >  		return 1;
> > > > >  }
> > > > >
> > > > > +static int
> > > > > +rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > > > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > > > > +	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER &&
> > > > > +	    rte_eth_next_owner_id <= owner_id)) {
> > > > > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > > > +		return 0;
> > > > > +	}
> > > > > +	return 1;
> > > > > +}
> > > > > +
> > > > > +uint16_t
> > > > > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t
> > > > > +owner_id) {
> > > > > +	while (port_id < RTE_MAX_ETHPORTS &&
> > > > > +	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
> > > > > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > > > > +		port_id++;
> > > > > +
> > > > > +	if (port_id >= RTE_MAX_ETHPORTS)
> > > > > +		return RTE_MAX_ETHPORTS;
> > > > > +
> > > > > +	return port_id;
> > > > > +}
> > > > > +
> > > > > +int
> > > > > +rte_eth_dev_owner_new(uint16_t *owner_id) {
> > > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own
> > ports.\n");
> > > > > +		return -EPERM;
> > > > > +	}
> > > > > +	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > > > > +		RTE_PMD_DEBUG_TRACE("Reached maximum number of
> > Ethernet port owners.\n");
> > > > > +		return -EUSERS;
> > > > > +	}
> > > > > +	*owner_id = rte_eth_next_owner_id++;
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +int
> > > > > +rte_eth_dev_owner_set(const uint16_t port_id,
> > > > > +		      const struct rte_eth_dev_owner *owner) {
> > > > > +	struct rte_eth_dev_owner *port_owner;
> > > > > +	int ret;
> > > > > +
> > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > > +		RTE_PMD_DEBUG_TRACE("Not primary process cannot own
> > ports.\n");
> > > > > +		return -EPERM;
> > > > > +	}
> > > > > +	if (!rte_eth_is_valid_owner_id(owner->id))
> > > > > +		return -EINVAL;
> > > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > > > > +	    port_owner->id != owner->id) {
> > > > > +		RTE_LOG(ERR, EAL,
> > > > > +			"Cannot set owner to port %d already owned by
> > %s_%05d.\n",
> > > > > +			port_id, port_owner->name, port_owner->id);
> > > > > +		return -EPERM;
> > > > > +	}
> > > > > +	ret = snprintf(port_owner->name,
> > RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > > > > +		       owner->name);
> > > > > +	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> > > > > +		memset(port_owner->name, 0,
> > RTE_ETH_MAX_OWNER_NAME_LEN);
> > > > > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > > > > +		return -EINVAL;
> > > > > +	}
> > > > > +	port_owner->id = owner->id;
> > > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
> > > > > +			    owner->name, owner->id);
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +int
> > > > > +rte_eth_dev_owner_remove(const uint16_t port_id, const uint16_t
> > > > > +owner_id) {
> > > > > +	struct rte_eth_dev_owner *port_owner;
> > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > > +		return -EINVAL;
> > > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > > +	if (port_owner->id != owner_id) {
> > > > > +		RTE_LOG(ERR, EAL,
> > > > > +			"Cannot remove port %d owner %s_%05d by
> > different owner id %5d.\n",
> > > > > +			port_id, port_owner->name, port_owner->id,
> > owner_id);
> > > > > +		return -EPERM;
> > > > > +	}
> > > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has
> > removed.\n", port_id,
> > > > > +			port_owner->name, port_owner->id);
> > > > > +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +void
> > > > > +rte_eth_dev_owner_delete(const uint16_t owner_id) {
> > > > > +	uint16_t p;
> > > > > +
> > > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > > +		return;
> > > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
> > > > > +		memset(&rte_eth_devices[p].data->owner, 0,
> > > > > +		       sizeof(struct rte_eth_dev_owner));
> > > > > +	RTE_PMD_DEBUG_TRACE("All port owners owned by "
> > > > > +			    "%05d identifier has removed.\n", owner_id); }
> > > > > +
> > > > > +const struct rte_eth_dev_owner *
> > > > > +rte_eth_dev_owner_get(const uint16_t port_id) {
> > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> > > > > +	if (rte_eth_devices[port_id].data->owner.id ==
> > RTE_ETH_DEV_NO_OWNER)
> > > > > +		return NULL;
> > > > > +	return &rte_eth_devices[port_id].data->owner;
> > > > > +}
> > > > > +
> > > > >  int
> > > > >  rte_eth_dev_socket_id(uint16_t port_id)  { diff --git
> > > > > a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> > > > > index 341c2d6..f54c26d 100644
> > > > > --- a/lib/librte_ether/rte_ethdev.h
> > > > > +++ b/lib/librte_ether/rte_ethdev.h
> > > > > @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
> > > > >
> > > > >  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
> > > > >
> > > > > +#define RTE_ETH_DEV_NO_OWNER 0
> > > > > +
> > > > > +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> > > > > +
> > > > > +struct rte_eth_dev_owner {
> > > > > +	uint16_t id; /**< The owner unique identifier. */
> > > > > +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner
> > name. */
> > > > > +};
> > > > > +
> > > > >  /**
> > > > >   * @internal
> > > > >   * The data part, with no function pointers, associated with each
> > ethernet device.
> > > > > @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
> > > > >  	int numa_node;  /**< NUMA node connection */
> > > > >  	struct rte_vlan_filter_conf vlan_filter_conf;
> > > > >  	/**< VLAN filter configuration. */
> > > > > +	struct rte_eth_dev_owner owner; /**< The port owner. */
> > > > >  };
> > > > >
> > > > >  /** Device supports link state interrupt */ @@ -1846,6 +1856,82
> > > > > @@ struct rte_eth_dev_data {
> > > > >
> > > > >
> > > > >  /**
> > > > > + * Iterates over valid ethdev ports owned by a specific owner.
> > > > > + *
> > > > > + * @param port_id
> > > > > + *   The id of the next possible valid owned port.
> > > > > + * @param	owner_id
> > > > > + *  The owner identifier.
> > > > > + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless
> > ports.
> > > > > + * @return
> > > > > + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if
> > there is none.
> > > > > + */
> > > > > +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const
> > > > > +uint16_t owner_id);
> > > > > +
> > > > > +/**
> > > > > + * Macro to iterate over all enabled ethdev ports owned by a specific
> > owner.
> > > > > + */
> > > > > +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> > > > > +	for (p = rte_eth_find_next_owned_by(0, o); \
> > > > > +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> > > > > +	     p = rte_eth_find_next_owned_by(p + 1, o))
> > > > > +
> > > > > +/**
> > > > > + * Get a new unique owner identifier.
> > > > > + * An owner identifier is used to owns Ethernet devices by only
> > > > > +one DPDK entity
> > > > > + * to avoid multiple management of device by different entities.
> > > > > + *
> > > > > + * @param	owner_id
> > > > > + *   Owner identifier pointer.
> > > > > + * @return
> > > > > + *   Negative errno value on error, 0 on success.
> > > > > + */
> > > > > +int rte_eth_dev_owner_new(uint16_t *owner_id);
> > > > > +
> > > > > +/**
> > > > > + * Set an Ethernet device owner.
> > > > > + *
> > > > > + * @param	port_id
> > > > > + *  The identifier of the port to own.
> > > > > + * @param	owner
> > > > > + *  The owner pointer.
> > > > > + * @return
> > > > > + *  Negative errno value on error, 0 on success.
> > > > > + */
> > > > > +int rte_eth_dev_owner_set(const uint16_t port_id,
> > > > > +			  const struct rte_eth_dev_owner *owner);
> > > > > +
> > > > > +/**
> > > > > + * Remove Ethernet device owner to make the device ownerless.
> > > > > + *
> > > > > + * @param	port_id
> > > > > + *  The identifier of port to make ownerless.
> > > > > + * @param	owner
> > > > > + *  The owner identifier.
> > > > > + * @return
> > > > > + *  0 on success, negative errno value on error.
> > > > > + */
> > > > > +int rte_eth_dev_owner_remove(const uint16_t port_id, const
> > > > > +uint16_t owner_id);
> > > > > +
> > > > > +/**
> > > > > + * Remove owner from all Ethernet devices owned by a specific
> > owner.
> > > > > + *
> > > > > + * @param	owner
> > > > > + *  The owner identifier.
> > > > > + */
> > > > > +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> > > > > +
> > > > > +/**
> > > > > + * Get the owner of an Ethernet device.
> > > > > + *
> > > > > + * @param	port_id
> > > > > + *  The port identifier.
> > > > > + * @return
> > > > > + *  NULL when the device is ownerless, else the device owner pointer.
> > > > > + */
> > > > > +const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const
> > > > > +uint16_t port_id);
> > > > > +
> > > > > +/**
> > > > >   * Get the total number of Ethernet devices that have been
> > successfully
> > > > >   * initialized by the matching Ethernet driver during the PCI probing
> > phase
> > > > >   * and that are available for applications to use. These devices
> > > > > must be diff --git a/lib/librte_ether/rte_ethdev_version.map
> > > > > b/lib/librte_ether/rte_ethdev_version.map
> > > > > index e9681ac..7d07edb 100644
> > > > > --- a/lib/librte_ether/rte_ethdev_version.map
> > > > > +++ b/lib/librte_ether/rte_ethdev_version.map
> > > > > @@ -198,6 +198,18 @@ DPDK_17.11 {
> > > > >
> > > > >  } DPDK_17.08;
> > > > >
> > > > > +DPDK_18.02 {
> > > > > +	global:
> > > > > +
> > > > > +	rte_eth_find_next_owned_by;
> > > > > +	rte_eth_dev_owner_new;
> > > > > +	rte_eth_dev_owner_set;
> > > > > +	rte_eth_dev_owner_remove;
> > > > > +	rte_eth_dev_owner_delete;
> > > > > +	rte_eth_dev_owner_get;
> > > > > +
> > > > > +} DPDK_17.11;
> > > > > +
> > > > >  EXPERIMENTAL {
> > > > >  	global:
> > > > >
> > > > > --
> > > > > 1.8.3.1
> > > > >
> > > > >
> > >
> > > --
> > > Gaëtan Rivet
> > > 6WIND
> > >

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-03 11:10           ` Ananyev, Konstantin
@ 2017-12-03 13:46             ` Matan Azrad
  2017-12-04 16:01               ` Neil Horman
  2017-12-05 11:12               ` Ananyev, Konstantin
  0 siblings, 2 replies; 214+ messages in thread
From: Matan Azrad @ 2017-12-03 13:46 UTC (permalink / raw)
  To: Ananyev, Konstantin, Neil Horman, Gaëtan Rivet
  Cc: Thomas Monjalon, Wu, Jingjing, dev

Hi Konstantine

> -----Original Message-----
> From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> Sent: Sunday, December 3, 2017 1:10 PM
> To: Matan Azrad <matan@mellanox.com>; Neil Horman
> <nhorman@tuxdriver.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> <jingjing.wu@intel.com>; dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> 
> 
> Hi Matan,
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > Sent: Sunday, December 3, 2017 8:05 AM
> > To: Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet
> > <gaetan.rivet@6wind.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > <jingjing.wu@intel.com>; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> >
> > Hi
> >
> > > -----Original Message-----
> > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > Sent: Friday, December 1, 2017 2:10 PM
> > > To: Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> > > dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > >
> > > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> > > > Hello Matan, Neil,
> > > >
> > > > I like the port ownership concept. I think it is needed to clarify
> > > > some operations and should be useful to several subsystems.
> > > >
> > > > This patch could certainly be sub-divided however, and your
> > > > current
> > > > 1/5 should probably come after this one.
> > > >
> > > > Some comments inline.
> > > >
> > > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > > > The ownership of a port is implicit in DPDK.
> > > > > > Making it explicit is better from the next reasons:
> > > > > > 1. It may be convenient for multi-process applications to know
> which
> > > > > >    process is in charge of a port.
> > > > > > 2. A library could work on top of a port.
> > > > > > 3. A port can work on top of another port.
> > > > > >
> > > > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > > > We need to check that the user is not trying to use a port
> > > > > > which is already managed by fail-safe.
> > > > > >
> > > > > > Add ownership mechanism to DPDK Ethernet devices to avoid
> > > > > > multiple management of a device by different DPDK entities.
> > > > > >
> > > > > > A port owner is built from owner id(number) and owner
> > > > > > name(string) while the owner id must be unique to distinguish
> > > > > > between two identical entity instances and the owner name can be
> any name.
> > > > > > The name helps to logically recognize the owner by different
> > > > > > DPDK entities and allows easy debug.
> > > > > > Each DPDK entity can allocate an owner unique identifier and
> > > > > > can use it and its preferred name to owns valid ethdev ports.
> > > > > > Each DPDK entity can get any port owner status to decide if it
> > > > > > can manage the port or not.
> > > > > >
> > > > > > The current ethdev internal port management is not affected by
> > > > > > this feature.
> > > > > >
> > > >
> > > > The internal port management is not affected, but the external
> > > > interface is, however. In order to respect port ownership,
> > > > applications are forced to modify their port iterator, as shown by
> > > > your
> > > testpmd patch.
> > > >
> > > > I think it would be better to modify the current
> > > > RTE_ETH_FOREACH_DEV to call RTE_FOREACH_DEV_OWNED_BY, and
> > > > introduce a default owner that would represent the application
> > > > itself (probably with the ID 0 and an owner string ""). Only with
> > > > specific additional configuration should this default subset of ethdev be
> divided.
> > > >
> > > > This would make this evolution seamless for applications, at no
> > > > cost to the complexity of the design.
> > > >
> > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > >
> > > > >
> > > > > This seems fairly racy.  What if one thread attempts to set
> > > > > ownership on a port, while another is checking it on another cpu
> > > > > in parallel.  It doesn't seem like it will protect against that at all.
> > > > > It also doesn't protect against the possibility of multiple
> > > > > threads attempting to poll for rx in parallel, which I think was
> > > > > part of Thomas's origional statement regarding port ownership
> > > > > (he noted that the lockless design implied only a single thread
> > > > > should be allowed to poll
> > > for receive or make configuration changes at a time.
> > > > >
> > > > > Neil
> > > > >
> > > >
> > > > Isn't this race already there for any configuration operation /
> > > > polling function? The DPDK arch is expecting applications to solve it.
> > > > Why should port ownership be designed differently from other DPDK
> > > components?
> > > >
> > > Yes, but that doesn't mean it should exist in purpituity, nor does
> > > it mean that your new api should contain it as well.
> > >
> > > > Embedding checks for port ownership within operations will force
> > > > everyone to bear their costs, even those not interested in using
> > > > this API. These checks should be kept outside, within the entity
> > > > claiming ownership of the port, in the form of using the proper
> > > > port iterator IMO.
> > > >
> > > No.  At the very least, you need to make the API itself exclusive.
> > > That is to say, you should at least ensure that a port ownership get
> > > call doesn't race with a port ownership set call.  It seems
> > > rediculous to just leave that sort of locking as an exercize to the user.
> > >
> > > Neil
> > >
> > Neil,
> > As Thomas mentioned, a DPDK port is designed to be managed by only one
> > thread (or synchronized DPDK entity).
> > So all the port management includes port ownership shouldn't be
> > synchronized, i.e. locks are not needed.
> > If some application want to do dual thread port management, the
> > responsibility to synchronize the port ownership or any other port
> > management is on this application.
> > Port ownership doesn't come to allow synchronized management of the
> > port by two DPDK entities in parallel, it is just nice way to answer next
> questions:
> > 	1. Is the port already owned by some DPDK entity(in early control
> path)?
> > 	2. If yes, Who is the owner?
> > If the answer to the first question is no, the current entity can take
> > the ownership without any lock(1 thread).
> > If the answer to the first question is yes, you can recognize the
> > owner and take decisions accordingly, sometimes you can decide to use
> > the port because you logically know what the current owner does and
> > you can be logically synchronized with it, sometimes you can just
> > leave this port because you have not any deal with  this owner.
> 
> If the goal is just to have an ability to recognize is that device is managed by
> another device (failsafe, bonding, etc.),  then I think all we need is a pointer
> to rte_eth_dev_data of the owner (NULL would mean no owner).

I think string is better than a pointer from the next reasons:
1. It is more human friendly than pointers for debug and printing.
2. it is flexible and allows to forward logical owner message to other DPDK entities. 

> Also I think if we'd like to introduce that mechanism, then it needs to be
> - mandatory (control API just don't allow changes to the device configuration
> if caller is not an owner).

But what if 2 DPDK entities should manage the same port \ using it and they are synchronized?

> - transparent to the user (no API changes).

For now, there is not API change but new suggested API to use for port iteration.

>  - set/get owner ops need to be atomic if we want this mechanism to be
> usable for MP.

But also without atomic this mechanism is usable in MP.
For example:
PRIMARY application can set its owner with string "primary A".
SECONDARY process (which attach to the ports only after the primary created them )is not allowed to set owner(As you can see in the code) but it can read the owner string and see that the port owner is the primary application.
The "A" can just sign specific port type to the SECONDARY that this port works with logic A which means, for example, primary should send the packets and secondary should receive the packets.

> Konstantin
> 
> 
> 
> 
> 
> >
> > > > > > ---
> > > > > >  doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
> > > > > >  lib/librte_ether/rte_ethdev.c           | 121
> > > ++++++++++++++++++++++++++++++++
> > > > > >  lib/librte_ether/rte_ethdev.h           |  86
> +++++++++++++++++++++++
> > > > > >  lib/librte_ether/rte_ethdev_version.map |  12 ++++
> > > > > >  4 files changed, 230 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > index 6a0c9f9..af639a1 100644
> > > > > > --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > @@ -156,7 +156,7 @@ concurrently on the same tx queue without
> > > > > > SW lock. This PMD feature found in som
> > > > > >
> > > > > >  See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE``
> > > capability probing details.
> > > > > >
> > > > > > -Device Identification and Configuration
> > > > > > +Device Identification, Ownership  and Configuration
> > > > > >  ---------------------------------------
> > > > > >
> > > > > >  Device Identification
> > > > > > @@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports
> > > > > > are
> > > assigned two other identifiers:
> > > > > >  *   A port name used to designate the port in console messages, for
> > > administration or debugging purposes.
> > > > > >      For ease of use, the port name includes the port index.
> > > > > >
> > > > > > +Port Ownership
> > > > > > +~~~~~~~~~~~~~
> > > > > > +The Ethernet devices ports can be owned by a single DPDK
> > > > > > +entity
> > > (application, library, PMD, process, etc).
> > > > > > +The ownership mechanism is controlled by ethdev APIs and
> > > > > > +allows to
> > > set/remove/get a port owner by DPDK entities.
> > > > > > +Allowing this should prevent any multiple management of
> > > > > > +Ethernet
> > > port by different entities.
> > > > > > +
> > > > > > +.. note::
> > > > > > +
> > > > > > +    It is the DPDK entity responsibility either to check the
> > > > > > + port owner
> > > before using it or to set the port owner to prevent others from using it.
> > > > > > +
> > > > > >  Device Configuration
> > > > > >  ~~~~~~~~~~~~~~~~~~~~
> > > > > >
> > > > > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > > > > b/lib/librte_ether/rte_ethdev.c index 2d754d9..836991e 100644
> > > > > > --- a/lib/librte_ether/rte_ethdev.c
> > > > > > +++ b/lib/librte_ether/rte_ethdev.c
> > > > > > @@ -71,6 +71,7 @@
> > > > > >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
> > > > > > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> > > > > >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > > > > > +static uint16_t rte_eth_next_owner_id =
> RTE_ETH_DEV_NO_OWNER
> > > + 1;
> > > > > >  static uint8_t eth_dev_last_created_port;
> > > > > >
> > > > > >  /* spinlock for eth device callbacks */ @@ -278,6 +279,7 @@
> > > > > > struct rte_eth_dev *
> > > > > >  	if (eth_dev == NULL)
> > > > > >  		return -EINVAL;
> > > > > >
> > > > > > +	memset(&eth_dev->data->owner, 0, sizeof(struct
> > > > > > +rte_eth_dev_owner));
> > > > > >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> > > > > >  	return 0;
> > > > > >  }
> > > > > > @@ -293,6 +295,125 @@ struct rte_eth_dev *
> > > > > >  		return 1;
> > > > > >  }
> > > > > >
> > > > > > +static int
> > > > > > +rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > > > > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > > > > > +	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER
> &&
> > > > > > +	    rte_eth_next_owner_id <= owner_id)) {
> > > > > > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n",
> owner_id);
> > > > > > +		return 0;
> > > > > > +	}
> > > > > > +	return 1;
> > > > > > +}
> > > > > > +
> > > > > > +uint16_t
> > > > > > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t
> > > > > > +owner_id) {
> > > > > > +	while (port_id < RTE_MAX_ETHPORTS &&
> > > > > > +	       (rte_eth_devices[port_id].state !=
> RTE_ETH_DEV_ATTACHED ||
> > > > > > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > > > > > +		port_id++;
> > > > > > +
> > > > > > +	if (port_id >= RTE_MAX_ETHPORTS)
> > > > > > +		return RTE_MAX_ETHPORTS;
> > > > > > +
> > > > > > +	return port_id;
> > > > > > +}
> > > > > > +
> > > > > > +int
> > > > > > +rte_eth_dev_owner_new(uint16_t *owner_id) {
> > > > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > > > +		RTE_PMD_DEBUG_TRACE("Not primary process
> cannot own
> > > ports.\n");
> > > > > > +		return -EPERM;
> > > > > > +	}
> > > > > > +	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > > > > > +		RTE_PMD_DEBUG_TRACE("Reached maximum
> number of
> > > Ethernet port owners.\n");
> > > > > > +		return -EUSERS;
> > > > > > +	}
> > > > > > +	*owner_id = rte_eth_next_owner_id++;
> > > > > > +	return 0;
> > > > > > +}
> > > > > > +
> > > > > > +int
> > > > > > +rte_eth_dev_owner_set(const uint16_t port_id,
> > > > > > +		      const struct rte_eth_dev_owner *owner) {
> > > > > > +	struct rte_eth_dev_owner *port_owner;
> > > > > > +	int ret;
> > > > > > +
> > > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > > > +		RTE_PMD_DEBUG_TRACE("Not primary process
> cannot own
> > > ports.\n");
> > > > > > +		return -EPERM;
> > > > > > +	}
> > > > > > +	if (!rte_eth_is_valid_owner_id(owner->id))
> > > > > > +		return -EINVAL;
> > > > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > > > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > > > > > +	    port_owner->id != owner->id) {
> > > > > > +		RTE_LOG(ERR, EAL,
> > > > > > +			"Cannot set owner to port %d already owned
> by
> > > %s_%05d.\n",
> > > > > > +			port_id, port_owner->name, port_owner-
> >id);
> > > > > > +		return -EPERM;
> > > > > > +	}
> > > > > > +	ret = snprintf(port_owner->name,
> > > RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > > > > > +		       owner->name);
> > > > > > +	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> > > > > > +		memset(port_owner->name, 0,
> > > RTE_ETH_MAX_OWNER_NAME_LEN);
> > > > > > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > > > > > +		return -EINVAL;
> > > > > > +	}
> > > > > > +	port_owner->id = owner->id;
> > > > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n",
> port_id,
> > > > > > +			    owner->name, owner->id);
> > > > > > +	return 0;
> > > > > > +}
> > > > > > +
> > > > > > +int
> > > > > > +rte_eth_dev_owner_remove(const uint16_t port_id, const
> > > > > > +uint16_t
> > > > > > +owner_id) {
> > > > > > +	struct rte_eth_dev_owner *port_owner;
> > > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > > > +		return -EINVAL;
> > > > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > > > +	if (port_owner->id != owner_id) {
> > > > > > +		RTE_LOG(ERR, EAL,
> > > > > > +			"Cannot remove port %d owner %s_%05d by
> > > different owner id %5d.\n",
> > > > > > +			port_id, port_owner->name, port_owner-
> >id,
> > > owner_id);
> > > > > > +		return -EPERM;
> > > > > > +	}
> > > > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has
> > > removed.\n", port_id,
> > > > > > +			port_owner->name, port_owner->id);
> > > > > > +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> > > > > > +	return 0;
> > > > > > +}
> > > > > > +
> > > > > > +void
> > > > > > +rte_eth_dev_owner_delete(const uint16_t owner_id) {
> > > > > > +	uint16_t p;
> > > > > > +
> > > > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > > > +		return;
> > > > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
> > > > > > +		memset(&rte_eth_devices[p].data->owner, 0,
> > > > > > +		       sizeof(struct rte_eth_dev_owner));
> > > > > > +	RTE_PMD_DEBUG_TRACE("All port owners owned by "
> > > > > > +			    "%05d identifier has removed.\n",
> owner_id); }
> > > > > > +
> > > > > > +const struct rte_eth_dev_owner * rte_eth_dev_owner_get(const
> > > > > > +uint16_t port_id) {
> > > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> > > > > > +	if (rte_eth_devices[port_id].data->owner.id ==
> > > RTE_ETH_DEV_NO_OWNER)
> > > > > > +		return NULL;
> > > > > > +	return &rte_eth_devices[port_id].data->owner;
> > > > > > +}
> > > > > > +
> > > > > >  int
> > > > > >  rte_eth_dev_socket_id(uint16_t port_id)  { diff --git
> > > > > > a/lib/librte_ether/rte_ethdev.h
> > > > > > b/lib/librte_ether/rte_ethdev.h index 341c2d6..f54c26d 100644
> > > > > > --- a/lib/librte_ether/rte_ethdev.h
> > > > > > +++ b/lib/librte_ether/rte_ethdev.h
> > > > > > @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
> > > > > >
> > > > > >  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
> > > > > >
> > > > > > +#define RTE_ETH_DEV_NO_OWNER 0
> > > > > > +
> > > > > > +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> > > > > > +
> > > > > > +struct rte_eth_dev_owner {
> > > > > > +	uint16_t id; /**< The owner unique identifier. */
> > > > > > +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The
> owner
> > > name. */
> > > > > > +};
> > > > > > +
> > > > > >  /**
> > > > > >   * @internal
> > > > > >   * The data part, with no function pointers, associated with
> > > > > > each
> > > ethernet device.
> > > > > > @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
> > > > > >  	int numa_node;  /**< NUMA node connection */
> > > > > >  	struct rte_vlan_filter_conf vlan_filter_conf;
> > > > > >  	/**< VLAN filter configuration. */
> > > > > > +	struct rte_eth_dev_owner owner; /**< The port owner. */
> > > > > >  };
> > > > > >
> > > > > >  /** Device supports link state interrupt */ @@ -1846,6
> > > > > > +1856,82 @@ struct rte_eth_dev_data {
> > > > > >
> > > > > >
> > > > > >  /**
> > > > > > + * Iterates over valid ethdev ports owned by a specific owner.
> > > > > > + *
> > > > > > + * @param port_id
> > > > > > + *   The id of the next possible valid owned port.
> > > > > > + * @param	owner_id
> > > > > > + *  The owner identifier.
> > > > > > + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid
> > > > > > + ownerless
> > > ports.
> > > > > > + * @return
> > > > > > + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if
> > > there is none.
> > > > > > + */
> > > > > > +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const
> > > > > > +uint16_t owner_id);
> > > > > > +
> > > > > > +/**
> > > > > > + * Macro to iterate over all enabled ethdev ports owned by a
> > > > > > +specific
> > > owner.
> > > > > > + */
> > > > > > +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> > > > > > +	for (p = rte_eth_find_next_owned_by(0, o); \
> > > > > > +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> > > > > > +	     p = rte_eth_find_next_owned_by(p + 1, o))
> > > > > > +
> > > > > > +/**
> > > > > > + * Get a new unique owner identifier.
> > > > > > + * An owner identifier is used to owns Ethernet devices by
> > > > > > +only one DPDK entity
> > > > > > + * to avoid multiple management of device by different entities.
> > > > > > + *
> > > > > > + * @param	owner_id
> > > > > > + *   Owner identifier pointer.
> > > > > > + * @return
> > > > > > + *   Negative errno value on error, 0 on success.
> > > > > > + */
> > > > > > +int rte_eth_dev_owner_new(uint16_t *owner_id);
> > > > > > +
> > > > > > +/**
> > > > > > + * Set an Ethernet device owner.
> > > > > > + *
> > > > > > + * @param	port_id
> > > > > > + *  The identifier of the port to own.
> > > > > > + * @param	owner
> > > > > > + *  The owner pointer.
> > > > > > + * @return
> > > > > > + *  Negative errno value on error, 0 on success.
> > > > > > + */
> > > > > > +int rte_eth_dev_owner_set(const uint16_t port_id,
> > > > > > +			  const struct rte_eth_dev_owner *owner);
> > > > > > +
> > > > > > +/**
> > > > > > + * Remove Ethernet device owner to make the device ownerless.
> > > > > > + *
> > > > > > + * @param	port_id
> > > > > > + *  The identifier of port to make ownerless.
> > > > > > + * @param	owner
> > > > > > + *  The owner identifier.
> > > > > > + * @return
> > > > > > + *  0 on success, negative errno value on error.
> > > > > > + */
> > > > > > +int rte_eth_dev_owner_remove(const uint16_t port_id, const
> > > > > > +uint16_t owner_id);
> > > > > > +
> > > > > > +/**
> > > > > > + * Remove owner from all Ethernet devices owned by a specific
> > > owner.
> > > > > > + *
> > > > > > + * @param	owner
> > > > > > + *  The owner identifier.
> > > > > > + */
> > > > > > +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> > > > > > +
> > > > > > +/**
> > > > > > + * Get the owner of an Ethernet device.
> > > > > > + *
> > > > > > + * @param	port_id
> > > > > > + *  The port identifier.
> > > > > > + * @return
> > > > > > + *  NULL when the device is ownerless, else the device owner
> pointer.
> > > > > > + */
> > > > > > +const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const
> > > > > > +uint16_t port_id);
> > > > > > +
> > > > > > +/**
> > > > > >   * Get the total number of Ethernet devices that have been
> > > successfully
> > > > > >   * initialized by the matching Ethernet driver during the PCI
> > > > > > probing
> > > phase
> > > > > >   * and that are available for applications to use. These
> > > > > > devices must be diff --git
> > > > > > a/lib/librte_ether/rte_ethdev_version.map
> > > > > > b/lib/librte_ether/rte_ethdev_version.map
> > > > > > index e9681ac..7d07edb 100644
> > > > > > --- a/lib/librte_ether/rte_ethdev_version.map
> > > > > > +++ b/lib/librte_ether/rte_ethdev_version.map
> > > > > > @@ -198,6 +198,18 @@ DPDK_17.11 {
> > > > > >
> > > > > >  } DPDK_17.08;
> > > > > >
> > > > > > +DPDK_18.02 {
> > > > > > +	global:
> > > > > > +
> > > > > > +	rte_eth_find_next_owned_by;
> > > > > > +	rte_eth_dev_owner_new;
> > > > > > +	rte_eth_dev_owner_set;
> > > > > > +	rte_eth_dev_owner_remove;
> > > > > > +	rte_eth_dev_owner_delete;
> > > > > > +	rte_eth_dev_owner_get;
> > > > > > +
> > > > > > +} DPDK_17.11;
> > > > > > +
> > > > > >  EXPERIMENTAL {
> > > > > >  	global:
> > > > > >
> > > > > > --
> > > > > > 1.8.3.1
> > > > > >
> > > > > >
> > > >
> > > > --
> > > > Gaëtan Rivet
> > > > 6WIND
> > > >

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-03 13:46             ` Matan Azrad
@ 2017-12-04 16:01               ` Neil Horman
  2017-12-04 18:10                 ` Matan Azrad
  2017-12-05 11:12               ` Ananyev, Konstantin
  1 sibling, 1 reply; 214+ messages in thread
From: Neil Horman @ 2017-12-04 16:01 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Ananyev, Konstantin, Gaëtan Rivet, Thomas Monjalon, Wu,
	Jingjing, dev

On Sun, Dec 03, 2017 at 01:46:49PM +0000, Matan Azrad wrote:
> Hi Konstantine
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > Sent: Sunday, December 3, 2017 1:10 PM
> > To: Matan Azrad <matan@mellanox.com>; Neil Horman
> > <nhorman@tuxdriver.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > <jingjing.wu@intel.com>; dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > 
> > 
> > 
> > Hi Matan,
> > 
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > > Sent: Sunday, December 3, 2017 8:05 AM
> > > To: Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet
> > > <gaetan.rivet@6wind.com>
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > <jingjing.wu@intel.com>; dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > >
> > > Hi
> > >
> > > > -----Original Message-----
> > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > > Sent: Friday, December 1, 2017 2:10 PM
> > > > To: Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> > > > dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > >
> > > > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> > > > > Hello Matan, Neil,
> > > > >
> > > > > I like the port ownership concept. I think it is needed to clarify
> > > > > some operations and should be useful to several subsystems.
> > > > >
> > > > > This patch could certainly be sub-divided however, and your
> > > > > current
> > > > > 1/5 should probably come after this one.
> > > > >
> > > > > Some comments inline.
> > > > >
> > > > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > > > > The ownership of a port is implicit in DPDK.
> > > > > > > Making it explicit is better from the next reasons:
> > > > > > > 1. It may be convenient for multi-process applications to know
> > which
> > > > > > >    process is in charge of a port.
> > > > > > > 2. A library could work on top of a port.
> > > > > > > 3. A port can work on top of another port.
> > > > > > >
> > > > > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > > > > We need to check that the user is not trying to use a port
> > > > > > > which is already managed by fail-safe.
> > > > > > >
> > > > > > > Add ownership mechanism to DPDK Ethernet devices to avoid
> > > > > > > multiple management of a device by different DPDK entities.
> > > > > > >
> > > > > > > A port owner is built from owner id(number) and owner
> > > > > > > name(string) while the owner id must be unique to distinguish
> > > > > > > between two identical entity instances and the owner name can be
> > any name.
> > > > > > > The name helps to logically recognize the owner by different
> > > > > > > DPDK entities and allows easy debug.
> > > > > > > Each DPDK entity can allocate an owner unique identifier and
> > > > > > > can use it and its preferred name to owns valid ethdev ports.
> > > > > > > Each DPDK entity can get any port owner status to decide if it
> > > > > > > can manage the port or not.
> > > > > > >
> > > > > > > The current ethdev internal port management is not affected by
> > > > > > > this feature.
> > > > > > >
> > > > >
> > > > > The internal port management is not affected, but the external
> > > > > interface is, however. In order to respect port ownership,
> > > > > applications are forced to modify their port iterator, as shown by
> > > > > your
> > > > testpmd patch.
> > > > >
> > > > > I think it would be better to modify the current
> > > > > RTE_ETH_FOREACH_DEV to call RTE_FOREACH_DEV_OWNED_BY, and
> > > > > introduce a default owner that would represent the application
> > > > > itself (probably with the ID 0 and an owner string ""). Only with
> > > > > specific additional configuration should this default subset of ethdev be
> > divided.
> > > > >
> > > > > This would make this evolution seamless for applications, at no
> > > > > cost to the complexity of the design.
> > > > >
> > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > >
> > > > > >
> > > > > > This seems fairly racy.  What if one thread attempts to set
> > > > > > ownership on a port, while another is checking it on another cpu
> > > > > > in parallel.  It doesn't seem like it will protect against that at all.
> > > > > > It also doesn't protect against the possibility of multiple
> > > > > > threads attempting to poll for rx in parallel, which I think was
> > > > > > part of Thomas's origional statement regarding port ownership
> > > > > > (he noted that the lockless design implied only a single thread
> > > > > > should be allowed to poll
> > > > for receive or make configuration changes at a time.
> > > > > >
> > > > > > Neil
> > > > > >
> > > > >
> > > > > Isn't this race already there for any configuration operation /
> > > > > polling function? The DPDK arch is expecting applications to solve it.
> > > > > Why should port ownership be designed differently from other DPDK
> > > > components?
> > > > >
> > > > Yes, but that doesn't mean it should exist in purpituity, nor does
> > > > it mean that your new api should contain it as well.
> > > >
> > > > > Embedding checks for port ownership within operations will force
> > > > > everyone to bear their costs, even those not interested in using
> > > > > this API. These checks should be kept outside, within the entity
> > > > > claiming ownership of the port, in the form of using the proper
> > > > > port iterator IMO.
> > > > >
> > > > No.  At the very least, you need to make the API itself exclusive.
> > > > That is to say, you should at least ensure that a port ownership get
> > > > call doesn't race with a port ownership set call.  It seems
> > > > rediculous to just leave that sort of locking as an exercize to the user.
> > > >
> > > > Neil
> > > >
> > > Neil,
> > > As Thomas mentioned, a DPDK port is designed to be managed by only one
> > > thread (or synchronized DPDK entity).
> > > So all the port management includes port ownership shouldn't be
> > > synchronized, i.e. locks are not needed.
> > > If some application want to do dual thread port management, the
> > > responsibility to synchronize the port ownership or any other port
> > > management is on this application.
> > > Port ownership doesn't come to allow synchronized management of the
> > > port by two DPDK entities in parallel, it is just nice way to answer next
> > questions:
> > > 	1. Is the port already owned by some DPDK entity(in early control
> > path)?
> > > 	2. If yes, Who is the owner?
> > > If the answer to the first question is no, the current entity can take
> > > the ownership without any lock(1 thread).
> > > If the answer to the first question is yes, you can recognize the
> > > owner and take decisions accordingly, sometimes you can decide to use
> > > the port because you logically know what the current owner does and
> > > you can be logically synchronized with it, sometimes you can just
> > > leave this port because you have not any deal with  this owner.
> > 
> > If the goal is just to have an ability to recognize is that device is managed by
> > another device (failsafe, bonding, etc.),  then I think all we need is a pointer
> > to rte_eth_dev_data of the owner (NULL would mean no owner).
> 
> I think string is better than a pointer from the next reasons:
> 1. It is more human friendly than pointers for debug and printing.
> 2. it is flexible and allows to forward logical owner message to other DPDK entities. 
> 
> > Also I think if we'd like to introduce that mechanism, then it needs to be
> > - mandatory (control API just don't allow changes to the device configuration
> > if caller is not an owner).
> 
> But what if 2 DPDK entities should manage the same port \ using it and they are synchronized?
> 
> > - transparent to the user (no API changes).
> 
> For now, there is not API change but new suggested API to use for port iteration.
> 
> >  - set/get owner ops need to be atomic if we want this mechanism to be
> > usable for MP.
> 
> But also without atomic this mechanism is usable in MP.
> For example:
> PRIMARY application can set its owner with string "primary A".
> SECONDARY process (which attach to the ports only after the primary created them )is not allowed to set owner(As you can see in the code) but it can read the owner string and see that the port owner is the primary application.
> The "A" can just sign specific port type to the SECONDARY that this port works with logic A which means, for example, primary should send the packets and secondary should receive the packets.
> 
But thats just the point, the operations that you are describing are not atomic
at all.  If the primary process is interrupted during its setting of a ports
owner ship after its read the current owner field, its entirely possible for the
secondary proces to set the owner as itself, and then have the primary process
set it back.  Without locking, its just broken.  I know that the dpdk likes to
say its lockless, because it has no locks, but here we're just kicking the can
down the road, by making the application add the locks for the library.

Neil

> > Konstantin
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-04 16:01               ` Neil Horman
@ 2017-12-04 18:10                 ` Matan Azrad
  2017-12-04 22:30                   ` Neil Horman
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2017-12-04 18:10 UTC (permalink / raw)
  To: Neil Horman
  Cc: Ananyev, Konstantin, Gaëtan Rivet, Thomas Monjalon, Wu,
	Jingjing, dev

Hi Neil

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Monday, December 4, 2017 6:01 PM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Gaëtan Rivet
> <gaetan.rivet@6wind.com>; Thomas Monjalon <thomas@monjalon.net>;
> Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> On Sun, Dec 03, 2017 at 01:46:49PM +0000, Matan Azrad wrote:
> > Hi Konstantine
> >
> > > -----Original Message-----
> > > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > > Sent: Sunday, December 3, 2017 1:10 PM
> > > To: Matan Azrad <matan@mellanox.com>; Neil Horman
> > > <nhorman@tuxdriver.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > <jingjing.wu@intel.com>; dev@dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > >
> > >
> > >
> > > Hi Matan,
> > >
> > > > -----Original Message-----
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > > > Sent: Sunday, December 3, 2017 8:05 AM
> > > > To: Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet
> > > > <gaetan.rivet@6wind.com>
> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > > <jingjing.wu@intel.com>; dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > >
> > > > Hi
> > > >
> > > > > -----Original Message-----
> > > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > > > Sent: Friday, December 1, 2017 2:10 PM
> > > > > To: Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > > > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > > <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> > > > > dev@dpdk.org
> > > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > > >
> > > > > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> > > > > > Hello Matan, Neil,
> > > > > >
> > > > > > I like the port ownership concept. I think it is needed to
> > > > > > clarify some operations and should be useful to several subsystems.
> > > > > >
> > > > > > This patch could certainly be sub-divided however, and your
> > > > > > current
> > > > > > 1/5 should probably come after this one.
> > > > > >
> > > > > > Some comments inline.
> > > > > >
> > > > > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > > > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > > > > > The ownership of a port is implicit in DPDK.
> > > > > > > > Making it explicit is better from the next reasons:
> > > > > > > > 1. It may be convenient for multi-process applications to
> > > > > > > > know
> > > which
> > > > > > > >    process is in charge of a port.
> > > > > > > > 2. A library could work on top of a port.
> > > > > > > > 3. A port can work on top of another port.
> > > > > > > >
> > > > > > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > > > > > We need to check that the user is not trying to use a port
> > > > > > > > which is already managed by fail-safe.
> > > > > > > >
> > > > > > > > Add ownership mechanism to DPDK Ethernet devices to avoid
> > > > > > > > multiple management of a device by different DPDK entities.
> > > > > > > >
> > > > > > > > A port owner is built from owner id(number) and owner
> > > > > > > > name(string) while the owner id must be unique to
> > > > > > > > distinguish between two identical entity instances and the
> > > > > > > > owner name can be
> > > any name.
> > > > > > > > The name helps to logically recognize the owner by
> > > > > > > > different DPDK entities and allows easy debug.
> > > > > > > > Each DPDK entity can allocate an owner unique identifier
> > > > > > > > and can use it and its preferred name to owns valid ethdev
> ports.
> > > > > > > > Each DPDK entity can get any port owner status to decide
> > > > > > > > if it can manage the port or not.
> > > > > > > >
> > > > > > > > The current ethdev internal port management is not
> > > > > > > > affected by this feature.
> > > > > > > >
> > > > > >
> > > > > > The internal port management is not affected, but the external
> > > > > > interface is, however. In order to respect port ownership,
> > > > > > applications are forced to modify their port iterator, as
> > > > > > shown by your
> > > > > testpmd patch.
> > > > > >
> > > > > > I think it would be better to modify the current
> > > > > > RTE_ETH_FOREACH_DEV to call RTE_FOREACH_DEV_OWNED_BY,
> and
> > > > > > introduce a default owner that would represent the application
> > > > > > itself (probably with the ID 0 and an owner string ""). Only
> > > > > > with specific additional configuration should this default
> > > > > > subset of ethdev be
> > > divided.
> > > > > >
> > > > > > This would make this evolution seamless for applications, at
> > > > > > no cost to the complexity of the design.
> > > > > >
> > > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > > >
> > > > > > >
> > > > > > > This seems fairly racy.  What if one thread attempts to set
> > > > > > > ownership on a port, while another is checking it on another
> > > > > > > cpu in parallel.  It doesn't seem like it will protect against that at all.
> > > > > > > It also doesn't protect against the possibility of multiple
> > > > > > > threads attempting to poll for rx in parallel, which I think
> > > > > > > was part of Thomas's origional statement regarding port
> > > > > > > ownership (he noted that the lockless design implied only a
> > > > > > > single thread should be allowed to poll
> > > > > for receive or make configuration changes at a time.
> > > > > > >
> > > > > > > Neil
> > > > > > >
> > > > > >
> > > > > > Isn't this race already there for any configuration operation
> > > > > > / polling function? The DPDK arch is expecting applications to solve
> it.
> > > > > > Why should port ownership be designed differently from other
> > > > > > DPDK
> > > > > components?
> > > > > >
> > > > > Yes, but that doesn't mean it should exist in purpituity, nor
> > > > > does it mean that your new api should contain it as well.
> > > > >
> > > > > > Embedding checks for port ownership within operations will
> > > > > > force everyone to bear their costs, even those not interested
> > > > > > in using this API. These checks should be kept outside, within
> > > > > > the entity claiming ownership of the port, in the form of
> > > > > > using the proper port iterator IMO.
> > > > > >
> > > > > No.  At the very least, you need to make the API itself exclusive.
> > > > > That is to say, you should at least ensure that a port ownership
> > > > > get call doesn't race with a port ownership set call.  It seems
> > > > > rediculous to just leave that sort of locking as an exercize to the user.
> > > > >
> > > > > Neil
> > > > >
> > > > Neil,
> > > > As Thomas mentioned, a DPDK port is designed to be managed by only
> > > > one thread (or synchronized DPDK entity).
> > > > So all the port management includes port ownership shouldn't be
> > > > synchronized, i.e. locks are not needed.
> > > > If some application want to do dual thread port management, the
> > > > responsibility to synchronize the port ownership or any other port
> > > > management is on this application.
> > > > Port ownership doesn't come to allow synchronized management of
> > > > the port by two DPDK entities in parallel, it is just nice way to
> > > > answer next
> > > questions:
> > > > 	1. Is the port already owned by some DPDK entity(in early control
> > > path)?
> > > > 	2. If yes, Who is the owner?
> > > > If the answer to the first question is no, the current entity can
> > > > take the ownership without any lock(1 thread).
> > > > If the answer to the first question is yes, you can recognize the
> > > > owner and take decisions accordingly, sometimes you can decide to
> > > > use the port because you logically know what the current owner
> > > > does and you can be logically synchronized with it, sometimes you
> > > > can just leave this port because you have not any deal with  this owner.
> > >
> > > If the goal is just to have an ability to recognize is that device
> > > is managed by another device (failsafe, bonding, etc.),  then I
> > > think all we need is a pointer to rte_eth_dev_data of the owner (NULL
> would mean no owner).
> >
> > I think string is better than a pointer from the next reasons:
> > 1. It is more human friendly than pointers for debug and printing.
> > 2. it is flexible and allows to forward logical owner message to other DPDK
> entities.
> >
> > > Also I think if we'd like to introduce that mechanism, then it needs
> > > to be
> > > - mandatory (control API just don't allow changes to the device
> > > configuration if caller is not an owner).
> >
> > But what if 2 DPDK entities should manage the same port \ using it and they
> are synchronized?
> >
> > > - transparent to the user (no API changes).
> >
> > For now, there is not API change but new suggested API to use for port
> iteration.
> >
> > >  - set/get owner ops need to be atomic if we want this mechanism to
> > > be usable for MP.
> >
> > But also without atomic this mechanism is usable in MP.
> > For example:
> > PRIMARY application can set its owner with string "primary A".
> > SECONDARY process (which attach to the ports only after the primary
> created them )is not allowed to set owner(As you can see in the code) but it
> can read the owner string and see that the port owner is the primary
> application.
> > The "A" can just sign specific port type to the SECONDARY that this port
> works with logic A which means, for example, primary should send the
> packets and secondary should receive the packets.
> >
> But thats just the point, the operations that you are describing are not atomic
> at all.  If the primary process is interrupted during its setting of a ports owner
> ship after its read the current owner field, its entirely possible for the
> secondary proces to set the owner as itself, and then have the primary
> process set it back.  Without locking, its just broken.  I know that the dpdk
> likes to say its lockless, because it has no locks, but here we're just kicking the
> can down the road, by making the application add the locks for the library.
> 
> Neil
> 
As I wrote before and as is in the code you can understand that secondary process should not take ownership of ports.
Any port configuration (for example port creation and release) is not internally synchronized between the primary to secondary processes so I don't see any reason to synchronize port ownership.
If the primary-secondary process want to manage(configure) same port in same time, they must be synchronized by the applications, so this is the case in port ownership too (actually I don't think this synchronization is realistic because many configurations of the port are not in shared memory).
So, actually secondary process should start its activity on ports only after the primary process done with all configurations includes port ownership, this part must already be synchronized.
  
> > > Konstantin
> >

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-04 18:10                 ` Matan Azrad
@ 2017-12-04 22:30                   ` Neil Horman
  2017-12-05  6:08                     ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Neil Horman @ 2017-12-04 22:30 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Ananyev, Konstantin, Gaëtan Rivet, Thomas Monjalon, Wu,
	Jingjing, dev

On Mon, Dec 04, 2017 at 06:10:56PM +0000, Matan Azrad wrote:
> Hi Neil
> 
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Monday, December 4, 2017 6:01 PM
> > To: Matan Azrad <matan@mellanox.com>
> > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Gaëtan Rivet
> > <gaetan.rivet@6wind.com>; Thomas Monjalon <thomas@monjalon.net>;
> > Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > 
> > On Sun, Dec 03, 2017 at 01:46:49PM +0000, Matan Azrad wrote:
> > > Hi Konstantine
> > >
> > > > -----Original Message-----
> > > > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > > > Sent: Sunday, December 3, 2017 1:10 PM
> > > > To: Matan Azrad <matan@mellanox.com>; Neil Horman
> > > > <nhorman@tuxdriver.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > > <jingjing.wu@intel.com>; dev@dpdk.org
> > > > Subject: RE: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > >
> > > >
> > > >
> > > > Hi Matan,
> > > >
> > > > > -----Original Message-----
> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > > > > Sent: Sunday, December 3, 2017 8:05 AM
> > > > > To: Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet
> > > > > <gaetan.rivet@6wind.com>
> > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > > > <jingjing.wu@intel.com>; dev@dpdk.org
> > > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > > >
> > > > > Hi
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > > > > Sent: Friday, December 1, 2017 2:10 PM
> > > > > > To: Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > > > > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > > > <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> > > > > > dev@dpdk.org
> > > > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > > > >
> > > > > > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> > > > > > > Hello Matan, Neil,
> > > > > > >
> > > > > > > I like the port ownership concept. I think it is needed to
> > > > > > > clarify some operations and should be useful to several subsystems.
> > > > > > >
> > > > > > > This patch could certainly be sub-divided however, and your
> > > > > > > current
> > > > > > > 1/5 should probably come after this one.
> > > > > > >
> > > > > > > Some comments inline.
> > > > > > >
> > > > > > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > > > > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > > > > > > The ownership of a port is implicit in DPDK.
> > > > > > > > > Making it explicit is better from the next reasons:
> > > > > > > > > 1. It may be convenient for multi-process applications to
> > > > > > > > > know
> > > > which
> > > > > > > > >    process is in charge of a port.
> > > > > > > > > 2. A library could work on top of a port.
> > > > > > > > > 3. A port can work on top of another port.
> > > > > > > > >
> > > > > > > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > > > > > > We need to check that the user is not trying to use a port
> > > > > > > > > which is already managed by fail-safe.
> > > > > > > > >
> > > > > > > > > Add ownership mechanism to DPDK Ethernet devices to avoid
> > > > > > > > > multiple management of a device by different DPDK entities.
> > > > > > > > >
> > > > > > > > > A port owner is built from owner id(number) and owner
> > > > > > > > > name(string) while the owner id must be unique to
> > > > > > > > > distinguish between two identical entity instances and the
> > > > > > > > > owner name can be
> > > > any name.
> > > > > > > > > The name helps to logically recognize the owner by
> > > > > > > > > different DPDK entities and allows easy debug.
> > > > > > > > > Each DPDK entity can allocate an owner unique identifier
> > > > > > > > > and can use it and its preferred name to owns valid ethdev
> > ports.
> > > > > > > > > Each DPDK entity can get any port owner status to decide
> > > > > > > > > if it can manage the port or not.
> > > > > > > > >
> > > > > > > > > The current ethdev internal port management is not
> > > > > > > > > affected by this feature.
> > > > > > > > >
> > > > > > >
> > > > > > > The internal port management is not affected, but the external
> > > > > > > interface is, however. In order to respect port ownership,
> > > > > > > applications are forced to modify their port iterator, as
> > > > > > > shown by your
> > > > > > testpmd patch.
> > > > > > >
> > > > > > > I think it would be better to modify the current
> > > > > > > RTE_ETH_FOREACH_DEV to call RTE_FOREACH_DEV_OWNED_BY,
> > and
> > > > > > > introduce a default owner that would represent the application
> > > > > > > itself (probably with the ID 0 and an owner string ""). Only
> > > > > > > with specific additional configuration should this default
> > > > > > > subset of ethdev be
> > > > divided.
> > > > > > >
> > > > > > > This would make this evolution seamless for applications, at
> > > > > > > no cost to the complexity of the design.
> > > > > > >
> > > > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > > > >
> > > > > > > >
> > > > > > > > This seems fairly racy.  What if one thread attempts to set
> > > > > > > > ownership on a port, while another is checking it on another
> > > > > > > > cpu in parallel.  It doesn't seem like it will protect against that at all.
> > > > > > > > It also doesn't protect against the possibility of multiple
> > > > > > > > threads attempting to poll for rx in parallel, which I think
> > > > > > > > was part of Thomas's origional statement regarding port
> > > > > > > > ownership (he noted that the lockless design implied only a
> > > > > > > > single thread should be allowed to poll
> > > > > > for receive or make configuration changes at a time.
> > > > > > > >
> > > > > > > > Neil
> > > > > > > >
> > > > > > >
> > > > > > > Isn't this race already there for any configuration operation
> > > > > > > / polling function? The DPDK arch is expecting applications to solve
> > it.
> > > > > > > Why should port ownership be designed differently from other
> > > > > > > DPDK
> > > > > > components?
> > > > > > >
> > > > > > Yes, but that doesn't mean it should exist in purpituity, nor
> > > > > > does it mean that your new api should contain it as well.
> > > > > >
> > > > > > > Embedding checks for port ownership within operations will
> > > > > > > force everyone to bear their costs, even those not interested
> > > > > > > in using this API. These checks should be kept outside, within
> > > > > > > the entity claiming ownership of the port, in the form of
> > > > > > > using the proper port iterator IMO.
> > > > > > >
> > > > > > No.  At the very least, you need to make the API itself exclusive.
> > > > > > That is to say, you should at least ensure that a port ownership
> > > > > > get call doesn't race with a port ownership set call.  It seems
> > > > > > rediculous to just leave that sort of locking as an exercize to the user.
> > > > > >
> > > > > > Neil
> > > > > >
> > > > > Neil,
> > > > > As Thomas mentioned, a DPDK port is designed to be managed by only
> > > > > one thread (or synchronized DPDK entity).
> > > > > So all the port management includes port ownership shouldn't be
> > > > > synchronized, i.e. locks are not needed.
> > > > > If some application want to do dual thread port management, the
> > > > > responsibility to synchronize the port ownership or any other port
> > > > > management is on this application.
> > > > > Port ownership doesn't come to allow synchronized management of
> > > > > the port by two DPDK entities in parallel, it is just nice way to
> > > > > answer next
> > > > questions:
> > > > > 	1. Is the port already owned by some DPDK entity(in early control
> > > > path)?
> > > > > 	2. If yes, Who is the owner?
> > > > > If the answer to the first question is no, the current entity can
> > > > > take the ownership without any lock(1 thread).
> > > > > If the answer to the first question is yes, you can recognize the
> > > > > owner and take decisions accordingly, sometimes you can decide to
> > > > > use the port because you logically know what the current owner
> > > > > does and you can be logically synchronized with it, sometimes you
> > > > > can just leave this port because you have not any deal with  this owner.
> > > >
> > > > If the goal is just to have an ability to recognize is that device
> > > > is managed by another device (failsafe, bonding, etc.),  then I
> > > > think all we need is a pointer to rte_eth_dev_data of the owner (NULL
> > would mean no owner).
> > >
> > > I think string is better than a pointer from the next reasons:
> > > 1. It is more human friendly than pointers for debug and printing.
> > > 2. it is flexible and allows to forward logical owner message to other DPDK
> > entities.
> > >
> > > > Also I think if we'd like to introduce that mechanism, then it needs
> > > > to be
> > > > - mandatory (control API just don't allow changes to the device
> > > > configuration if caller is not an owner).
> > >
> > > But what if 2 DPDK entities should manage the same port \ using it and they
> > are synchronized?
> > >
> > > > - transparent to the user (no API changes).
> > >
> > > For now, there is not API change but new suggested API to use for port
> > iteration.
> > >
> > > >  - set/get owner ops need to be atomic if we want this mechanism to
> > > > be usable for MP.
> > >
> > > But also without atomic this mechanism is usable in MP.
> > > For example:
> > > PRIMARY application can set its owner with string "primary A".
> > > SECONDARY process (which attach to the ports only after the primary
> > created them )is not allowed to set owner(As you can see in the code) but it
> > can read the owner string and see that the port owner is the primary
> > application.
> > > The "A" can just sign specific port type to the SECONDARY that this port
> > works with logic A which means, for example, primary should send the
> > packets and secondary should receive the packets.
> > >
> > But thats just the point, the operations that you are describing are not atomic
> > at all.  If the primary process is interrupted during its setting of a ports owner
> > ship after its read the current owner field, its entirely possible for the
> > secondary proces to set the owner as itself, and then have the primary
> > process set it back.  Without locking, its just broken.  I know that the dpdk
> > likes to say its lockless, because it has no locks, but here we're just kicking the
> > can down the road, by making the application add the locks for the library.
> > 
> > Neil
> > 
> As I wrote before and as is in the code you can understand that secondary process should not take ownership of ports.
But you allow for it, and if you do, you should write your api to be safe for
such opperations.  
> Any port configuration (for example port creation and release) is not internally synchronized between the primary to secondary processes so I don't see any reason to synchronize port ownership.
Yes, and I've never agreed with that design point either, because the fact of
the matter is that one of two things must be true in relation to port
configuration:

1) Either multiple processes will attempt to read/change port
configuration/ownership

or 

2) port ownership is implicitly given to a single task (be it a primary or
secondary task), and said ownership is therefore implicitly known by the
application

Either situation may be true depending on the details of the application being
built, but regardless, if (2) is true, then this api isn't really needed at all,
because the application implicitly has been designed to have a port be owned by
a given task.  If (1) is true, then all the locking required to access the data
of port ownership needs to be adhered to.

I understand that you are taking Thomas' words to mean that all paths are
lockless in the DPDK, and that is true as a statement of fact (in that the DPDK
doesn't synchronize access to internal data).  What it does do, is leave that
locking as an exercize for the downstream consumer of the library.  While I
don't agree with it, I can see that such an argument is valid for hot paths such
as transmission and reception where you may perhaps want to minimize your
locking by assigning a single task to do that work, but port configuration and
ownership isn't a hot path.  If you mean to allow potentially multiple tasks to
access configuration (including port ownership), be it frequently or just
occasionaly, why are you making a downstream developer recognize the need for
locking here outside the library?  It just seems like very bad general practice
to me.

> If the primary-secondary process want to manage(configure) same port in same time, they must be synchronized by the applications, so this is the case in port ownership too (actually I don't think this synchronization is realistic because many configurations of the port are not in shared memory).
Yes, it is the case, my question is, why?  We're not talking about a time
sensitive execution path here.  And by avoiding it you're just making work that
has to be repeated by every downstream consumer.

Neil

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-04 22:30                   ` Neil Horman
@ 2017-12-05  6:08                     ` Matan Azrad
  2017-12-05 10:05                       ` Bruce Richardson
  2017-12-05 19:26                       ` Neil Horman
  0 siblings, 2 replies; 214+ messages in thread
From: Matan Azrad @ 2017-12-05  6:08 UTC (permalink / raw)
  To: Neil Horman
  Cc: Ananyev, Konstantin, Gaëtan Rivet, Thomas Monjalon, Wu,
	Jingjing, dev

Hi Neil

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Tuesday, December 5, 2017 12:31 AM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Gaëtan Rivet
> <gaetan.rivet@6wind.com>; Thomas Monjalon <thomas@monjalon.net>;
> Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> On Mon, Dec 04, 2017 at 06:10:56PM +0000, Matan Azrad wrote:
> > Hi Neil
> >
> > > -----Original Message-----
> > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > Sent: Monday, December 4, 2017 6:01 PM
> > > To: Matan Azrad <matan@mellanox.com>
> > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Gaëtan Rivet
> > > <gaetan.rivet@6wind.com>; Thomas Monjalon
> <thomas@monjalon.net>; Wu,
> > > Jingjing <jingjing.wu@intel.com>; dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > >
> > > On Sun, Dec 03, 2017 at 01:46:49PM +0000, Matan Azrad wrote:
> > > > Hi Konstantine
> > > >
> > > > > -----Original Message-----
> > > > > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > > > > Sent: Sunday, December 3, 2017 1:10 PM
> > > > > To: Matan Azrad <matan@mellanox.com>; Neil Horman
> > > > > <nhorman@tuxdriver.com>; Gaëtan Rivet
> <gaetan.rivet@6wind.com>
> > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > > > <jingjing.wu@intel.com>; dev@dpdk.org
> > > > > Subject: RE: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > > >
> > > > >
> > > > >
> > > > > Hi Matan,
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan
> > > > > > Azrad
> > > > > > Sent: Sunday, December 3, 2017 8:05 AM
> > > > > > To: Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet
> > > > > > <gaetan.rivet@6wind.com>
> > > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > > > > <jingjing.wu@intel.com>; dev@dpdk.org
> > > > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > > > >
> > > > > > Hi
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > > > > > Sent: Friday, December 1, 2017 2:10 PM
> > > > > > > To: Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > > > > > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > > > > <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> > > > > > > dev@dpdk.org
> > > > > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port
> > > > > > > ownership
> > > > > > >
> > > > > > > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> > > > > > > > Hello Matan, Neil,
> > > > > > > >
> > > > > > > > I like the port ownership concept. I think it is needed to
> > > > > > > > clarify some operations and should be useful to several
> subsystems.
> > > > > > > >
> > > > > > > > This patch could certainly be sub-divided however, and
> > > > > > > > your current
> > > > > > > > 1/5 should probably come after this one.
> > > > > > > >
> > > > > > > > Some comments inline.
> > > > > > > >
> > > > > > > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > > > > > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad
> wrote:
> > > > > > > > > > The ownership of a port is implicit in DPDK.
> > > > > > > > > > Making it explicit is better from the next reasons:
> > > > > > > > > > 1. It may be convenient for multi-process applications
> > > > > > > > > > to know
> > > > > which
> > > > > > > > > >    process is in charge of a port.
> > > > > > > > > > 2. A library could work on top of a port.
> > > > > > > > > > 3. A port can work on top of another port.
> > > > > > > > > >
> > > > > > > > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > > > > > > > We need to check that the user is not trying to use a
> > > > > > > > > > port which is already managed by fail-safe.
> > > > > > > > > >
> > > > > > > > > > Add ownership mechanism to DPDK Ethernet devices to
> > > > > > > > > > avoid multiple management of a device by different DPDK
> entities.
> > > > > > > > > >
> > > > > > > > > > A port owner is built from owner id(number) and owner
> > > > > > > > > > name(string) while the owner id must be unique to
> > > > > > > > > > distinguish between two identical entity instances and
> > > > > > > > > > the owner name can be
> > > > > any name.
> > > > > > > > > > The name helps to logically recognize the owner by
> > > > > > > > > > different DPDK entities and allows easy debug.
> > > > > > > > > > Each DPDK entity can allocate an owner unique
> > > > > > > > > > identifier and can use it and its preferred name to
> > > > > > > > > > owns valid ethdev
> > > ports.
> > > > > > > > > > Each DPDK entity can get any port owner status to
> > > > > > > > > > decide if it can manage the port or not.
> > > > > > > > > >
> > > > > > > > > > The current ethdev internal port management is not
> > > > > > > > > > affected by this feature.
> > > > > > > > > >
> > > > > > > >
> > > > > > > > The internal port management is not affected, but the
> > > > > > > > external interface is, however. In order to respect port
> > > > > > > > ownership, applications are forced to modify their port
> > > > > > > > iterator, as shown by your
> > > > > > > testpmd patch.
> > > > > > > >
> > > > > > > > I think it would be better to modify the current
> > > > > > > > RTE_ETH_FOREACH_DEV to call
> RTE_FOREACH_DEV_OWNED_BY,
> > > and
> > > > > > > > introduce a default owner that would represent the
> > > > > > > > application itself (probably with the ID 0 and an owner
> > > > > > > > string ""). Only with specific additional configuration
> > > > > > > > should this default subset of ethdev be
> > > > > divided.
> > > > > > > >
> > > > > > > > This would make this evolution seamless for applications,
> > > > > > > > at no cost to the complexity of the design.
> > > > > > > >
> > > > > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > This seems fairly racy.  What if one thread attempts to
> > > > > > > > > set ownership on a port, while another is checking it on
> > > > > > > > > another cpu in parallel.  It doesn't seem like it will protect
> against that at all.
> > > > > > > > > It also doesn't protect against the possibility of
> > > > > > > > > multiple threads attempting to poll for rx in parallel,
> > > > > > > > > which I think was part of Thomas's origional statement
> > > > > > > > > regarding port ownership (he noted that the lockless
> > > > > > > > > design implied only a single thread should be allowed to
> > > > > > > > > poll
> > > > > > > for receive or make configuration changes at a time.
> > > > > > > > >
> > > > > > > > > Neil
> > > > > > > > >
> > > > > > > >
> > > > > > > > Isn't this race already there for any configuration
> > > > > > > > operation / polling function? The DPDK arch is expecting
> > > > > > > > applications to solve
> > > it.
> > > > > > > > Why should port ownership be designed differently from
> > > > > > > > other DPDK
> > > > > > > components?
> > > > > > > >
> > > > > > > Yes, but that doesn't mean it should exist in purpituity,
> > > > > > > nor does it mean that your new api should contain it as well.
> > > > > > >
> > > > > > > > Embedding checks for port ownership within operations will
> > > > > > > > force everyone to bear their costs, even those not
> > > > > > > > interested in using this API. These checks should be kept
> > > > > > > > outside, within the entity claiming ownership of the port,
> > > > > > > > in the form of using the proper port iterator IMO.
> > > > > > > >
> > > > > > > No.  At the very least, you need to make the API itself exclusive.
> > > > > > > That is to say, you should at least ensure that a port
> > > > > > > ownership get call doesn't race with a port ownership set
> > > > > > > call.  It seems rediculous to just leave that sort of locking as an
> exercize to the user.
> > > > > > >
> > > > > > > Neil
> > > > > > >
> > > > > > Neil,
> > > > > > As Thomas mentioned, a DPDK port is designed to be managed by
> > > > > > only one thread (or synchronized DPDK entity).
> > > > > > So all the port management includes port ownership shouldn't
> > > > > > be synchronized, i.e. locks are not needed.
> > > > > > If some application want to do dual thread port management,
> > > > > > the responsibility to synchronize the port ownership or any
> > > > > > other port management is on this application.
> > > > > > Port ownership doesn't come to allow synchronized management
> > > > > > of the port by two DPDK entities in parallel, it is just nice
> > > > > > way to answer next
> > > > > questions:
> > > > > > 	1. Is the port already owned by some DPDK entity(in early
> > > > > > control
> > > > > path)?
> > > > > > 	2. If yes, Who is the owner?
> > > > > > If the answer to the first question is no, the current entity
> > > > > > can take the ownership without any lock(1 thread).
> > > > > > If the answer to the first question is yes, you can recognize
> > > > > > the owner and take decisions accordingly, sometimes you can
> > > > > > decide to use the port because you logically know what the
> > > > > > current owner does and you can be logically synchronized with
> > > > > > it, sometimes you can just leave this port because you have not any
> deal with  this owner.
> > > > >
> > > > > If the goal is just to have an ability to recognize is that
> > > > > device is managed by another device (failsafe, bonding, etc.),
> > > > > then I think all we need is a pointer to rte_eth_dev_data of the
> > > > > owner (NULL
> > > would mean no owner).
> > > >
> > > > I think string is better than a pointer from the next reasons:
> > > > 1. It is more human friendly than pointers for debug and printing.
> > > > 2. it is flexible and allows to forward logical owner message to
> > > > other DPDK
> > > entities.
> > > >
> > > > > Also I think if we'd like to introduce that mechanism, then it
> > > > > needs to be
> > > > > - mandatory (control API just don't allow changes to the device
> > > > > configuration if caller is not an owner).
> > > >
> > > > But what if 2 DPDK entities should manage the same port \ using it
> > > > and they
> > > are synchronized?
> > > >
> > > > > - transparent to the user (no API changes).
> > > >
> > > > For now, there is not API change but new suggested API to use for
> > > > port
> > > iteration.
> > > >
> > > > >  - set/get owner ops need to be atomic if we want this mechanism
> > > > > to be usable for MP.
> > > >
> > > > But also without atomic this mechanism is usable in MP.
> > > > For example:
> > > > PRIMARY application can set its owner with string "primary A".
> > > > SECONDARY process (which attach to the ports only after the
> > > > primary
> > > created them )is not allowed to set owner(As you can see in the
> > > code) but it can read the owner string and see that the port owner
> > > is the primary application.
> > > > The "A" can just sign specific port type to the SECONDARY that
> > > > this port
> > > works with logic A which means, for example, primary should send the
> > > packets and secondary should receive the packets.
> > > >
> > > But thats just the point, the operations that you are describing are
> > > not atomic at all.  If the primary process is interrupted during its
> > > setting of a ports owner ship after its read the current owner
> > > field, its entirely possible for the secondary proces to set the
> > > owner as itself, and then have the primary process set it back.
> > > Without locking, its just broken.  I know that the dpdk likes to say
> > > its lockless, because it has no locks, but here we're just kicking the can
> down the road, by making the application add the locks for the library.
> > >
> > > Neil
> > >
> > As I wrote before and as is in the code you can understand that secondary
> process should not take ownership of ports.
> But you allow for it, and if you do, you should write your api to be safe for
> such opperations.

Please look at the code again, secondary process cannot take ownership, I don't allow it!
Actually, I think it must not be like that as no limitation for that in any other ethdev configurations.

> > Any port configuration (for example port creation and release) is not
> internally synchronized between the primary to secondary processes so I
> don't see any reason to synchronize port ownership.
> Yes, and I've never agreed with that design point either, because the fact of
> the matter is that one of two things must be true in relation to port
> configuration:
> 
> 1) Either multiple processes will attempt to read/change port
> configuration/ownership
> 
> or
> 
> 2) port ownership is implicitly given to a single task (be it a primary or
> secondary task), and said ownership is therefore implicitly known by the
> application
> 
> Either situation may be true depending on the details of the application being
> built, but regardless, if (2) is true, then this api isn't really needed at all,
> because the application implicitly has been designed to have a port be
> owned by a given task. 

Here I think you miss something, the port ownership is not mainly for process DPDK entities,
The entity can be also PMD, library, application in same process.
For MP it is nice to have, the secondary just can see the primary owners and take decision accordingly (please read my answer to Konstatin above). 

 If (1) is true, then all the locking required to access
> the data of port ownership needs to be adhered to.
> 

And all the port configurations!
I think it is behind to this thread.


> I understand that you are taking Thomas' words to mean that all paths are
> lockless in the DPDK, and that is true as a statement of fact (in that the DPDK
> doesn't synchronize access to internal data).  What it does do, is leave that
> locking as an exercize for the downstream consumer of the library.  While I
> don't agree with it, I can see that such an argument is valid for hot paths such
> as transmission and reception where you may perhaps want to minimize
> your locking by assigning a single task to do that work, but port configuration
> and ownership isn't a hot path.  If you mean to allow potentially multiple
> tasks to access configuration (including port ownership), be it frequently or
> just occasionaly, why are you making a downstream developer recognize the
> need for locking here outside the library?  It just seems like very bad general
> practice to me.
> 
> > If the primary-secondary process want to manage(configure) same port in
> same time, they must be synchronized by the applications, so this is the case
> in port ownership too (actually I don't think this synchronization is realistic
> because many configurations of the port are not in shared memory).
> Yes, it is the case, my question is, why?  We're not talking about a time
> sensitive execution path here.  And by avoiding it you're just making work
> that has to be repeated by every downstream consumer.

I think you suggest to make all the ethdev configuration race safe, it is behind to this thread.
Current ethdev implementation leave the race management to applications, so port ownership as any other port configurations should be designed in the same method.

> 
> Neil

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-05  6:08                     ` Matan Azrad
@ 2017-12-05 10:05                       ` Bruce Richardson
  2017-12-08 11:35                         ` Thomas Monjalon
  2017-12-05 19:26                       ` Neil Horman
  1 sibling, 1 reply; 214+ messages in thread
From: Bruce Richardson @ 2017-12-05 10:05 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Neil Horman, Ananyev, Konstantin, Gaëtan Rivet,
	Thomas Monjalon, Wu, Jingjing, dev

On Tue, Dec 05, 2017 at 06:08:35AM +0000, Matan Azrad wrote:
> Hi Neil
> 
> > -----Original Message----- From: Neil Horman
> > [mailto:nhorman@tuxdriver.com] Sent: Tuesday, December 5, 2017 12:31
> > AM To: Matan Azrad <matan@mellanox.com> Cc: Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>; Gaëtan Rivet
> > <gaetan.rivet@6wind.com>; Thomas Monjalon <thomas@monjalon.net>; Wu,
> > Jingjing <jingjing.wu@intel.com>; dev@dpdk.org Subject: Re:
> > [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > 
> > On Mon, Dec 04, 2017 at 06:10:56PM +0000, Matan Azrad wrote:
> > > Hi Neil
> > >
> > > > -----Original Message----- From: Neil Horman
> > > > [mailto:nhorman@tuxdriver.com] Sent: Monday, December 4, 2017
> > > > 6:01 PM To: Matan Azrad <matan@mellanox.com> Cc: Ananyev,
> > > > Konstantin <konstantin.ananyev@intel.com>; Gaëtan Rivet
> > > > <gaetan.rivet@6wind.com>; Thomas Monjalon
> > <thomas@monjalon.net>; Wu,
> > > > Jingjing <jingjing.wu@intel.com>; dev@dpdk.org Subject: Re:
> > > > [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > >
> > > > On Sun, Dec 03, 2017 at 01:46:49PM +0000, Matan Azrad wrote:
> > > > > Hi Konstantine
> > > > >
> > > > > > -----Original Message----- From: Ananyev, Konstantin
> > > > > > [mailto:konstantin.ananyev@intel.com] Sent: Sunday, December
> > > > > > 3, 2017 1:10 PM To: Matan Azrad <matan@mellanox.com>; Neil
> > > > > > Horman <nhorman@tuxdriver.com>; Gaëtan Rivet
> > <gaetan.rivet@6wind.com>
> > > > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > > > > <jingjing.wu@intel.com>; dev@dpdk.org Subject: RE:
> > > > > > [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > > > >
> > > > > >
> > > > > >
> > > > > > Hi Matan,
> > > > > >
> > > > > > > -----Original Message----- From: dev
> > > > > > > [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > > > > > > Sent: Sunday, December 3, 2017 8:05 AM To: Neil Horman
> > > > > > > <nhorman@tuxdriver.com>; Gaëtan Rivet
> > > > > > > <gaetan.rivet@6wind.com> Cc: Thomas Monjalon
> > > > > > > <thomas@monjalon.net>; Wu, Jingjing
> > > > > > > <jingjing.wu@intel.com>; dev@dpdk.org Subject: Re:
> > > > > > > [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > > > > >
> > > > > > > Hi
> > > > > > >
> > > > > > > > -----Original Message----- From: Neil Horman
> > > > > > > > [mailto:nhorman@tuxdriver.com] Sent: Friday, December 1,
> > > > > > > > 2017 2:10 PM To: Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > > > > > > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > > > > > <thomas@monjalon.net>; Jingjing Wu
> > > > > > > > <jingjing.wu@intel.com>; dev@dpdk.org Subject: Re:
> > > > > > > > [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > > > > > >
> > > > > > > > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet
> > > > > > > > wrote:
> > > > > > > > > Hello Matan, Neil,
> > > > > > > > >
> > > > > > > > > I like the port ownership concept. I think it is
> > > > > > > > > needed to clarify some operations and should be useful
> > > > > > > > > to several
> > subsystems.
> > > > > > > > >
> > > > > > > > > This patch could certainly be sub-divided however, and
> > > > > > > > > your current 1/5 should probably come after this one.
> > > > > > > > >
> > > > > > > > > Some comments inline.
> > > > > > > > >
> > > > > > > > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman
> > > > > > > > > wrote:
> > > > > > > > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan
> > > > > > > > > > Azrad
> > wrote:
> > > > > > > > > > > The ownership of a port is implicit in DPDK.
> > > > > > > > > > > Making it explicit is better from the next
> > > > > > > > > > > reasons: 1. It may be convenient for multi-process
> > > > > > > > > > > applications to know
> > > > > > which
> > > > > > > > > > >    process is in charge of a port.  2. A library
> > > > > > > > > > >    could work on top of a port.  3. A port can
> > > > > > > > > > >    work on top of another port.
> > > > > > > > > > >
> > > > > > > > > > > Also in the fail-safe case, an issue has been met
> > > > > > > > > > > in testpmd.  We need to check that the user is not
> > > > > > > > > > > trying to use a port which is already managed by
> > > > > > > > > > > fail-safe.
> > > > > > > > > > >
> > > > > > > > > > > Add ownership mechanism to DPDK Ethernet devices
> > > > > > > > > > > to avoid multiple management of a device by
> > > > > > > > > > > different DPDK
> > entities.
> > > > > > > > > > >
> > > > > > > > > > > A port owner is built from owner id(number) and
> > > > > > > > > > > owner name(string) while the owner id must be
> > > > > > > > > > > unique to distinguish between two identical entity
> > > > > > > > > > > instances and the owner name can be
> > > > > > any name.
> > > > > > > > > > > The name helps to logically recognize the owner by
> > > > > > > > > > > different DPDK entities and allows easy debug.
> > > > > > > > > > > Each DPDK entity can allocate an owner unique
> > > > > > > > > > > identifier and can use it and its preferred name
> > > > > > > > > > > to owns valid ethdev
> > > > ports.
> > > > > > > > > > > Each DPDK entity can get any port owner status to
> > > > > > > > > > > decide if it can manage the port or not.
> > > > > > > > > > >
> > > > > > > > > > > The current ethdev internal port management is not
> > > > > > > > > > > affected by this feature.
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > > > The internal port management is not affected, but the
> > > > > > > > > external interface is, however. In order to respect
> > > > > > > > > port ownership, applications are forced to modify
> > > > > > > > > their port iterator, as shown by your
> > > > > > > > testpmd patch.
> > > > > > > > >
> > > > > > > > > I think it would be better to modify the current
> > > > > > > > > RTE_ETH_FOREACH_DEV to call
> > RTE_FOREACH_DEV_OWNED_BY,
> > > > and
> > > > > > > > > introduce a default owner that would represent the
> > > > > > > > > application itself (probably with the ID 0 and an
> > > > > > > > > owner string ""). Only with specific additional
> > > > > > > > > configuration should this default subset of ethdev be
> > > > > > divided.
> > > > > > > > >
> > > > > > > > > This would make this evolution seamless for
> > > > > > > > > applications, at no cost to the complexity of the
> > > > > > > > > design.
> > > > > > > > >
> > > > > > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > This seems fairly racy.  What if one thread attempts
> > > > > > > > > > to set ownership on a port, while another is
> > > > > > > > > > checking it on another cpu in parallel.  It doesn't
> > > > > > > > > > seem like it will protect
> > against that at all.
> > > > > > > > > > It also doesn't protect against the possibility of
> > > > > > > > > > multiple threads attempting to poll for rx in
> > > > > > > > > > parallel, which I think was part of Thomas's
> > > > > > > > > > origional statement regarding port ownership (he
> > > > > > > > > > noted that the lockless design implied only a single
> > > > > > > > > > thread should be allowed to poll
> > > > > > > > for receive or make configuration changes at a time.
> > > > > > > > > >
> > > > > > > > > > Neil
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Isn't this race already there for any configuration
> > > > > > > > > operation / polling function? The DPDK arch is
> > > > > > > > > expecting applications to solve
> > > > it.
> > > > > > > > > Why should port ownership be designed differently from
> > > > > > > > > other DPDK
> > > > > > > > components?
> > > > > > > > >
> > > > > > > > Yes, but that doesn't mean it should exist in
> > > > > > > > purpituity, nor does it mean that your new api should
> > > > > > > > contain it as well.
> > > > > > > >
> > > > > > > > > Embedding checks for port ownership within operations
> > > > > > > > > will force everyone to bear their costs, even those
> > > > > > > > > not interested in using this API. These checks should
> > > > > > > > > be kept outside, within the entity claiming ownership
> > > > > > > > > of the port, in the form of using the proper port
> > > > > > > > > iterator IMO.
> > > > > > > > >
> > > > > > > > No.  At the very least, you need to make the API itself
> > > > > > > > exclusive.  That is to say, you should at least ensure
> > > > > > > > that a port ownership get call doesn't race with a port
> > > > > > > > ownership set call.  It seems rediculous to just leave
> > > > > > > > that sort of locking as an
> > exercize to the user.
> > > > > > > >
> > > > > > > > Neil
> > > > > > > >
> > > > > > > Neil, As Thomas mentioned, a DPDK port is designed to be
> > > > > > > managed by only one thread (or synchronized DPDK entity).
> > > > > > > So all the port management includes port ownership
> > > > > > > shouldn't be synchronized, i.e. locks are not needed.  If
> > > > > > > some application want to do dual thread port management,
> > > > > > > the responsibility to synchronize the port ownership or
> > > > > > > any other port management is on this application.  Port
> > > > > > > ownership doesn't come to allow synchronized management of
> > > > > > > the port by two DPDK entities in parallel, it is just nice
> > > > > > > way to answer next
> > > > > > questions:
> > > > > > > 	1. Is the port already owned by some DPDK entity(in
> > > > > > > 	early control
> > > > > > path)?
> > > > > > > 	2. If yes, Who is the owner?  If the answer to the first
> > > > > > > 	question is no, the current entity can take the
> > > > > > > 	ownership without any lock(1 thread).  If the answer to
> > > > > > > 	the first question is yes, you can recognize the owner
> > > > > > > 	and take decisions accordingly, sometimes you can decide
> > > > > > > 	to use the port because you logically know what the
> > > > > > > 	current owner does and you can be logically synchronized
> > > > > > > 	with it, sometimes you can just leave this port because
> > > > > > > 	you have not any
> > deal with  this owner.
> > > > > >
> > > > > > If the goal is just to have an ability to recognize is that
> > > > > > device is managed by another device (failsafe, bonding,
> > > > > > etc.), then I think all we need is a pointer to
> > > > > > rte_eth_dev_data of the owner (NULL
> > > > would mean no owner).
> > > > >
> > > > > I think string is better than a pointer from the next reasons:
> > > > > 1. It is more human friendly than pointers for debug and
> > > > > printing.  2. it is flexible and allows to forward logical
> > > > > owner message to other DPDK
> > > > entities.
> > > > >
> > > > > > Also I think if we'd like to introduce that mechanism, then
> > > > > > it needs to be - mandatory (control API just don't allow
> > > > > > changes to the device configuration if caller is not an
> > > > > > owner).
> > > > >
> > > > > But what if 2 DPDK entities should manage the same port \
> > > > > using it and they
> > > > are synchronized?
> > > > >
> > > > > > - transparent to the user (no API changes).
> > > > >
> > > > > For now, there is not API change but new suggested API to use
> > > > > for port
> > > > iteration.
> > > > >
> > > > > >  - set/get owner ops need to be atomic if we want this
> > > > > >  mechanism to be usable for MP.
> > > > >
> > > > > But also without atomic this mechanism is usable in MP.  For
> > > > > example: PRIMARY application can set its owner with string
> > > > > "primary A".  SECONDARY process (which attach to the ports
> > > > > only after the primary
> > > > created them )is not allowed to set owner(As you can see in the
> > > > code) but it can read the owner string and see that the port
> > > > owner is the primary application.
> > > > > The "A" can just sign specific port type to the SECONDARY that
> > > > > this port
> > > > works with logic A which means, for example, primary should send
> > > > the packets and secondary should receive the packets.
> > > > >
> > > > But thats just the point, the operations that you are describing
> > > > are not atomic at all.  If the primary process is interrupted
> > > > during its setting of a ports owner ship after its read the
> > > > current owner field, its entirely possible for the secondary
> > > > proces to set the owner as itself, and then have the primary
> > > > process set it back.  Without locking, its just broken.  I know
> > > > that the dpdk likes to say its lockless, because it has no
> > > > locks, but here we're just kicking the can
> > down the road, by making the application add the locks for the
> > library.
> > > >
> > > > Neil
> > > >
> > > As I wrote before and as is in the code you can understand that
> > > secondary
> > process should not take ownership of ports.  But you allow for it,
> > and if you do, you should write your api to be safe for such
> > opperations.
> 
> Please look at the code again, secondary process cannot take
> ownership, I don't allow it!  Actually, I think it must not be like
> that as no limitation for that in any other ethdev configurations.
> 
> > > Any port configuration (for example port creation and release) is
> > > not
> > internally synchronized between the primary to secondary processes
> > so I don't see any reason to synchronize port ownership.  Yes, and
> > I've never agreed with that design point either, because the fact of
> > the matter is that one of two things must be true in relation to
> > port configuration:
> > 
> > 1) Either multiple processes will attempt to read/change port
> > configuration/ownership
> > 
> > or
> > 
> > 2) port ownership is implicitly given to a single task (be it a
> > primary or secondary task), and said ownership is therefore
> > implicitly known by the application
> > 
> > Either situation may be true depending on the details of the
> > application being built, but regardless, if (2) is true, then this
> > api isn't really needed at all, because the application implicitly
> > has been designed to have a port be owned by a given task. 
> 
> Here I think you miss something, the port ownership is not mainly for
> process DPDK entities, The entity can be also PMD, library,
> application in same process.  For MP it is nice to have, the secondary
> just can see the primary owners and take decision accordingly (please
> read my answer to Konstatin above). 
> 
>  If (1) is true, then all the locking required to access
> > the data of port ownership needs to be adhered to.
> > 
> 
> And all the port configurations!  I think it is behind to this thread.
> 
> 
> > I understand that you are taking Thomas' words to mean that all
> > paths are lockless in the DPDK, and that is true as a statement of
> > fact (in that the DPDK doesn't synchronize access to internal data).
> > What it does do, is leave that locking as an exercize for the
> > downstream consumer of the library.  While I don't agree with it, I
> > can see that such an argument is valid for hot paths such as
> > transmission and reception where you may perhaps want to minimize
> > your locking by assigning a single task to do that work, but port
> > configuration and ownership isn't a hot path.  If you mean to allow
> > potentially multiple tasks to access configuration (including port
> > ownership), be it frequently or just occasionaly, why are you making
> > a downstream developer recognize the need for locking here outside
> > the library?  It just seems like very bad general practice to me.
> > 
> > > If the primary-secondary process want to manage(configure) same
> > > port in
> > same time, they must be synchronized by the applications, so this is
> > the case in port ownership too (actually I don't think this
> > synchronization is realistic because many configurations of the port
> > are not in shared memory).  Yes, it is the case, my question is,
> > why?  We're not talking about a time sensitive execution path here.
> > And by avoiding it you're just making work that has to be repeated
> > by every downstream consumer.
> 
> I think you suggest to make all the ethdev configuration race safe, it
> is behind to this thread.  Current ethdev implementation leave the
> race management to applications, so port ownership as any other port
> configurations should be designed in the same method.
> 
> > 
One key difference, though, being that port ownership itself could be
used to manage the thread-safety of the ethdev configuration. It's also
a little different from other APIs in that I find it hard to come up
with a scenario where it would be very useful to an application without
also having some form of locking present in it. For other config/control
APIs we can have the control plane APIs work without locks e.g. by
having a single designated thread/process manage all configuration
updates. However, as Neil points out, in such a scenario, the ownership
concept doesn't provide any additional benefit so can be skipped
completely. I'd view it a bit like the reference counting of mbufs -
we can provide a lockless/non-atomic version, but for just about every
real use-case, you want the atomic version.

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-03 13:46             ` Matan Azrad
  2017-12-04 16:01               ` Neil Horman
@ 2017-12-05 11:12               ` Ananyev, Konstantin
  2017-12-05 11:44                 ` Ananyev, Konstantin
  2017-12-05 11:47                 ` Thomas Monjalon
  1 sibling, 2 replies; 214+ messages in thread
From: Ananyev, Konstantin @ 2017-12-05 11:12 UTC (permalink / raw)
  To: Matan Azrad, Neil Horman, Gaëtan Rivet
  Cc: Thomas Monjalon, Wu, Jingjing, dev

Hi Matan,

> -----Original Message-----
> From: Matan Azrad [mailto:matan@mellanox.com]
> Sent: Sunday, December 3, 2017 1:47 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet
> <gaetan.rivet@6wind.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> Hi Konstantine
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > Sent: Sunday, December 3, 2017 1:10 PM
> > To: Matan Azrad <matan@mellanox.com>; Neil Horman
> > <nhorman@tuxdriver.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > <jingjing.wu@intel.com>; dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> >
> >
> >
> > Hi Matan,
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > > Sent: Sunday, December 3, 2017 8:05 AM
> > > To: Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet
> > > <gaetan.rivet@6wind.com>
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > <jingjing.wu@intel.com>; dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > >
> > > Hi
> > >
> > > > -----Original Message-----
> > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > > Sent: Friday, December 1, 2017 2:10 PM
> > > > To: Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> > > > dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > >
> > > > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> > > > > Hello Matan, Neil,
> > > > >
> > > > > I like the port ownership concept. I think it is needed to clarify
> > > > > some operations and should be useful to several subsystems.
> > > > >
> > > > > This patch could certainly be sub-divided however, and your
> > > > > current
> > > > > 1/5 should probably come after this one.
> > > > >
> > > > > Some comments inline.
> > > > >
> > > > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > > > > The ownership of a port is implicit in DPDK.
> > > > > > > Making it explicit is better from the next reasons:
> > > > > > > 1. It may be convenient for multi-process applications to know
> > which
> > > > > > >    process is in charge of a port.
> > > > > > > 2. A library could work on top of a port.
> > > > > > > 3. A port can work on top of another port.
> > > > > > >
> > > > > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > > > > We need to check that the user is not trying to use a port
> > > > > > > which is already managed by fail-safe.
> > > > > > >
> > > > > > > Add ownership mechanism to DPDK Ethernet devices to avoid
> > > > > > > multiple management of a device by different DPDK entities.
> > > > > > >
> > > > > > > A port owner is built from owner id(number) and owner
> > > > > > > name(string) while the owner id must be unique to distinguish
> > > > > > > between two identical entity instances and the owner name can be
> > any name.
> > > > > > > The name helps to logically recognize the owner by different
> > > > > > > DPDK entities and allows easy debug.
> > > > > > > Each DPDK entity can allocate an owner unique identifier and
> > > > > > > can use it and its preferred name to owns valid ethdev ports.
> > > > > > > Each DPDK entity can get any port owner status to decide if it
> > > > > > > can manage the port or not.
> > > > > > >
> > > > > > > The current ethdev internal port management is not affected by
> > > > > > > this feature.
> > > > > > >
> > > > >
> > > > > The internal port management is not affected, but the external
> > > > > interface is, however. In order to respect port ownership,
> > > > > applications are forced to modify their port iterator, as shown by
> > > > > your
> > > > testpmd patch.
> > > > >
> > > > > I think it would be better to modify the current
> > > > > RTE_ETH_FOREACH_DEV to call RTE_FOREACH_DEV_OWNED_BY, and
> > > > > introduce a default owner that would represent the application
> > > > > itself (probably with the ID 0 and an owner string ""). Only with
> > > > > specific additional configuration should this default subset of ethdev be
> > divided.
> > > > >
> > > > > This would make this evolution seamless for applications, at no
> > > > > cost to the complexity of the design.
> > > > >
> > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > >
> > > > > >
> > > > > > This seems fairly racy.  What if one thread attempts to set
> > > > > > ownership on a port, while another is checking it on another cpu
> > > > > > in parallel.  It doesn't seem like it will protect against that at all.
> > > > > > It also doesn't protect against the possibility of multiple
> > > > > > threads attempting to poll for rx in parallel, which I think was
> > > > > > part of Thomas's origional statement regarding port ownership
> > > > > > (he noted that the lockless design implied only a single thread
> > > > > > should be allowed to poll
> > > > for receive or make configuration changes at a time.
> > > > > >
> > > > > > Neil
> > > > > >
> > > > >
> > > > > Isn't this race already there for any configuration operation /
> > > > > polling function? The DPDK arch is expecting applications to solve it.
> > > > > Why should port ownership be designed differently from other DPDK
> > > > components?
> > > > >
> > > > Yes, but that doesn't mean it should exist in purpituity, nor does
> > > > it mean that your new api should contain it as well.
> > > >
> > > > > Embedding checks for port ownership within operations will force
> > > > > everyone to bear their costs, even those not interested in using
> > > > > this API. These checks should be kept outside, within the entity
> > > > > claiming ownership of the port, in the form of using the proper
> > > > > port iterator IMO.
> > > > >
> > > > No.  At the very least, you need to make the API itself exclusive.
> > > > That is to say, you should at least ensure that a port ownership get
> > > > call doesn't race with a port ownership set call.  It seems
> > > > rediculous to just leave that sort of locking as an exercize to the user.
> > > >
> > > > Neil
> > > >
> > > Neil,
> > > As Thomas mentioned, a DPDK port is designed to be managed by only one
> > > thread (or synchronized DPDK entity).
> > > So all the port management includes port ownership shouldn't be
> > > synchronized, i.e. locks are not needed.
> > > If some application want to do dual thread port management, the
> > > responsibility to synchronize the port ownership or any other port
> > > management is on this application.
> > > Port ownership doesn't come to allow synchronized management of the
> > > port by two DPDK entities in parallel, it is just nice way to answer next
> > questions:
> > > 	1. Is the port already owned by some DPDK entity(in early control
> > path)?
> > > 	2. If yes, Who is the owner?
> > > If the answer to the first question is no, the current entity can take
> > > the ownership without any lock(1 thread).
> > > If the answer to the first question is yes, you can recognize the
> > > owner and take decisions accordingly, sometimes you can decide to use
> > > the port because you logically know what the current owner does and
> > > you can be logically synchronized with it, sometimes you can just
> > > leave this port because you have not any deal with  this owner.
> >
> > If the goal is just to have an ability to recognize is that device is managed by
> > another device (failsafe, bonding, etc.),  then I think all we need is a pointer
> > to rte_eth_dev_data of the owner (NULL would mean no owner).
> 
> I think string is better than a pointer from the next reasons:
> 1. It is more human friendly than pointers for debug and printing.

We can have a function that would take an owner pointer and produce nice
pretty formatted text explanation: "owned by fail-safe device at port X" or so.  

> 2. it is flexible and allows to forward logical owner message to other DPDK entities.

Hmm and why do you want to do that?
There are dozen well defined IPC mechanisms in POSIX world, why do we need to create
a new one?
Especially considering how limited and error prone then new one is.

> 
> > Also I think if we'd like to introduce that mechanism, then it needs to be
> > - mandatory (control API just don't allow changes to the device configuration
> > if caller is not an owner).
> 
> But what if 2 DPDK entities should manage the same port \ using it and they are synchronized?

You mean 2 DPDK processes (primary/secondary) right?
As you mentioned below - ownership could be set only by primary.
So from the perspective of synchronizing access to the device between multiple processes -
it seems useless anyway.
What I am talking about is about synchronizing access to the low level device from
different high-level entities.
Let say if we have 2 failsafe devices (or 2 bonded devices) -
that mechanism will help to ensure that only one of them can own the device.
Again if user by mistake will try to manage device that is owned by failsafe device -
he wouldn't be able to do that.

> 
> > - transparent to the user (no API changes).
> 
> For now, there is not API change but new suggested API to use for port iteration.

Sorry, I probably wasn't clear here.
What I meant - this api to set/get ownership should be sort of internal to ethdev layer.
Let say it would be used for failsafe/bonding (any other compound) device that needs
to own/manage several low-level devices.
So in normal situation user wouldn't need to use that API directly at all.

> 
> >  - set/get owner ops need to be atomic if we want this mechanism to be
> > usable for MP.
> 
> But also without atomic this mechanism is usable in MP.
> For example:
> PRIMARY application can set its owner with string "primary A".
> SECONDARY process (which attach to the ports only after the primary created them )is not allowed to set owner(As you can see in the code)
> but it can read the owner string and see that the port owner is the primary application.
> The "A" can just sign specific port type to the SECONDARY that this port works with logic A which means, for example, primary should send
> the packets and secondary should receive the packets.

Even if secondary process is not allowed to modify that string, it might decide to read it at the moment
when primary one will decide to change it again (clear/set owner).
In that situation secondary will end-up either reading a junk or just crash.
But anyway as I said above - I don't think it is a good idea to have a strings here and
use them as IPC mechanism.

Konstantin



> >
> >
> >
> >
> >
> > >
> > > > > > > ---
> > > > > > >  doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
> > > > > > >  lib/librte_ether/rte_ethdev.c           | 121
> > > > ++++++++++++++++++++++++++++++++
> > > > > > >  lib/librte_ether/rte_ethdev.h           |  86
> > +++++++++++++++++++++++
> > > > > > >  lib/librte_ether/rte_ethdev_version.map |  12 ++++
> > > > > > >  4 files changed, 230 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > > b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > > index 6a0c9f9..af639a1 100644
> > > > > > > --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > > +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > > @@ -156,7 +156,7 @@ concurrently on the same tx queue without
> > > > > > > SW lock. This PMD feature found in som
> > > > > > >
> > > > > > >  See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE``
> > > > capability probing details.
> > > > > > >
> > > > > > > -Device Identification and Configuration
> > > > > > > +Device Identification, Ownership  and Configuration
> > > > > > >  ---------------------------------------
> > > > > > >
> > > > > > >  Device Identification
> > > > > > > @@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports
> > > > > > > are
> > > > assigned two other identifiers:
> > > > > > >  *   A port name used to designate the port in console messages, for
> > > > administration or debugging purposes.
> > > > > > >      For ease of use, the port name includes the port index.
> > > > > > >
> > > > > > > +Port Ownership
> > > > > > > +~~~~~~~~~~~~~
> > > > > > > +The Ethernet devices ports can be owned by a single DPDK
> > > > > > > +entity
> > > > (application, library, PMD, process, etc).
> > > > > > > +The ownership mechanism is controlled by ethdev APIs and
> > > > > > > +allows to
> > > > set/remove/get a port owner by DPDK entities.
> > > > > > > +Allowing this should prevent any multiple management of
> > > > > > > +Ethernet
> > > > port by different entities.
> > > > > > > +
> > > > > > > +.. note::
> > > > > > > +
> > > > > > > +    It is the DPDK entity responsibility either to check the
> > > > > > > + port owner
> > > > before using it or to set the port owner to prevent others from using it.
> > > > > > > +
> > > > > > >  Device Configuration
> > > > > > >  ~~~~~~~~~~~~~~~~~~~~
> > > > > > >
> > > > > > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > > > > > b/lib/librte_ether/rte_ethdev.c index 2d754d9..836991e 100644
> > > > > > > --- a/lib/librte_ether/rte_ethdev.c
> > > > > > > +++ b/lib/librte_ether/rte_ethdev.c
> > > > > > > @@ -71,6 +71,7 @@
> > > > > > >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
> > > > > > > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> > > > > > >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > > > > > > +static uint16_t rte_eth_next_owner_id =
> > RTE_ETH_DEV_NO_OWNER
> > > > + 1;
> > > > > > >  static uint8_t eth_dev_last_created_port;
> > > > > > >
> > > > > > >  /* spinlock for eth device callbacks */ @@ -278,6 +279,7 @@
> > > > > > > struct rte_eth_dev *
> > > > > > >  	if (eth_dev == NULL)
> > > > > > >  		return -EINVAL;
> > > > > > >
> > > > > > > +	memset(&eth_dev->data->owner, 0, sizeof(struct
> > > > > > > +rte_eth_dev_owner));
> > > > > > >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> > > > > > >  	return 0;
> > > > > > >  }
> > > > > > > @@ -293,6 +295,125 @@ struct rte_eth_dev *
> > > > > > >  		return 1;
> > > > > > >  }
> > > > > > >
> > > > > > > +static int
> > > > > > > +rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > > > > > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > > > > > > +	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER
> > &&
> > > > > > > +	    rte_eth_next_owner_id <= owner_id)) {
> > > > > > > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n",
> > owner_id);
> > > > > > > +		return 0;
> > > > > > > +	}
> > > > > > > +	return 1;
> > > > > > > +}
> > > > > > > +
> > > > > > > +uint16_t
> > > > > > > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t
> > > > > > > +owner_id) {
> > > > > > > +	while (port_id < RTE_MAX_ETHPORTS &&
> > > > > > > +	       (rte_eth_devices[port_id].state !=
> > RTE_ETH_DEV_ATTACHED ||
> > > > > > > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > > > > > > +		port_id++;
> > > > > > > +
> > > > > > > +	if (port_id >= RTE_MAX_ETHPORTS)
> > > > > > > +		return RTE_MAX_ETHPORTS;
> > > > > > > +
> > > > > > > +	return port_id;
> > > > > > > +}
> > > > > > > +
> > > > > > > +int
> > > > > > > +rte_eth_dev_owner_new(uint16_t *owner_id) {
> > > > > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > > > > +		RTE_PMD_DEBUG_TRACE("Not primary process
> > cannot own
> > > > ports.\n");
> > > > > > > +		return -EPERM;
> > > > > > > +	}
> > > > > > > +	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > > > > > > +		RTE_PMD_DEBUG_TRACE("Reached maximum
> > number of
> > > > Ethernet port owners.\n");
> > > > > > > +		return -EUSERS;
> > > > > > > +	}
> > > > > > > +	*owner_id = rte_eth_next_owner_id++;
> > > > > > > +	return 0;
> > > > > > > +}
> > > > > > > +
> > > > > > > +int
> > > > > > > +rte_eth_dev_owner_set(const uint16_t port_id,
> > > > > > > +		      const struct rte_eth_dev_owner *owner) {
> > > > > > > +	struct rte_eth_dev_owner *port_owner;
> > > > > > > +	int ret;
> > > > > > > +
> > > > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > > > > +		RTE_PMD_DEBUG_TRACE("Not primary process
> > cannot own
> > > > ports.\n");
> > > > > > > +		return -EPERM;
> > > > > > > +	}
> > > > > > > +	if (!rte_eth_is_valid_owner_id(owner->id))
> > > > > > > +		return -EINVAL;
> > > > > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > > > > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > > > > > > +	    port_owner->id != owner->id) {
> > > > > > > +		RTE_LOG(ERR, EAL,
> > > > > > > +			"Cannot set owner to port %d already owned
> > by
> > > > %s_%05d.\n",
> > > > > > > +			port_id, port_owner->name, port_owner-
> > >id);
> > > > > > > +		return -EPERM;
> > > > > > > +	}
> > > > > > > +	ret = snprintf(port_owner->name,
> > > > RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > > > > > > +		       owner->name);
> > > > > > > +	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> > > > > > > +		memset(port_owner->name, 0,
> > > > RTE_ETH_MAX_OWNER_NAME_LEN);
> > > > > > > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > > > > > > +		return -EINVAL;
> > > > > > > +	}
> > > > > > > +	port_owner->id = owner->id;
> > > > > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n",
> > port_id,
> > > > > > > +			    owner->name, owner->id);
> > > > > > > +	return 0;
> > > > > > > +}
> > > > > > > +
> > > > > > > +int
> > > > > > > +rte_eth_dev_owner_remove(const uint16_t port_id, const
> > > > > > > +uint16_t
> > > > > > > +owner_id) {
> > > > > > > +	struct rte_eth_dev_owner *port_owner;
> > > > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > > > > +		return -EINVAL;
> > > > > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > > > > +	if (port_owner->id != owner_id) {
> > > > > > > +		RTE_LOG(ERR, EAL,
> > > > > > > +			"Cannot remove port %d owner %s_%05d by
> > > > different owner id %5d.\n",
> > > > > > > +			port_id, port_owner->name, port_owner-
> > >id,
> > > > owner_id);
> > > > > > > +		return -EPERM;
> > > > > > > +	}
> > > > > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has
> > > > removed.\n", port_id,
> > > > > > > +			port_owner->name, port_owner->id);
> > > > > > > +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> > > > > > > +	return 0;
> > > > > > > +}
> > > > > > > +
> > > > > > > +void
> > > > > > > +rte_eth_dev_owner_delete(const uint16_t owner_id) {
> > > > > > > +	uint16_t p;
> > > > > > > +
> > > > > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > > > > +		return;
> > > > > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
> > > > > > > +		memset(&rte_eth_devices[p].data->owner, 0,
> > > > > > > +		       sizeof(struct rte_eth_dev_owner));
> > > > > > > +	RTE_PMD_DEBUG_TRACE("All port owners owned by "
> > > > > > > +			    "%05d identifier has removed.\n",
> > owner_id); }
> > > > > > > +
> > > > > > > +const struct rte_eth_dev_owner * rte_eth_dev_owner_get(const
> > > > > > > +uint16_t port_id) {
> > > > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> > > > > > > +	if (rte_eth_devices[port_id].data->owner.id ==
> > > > RTE_ETH_DEV_NO_OWNER)
> > > > > > > +		return NULL;
> > > > > > > +	return &rte_eth_devices[port_id].data->owner;
> > > > > > > +}
> > > > > > > +
> > > > > > >  int
> > > > > > >  rte_eth_dev_socket_id(uint16_t port_id)  { diff --git
> > > > > > > a/lib/librte_ether/rte_ethdev.h
> > > > > > > b/lib/librte_ether/rte_ethdev.h index 341c2d6..f54c26d 100644
> > > > > > > --- a/lib/librte_ether/rte_ethdev.h
> > > > > > > +++ b/lib/librte_ether/rte_ethdev.h
> > > > > > > @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
> > > > > > >
> > > > > > >  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
> > > > > > >
> > > > > > > +#define RTE_ETH_DEV_NO_OWNER 0
> > > > > > > +
> > > > > > > +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> > > > > > > +
> > > > > > > +struct rte_eth_dev_owner {
> > > > > > > +	uint16_t id; /**< The owner unique identifier. */
> > > > > > > +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The
> > owner
> > > > name. */
> > > > > > > +};
> > > > > > > +
> > > > > > >  /**
> > > > > > >   * @internal
> > > > > > >   * The data part, with no function pointers, associated with
> > > > > > > each
> > > > ethernet device.
> > > > > > > @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
> > > > > > >  	int numa_node;  /**< NUMA node connection */
> > > > > > >  	struct rte_vlan_filter_conf vlan_filter_conf;
> > > > > > >  	/**< VLAN filter configuration. */
> > > > > > > +	struct rte_eth_dev_owner owner; /**< The port owner. */
> > > > > > >  };
> > > > > > >
> > > > > > >  /** Device supports link state interrupt */ @@ -1846,6
> > > > > > > +1856,82 @@ struct rte_eth_dev_data {
> > > > > > >
> > > > > > >
> > > > > > >  /**
> > > > > > > + * Iterates over valid ethdev ports owned by a specific owner.
> > > > > > > + *
> > > > > > > + * @param port_id
> > > > > > > + *   The id of the next possible valid owned port.
> > > > > > > + * @param	owner_id
> > > > > > > + *  The owner identifier.
> > > > > > > + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid
> > > > > > > + ownerless
> > > > ports.
> > > > > > > + * @return
> > > > > > > + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if
> > > > there is none.
> > > > > > > + */
> > > > > > > +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const
> > > > > > > +uint16_t owner_id);
> > > > > > > +
> > > > > > > +/**
> > > > > > > + * Macro to iterate over all enabled ethdev ports owned by a
> > > > > > > +specific
> > > > owner.
> > > > > > > + */
> > > > > > > +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> > > > > > > +	for (p = rte_eth_find_next_owned_by(0, o); \
> > > > > > > +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> > > > > > > +	     p = rte_eth_find_next_owned_by(p + 1, o))
> > > > > > > +
> > > > > > > +/**
> > > > > > > + * Get a new unique owner identifier.
> > > > > > > + * An owner identifier is used to owns Ethernet devices by
> > > > > > > +only one DPDK entity
> > > > > > > + * to avoid multiple management of device by different entities.
> > > > > > > + *
> > > > > > > + * @param	owner_id
> > > > > > > + *   Owner identifier pointer.
> > > > > > > + * @return
> > > > > > > + *   Negative errno value on error, 0 on success.
> > > > > > > + */
> > > > > > > +int rte_eth_dev_owner_new(uint16_t *owner_id);
> > > > > > > +
> > > > > > > +/**
> > > > > > > + * Set an Ethernet device owner.
> > > > > > > + *
> > > > > > > + * @param	port_id
> > > > > > > + *  The identifier of the port to own.
> > > > > > > + * @param	owner
> > > > > > > + *  The owner pointer.
> > > > > > > + * @return
> > > > > > > + *  Negative errno value on error, 0 on success.
> > > > > > > + */
> > > > > > > +int rte_eth_dev_owner_set(const uint16_t port_id,
> > > > > > > +			  const struct rte_eth_dev_owner *owner);
> > > > > > > +
> > > > > > > +/**
> > > > > > > + * Remove Ethernet device owner to make the device ownerless.
> > > > > > > + *
> > > > > > > + * @param	port_id
> > > > > > > + *  The identifier of port to make ownerless.
> > > > > > > + * @param	owner
> > > > > > > + *  The owner identifier.
> > > > > > > + * @return
> > > > > > > + *  0 on success, negative errno value on error.
> > > > > > > + */
> > > > > > > +int rte_eth_dev_owner_remove(const uint16_t port_id, const
> > > > > > > +uint16_t owner_id);
> > > > > > > +
> > > > > > > +/**
> > > > > > > + * Remove owner from all Ethernet devices owned by a specific
> > > > owner.
> > > > > > > + *
> > > > > > > + * @param	owner
> > > > > > > + *  The owner identifier.
> > > > > > > + */
> > > > > > > +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> > > > > > > +
> > > > > > > +/**
> > > > > > > + * Get the owner of an Ethernet device.
> > > > > > > + *
> > > > > > > + * @param	port_id
> > > > > > > + *  The port identifier.
> > > > > > > + * @return
> > > > > > > + *  NULL when the device is ownerless, else the device owner
> > pointer.
> > > > > > > + */
> > > > > > > +const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const
> > > > > > > +uint16_t port_id);
> > > > > > > +
> > > > > > > +/**
> > > > > > >   * Get the total number of Ethernet devices that have been
> > > > successfully
> > > > > > >   * initialized by the matching Ethernet driver during the PCI
> > > > > > > probing
> > > > phase
> > > > > > >   * and that are available for applications to use. These
> > > > > > > devices must be diff --git
> > > > > > > a/lib/librte_ether/rte_ethdev_version.map
> > > > > > > b/lib/librte_ether/rte_ethdev_version.map
> > > > > > > index e9681ac..7d07edb 100644
> > > > > > > --- a/lib/librte_ether/rte_ethdev_version.map
> > > > > > > +++ b/lib/librte_ether/rte_ethdev_version.map
> > > > > > > @@ -198,6 +198,18 @@ DPDK_17.11 {
> > > > > > >
> > > > > > >  } DPDK_17.08;
> > > > > > >
> > > > > > > +DPDK_18.02 {
> > > > > > > +	global:
> > > > > > > +
> > > > > > > +	rte_eth_find_next_owned_by;
> > > > > > > +	rte_eth_dev_owner_new;
> > > > > > > +	rte_eth_dev_owner_set;
> > > > > > > +	rte_eth_dev_owner_remove;
> > > > > > > +	rte_eth_dev_owner_delete;
> > > > > > > +	rte_eth_dev_owner_get;
> > > > > > > +
> > > > > > > +} DPDK_17.11;
> > > > > > > +
> > > > > > >  EXPERIMENTAL {
> > > > > > >  	global:
> > > > > > >
> > > > > > > --
> > > > > > > 1.8.3.1
> > > > > > >
> > > > > > >
> > > > >
> > > > > --
> > > > > Gaëtan Rivet
> > > > > 6WIND
> > > > >

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-05 11:12               ` Ananyev, Konstantin
@ 2017-12-05 11:44                 ` Ananyev, Konstantin
  2017-12-05 11:53                   ` Thomas Monjalon
  2017-12-05 11:47                 ` Thomas Monjalon
  1 sibling, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2017-12-05 11:44 UTC (permalink / raw)
  To: Ananyev, Konstantin, Matan Azrad, Neil Horman, Gaëtan Rivet
  Cc: Thomas Monjalon, Wu, Jingjing, dev



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin
> Sent: Tuesday, December 5, 2017 11:12 AM
> To: Matan Azrad <matan@mellanox.com>; Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> Hi Matan,
> 
> > -----Original Message-----
> > From: Matan Azrad [mailto:matan@mellanox.com]
> > Sent: Sunday, December 3, 2017 1:47 PM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet
> > <gaetan.rivet@6wind.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> >
> > Hi Konstantine
> >
> > > -----Original Message-----
> > > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > > Sent: Sunday, December 3, 2017 1:10 PM
> > > To: Matan Azrad <matan@mellanox.com>; Neil Horman
> > > <nhorman@tuxdriver.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > <jingjing.wu@intel.com>; dev@dpdk.org
> > > Subject: RE: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > >
> > >
> > >
> > > Hi Matan,
> > >
> > > > -----Original Message-----
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > > > Sent: Sunday, December 3, 2017 8:05 AM
> > > > To: Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet
> > > > <gaetan.rivet@6wind.com>
> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > > <jingjing.wu@intel.com>; dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > >
> > > > Hi
> > > >
> > > > > -----Original Message-----
> > > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > > > Sent: Friday, December 1, 2017 2:10 PM
> > > > > To: Gaëtan Rivet <gaetan.rivet@6wind.com>
> > > > > Cc: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > > <thomas@monjalon.net>; Jingjing Wu <jingjing.wu@intel.com>;
> > > > > dev@dpdk.org
> > > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > > >
> > > > > On Thu, Nov 30, 2017 at 02:24:43PM +0100, Gaëtan Rivet wrote:
> > > > > > Hello Matan, Neil,
> > > > > >
> > > > > > I like the port ownership concept. I think it is needed to clarify
> > > > > > some operations and should be useful to several subsystems.
> > > > > >
> > > > > > This patch could certainly be sub-divided however, and your
> > > > > > current
> > > > > > 1/5 should probably come after this one.
> > > > > >
> > > > > > Some comments inline.
> > > > > >
> > > > > > On Thu, Nov 30, 2017 at 07:36:11AM -0500, Neil Horman wrote:
> > > > > > > On Tue, Nov 28, 2017 at 11:57:58AM +0000, Matan Azrad wrote:
> > > > > > > > The ownership of a port is implicit in DPDK.
> > > > > > > > Making it explicit is better from the next reasons:
> > > > > > > > 1. It may be convenient for multi-process applications to know
> > > which
> > > > > > > >    process is in charge of a port.
> > > > > > > > 2. A library could work on top of a port.
> > > > > > > > 3. A port can work on top of another port.
> > > > > > > >
> > > > > > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > > > > > We need to check that the user is not trying to use a port
> > > > > > > > which is already managed by fail-safe.
> > > > > > > >
> > > > > > > > Add ownership mechanism to DPDK Ethernet devices to avoid
> > > > > > > > multiple management of a device by different DPDK entities.
> > > > > > > >
> > > > > > > > A port owner is built from owner id(number) and owner
> > > > > > > > name(string) while the owner id must be unique to distinguish
> > > > > > > > between two identical entity instances and the owner name can be
> > > any name.
> > > > > > > > The name helps to logically recognize the owner by different
> > > > > > > > DPDK entities and allows easy debug.
> > > > > > > > Each DPDK entity can allocate an owner unique identifier and
> > > > > > > > can use it and its preferred name to owns valid ethdev ports.
> > > > > > > > Each DPDK entity can get any port owner status to decide if it
> > > > > > > > can manage the port or not.
> > > > > > > >
> > > > > > > > The current ethdev internal port management is not affected by
> > > > > > > > this feature.
> > > > > > > >
> > > > > >
> > > > > > The internal port management is not affected, but the external
> > > > > > interface is, however. In order to respect port ownership,
> > > > > > applications are forced to modify their port iterator, as shown by
> > > > > > your
> > > > > testpmd patch.
> > > > > >
> > > > > > I think it would be better to modify the current
> > > > > > RTE_ETH_FOREACH_DEV to call RTE_FOREACH_DEV_OWNED_BY, and
> > > > > > introduce a default owner that would represent the application
> > > > > > itself (probably with the ID 0 and an owner string ""). Only with
> > > > > > specific additional configuration should this default subset of ethdev be
> > > divided.
> > > > > >
> > > > > > This would make this evolution seamless for applications, at no
> > > > > > cost to the complexity of the design.
> > > > > >
> > > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > > >
> > > > > > >
> > > > > > > This seems fairly racy.  What if one thread attempts to set
> > > > > > > ownership on a port, while another is checking it on another cpu
> > > > > > > in parallel.  It doesn't seem like it will protect against that at all.
> > > > > > > It also doesn't protect against the possibility of multiple
> > > > > > > threads attempting to poll for rx in parallel, which I think was
> > > > > > > part of Thomas's origional statement regarding port ownership
> > > > > > > (he noted that the lockless design implied only a single thread
> > > > > > > should be allowed to poll
> > > > > for receive or make configuration changes at a time.
> > > > > > >
> > > > > > > Neil
> > > > > > >
> > > > > >
> > > > > > Isn't this race already there for any configuration operation /
> > > > > > polling function? The DPDK arch is expecting applications to solve it.
> > > > > > Why should port ownership be designed differently from other DPDK
> > > > > components?
> > > > > >
> > > > > Yes, but that doesn't mean it should exist in purpituity, nor does
> > > > > it mean that your new api should contain it as well.
> > > > >
> > > > > > Embedding checks for port ownership within operations will force
> > > > > > everyone to bear their costs, even those not interested in using
> > > > > > this API. These checks should be kept outside, within the entity
> > > > > > claiming ownership of the port, in the form of using the proper
> > > > > > port iterator IMO.
> > > > > >
> > > > > No.  At the very least, you need to make the API itself exclusive.
> > > > > That is to say, you should at least ensure that a port ownership get
> > > > > call doesn't race with a port ownership set call.  It seems
> > > > > rediculous to just leave that sort of locking as an exercize to the user.
> > > > >
> > > > > Neil
> > > > >
> > > > Neil,
> > > > As Thomas mentioned, a DPDK port is designed to be managed by only one
> > > > thread (or synchronized DPDK entity).
> > > > So all the port management includes port ownership shouldn't be
> > > > synchronized, i.e. locks are not needed.
> > > > If some application want to do dual thread port management, the
> > > > responsibility to synchronize the port ownership or any other port
> > > > management is on this application.
> > > > Port ownership doesn't come to allow synchronized management of the
> > > > port by two DPDK entities in parallel, it is just nice way to answer next
> > > questions:
> > > > 	1. Is the port already owned by some DPDK entity(in early control
> > > path)?
> > > > 	2. If yes, Who is the owner?
> > > > If the answer to the first question is no, the current entity can take
> > > > the ownership without any lock(1 thread).
> > > > If the answer to the first question is yes, you can recognize the
> > > > owner and take decisions accordingly, sometimes you can decide to use
> > > > the port because you logically know what the current owner does and
> > > > you can be logically synchronized with it, sometimes you can just
> > > > leave this port because you have not any deal with  this owner.
> > >
> > > If the goal is just to have an ability to recognize is that device is managed by
> > > another device (failsafe, bonding, etc.),  then I think all we need is a pointer
> > > to rte_eth_dev_data of the owner (NULL would mean no owner).
> >
> > I think string is better than a pointer from the next reasons:
> > 1. It is more human friendly than pointers for debug and printing.
> 
> We can have a function that would take an owner pointer and produce nice
> pretty formatted text explanation: "owned by fail-safe device at port X" or so.
> 
> > 2. it is flexible and allows to forward logical owner message to other DPDK entities.
> 
> Hmm and why do you want to do that?
> There are dozen well defined IPC mechanisms in POSIX world, why do we need to create
> a new one?
> Especially considering how limited and error prone then new one is.
> 
> >
> > > Also I think if we'd like to introduce that mechanism, then it needs to be
> > > - mandatory (control API just don't allow changes to the device configuration
> > > if caller is not an owner).
> >
> > But what if 2 DPDK entities should manage the same port \ using it and they are synchronized?
> 
> You mean 2 DPDK processes (primary/secondary) right?
> As you mentioned below - ownership could be set only by primary.
> So from the perspective of synchronizing access to the device between multiple processes -
> it seems useless anyway.
> What I am talking about is about synchronizing access to the low level device from
> different high-level entities.
> Let say if we have 2 failsafe devices (or 2 bonded devices) -
> that mechanism will help to ensure that only one of them can own the device.
> Again if user by mistake will try to manage device that is owned by failsafe device -
> he wouldn't be able to do that.
> 
> >
> > > - transparent to the user (no API changes).
> >
> > For now, there is not API change but new suggested API to use for port iteration.
> 
> Sorry, I probably wasn't clear here.
> What I meant - this api to set/get ownership should be sort of internal to ethdev layer.
> Let say it would be used for failsafe/bonding (any other compound) device that needs
> to own/manage several low-level devices.
> So in normal situation user wouldn't need to use that API directly at all.
> 
> >
> > >  - set/get owner ops need to be atomic if we want this mechanism to be
> > > usable for MP.
> >
> > But also without atomic this mechanism is usable in MP.
> > For example:
> > PRIMARY application can set its owner with string "primary A".
> > SECONDARY process (which attach to the ports only after the primary created them )is not allowed to set owner(As you can see in the
> code)
> > but it can read the owner string and see that the port owner is the primary application.
> > The "A" can just sign specific port type to the SECONDARY that this port works with logic A which means, for example, primary should
> send
> > the packets and secondary should receive the packets.
> 
> Even if secondary process is not allowed to modify that string, it might decide to read it at the moment
> when primary one will decide to change it again (clear/set owner).
> In that situation secondary will end-up either reading a junk or just crash.
> But anyway as I said above - I don't think it is a good idea to have a strings here and
> use them as IPC mechanism.

Just forgot to mention - I don' think it is good idea to disallow secondary process to set  theowner.
Let say  in secondary process I have few tap/ring/pcap devices.
Why it shouldn't be allowed to unite them under bonding device and make that device to own them?
That's why I think get/set owner better to be atomic.
If the owner is just a pointer - in that case get operation will be atomic by nature,
set could be implemented just by CAS.
Konstantin 

> 
> Konstantin
> 
> 
> 
> > >
> > >
> > >
> > >
> > >
> > > >
> > > > > > > > ---
> > > > > > > >  doc/guides/prog_guide/poll_mode_drv.rst |  12 +++-
> > > > > > > >  lib/librte_ether/rte_ethdev.c           | 121
> > > > > ++++++++++++++++++++++++++++++++
> > > > > > > >  lib/librte_ether/rte_ethdev.h           |  86
> > > +++++++++++++++++++++++
> > > > > > > >  lib/librte_ether/rte_ethdev_version.map |  12 ++++
> > > > > > > >  4 files changed, 230 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > > > b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > > > index 6a0c9f9..af639a1 100644
> > > > > > > > --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > > > +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > > > > > > > @@ -156,7 +156,7 @@ concurrently on the same tx queue without
> > > > > > > > SW lock. This PMD feature found in som
> > > > > > > >
> > > > > > > >  See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE``
> > > > > capability probing details.
> > > > > > > >
> > > > > > > > -Device Identification and Configuration
> > > > > > > > +Device Identification, Ownership  and Configuration
> > > > > > > >  ---------------------------------------
> > > > > > > >
> > > > > > > >  Device Identification
> > > > > > > > @@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports
> > > > > > > > are
> > > > > assigned two other identifiers:
> > > > > > > >  *   A port name used to designate the port in console messages, for
> > > > > administration or debugging purposes.
> > > > > > > >      For ease of use, the port name includes the port index.
> > > > > > > >
> > > > > > > > +Port Ownership
> > > > > > > > +~~~~~~~~~~~~~
> > > > > > > > +The Ethernet devices ports can be owned by a single DPDK
> > > > > > > > +entity
> > > > > (application, library, PMD, process, etc).
> > > > > > > > +The ownership mechanism is controlled by ethdev APIs and
> > > > > > > > +allows to
> > > > > set/remove/get a port owner by DPDK entities.
> > > > > > > > +Allowing this should prevent any multiple management of
> > > > > > > > +Ethernet
> > > > > port by different entities.
> > > > > > > > +
> > > > > > > > +.. note::
> > > > > > > > +
> > > > > > > > +    It is the DPDK entity responsibility either to check the
> > > > > > > > + port owner
> > > > > before using it or to set the port owner to prevent others from using it.
> > > > > > > > +
> > > > > > > >  Device Configuration
> > > > > > > >  ~~~~~~~~~~~~~~~~~~~~
> > > > > > > >
> > > > > > > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > > > > > > b/lib/librte_ether/rte_ethdev.c index 2d754d9..836991e 100644
> > > > > > > > --- a/lib/librte_ether/rte_ethdev.c
> > > > > > > > +++ b/lib/librte_ether/rte_ethdev.c
> > > > > > > > @@ -71,6 +71,7 @@
> > > > > > > >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
> > > > > > > > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> > > > > > > >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > > > > > > > +static uint16_t rte_eth_next_owner_id =
> > > RTE_ETH_DEV_NO_OWNER
> > > > > + 1;
> > > > > > > >  static uint8_t eth_dev_last_created_port;
> > > > > > > >
> > > > > > > >  /* spinlock for eth device callbacks */ @@ -278,6 +279,7 @@
> > > > > > > > struct rte_eth_dev *
> > > > > > > >  	if (eth_dev == NULL)
> > > > > > > >  		return -EINVAL;
> > > > > > > >
> > > > > > > > +	memset(&eth_dev->data->owner, 0, sizeof(struct
> > > > > > > > +rte_eth_dev_owner));
> > > > > > > >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> > > > > > > >  	return 0;
> > > > > > > >  }
> > > > > > > > @@ -293,6 +295,125 @@ struct rte_eth_dev *
> > > > > > > >  		return 1;
> > > > > > > >  }
> > > > > > > >
> > > > > > > > +static int
> > > > > > > > +rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > > > > > > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > > > > > > > +	    (rte_eth_next_owner_id != RTE_ETH_DEV_NO_OWNER
> > > &&
> > > > > > > > +	    rte_eth_next_owner_id <= owner_id)) {
> > > > > > > > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n",
> > > owner_id);
> > > > > > > > +		return 0;
> > > > > > > > +	}
> > > > > > > > +	return 1;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +uint16_t
> > > > > > > > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t
> > > > > > > > +owner_id) {
> > > > > > > > +	while (port_id < RTE_MAX_ETHPORTS &&
> > > > > > > > +	       (rte_eth_devices[port_id].state !=
> > > RTE_ETH_DEV_ATTACHED ||
> > > > > > > > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > > > > > > > +		port_id++;
> > > > > > > > +
> > > > > > > > +	if (port_id >= RTE_MAX_ETHPORTS)
> > > > > > > > +		return RTE_MAX_ETHPORTS;
> > > > > > > > +
> > > > > > > > +	return port_id;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +int
> > > > > > > > +rte_eth_dev_owner_new(uint16_t *owner_id) {
> > > > > > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > > > > > +		RTE_PMD_DEBUG_TRACE("Not primary process
> > > cannot own
> > > > > ports.\n");
> > > > > > > > +		return -EPERM;
> > > > > > > > +	}
> > > > > > > > +	if (rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > > > > > > > +		RTE_PMD_DEBUG_TRACE("Reached maximum
> > > number of
> > > > > Ethernet port owners.\n");
> > > > > > > > +		return -EUSERS;
> > > > > > > > +	}
> > > > > > > > +	*owner_id = rte_eth_next_owner_id++;
> > > > > > > > +	return 0;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +int
> > > > > > > > +rte_eth_dev_owner_set(const uint16_t port_id,
> > > > > > > > +		      const struct rte_eth_dev_owner *owner) {
> > > > > > > > +	struct rte_eth_dev_owner *port_owner;
> > > > > > > > +	int ret;
> > > > > > > > +
> > > > > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > > > > > +	if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > > > > > > > +		RTE_PMD_DEBUG_TRACE("Not primary process
> > > cannot own
> > > > > ports.\n");
> > > > > > > > +		return -EPERM;
> > > > > > > > +	}
> > > > > > > > +	if (!rte_eth_is_valid_owner_id(owner->id))
> > > > > > > > +		return -EINVAL;
> > > > > > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > > > > > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > > > > > > > +	    port_owner->id != owner->id) {
> > > > > > > > +		RTE_LOG(ERR, EAL,
> > > > > > > > +			"Cannot set owner to port %d already owned
> > > by
> > > > > %s_%05d.\n",
> > > > > > > > +			port_id, port_owner->name, port_owner-
> > > >id);
> > > > > > > > +		return -EPERM;
> > > > > > > > +	}
> > > > > > > > +	ret = snprintf(port_owner->name,
> > > > > RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > > > > > > > +		       owner->name);
> > > > > > > > +	if (ret < 0 || ret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> > > > > > > > +		memset(port_owner->name, 0,
> > > > > RTE_ETH_MAX_OWNER_NAME_LEN);
> > > > > > > > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > > > > > > > +		return -EINVAL;
> > > > > > > > +	}
> > > > > > > > +	port_owner->id = owner->id;
> > > > > > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n",
> > > port_id,
> > > > > > > > +			    owner->name, owner->id);
> > > > > > > > +	return 0;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +int
> > > > > > > > +rte_eth_dev_owner_remove(const uint16_t port_id, const
> > > > > > > > +uint16_t
> > > > > > > > +owner_id) {
> > > > > > > > +	struct rte_eth_dev_owner *port_owner;
> > > > > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > > > > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > > > > > +		return -EINVAL;
> > > > > > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > > > > > +	if (port_owner->id != owner_id) {
> > > > > > > > +		RTE_LOG(ERR, EAL,
> > > > > > > > +			"Cannot remove port %d owner %s_%05d by
> > > > > different owner id %5d.\n",
> > > > > > > > +			port_id, port_owner->name, port_owner-
> > > >id,
> > > > > owner_id);
> > > > > > > > +		return -EPERM;
> > > > > > > > +	}
> > > > > > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has
> > > > > removed.\n", port_id,
> > > > > > > > +			port_owner->name, port_owner->id);
> > > > > > > > +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> > > > > > > > +	return 0;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +void
> > > > > > > > +rte_eth_dev_owner_delete(const uint16_t owner_id) {
> > > > > > > > +	uint16_t p;
> > > > > > > > +
> > > > > > > > +	if (!rte_eth_is_valid_owner_id(owner_id))
> > > > > > > > +		return;
> > > > > > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, owner_id)
> > > > > > > > +		memset(&rte_eth_devices[p].data->owner, 0,
> > > > > > > > +		       sizeof(struct rte_eth_dev_owner));
> > > > > > > > +	RTE_PMD_DEBUG_TRACE("All port owners owned by "
> > > > > > > > +			    "%05d identifier has removed.\n",
> > > owner_id); }
> > > > > > > > +
> > > > > > > > +const struct rte_eth_dev_owner * rte_eth_dev_owner_get(const
> > > > > > > > +uint16_t port_id) {
> > > > > > > > +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> > > > > > > > +	if (rte_eth_devices[port_id].data->owner.id ==
> > > > > RTE_ETH_DEV_NO_OWNER)
> > > > > > > > +		return NULL;
> > > > > > > > +	return &rte_eth_devices[port_id].data->owner;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > >  int
> > > > > > > >  rte_eth_dev_socket_id(uint16_t port_id)  { diff --git
> > > > > > > > a/lib/librte_ether/rte_ethdev.h
> > > > > > > > b/lib/librte_ether/rte_ethdev.h index 341c2d6..f54c26d 100644
> > > > > > > > --- a/lib/librte_ether/rte_ethdev.h
> > > > > > > > +++ b/lib/librte_ether/rte_ethdev.h
> > > > > > > > @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
> > > > > > > >
> > > > > > > >  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
> > > > > > > >
> > > > > > > > +#define RTE_ETH_DEV_NO_OWNER 0
> > > > > > > > +
> > > > > > > > +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> > > > > > > > +
> > > > > > > > +struct rte_eth_dev_owner {
> > > > > > > > +	uint16_t id; /**< The owner unique identifier. */
> > > > > > > > +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The
> > > owner
> > > > > name. */
> > > > > > > > +};
> > > > > > > > +
> > > > > > > >  /**
> > > > > > > >   * @internal
> > > > > > > >   * The data part, with no function pointers, associated with
> > > > > > > > each
> > > > > ethernet device.
> > > > > > > > @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
> > > > > > > >  	int numa_node;  /**< NUMA node connection */
> > > > > > > >  	struct rte_vlan_filter_conf vlan_filter_conf;
> > > > > > > >  	/**< VLAN filter configuration. */
> > > > > > > > +	struct rte_eth_dev_owner owner; /**< The port owner. */
> > > > > > > >  };
> > > > > > > >
> > > > > > > >  /** Device supports link state interrupt */ @@ -1846,6
> > > > > > > > +1856,82 @@ struct rte_eth_dev_data {
> > > > > > > >
> > > > > > > >
> > > > > > > >  /**
> > > > > > > > + * Iterates over valid ethdev ports owned by a specific owner.
> > > > > > > > + *
> > > > > > > > + * @param port_id
> > > > > > > > + *   The id of the next possible valid owned port.
> > > > > > > > + * @param	owner_id
> > > > > > > > + *  The owner identifier.
> > > > > > > > + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid
> > > > > > > > + ownerless
> > > > > ports.
> > > > > > > > + * @return
> > > > > > > > + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if
> > > > > there is none.
> > > > > > > > + */
> > > > > > > > +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const
> > > > > > > > +uint16_t owner_id);
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > > + * Macro to iterate over all enabled ethdev ports owned by a
> > > > > > > > +specific
> > > > > owner.
> > > > > > > > + */
> > > > > > > > +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> > > > > > > > +	for (p = rte_eth_find_next_owned_by(0, o); \
> > > > > > > > +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> > > > > > > > +	     p = rte_eth_find_next_owned_by(p + 1, o))
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > > + * Get a new unique owner identifier.
> > > > > > > > + * An owner identifier is used to owns Ethernet devices by
> > > > > > > > +only one DPDK entity
> > > > > > > > + * to avoid multiple management of device by different entities.
> > > > > > > > + *
> > > > > > > > + * @param	owner_id
> > > > > > > > + *   Owner identifier pointer.
> > > > > > > > + * @return
> > > > > > > > + *   Negative errno value on error, 0 on success.
> > > > > > > > + */
> > > > > > > > +int rte_eth_dev_owner_new(uint16_t *owner_id);
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > > + * Set an Ethernet device owner.
> > > > > > > > + *
> > > > > > > > + * @param	port_id
> > > > > > > > + *  The identifier of the port to own.
> > > > > > > > + * @param	owner
> > > > > > > > + *  The owner pointer.
> > > > > > > > + * @return
> > > > > > > > + *  Negative errno value on error, 0 on success.
> > > > > > > > + */
> > > > > > > > +int rte_eth_dev_owner_set(const uint16_t port_id,
> > > > > > > > +			  const struct rte_eth_dev_owner *owner);
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > > + * Remove Ethernet device owner to make the device ownerless.
> > > > > > > > + *
> > > > > > > > + * @param	port_id
> > > > > > > > + *  The identifier of port to make ownerless.
> > > > > > > > + * @param	owner
> > > > > > > > + *  The owner identifier.
> > > > > > > > + * @return
> > > > > > > > + *  0 on success, negative errno value on error.
> > > > > > > > + */
> > > > > > > > +int rte_eth_dev_owner_remove(const uint16_t port_id, const
> > > > > > > > +uint16_t owner_id);
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > > + * Remove owner from all Ethernet devices owned by a specific
> > > > > owner.
> > > > > > > > + *
> > > > > > > > + * @param	owner
> > > > > > > > + *  The owner identifier.
> > > > > > > > + */
> > > > > > > > +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > > + * Get the owner of an Ethernet device.
> > > > > > > > + *
> > > > > > > > + * @param	port_id
> > > > > > > > + *  The port identifier.
> > > > > > > > + * @return
> > > > > > > > + *  NULL when the device is ownerless, else the device owner
> > > pointer.
> > > > > > > > + */
> > > > > > > > +const struct rte_eth_dev_owner *rte_eth_dev_owner_get(const
> > > > > > > > +uint16_t port_id);
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > >   * Get the total number of Ethernet devices that have been
> > > > > successfully
> > > > > > > >   * initialized by the matching Ethernet driver during the PCI
> > > > > > > > probing
> > > > > phase
> > > > > > > >   * and that are available for applications to use. These
> > > > > > > > devices must be diff --git
> > > > > > > > a/lib/librte_ether/rte_ethdev_version.map
> > > > > > > > b/lib/librte_ether/rte_ethdev_version.map
> > > > > > > > index e9681ac..7d07edb 100644
> > > > > > > > --- a/lib/librte_ether/rte_ethdev_version.map
> > > > > > > > +++ b/lib/librte_ether/rte_ethdev_version.map
> > > > > > > > @@ -198,6 +198,18 @@ DPDK_17.11 {
> > > > > > > >
> > > > > > > >  } DPDK_17.08;
> > > > > > > >
> > > > > > > > +DPDK_18.02 {
> > > > > > > > +	global:
> > > > > > > > +
> > > > > > > > +	rte_eth_find_next_owned_by;
> > > > > > > > +	rte_eth_dev_owner_new;
> > > > > > > > +	rte_eth_dev_owner_set;
> > > > > > > > +	rte_eth_dev_owner_remove;
> > > > > > > > +	rte_eth_dev_owner_delete;
> > > > > > > > +	rte_eth_dev_owner_get;
> > > > > > > > +
> > > > > > > > +} DPDK_17.11;
> > > > > > > > +
> > > > > > > >  EXPERIMENTAL {
> > > > > > > >  	global:
> > > > > > > >
> > > > > > > > --
> > > > > > > > 1.8.3.1
> > > > > > > >
> > > > > > > >
> > > > > >
> > > > > > --
> > > > > > Gaëtan Rivet
> > > > > > 6WIND
> > > > > >

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-05 11:12               ` Ananyev, Konstantin
  2017-12-05 11:44                 ` Ananyev, Konstantin
@ 2017-12-05 11:47                 ` Thomas Monjalon
  2017-12-05 15:13                   ` Ananyev, Konstantin
  1 sibling, 1 reply; 214+ messages in thread
From: Thomas Monjalon @ 2017-12-05 11:47 UTC (permalink / raw)
  To: Ananyev, Konstantin
  Cc: Matan Azrad, Neil Horman, Gaëtan Rivet, Wu, Jingjing, dev

Hi,

I will give my view on locking and synchronization in a different email.
Let's discuss about the API here.

05/12/2017 12:12, Ananyev, Konstantin:
> From: Matan Azrad [mailto:matan@mellanox.com]
> > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]

> > > If the goal is just to have an ability to recognize is that device is managed by
> > > another device (failsafe, bonding, etc.),  then I think all we need is a pointer
> > > to rte_eth_dev_data of the owner (NULL would mean no owner).
> > 
> > I think string is better than a pointer from the next reasons:
> > 1. It is more human friendly than pointers for debug and printing.
> 
> We can have a function that would take an owner pointer and produce nice
> pretty formatted text explanation: "owned by fail-safe device at port X" or so.

I don't think it is possible or convenient to have such function.
Keep in mind that the owner can be an application thread.
If you prefer using a single function pointer (may help for
atomic implementation), we can allocate an owner structure containing
a name as a string to identify the owner in human readable format.
Then we just have to set the pointer of this struct to rte_eth_dev_data.


> What I meant - this api to set/get ownership should be sort of internal to ethdev layer.
> Let say it would be used for failsafe/bonding (any other compound) device that needs
> to own/manage several low-level devices.
> So in normal situation user wouldn't need to use that API directly at all.

Again, the application may use this API to declare its ownership.
And anwyway, it may be interesting from an application point of view
to be able to list every devices and their internal owners.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-05 11:44                 ` Ananyev, Konstantin
@ 2017-12-05 11:53                   ` Thomas Monjalon
  2017-12-05 14:56                     ` Bruce Richardson
  2017-12-05 14:57                     ` Ananyev, Konstantin
  0 siblings, 2 replies; 214+ messages in thread
From: Thomas Monjalon @ 2017-12-05 11:53 UTC (permalink / raw)
  To: Ananyev, Konstantin
  Cc: Matan Azrad, Neil Horman, Gaëtan Rivet, Wu, Jingjing, dev

05/12/2017 12:44, Ananyev, Konstantin:
> Just forgot to mention - I don' think it is good idea to disallow secondary process to set  theowner.

I think we all agree on that.
My initial suggestion was to use the ownership in secondary processes.
I think Matan forbid it as a first step because there is no
multi-process synchronization currently.

> Let say  in secondary process I have few tap/ring/pcap devices.
> Why it shouldn't be allowed to unite them under bonding device and make that device to own them?
> That's why I think get/set owner better to be atomic.
> If the owner is just a pointer - in that case get operation will be atomic by nature,
> set could be implemented just by CAS.

It would be perfect.
Can we be sure that the atomic will work perfectly on shared memory?
On every architectures?

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-05 11:53                   ` Thomas Monjalon
@ 2017-12-05 14:56                     ` Bruce Richardson
  2017-12-05 14:57                     ` Ananyev, Konstantin
  1 sibling, 0 replies; 214+ messages in thread
From: Bruce Richardson @ 2017-12-05 14:56 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Ananyev, Konstantin, Matan Azrad, Neil Horman, Gaëtan Rivet,
	Wu, Jingjing, dev

On Tue, Dec 05, 2017 at 12:53:36PM +0100, Thomas Monjalon wrote:
> 05/12/2017 12:44, Ananyev, Konstantin:
> > Just forgot to mention - I don' think it is good idea to disallow secondary process to set  theowner.
> 
> I think we all agree on that.
> My initial suggestion was to use the ownership in secondary processes.
> I think Matan forbid it as a first step because there is no
> multi-process synchronization currently.
> 
> > Let say  in secondary process I have few tap/ring/pcap devices.
> > Why it shouldn't be allowed to unite them under bonding device and make that device to own them?
> > That's why I think get/set owner better to be atomic.
> > If the owner is just a pointer - in that case get operation will be atomic by nature,
> > set could be implemented just by CAS.
> 
> It would be perfect.
> Can we be sure that the atomic will work perfectly on shared memory?

The sharing of memory is an OS-level construct in managing page tables,
more than anything else. For atomic operations, a memory address is a
memory address, whether it is shared or private to a process.

> On every architectures?

All architectures should have an atomic compare-and-set equivalent
operation for it's native pointer size. In the unlikely case we have to
support one that doesn't, we can special-case that in some other way.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-05 11:53                   ` Thomas Monjalon
  2017-12-05 14:56                     ` Bruce Richardson
@ 2017-12-05 14:57                     ` Ananyev, Konstantin
  1 sibling, 0 replies; 214+ messages in thread
From: Ananyev, Konstantin @ 2017-12-05 14:57 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Matan Azrad, Neil Horman, Gaëtan Rivet, Wu, Jingjing, dev


>> Just forgot to mention - I don' think it is good idea to disallow secondary process to set theowner.
 
>I think we all agree on that.
>My initial suggestion was to use the ownership in secondary processes.
>I think Matan forbid it as a first step because there is no
>multi-process synchronization currently.
 
>> Let say in secondary process I have few tap/ring/pcap devices.
>> Why it shouldn't be allowed to unite them under bonding device and make that device to own them?
>> That's why I think get/set owner better to be atomic.
>> If the owner is just a pointer - in that case get operation will be atomic by nature,
>> set could be implemented just by CAS.
 
>It would be perfect.
>Can we be sure that the atomic will work perfectly on shared memory?
>On every architectures?

I believe - yes, how otherwise rte_ring and rte_mbuf would work for MP? :)
Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-05 11:47                 ` Thomas Monjalon
@ 2017-12-05 15:13                   ` Ananyev, Konstantin
  2017-12-05 15:49                     ` Thomas Monjalon
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2017-12-05 15:13 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Matan Azrad, Neil Horman, Gaëtan Rivet, Wu, Jingjing, dev


Hi Thomas,

> Hi,
 
> I will give my view on locking and synchronization in a different email.
> Let's discuss about the API here.
 
> 05/12/2017 12:12, Ananyev, Konstantin:
> >> From: Matan Azrad [mailto:matan@mellanox.com]
>> > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
 
> > > > If the goal is just to have an ability to recognize is that device is managed by
> > > > another device (failsafe, bonding, etc.), then I think all we need is a pointer
> > > > to rte_eth_dev_data of the owner (NULL would mean no owner).
> > > 
> > > I think string is better than a pointer from the next reasons:
> > > 1. It is more human friendly than pointers for debug and printing.
> > 
> > We can have a function that would take an owner pointer and produce nice
> > pretty formatted text explanation: "owned by fail-safe device at port X" or so.
 
> I don't think it is possible or convenient to have such function.

Why do you think it is not possible?

> Keep in mind that the owner can be an application thread.
> If you prefer using a single function pointer (may help for
> atomic implementation), we can allocate an owner structure containing
> a name as a string to identify the owner in human readable format.
> Then we just have to set the pointer of this struct to rte_eth_dev_data.

Basically you'd like to have an ability to set something different then
pointer to rte_eth_dev_data as an owner, right?
I think this is possible too, just not sure it will useful.
 
> > What I meant - this api to set/get ownership should be sort of internal to ethdev layer.
> > Let say it would be used for failsafe/bonding (any other compound) device that needs
> > to own/manage several low-level devices.
> > So in normal situation user wouldn't need to use that API directly at all.
 
> Again, the application may use this API to declare its ownership.

Could you explain that a bit: what would mean 'application declares an ownership on device'?
Does it mean that no other application will be allowed to do any control op on that device
till application will clear its ownership?
I.E. make sure that at each moment only one particular thread can modify device configuration?
Or would it be totally informal and second application will be free to ignore it?

If it will be the second one - I personally don't see much point in it.
If it the first one - then simplest and most straightforward way would be -
introduce a mutex (either per device or just per whole rte_eth_dev[]) and force
each control op to grab it at entrance release at exit.

> And anwyway, it may be interesting from an application point of view
> to be able to list every devices and their internal owners.

Yes sure application is free to call 'get' to retrieve information etc.
What I am saying for normal operation - application don't have to call that API.
I.E. - we don't need to change testpmd, etc. apps because that API was introduced.

Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-05 15:13                   ` Ananyev, Konstantin
@ 2017-12-05 15:49                     ` Thomas Monjalon
  0 siblings, 0 replies; 214+ messages in thread
From: Thomas Monjalon @ 2017-12-05 15:49 UTC (permalink / raw)
  To: Ananyev, Konstantin
  Cc: Matan Azrad, Neil Horman, Gaëtan Rivet, Wu, Jingjing, dev

05/12/2017 16:13, Ananyev, Konstantin:
> 
> Hi Thomas,
> 
> > Hi,
>  
> > I will give my view on locking and synchronization in a different email.
> > Let's discuss about the API here.
>  
> > 05/12/2017 12:12, Ananyev, Konstantin:
> > >> From: Matan Azrad [mailto:matan@mellanox.com]
> >> > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
>  
> > > > > If the goal is just to have an ability to recognize is that device is managed by
> > > > > another device (failsafe, bonding, etc.), then I think all we need is a pointer
> > > > > to rte_eth_dev_data of the owner (NULL would mean no owner).
> > > > 
> > > > I think string is better than a pointer from the next reasons:
> > > > 1. It is more human friendly than pointers for debug and printing.
> > > 
> > > We can have a function that would take an owner pointer and produce nice
> > > pretty formatted text explanation: "owned by fail-safe device at port X" or so.
>  
> > I don't think it is possible or convenient to have such function.
> 
> Why do you think it is not possible?

Because of applications being the owner (discussion below).

> > Keep in mind that the owner can be an application thread.
> > If you prefer using a single function pointer (may help for
> > atomic implementation), we can allocate an owner structure containing
> > a name as a string to identify the owner in human readable format.
> > Then we just have to set the pointer of this struct to rte_eth_dev_data.
> 
> Basically you'd like to have an ability to set something different then
> pointer to rte_eth_dev_data as an owner, right?

No, it can be a pointer, or an id, I don't care.

> I think this is possible too, just not sure it will useful.
>  
> > > What I meant - this api to set/get ownership should be sort of internal to ethdev layer.
> > > Let say it would be used for failsafe/bonding (any other compound) device that needs
> > > to own/manage several low-level devices.
> > > So in normal situation user wouldn't need to use that API directly at all.
>  
> > Again, the application may use this API to declare its ownership.
> 
> Could you explain that a bit: what would mean 'application declares an ownership on device'?
> Does it mean that no other application will be allowed to do any control op on that device
> till application will clear its ownership?
> I.E. make sure that at each moment only one particular thread can modify device configuration?
> Or would it be totally informal and second application will be free to ignore it?

It is an information.
It will avoid a library taking ownership on a port controlled by the app.

> If it will be the second one - I personally don't see much point in it.
> If it the first one - then simplest and most straightforward way would be -
> introduce a mutex (either per device or just per whole rte_eth_dev[]) and force
> each control op to grab it at entrance release at exit.

No, a mutex does not provide any information.

> > And anwyway, it may be interesting from an application point of view
> > to be able to list every devices and their internal owners.
> 
> Yes sure application is free to call 'get' to retrieve information etc.
> What I am saying for normal operation - application don't have to call that API.
> I.E. - we don't need to change testpmd, etc. apps because that API was introduced.

Yes the ports can stay without any owner if the application does not fill it.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-05  6:08                     ` Matan Azrad
  2017-12-05 10:05                       ` Bruce Richardson
@ 2017-12-05 19:26                       ` Neil Horman
  2017-12-08 11:06                         ` Thomas Monjalon
  1 sibling, 1 reply; 214+ messages in thread
From: Neil Horman @ 2017-12-05 19:26 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Ananyev, Konstantin, Gaëtan Rivet, Thomas Monjalon, Wu,
	Jingjing, dev

On Tue, Dec 05, 2017 at 06:08:35AM +0000, Matan Azrad wrote:
><snip>
> Please look at the code again, secondary process cannot take ownership, I don't allow it!
> Actually, I think it must not be like that as no limitation for that in any other ethdev configurations.
> 
Sure you do.  Consider the following situation, two tasks, A and B running
independently on separate CPUS:

TASK A					TASK B
================================================================================
calls rte_eth_dev_owner_new (gets id 2)| calls rte_eth_dev_owner_new (gets id 1)
				       |
calls rte_eth_dev_owner_set on port 1  | calls rte_eth_dev_owner_set (port 1)
				       |
sets port_owner->id = 2		       | gets removed from cpu via scheduler
				       |
				       | returns to continue running on cpu
				       |
Gets interrupted immediately before    | completes rte_eth_dev_owner_set, 
 return 0 statement		       |  setting port_owner->id = 1
				       |
				       | returns 0 from rte_eth_dev_owner_set
is scheduled back on the cpu	       |
				       |
returns 0 from rte_eth_dev_owner_set   |

in the above scenario, you've allowed two tasks to race through the ownership
set routine, and while your intended semantics indicate that task A should have
taken ownership of the port, task B actually did, and whats worse, both tasks
think they completed successfully.

I get that much of dpdk relies on the fact that the application either handles
all the locking, or architects itself so that a single thread of execution (or
at least only one thread at a time), is responsible for packet processing and
port configuration. If you are assuming the former, you've done a disservice to
the downstream consumer, because the locking is the intricate part of this
operation (i.e. you are requiring that the developer figure out what granularity
of locking is required such that you don't serialize too many operations that
don't need it, while maintaining enough serialization that you don't corrupt the
data that they wanted the api to manage in the first place.  If you assert that
the application should only be using a single task to do these operations to
begin with, then this API isn't overly useful, because theres only one thread
pushing data into the library and, by definition it implicitly owns all the
ports, or at least knows which ports it shouldn't mess with (e.g subordunate
ports in a failsafe device).

> > > Any port configuration (for example port creation and release) is not
> > internally synchronized between the primary to secondary processes so I
> > don't see any reason to synchronize port ownership.
> > Yes, and I've never agreed with that design point either, because the fact of
> > the matter is that one of two things must be true in relation to port
> > configuration:
> > 
> > 1) Either multiple processes will attempt to read/change port
> > configuration/ownership
> > 
> > or
> > 
> > 2) port ownership is implicitly given to a single task (be it a primary or
> > secondary task), and said ownership is therefore implicitly known by the
> > application
> > 
> > Either situation may be true depending on the details of the application being
> > built, but regardless, if (2) is true, then this api isn't really needed at all,
> > because the application implicitly has been designed to have a port be
> > owned by a given task. 
> 
> Here I think you miss something, the port ownership is not mainly for process DPDK entities,
> The entity can be also PMD, library, application in same process.
> For MP it is nice to have, the secondary just can see the primary owners and take decision accordingly (please read my answer to Konstatin above). 
> 
But the former is just a case of the latter (in fact worse).  if you expect
another execution context out of the control of the application to query this
API, then you are in an MP situation, and definately need to provide mutually
exclusive access to your data.  If instead you expect all other execution
contexts to suspend (or otherwise refrain from accessing this API) while a
single task makes changes, then by definition you have already had to create
some syncrnoization between those contexts and are capable of informing them of
of the new ownership scheme

The bottom line is, either you expect multiple access, or you dont.  If you
expect multiple access, and you belive that said access are not completely under
the control of your application, you need to protect those accesses against one
another.  If you don't expect multiple access (or expect your application to
architect itself to enforce single access), then you've created a environment in
which the single accessor already has to know all the information regarding port
ownership that you store in the API.

>  If (1) is true, then all the locking required to access
> > the data of port ownership needs to be adhered to.
> > 
> 
> And all the port configurations!
> I think it is behind to this thread.
> 
> 
> > I understand that you are taking Thomas' words to mean that all paths are
> > lockless in the DPDK, and that is true as a statement of fact (in that the DPDK
> > doesn't synchronize access to internal data).  What it does do, is leave that
> > locking as an exercize for the downstream consumer of the library.  While I
> > don't agree with it, I can see that such an argument is valid for hot paths such
> > as transmission and reception where you may perhaps want to minimize
> > your locking by assigning a single task to do that work, but port configuration
> > and ownership isn't a hot path.  If you mean to allow potentially multiple
> > tasks to access configuration (including port ownership), be it frequently or
> > just occasionaly, why are you making a downstream developer recognize the
> > need for locking here outside the library?  It just seems like very bad general
> > practice to me.
> > 
> > > If the primary-secondary process want to manage(configure) same port in
> > same time, they must be synchronized by the applications, so this is the case
> > in port ownership too (actually I don't think this synchronization is realistic
> > because many configurations of the port are not in shared memory).
> > Yes, it is the case, my question is, why?  We're not talking about a time
> > sensitive execution path here.  And by avoiding it you're just making work
> > that has to be repeated by every downstream consumer.
> 
> I think you suggest to make all the ethdev configuration race safe, it is behind to this thread.
> Current ethdev implementation leave the race management to applications, so port ownership as any other port configurations should be designed in the same method.
> 
I would like to make ethdev configuration race safe, but thats a fight I've
lost, but I strongly disagree that just because some parts of the dpdk leave
race safety to the user, doesn't mean you have to.  Its just silly not to here.
We're not talking about a hot execution path here, we're talking about one time
/ rare configuration changes.  To insert a lock (or other lightweight atomic
operation) costs nothing in terms of execution time, and saves downstream
consumers significant time figuring out what the best mutual exclusion strategy
is.  Why wouldn't you do that?

Neil

> > 
> > Neil
> 
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-05 19:26                       ` Neil Horman
@ 2017-12-08 11:06                         ` Thomas Monjalon
  0 siblings, 0 replies; 214+ messages in thread
From: Thomas Monjalon @ 2017-12-08 11:06 UTC (permalink / raw)
  To: Neil Horman
  Cc: Matan Azrad, Ananyev, Konstantin, Gaëtan Rivet, Wu, Jingjing, dev

05/12/2017 20:26, Neil Horman:
> I get that much of dpdk relies on the fact that the application either handles
> all the locking, or architects itself so that a single thread of execution (or
> at least only one thread at a time), is responsible for packet processing and
> port configuration.

Yes, for now, configuration is synchronized at application level.
It is a constraint for applications.
It may be an issue for multi-process applications,
or for libraries aiming some device management.

The first obvious bug to fix is race in device allocation.
It will become more real with hotplug support.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-05 10:05                       ` Bruce Richardson
@ 2017-12-08 11:35                         ` Thomas Monjalon
  2017-12-08 12:31                           ` Neil Horman
  0 siblings, 1 reply; 214+ messages in thread
From: Thomas Monjalon @ 2017-12-08 11:35 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: Matan Azrad, Neil Horman, Ananyev, Konstantin, Gaëtan Rivet,
	Wu, Jingjing, dev

05/12/2017 11:05, Bruce Richardson:
> > I think you suggest to make all the ethdev configuration race safe, it
> > is behind to this thread.  Current ethdev implementation leave the
> > race management to applications, so port ownership as any other port
> > configurations should be designed in the same method.
> 
> One key difference, though, being that port ownership itself could be
> used to manage the thread-safety of the ethdev configuration. It's also
> a little different from other APIs in that I find it hard to come up
> with a scenario where it would be very useful to an application without
> also having some form of locking present in it. For other config/control
> APIs we can have the control plane APIs work without locks e.g. by
> having a single designated thread/process manage all configuration
> updates. However, as Neil points out, in such a scenario, the ownership
> concept doesn't provide any additional benefit so can be skipped
> completely. I'd view it a bit like the reference counting of mbufs -
> we can provide a lockless/non-atomic version, but for just about every
> real use-case, you want the atomic version.

I think we need to clearly describe what is the tread-safety policy
in DPDK (especially in ethdev as a first example).
Let's start with obvious things:

	1/ A queue is not protected for races with multiple Rx or Tx
			- no planned change because of performance purpose
	2/ The list of devices is racy
			- to be fixed with atomics
	3/ The configuration of different devices is thread-safe
			- the configurations are different per-device
	4/ The configuration of a given device is racy
			- can be managed by the owner of the device
	5/ The device ownership is racy
			- to be fixed with atomics

What am I missing?

I am also wondering whether the device ownership can be a separate
library used in several device class interfaces?

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-08 11:35                         ` Thomas Monjalon
@ 2017-12-08 12:31                           ` Neil Horman
  2017-12-21 17:06                             ` Thomas Monjalon
  0 siblings, 1 reply; 214+ messages in thread
From: Neil Horman @ 2017-12-08 12:31 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Bruce Richardson, Matan Azrad, Ananyev, Konstantin,
	Gaëtan Rivet, Wu, Jingjing, dev

On Fri, Dec 08, 2017 at 12:35:18PM +0100, Thomas Monjalon wrote:
> 05/12/2017 11:05, Bruce Richardson:
> > > I think you suggest to make all the ethdev configuration race safe, it
> > > is behind to this thread.  Current ethdev implementation leave the
> > > race management to applications, so port ownership as any other port
> > > configurations should be designed in the same method.
> > 
> > One key difference, though, being that port ownership itself could be
> > used to manage the thread-safety of the ethdev configuration. It's also
> > a little different from other APIs in that I find it hard to come up
> > with a scenario where it would be very useful to an application without
> > also having some form of locking present in it. For other config/control
> > APIs we can have the control plane APIs work without locks e.g. by
> > having a single designated thread/process manage all configuration
> > updates. However, as Neil points out, in such a scenario, the ownership
> > concept doesn't provide any additional benefit so can be skipped
> > completely. I'd view it a bit like the reference counting of mbufs -
> > we can provide a lockless/non-atomic version, but for just about every
> > real use-case, you want the atomic version.
> 
> I think we need to clearly describe what is the tread-safety policy
> in DPDK (especially in ethdev as a first example).
> Let's start with obvious things:
> 
> 	1/ A queue is not protected for races with multiple Rx or Tx
> 			- no planned change because of performance purpose
> 	2/ The list of devices is racy
> 			- to be fixed with atomics
> 	3/ The configuration of different devices is thread-safe
> 			- the configurations are different per-device
> 	4/ The configuration of a given device is racy
> 			- can be managed by the owner of the device
> 	5/ The device ownership is racy
> 			- to be fixed with atomics
> 
> What am I missing?
> 
There is fan out to consider here:

1) Is device configuration racy with ownership?  That is to say, can I change
ownership of a device safely while another thread that currently owns it
modifies its configuration?

2) Is device configuration racy with device addition/removal?  That is to say,
can one thread remove a device while another configures it.

There are probably many subsystem interactions that need to be addressed here.

Neil

> I am also wondering whether the device ownership can be a separate
> library used in several device class interfaces?
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-08 12:31                           ` Neil Horman
@ 2017-12-21 17:06                             ` Thomas Monjalon
  2017-12-21 17:43                               ` Neil Horman
  0 siblings, 1 reply; 214+ messages in thread
From: Thomas Monjalon @ 2017-12-21 17:06 UTC (permalink / raw)
  To: Neil Horman
  Cc: dev, Bruce Richardson, Matan Azrad, Ananyev, Konstantin,
	Gaëtan Rivet, Wu, Jingjing

08/12/2017 13:31, Neil Horman:
> On Fri, Dec 08, 2017 at 12:35:18PM +0100, Thomas Monjalon wrote:
> > 05/12/2017 11:05, Bruce Richardson:
> > > > I think you suggest to make all the ethdev configuration race safe, it
> > > > is behind to this thread.  Current ethdev implementation leave the
> > > > race management to applications, so port ownership as any other port
> > > > configurations should be designed in the same method.
> > > 
> > > One key difference, though, being that port ownership itself could be
> > > used to manage the thread-safety of the ethdev configuration. It's also
> > > a little different from other APIs in that I find it hard to come up
> > > with a scenario where it would be very useful to an application without
> > > also having some form of locking present in it. For other config/control
> > > APIs we can have the control plane APIs work without locks e.g. by
> > > having a single designated thread/process manage all configuration
> > > updates. However, as Neil points out, in such a scenario, the ownership
> > > concept doesn't provide any additional benefit so can be skipped
> > > completely. I'd view it a bit like the reference counting of mbufs -
> > > we can provide a lockless/non-atomic version, but for just about every
> > > real use-case, you want the atomic version.
> > 
> > I think we need to clearly describe what is the tread-safety policy
> > in DPDK (especially in ethdev as a first example).
> > Let's start with obvious things:
> > 
> > 	1/ A queue is not protected for races with multiple Rx or Tx
> > 			- no planned change because of performance purpose
> > 	2/ The list of devices is racy
> > 			- to be fixed with atomics
> > 	3/ The configuration of different devices is thread-safe
> > 			- the configurations are different per-device
> > 	4/ The configuration of a given device is racy
> > 			- can be managed by the owner of the device
> > 	5/ The device ownership is racy
> > 			- to be fixed with atomics
> > 
> > What am I missing?
> > 
> There is fan out to consider here:
> 
> 1) Is device configuration racy with ownership?  That is to say, can I change
> ownership of a device safely while another thread that currently owns it
> modifies its configuration?

If an entity steals ownership to another one, either it is agreed earlier,
or it is done by a central authority.
When it is acked that ownership can be moved, there should not be any
configuration in progress.
So it is more a communication issue than a race.

> 2) Is device configuration racy with device addition/removal?  That is to say,
> can one thread remove a device while another configures it.

I think it is the same as two threads configuring the same device
(item 4/ above). It can be managed with port ownership.

> There are probably many subsystem interactions that need to be addressed here.
> 
> Neil
> 
> > I am also wondering whether the device ownership can be a separate
> > library used in several device class interfaces?

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-21 17:06                             ` Thomas Monjalon
@ 2017-12-21 17:43                               ` Neil Horman
  2017-12-21 19:37                                 ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Neil Horman @ 2017-12-21 17:43 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, Bruce Richardson, Matan Azrad, Ananyev, Konstantin,
	Gaëtan Rivet, Wu, Jingjing

On Thu, Dec 21, 2017 at 06:06:48PM +0100, Thomas Monjalon wrote:
> 08/12/2017 13:31, Neil Horman:
> > On Fri, Dec 08, 2017 at 12:35:18PM +0100, Thomas Monjalon wrote:
> > > 05/12/2017 11:05, Bruce Richardson:
> > > > > I think you suggest to make all the ethdev configuration race safe, it
> > > > > is behind to this thread.  Current ethdev implementation leave the
> > > > > race management to applications, so port ownership as any other port
> > > > > configurations should be designed in the same method.
> > > > 
> > > > One key difference, though, being that port ownership itself could be
> > > > used to manage the thread-safety of the ethdev configuration. It's also
> > > > a little different from other APIs in that I find it hard to come up
> > > > with a scenario where it would be very useful to an application without
> > > > also having some form of locking present in it. For other config/control
> > > > APIs we can have the control plane APIs work without locks e.g. by
> > > > having a single designated thread/process manage all configuration
> > > > updates. However, as Neil points out, in such a scenario, the ownership
> > > > concept doesn't provide any additional benefit so can be skipped
> > > > completely. I'd view it a bit like the reference counting of mbufs -
> > > > we can provide a lockless/non-atomic version, but for just about every
> > > > real use-case, you want the atomic version.
> > > 
> > > I think we need to clearly describe what is the tread-safety policy
> > > in DPDK (especially in ethdev as a first example).
> > > Let's start with obvious things:
> > > 
> > > 	1/ A queue is not protected for races with multiple Rx or Tx
> > > 			- no planned change because of performance purpose
> > > 	2/ The list of devices is racy
> > > 			- to be fixed with atomics
> > > 	3/ The configuration of different devices is thread-safe
> > > 			- the configurations are different per-device
> > > 	4/ The configuration of a given device is racy
> > > 			- can be managed by the owner of the device
> > > 	5/ The device ownership is racy
> > > 			- to be fixed with atomics
> > > 
> > > What am I missing?
> > > 
> > There is fan out to consider here:
> > 
> > 1) Is device configuration racy with ownership?  That is to say, can I change
> > ownership of a device safely while another thread that currently owns it
> > modifies its configuration?
> 
> If an entity steals ownership to another one, either it is agreed earlier,
> or it is done by a central authority.
> When it is acked that ownership can be moved, there should not be any
> configuration in progress.
> So it is more a communication issue than a race.
> 
But if thats the case (specifically that mutual exclusion between port ownership
and configuration is an exercize left to an application developer), then port
ownership itself is largely meaningless within the dpdk, because the notion of
who owns the port needs to be codified within the application anyway.


> > 2) Is device configuration racy with device addition/removal?  That is to say,
> > can one thread remove a device while another configures it.
> 
> I think it is the same as two threads configuring the same device
> (item 4/ above). It can be managed with port ownership.
> 
Only if you assert that application is required to have the owning port be
responsible for the ports deletion, which we can say, but that leads to the
issue above again.


> > There are probably many subsystem interactions that need to be addressed here.
> > 
> > Neil
> > 
> > > I am also wondering whether the device ownership can be a separate
> > > library used in several device class interfaces?
> 
> 
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-21 17:43                               ` Neil Horman
@ 2017-12-21 19:37                                 ` Matan Azrad
  2017-12-21 20:14                                   ` Neil Horman
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2017-12-21 19:37 UTC (permalink / raw)
  To: Neil Horman, Thomas Monjalon
  Cc: dev, Bruce Richardson, Ananyev, Konstantin, Gaëtan Rivet,
	Wu, Jingjing

Hi

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Thursday, December 21, 2017 7:43 PM
> To: Thomas Monjalon <thomas@monjalon.net>
> Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>; Matan
> Azrad <matan@mellanox.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>;
> Wu, Jingjing <jingjing.wu@intel.com>
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> On Thu, Dec 21, 2017 at 06:06:48PM +0100, Thomas Monjalon wrote:
> > 08/12/2017 13:31, Neil Horman:
> > > On Fri, Dec 08, 2017 at 12:35:18PM +0100, Thomas Monjalon wrote:
> > > > 05/12/2017 11:05, Bruce Richardson:
> > > > > > I think you suggest to make all the ethdev configuration race
> > > > > > safe, it is behind to this thread.  Current ethdev
> > > > > > implementation leave the race management to applications, so
> > > > > > port ownership as any other port configurations should be designed
> in the same method.
> > > > >
> > > > > One key difference, though, being that port ownership itself
> > > > > could be used to manage the thread-safety of the ethdev
> > > > > configuration. It's also a little different from other APIs in
> > > > > that I find it hard to come up with a scenario where it would be
> > > > > very useful to an application without also having some form of
> > > > > locking present in it. For other config/control APIs we can have
> > > > > the control plane APIs work without locks e.g. by having a
> > > > > single designated thread/process manage all configuration
> > > > > updates. However, as Neil points out, in such a scenario, the
> > > > > ownership concept doesn't provide any additional benefit so can
> > > > > be skipped completely. I'd view it a bit like the reference
> > > > > counting of mbufs - we can provide a lockless/non-atomic version,
> but for just about every real use-case, you want the atomic version.
> > > >
> > > > I think we need to clearly describe what is the tread-safety
> > > > policy in DPDK (especially in ethdev as a first example).
> > > > Let's start with obvious things:
> > > >
> > > > 	1/ A queue is not protected for races with multiple Rx or Tx
> > > > 			- no planned change because of performance
> purpose
> > > > 	2/ The list of devices is racy
> > > > 			- to be fixed with atomics
> > > > 	3/ The configuration of different devices is thread-safe
> > > > 			- the configurations are different per-device
> > > > 	4/ The configuration of a given device is racy
> > > > 			- can be managed by the owner of the device
> > > > 	5/ The device ownership is racy
> > > > 			- to be fixed with atomics
> > > >
> > > > What am I missing?
> > > >

Thank you Thomas for this order.
Actually the port ownership is a good opportunity to redefine the synchronization rules in ethdev :)

> > > There is fan out to consider here:
> > >
> > > 1) Is device configuration racy with ownership?  That is to say, can
> > > I change ownership of a device safely while another thread that
> > > currently owns it modifies its configuration?
> >
> > If an entity steals ownership to another one, either it is agreed
> > earlier, or it is done by a central authority.
> > When it is acked that ownership can be moved, there should not be any
> > configuration in progress.
> > So it is more a communication issue than a race.
> >
> But if thats the case (specifically that mutual exclusion between port
> ownership and configuration is an exercize left to an application developer),
> then port ownership itself is largely meaningless within the dpdk, because
> the notion of who owns the port needs to be codified within the application
> anyway.
> 

Bruce, As I understand it, only the dpdk entity who took ownership of a port successfully can configure the device by default, if other dpdk entities want to configure it too they must to be synchronized with the port owner while it is not recommended after the port ownership integration.

So, for example,  if the dpdk entity is an application, the application should take ownership of the port and manage the synchronization of this port configuration between the application threads and its EAL host thread callbacks, no other dpdk entity should configure the same port because they should fail when they try to take ownership of the same port too.
Each dpdk entity which wants to take ownership must to be able to synchronize the port configuration in its level. 

> 
> > > 2) Is device configuration racy with device addition/removal?  That
> > > is to say, can one thread remove a device while another configures it.
> >
> > I think it is the same as two threads configuring the same device
> > (item 4/ above). It can be managed with port ownership.
> >
> Only if you assert that application is required to have the owning port be
> responsible for the ports deletion, which we can say, but that leads to the
> issue above again.
> 
> 
As Thomas said in item 2 the port creation must be synchronized by ethdev and we need to add it there. 
I think it is obvious that port removal must to be done only by the port owner.  


I think we need to add synchronization for port ownership management in this patch V2 and add port creation synchronization in ethdev in separate patch to fill the new rules Thomas suggested.

What do you think?

> > > There are probably many subsystem interactions that need to be
> addressed here.
> > >
> > > Neil
> > >
> > > > I am also wondering whether the device ownership can be a separate
> > > > library used in several device class interfaces?
> >
> >
> >

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-21 19:37                                 ` Matan Azrad
@ 2017-12-21 20:14                                   ` Neil Horman
  2017-12-21 21:57                                     ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Neil Horman @ 2017-12-21 20:14 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Thomas Monjalon, dev, Bruce Richardson, Ananyev, Konstantin,
	Gaëtan Rivet, Wu, Jingjing

On Thu, Dec 21, 2017 at 07:37:06PM +0000, Matan Azrad wrote:
> Hi
> 
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Thursday, December 21, 2017 7:43 PM
> > To: Thomas Monjalon <thomas@monjalon.net>
> > Cc: dev@dpdk.org; Bruce Richardson <bruce.richardson@intel.com>; Matan
> > Azrad <matan@mellanox.com>; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>;
> > Wu, Jingjing <jingjing.wu@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > 
> > On Thu, Dec 21, 2017 at 06:06:48PM +0100, Thomas Monjalon wrote:
> > > 08/12/2017 13:31, Neil Horman:
> > > > On Fri, Dec 08, 2017 at 12:35:18PM +0100, Thomas Monjalon wrote:
> > > > > 05/12/2017 11:05, Bruce Richardson:
> > > > > > > I think you suggest to make all the ethdev configuration race
> > > > > > > safe, it is behind to this thread.  Current ethdev
> > > > > > > implementation leave the race management to applications, so
> > > > > > > port ownership as any other port configurations should be designed
> > in the same method.
> > > > > >
> > > > > > One key difference, though, being that port ownership itself
> > > > > > could be used to manage the thread-safety of the ethdev
> > > > > > configuration. It's also a little different from other APIs in
> > > > > > that I find it hard to come up with a scenario where it would be
> > > > > > very useful to an application without also having some form of
> > > > > > locking present in it. For other config/control APIs we can have
> > > > > > the control plane APIs work without locks e.g. by having a
> > > > > > single designated thread/process manage all configuration
> > > > > > updates. However, as Neil points out, in such a scenario, the
> > > > > > ownership concept doesn't provide any additional benefit so can
> > > > > > be skipped completely. I'd view it a bit like the reference
> > > > > > counting of mbufs - we can provide a lockless/non-atomic version,
> > but for just about every real use-case, you want the atomic version.
> > > > >
> > > > > I think we need to clearly describe what is the tread-safety
> > > > > policy in DPDK (especially in ethdev as a first example).
> > > > > Let's start with obvious things:
> > > > >
> > > > > 	1/ A queue is not protected for races with multiple Rx or Tx
> > > > > 			- no planned change because of performance
> > purpose
> > > > > 	2/ The list of devices is racy
> > > > > 			- to be fixed with atomics
> > > > > 	3/ The configuration of different devices is thread-safe
> > > > > 			- the configurations are different per-device
> > > > > 	4/ The configuration of a given device is racy
> > > > > 			- can be managed by the owner of the device
> > > > > 	5/ The device ownership is racy
> > > > > 			- to be fixed with atomics
> > > > >
> > > > > What am I missing?
> > > > >
> 
> Thank you Thomas for this order.
> Actually the port ownership is a good opportunity to redefine the synchronization rules in ethdev :)
> 
> > > > There is fan out to consider here:
> > > >
> > > > 1) Is device configuration racy with ownership?  That is to say, can
> > > > I change ownership of a device safely while another thread that
> > > > currently owns it modifies its configuration?
> > >
> > > If an entity steals ownership to another one, either it is agreed
> > > earlier, or it is done by a central authority.
> > > When it is acked that ownership can be moved, there should not be any
> > > configuration in progress.
> > > So it is more a communication issue than a race.
> > >
> > But if thats the case (specifically that mutual exclusion between port
> > ownership and configuration is an exercize left to an application developer),
> > then port ownership itself is largely meaningless within the dpdk, because
> > the notion of who owns the port needs to be codified within the application
> > anyway.
> > 
> 
> Bruce, As I understand it, only the dpdk entity who took ownership of a port successfully can configure the device by default, if other dpdk entities want to configure it too they must to be synchronized with the port owner while it is not recommended after the port ownership integration.
> 
Can you clarify what you mean by "it is not recommended after the port ownership
integration"?  I think there is consensus that the port owner must be the only
entitiy to operate on a port (be that configuration/frame rx/tx, or some other
operation).  Multithreaded operation on a port always means some level of
synchronization between application threads and the dpdk library, but I'm not
sure why that would be different if we introduced a more concrete notion of port
ownership via a new library.

> So, for example,  if the dpdk entity is an application, the application should take ownership of the port and manage the synchronization of this port configuration between the application threads and its EAL host thread callbacks, no other dpdk entity should configure the same port because they should fail when they try to take ownership of the same port too.
Well, failing is one good approach, yes, blocking on port relenquishment could
be another.  I'd recommend an API with the following interface:

rte_port_ownership_claim(int port_id) - blocks execution of the calling thread
until the previous owner releases ownership, then claims it and returns

rte_port_ownership_release(int port_id) - releases ownership of port, or returns
error if the port was not owned by this execution context

rte_port_owernship_try_claim(int port_id) - same as rte_port_ownership_claim,
but fails if the port is already owned.

That would give the option for both semantics.

> Each dpdk entity which wants to take ownership must to be able to synchronize the port configuration in its level. 
Can you elaborate on what you mean by level here?  Are you envisioning a scheme
in which multiple execution contexts might own a port for various
non-conflicting purposes?


> 
> > 
> > > > 2) Is device configuration racy with device addition/removal?  That
> > > > is to say, can one thread remove a device while another configures it.
> > >
> > > I think it is the same as two threads configuring the same device
> > > (item 4/ above). It can be managed with port ownership.
> > >
> > Only if you assert that application is required to have the owning port be
> > responsible for the ports deletion, which we can say, but that leads to the
> > issue above again.
> > 
> > 
> As Thomas said in item 2 the port creation must be synchronized by ethdev and we need to add it there. 
> I think it is obvious that port removal must to be done only by the port owner.  
> 
You say that, but its obvious to you as a developer who has looked extensively
at the code.  It may well be less so to a consumer who is not an active member
of the community, for instance one who obtains the dpdk via pre-built package.

> 
> I think we need to add synchronization for port ownership management in this patch V2 and add port creation synchronization in ethdev in separate patch to fill the new rules Thomas suggested.
I think that makes sense, yes. 

Neil

> 
> What do you think?
> 
> > > > There are probably many subsystem interactions that need to be
> > addressed here.
> > > >
> > > > Neil
> > > >
> > > > > I am also wondering whether the device ownership can be a separate
> > > > > library used in several device class interfaces?
> > >
> > >
> > >
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-21 20:14                                   ` Neil Horman
@ 2017-12-21 21:57                                     ` Matan Azrad
  2017-12-22 14:26                                       ` Neil Horman
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2017-12-21 21:57 UTC (permalink / raw)
  To: Neil Horman
  Cc: Thomas Monjalon, dev, Bruce Richardson, Ananyev, Konstantin,
	Gaëtan Rivet, Wu, Jingjing

> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Thursday, December 21, 2017 10:14 PM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Bruce
> Richardson <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>;
> Wu, Jingjing <jingjing.wu@intel.com>
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> On Thu, Dec 21, 2017 at 07:37:06PM +0000, Matan Azrad wrote:
> > Hi
> >
<snip>
> > > > > > I think we need to clearly describe what is the tread-safety
> > > > > > policy in DPDK (especially in ethdev as a first example).
> > > > > > Let's start with obvious things:
> > > > > >
> > > > > > 	1/ A queue is not protected for races with multiple Rx or Tx
> > > > > > 			- no planned change because of performance
> > > purpose
> > > > > > 	2/ The list of devices is racy
> > > > > > 			- to be fixed with atomics
> > > > > > 	3/ The configuration of different devices is thread-safe
> > > > > > 			- the configurations are different per-device
> > > > > > 	4/ The configuration of a given device is racy
> > > > > > 			- can be managed by the owner of the device
> > > > > > 	5/ The device ownership is racy
> > > > > > 			- to be fixed with atomics
> > > > > >
> > > > > > What am I missing?
> > > > > >
> >
> > Thank you Thomas for this order.
> > Actually the port ownership is a good opportunity to redefine the
> > synchronization rules in ethdev :)
> >
> > > > > There is fan out to consider here:
> > > > >
> > > > > 1) Is device configuration racy with ownership?  That is to say,
> > > > > can I change ownership of a device safely while another thread
> > > > > that currently owns it modifies its configuration?
> > > >
> > > > If an entity steals ownership to another one, either it is agreed
> > > > earlier, or it is done by a central authority.
> > > > When it is acked that ownership can be moved, there should not be
> > > > any configuration in progress.
> > > > So it is more a communication issue than a race.
> > > >
> > > But if thats the case (specifically that mutual exclusion between
> > > port ownership and configuration is an exercize left to an
> > > application developer), then port ownership itself is largely
> > > meaningless within the dpdk, because the notion of who owns the port
> > > needs to be codified within the application anyway.
> > >
> >
> > Bruce, As I understand it, only the dpdk entity who took ownership of a
> port successfully can configure the device by default, if other dpdk entities
> want to configure it too they must to be synchronized with the port owner
> while it is not recommended after the port ownership integration.
> >
> Can you clarify what you mean by "it is not recommended after the port
> ownership integration"?

Sure,
The new defining of ethdev synchronization doesn't recommend to manage a port by 2 different dpdk entities, it can be done but not recommended.
  
>  I think there is consensus that the port owner must
> be the only entitiy to operate on a port (be that configuration/frame rx/tx, or
> some other operation).

Your question above caused me to think that you don't understand it, How can someone who is not the port owner to change the port owner?
Changing the port owner, like port configuration and port release must be done by the owner itself except the case that there is no owner to the port.
See the API rte_eth_dev_owner_remove.

> Multithreaded operation on a port always means
> some level of synchronization between application threads and the dpdk
> library,
Yes.
 >but I'm not sure why that would be different if we introduced a more
> concrete notion of port ownership via a new library.
>

What do you mean by "new library"?, port is an ethdev instance and should be managed by ethdev.

 > > So, for example,  if the dpdk entity is an application, the application should
>> take ownership of the port and manage the synchronization of this port
>> configuration between the application threads and its EAL host thread
>> callbacks, no other dpdk entity should configure the same port because they
>> should fail when they try to take ownership of the same port too.

> Well, failing is one good approach, yes, blocking on port relenquishment
> could be another.  I'd recommend an API with the following interface:
> 
> rte_port_ownership_claim(int port_id) - blocks execution of the calling
> thread until the previous owner releases ownership, then claims it and
> returns
> 
> rte_port_ownership_release(int port_id) - releases ownership of port, or
> returns error if the port was not owned by this execution context
>
> rte_port_owernship_try_claim(int port_id) - same as
> rte_port_ownership_claim, but fails if the port is already owned.
> 
> That would give the option for both semantics.

I think the current APIs are better because of the next reasons:
- It defines well who is the owner.
- The owner structure includes string to allow better debug and printing for humans. 
Did you read it?
I can add there an API that wait until the port ownership is released as you suggested in V2.
 
> > Each dpdk entity which wants to take ownership must to be able to
> >synchronize the port configuration in its level.

> Can you elaborate on what you mean by level here?  Are you envisioning a
> scheme in which multiple execution contexts might own a port for various
> non-conflicting purposes?
 
Sure,
1) Application with 2 threads wanting to configure the same port:
	level = application code.
	
	a. The main thread should create owner identifier(rte_eth_dev_owner_new).
	b. The main thread should take the port ownership(rte_eth_dev_owner_set).
	c. Synchronization between the two threads should be done for the conflicted 		configurations by the application.
	d. when the application finishes the port usage it should release the owner(rte_eth_dev_owner_remove).

2) Fail-safe PMD manages 2 sub-devices (2 ports) and uses alarm for hotplug detections which can configure the 2 ports(by the host thread).
	Level = fail-safe code.
	a. Application starts the eal and the fail-safe driver probing function is called.
	b. Fail-safe probes the 2 sub-devices(2 ports are created) and takes ownership for them.
	c. Failsafe creates itself port and leaves it ownerless. 
	d. Failsafe starts the hotplug alarm mechanism.
	e. Application tries to take ownership for all ports and success only for failsafe port.
	f. Application start to configure the failsafe port asynchronously to failsafe hotplug alarm.
	g. Failsafe must use synchronization between failsafe alarm callback code and failsafe configuration APIs called by the application because they both try to configure the same sub-devices ports.
	h. When fail-safe finishes with the two sub devices it should release the ports owner.

> >
> > >
> > > > > 2) Is device configuration racy with device addition/removal?
> > > > > That is to say, can one thread remove a device while another
> configures it.
> > > >
> > > > I think it is the same as two threads configuring the same device
> > > > (item 4/ above). It can be managed with port ownership.
> > > >
> > > Only if you assert that application is required to have the owning
> > > port be responsible for the ports deletion, which we can say, but
> > > that leads to the issue above again.
> > >
> > >
> > As Thomas said in item 2 the port creation must be synchronized by ethdev
> and we need to add it there.
> > I think it is obvious that port removal must to be done only by the port
> owner.
> >
> You say that, but its obvious to you as a developer who has looked
> extensively at the code.  It may well be less so to a consumer who is not an
> active member of the community, for instance one who obtains the dpdk via
> pre-built package.
>

Yes I can understand, but new rules should be documented and be adjusted easy easy by the customers, no?
The old way to sync configuration still exists.
 
> >
> > I think we need to add synchronization for port ownership management in
> this patch V2 and add port creation synchronization in ethdev in separate
> patch to fill the new rules Thomas suggested.
> I think that makes sense, yes.
> 
> Neil
> 
> >
> > What do you think?
> >
> > > > > There are probably many subsystem interactions that need to be
> > > addressed here.
> > > > >
> > > > > Neil
> > > > >
> > > > > > I am also wondering whether the device ownership can be a
> > > > > > separate library used in several device class interfaces?
> > > >
> > > >
> > > >
> >

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-21 21:57                                     ` Matan Azrad
@ 2017-12-22 14:26                                       ` Neil Horman
  2017-12-23 22:36                                         ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Neil Horman @ 2017-12-22 14:26 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Thomas Monjalon, dev, Bruce Richardson, Ananyev, Konstantin,
	Gaëtan Rivet, Wu, Jingjing

On Thu, Dec 21, 2017 at 09:57:43PM +0000, Matan Azrad wrote:
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Thursday, December 21, 2017 10:14 PM
> > To: Matan Azrad <matan@mellanox.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Bruce
> > Richardson <bruce.richardson@intel.com>; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>;
> > Wu, Jingjing <jingjing.wu@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > 
> > On Thu, Dec 21, 2017 at 07:37:06PM +0000, Matan Azrad wrote:
> > > Hi
> > >
> <snip>
> > > > > > > I think we need to clearly describe what is the tread-safety
> > > > > > > policy in DPDK (especially in ethdev as a first example).
> > > > > > > Let's start with obvious things:
> > > > > > >
> > > > > > > 	1/ A queue is not protected for races with multiple Rx or Tx
> > > > > > > 			- no planned change because of performance
> > > > purpose
> > > > > > > 	2/ The list of devices is racy
> > > > > > > 			- to be fixed with atomics
> > > > > > > 	3/ The configuration of different devices is thread-safe
> > > > > > > 			- the configurations are different per-device
> > > > > > > 	4/ The configuration of a given device is racy
> > > > > > > 			- can be managed by the owner of the device
> > > > > > > 	5/ The device ownership is racy
> > > > > > > 			- to be fixed with atomics
> > > > > > >
> > > > > > > What am I missing?
> > > > > > >
> > >
> > > Thank you Thomas for this order.
> > > Actually the port ownership is a good opportunity to redefine the
> > > synchronization rules in ethdev :)
> > >
> > > > > > There is fan out to consider here:
> > > > > >
> > > > > > 1) Is device configuration racy with ownership?  That is to say,
> > > > > > can I change ownership of a device safely while another thread
> > > > > > that currently owns it modifies its configuration?
> > > > >
> > > > > If an entity steals ownership to another one, either it is agreed
> > > > > earlier, or it is done by a central authority.
> > > > > When it is acked that ownership can be moved, there should not be
> > > > > any configuration in progress.
> > > > > So it is more a communication issue than a race.
> > > > >
> > > > But if thats the case (specifically that mutual exclusion between
> > > > port ownership and configuration is an exercize left to an
> > > > application developer), then port ownership itself is largely
> > > > meaningless within the dpdk, because the notion of who owns the port
> > > > needs to be codified within the application anyway.
> > > >
> > >
> > > Bruce, As I understand it, only the dpdk entity who took ownership of a
> > port successfully can configure the device by default, if other dpdk entities
> > want to configure it too they must to be synchronized with the port owner
> > while it is not recommended after the port ownership integration.
> > >
> > Can you clarify what you mean by "it is not recommended after the port
> > ownership integration"?
> 
> Sure,
> The new defining of ethdev synchronization doesn't recommend to manage a port by 2 different dpdk entities, it can be done but not recommended.
>   
Ok, thats just not what you said above.  Your suggestion made it sound like you
thought that  after the integration of a port ownership model, that multiple
dpdk entries should not synchronize with one another, which made no sense.

> >  I think there is consensus that the port owner must
> > be the only entitiy to operate on a port (be that configuration/frame rx/tx, or
> > some other operation).
> 
> Your question above caused me to think that you don't understand it, How can someone who is not the port owner to change the port owner?
> Changing the port owner, like port configuration and port release must be done by the owner itself except the case that there is no owner to the port.
> See the API rte_eth_dev_owner_remove.
> 
See above, your phrasing I don't think accurately reflected what you meant to
convey. Or at least thats not how I read it

> > Multithreaded operation on a port always means
> > some level of synchronization between application threads and the dpdk
> > library,
> Yes.
>  >but I'm not sure why that would be different if we introduced a more
> > concrete notion of port ownership via a new library.
> >
> 
> What do you mean by "new library"?, port is an ethdev instance and should be managed by ethdev.
> 
I'm referring to the port ownership api that you proposed.  Apologies, I should
not have used the term "new library", but rather "new api".

>  > > So, for example,  if the dpdk entity is an application, the application should
> >> take ownership of the port and manage the synchronization of this port
> >> configuration between the application threads and its EAL host thread
> >> callbacks, no other dpdk entity should configure the same port because they
> >> should fail when they try to take ownership of the same port too.
> 
> > Well, failing is one good approach, yes, blocking on port relenquishment
> > could be another.  I'd recommend an API with the following interface:
> > 
> > rte_port_ownership_claim(int port_id) - blocks execution of the calling
> > thread until the previous owner releases ownership, then claims it and
> > returns
> > 
> > rte_port_ownership_release(int port_id) - releases ownership of port, or
> > returns error if the port was not owned by this execution context
> >
> > rte_port_owernship_try_claim(int port_id) - same as
> > rte_port_ownership_claim, but fails if the port is already owned.
> > 
> > That would give the option for both semantics.
> 
> I think the current APIs are better because of the next reasons:
> - It defines well who is the owner.
Theres no reason you can't integrate some ownership nonce to the above API as
well, thats easy to add.  The relevant part is the ability to exclude those who
are not owners (that is to say, block their progress until ownership is released
by a preceding entity).

> - The owner structure includes string to allow better debug and printing for humans. 
I've got no problem with any such internals, its really the synchronization that I'm after.

> Did you read it?
Yes, I don't see why you would think I hadn't, I think I've been very clear in
my understanding of you initial patch.  Have you taken the time to understand my
concerns? 

> I can add there an API that wait until the port ownership is released as you suggested in V2.
> 
I think that would be good.

> > > Each dpdk entity which wants to take ownership must to be able to
> > >synchronize the port configuration in its level.
> 
> > Can you elaborate on what you mean by level here?  Are you envisioning a
> > scheme in which multiple execution contexts might own a port for various
> > non-conflicting purposes?
>  
> Sure,
> 1) Application with 2 threads wanting to configure the same port:
> 	level = application code.
> 	
> 	a. The main thread should create owner identifier(rte_eth_dev_owner_new).
> 	b. The main thread should take the port ownership(rte_eth_dev_owner_set).
> 	c. Synchronization between the two threads should be done for the conflicted 		configurations by the application.
> 	d. when the application finishes the port usage it should release the owner(rte_eth_dev_owner_remove).
> 
> 2) Fail-safe PMD manages 2 sub-devices (2 ports) and uses alarm for hotplug detections which can configure the 2 ports(by the host thread).
> 	Level = fail-safe code.
> 	a. Application starts the eal and the fail-safe driver probing function is called.
> 	b. Fail-safe probes the 2 sub-devices(2 ports are created) and takes ownership for them.
> 	c. Failsafe creates itself port and leaves it ownerless. 
> 	d. Failsafe starts the hotplug alarm mechanism.
> 	e. Application tries to take ownership for all ports and success only for failsafe port.
> 	f. Application start to configure the failsafe port asynchronously to failsafe hotplug alarm.
> 	g. Failsafe must use synchronization between failsafe alarm callback code and failsafe configuration APIs called by the application because they both try to configure the same sub-devices ports.
> 	h. When fail-safe finishes with the two sub devices it should release the ports owner.
> 
Ok, this I would describe as different use cases rather than parallel ownership,
in that in both cases there is still a single execution context which is
responsible for all aspects of a given port (which is fine with me, I'm just
trying to be clear in our description of the model).



> > >
> > > >
> > > > > > 2) Is device configuration racy with device addition/removal?
> > > > > > That is to say, can one thread remove a device while another
> > configures it.
> > > > >
> > > > > I think it is the same as two threads configuring the same device
> > > > > (item 4/ above). It can be managed with port ownership.
> > > > >
> > > > Only if you assert that application is required to have the owning
> > > > port be responsible for the ports deletion, which we can say, but
> > > > that leads to the issue above again.
> > > >
> > > >
> > > As Thomas said in item 2 the port creation must be synchronized by ethdev
> > and we need to add it there.
> > > I think it is obvious that port removal must to be done only by the port
> > owner.
> > >
> > You say that, but its obvious to you as a developer who has looked
> > extensively at the code.  It may well be less so to a consumer who is not an
> > active member of the community, for instance one who obtains the dpdk via
> > pre-built package.
> >
> 
> Yes I can understand, but new rules should be documented and be adjusted easy easy by the customers, no?
Ostensibly, it should be easy, yes, but in practice its a bit murkier.  For
instance, What if an application wants to enable packet capture on an interface
via rte_pdump_enable?  Does preforming that action require that the execution
context which calls that function own the port before doing so?  Digging through
the code suggests to me that it (suprisingly) does not, because all that
function does is set a socket to record packets too, but I would have
intuitively thought that enabling packet capture would require turning off the
mac filter table in the hardware, and so would have required ownership

Conversely, using the same example, calling rte_pdump_init, using the model from
your last patch, would require that the calling execution context ensured that
, at the time the cli application issued the monnitor request, that the port
be unowned, because the pdump main thread needs to set rx_tx callbacks on the
requested port, which I belive constitutes a configuration change needing port
ownership.

My point being, I think saying that ownership is easy and obvious isn't
accurate.  If we are to leave proper synchrnization of access to devices up to
the application, we either need to:

1) Assume downstream users are intimately familiar with the code
or
2) Exhaustively document the conditions under which ownership needs to be held

(1) is a non starter, and 2 I think is a fairly large undertaking, but unless we
are willing to codify synchronization in the code explicitly (via locking), (2)
is what we have to do.

Neil

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-22 14:26                                       ` Neil Horman
@ 2017-12-23 22:36                                         ` Matan Azrad
  2017-12-29 16:56                                           ` Neil Horman
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2017-12-23 22:36 UTC (permalink / raw)
  To: Neil Horman
  Cc: Thomas Monjalon, dev, Bruce Richardson, Ananyev, Konstantin,
	Gaëtan Rivet, Wu, Jingjing

Hi 
> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Friday, December 22, 2017 4:27 PM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Bruce
> Richardson <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>;
> Wu, Jingjing <jingjing.wu@intel.com>
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> On Thu, Dec 21, 2017 at 09:57:43PM +0000, Matan Azrad wrote:
> > > -----Original Message-----
> > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > Sent: Thursday, December 21, 2017 10:14 PM
> > > To: Matan Azrad <matan@mellanox.com>
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Bruce
> > > Richardson <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > <konstantin.ananyev@intel.com>; Gaëtan Rivet
> > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > >
> > > On Thu, Dec 21, 2017 at 07:37:06PM +0000, Matan Azrad wrote:
> > > > Hi
> > > >
> > <snip>
> > > > > > > > I think we need to clearly describe what is the
> > > > > > > > tread-safety policy in DPDK (especially in ethdev as a first
> example).
> > > > > > > > Let's start with obvious things:
> > > > > > > >
> > > > > > > > 	1/ A queue is not protected for races with multiple Rx or Tx
> > > > > > > > 			- no planned change because of performance
> > > > > purpose
> > > > > > > > 	2/ The list of devices is racy
> > > > > > > > 			- to be fixed with atomics
> > > > > > > > 	3/ The configuration of different devices is thread-safe
> > > > > > > > 			- the configurations are different per-device
> > > > > > > > 	4/ The configuration of a given device is racy
> > > > > > > > 			- can be managed by the owner of the device
> > > > > > > > 	5/ The device ownership is racy
> > > > > > > > 			- to be fixed with atomics
> > > > > > > >
> > > > > > > > What am I missing?
> > > > > > > >
> > > >
> > > > Thank you Thomas for this order.
> > > > Actually the port ownership is a good opportunity to redefine the
> > > > synchronization rules in ethdev :)
> > > >
> > > > > > > There is fan out to consider here:
> > > > > > >
> > > > > > > 1) Is device configuration racy with ownership?  That is to
> > > > > > > say, can I change ownership of a device safely while another
> > > > > > > thread that currently owns it modifies its configuration?
> > > > > >
> > > > > > If an entity steals ownership to another one, either it is
> > > > > > agreed earlier, or it is done by a central authority.
> > > > > > When it is acked that ownership can be moved, there should not
> > > > > > be any configuration in progress.
> > > > > > So it is more a communication issue than a race.
> > > > > >
> > > > > But if thats the case (specifically that mutual exclusion
> > > > > between port ownership and configuration is an exercize left to
> > > > > an application developer), then port ownership itself is largely
> > > > > meaningless within the dpdk, because the notion of who owns the
> > > > > port needs to be codified within the application anyway.
> > > > >
> > > >
> > > > Bruce, As I understand it, only the dpdk entity who took ownership
> > > > of a
> > > port successfully can configure the device by default, if other dpdk
> > > entities want to configure it too they must to be synchronized with
> > > the port owner while it is not recommended after the port ownership
> integration.
> > > >
> > > Can you clarify what you mean by "it is not recommended after the
> > > port ownership integration"?
> >
> > Sure,
> > The new defining of ethdev synchronization doesn't recommend to
> manage a port by 2 different dpdk entities, it can be done but not
> recommended.
> >
> Ok, thats just not what you said above.  Your suggestion made it sound like
> you thought that  after the integration of a port ownership model, that
> multiple dpdk entries should not synchronize with one another, which made
> no sense.
> 
Ok, I can see a dual meaning in my sentence, sorry for that, I think we agree here.

> > >  I think there is consensus that the port owner must be the only
> > > entitiy to operate on a port (be that configuration/frame rx/tx, or
> > > some other operation).
> >
> > Your question above caused me to think that you don't understand it, How
> can someone who is not the port owner to change the port owner?
> > Changing the port owner, like port configuration and port release must be
> done by the owner itself except the case that there is no owner to the port.
> > See the API rte_eth_dev_owner_remove.
> >
> See above, your phrasing I don't think accurately reflected what you meant
> to convey. Or at least thats not how I read it
> 
> > > Multithreaded operation on a port always means some level of
> > > synchronization between application threads and the dpdk library,
> > Yes.
> >  >but I'm not sure why that would be different if we introduced a more
> > > concrete notion of port ownership via a new library.
> > >
> >
> > What do you mean by "new library"?, port is an ethdev instance and should
> be managed by ethdev.
> >
> I'm referring to the port ownership api that you proposed.  Apologies, I
> should not have used the term "new library", but rather "new api".
> 
> >  > > So, for example,  if the dpdk entity is an application, the
> > application should
> > >> take ownership of the port and manage the synchronization of this
> > >> port configuration between the application threads and its EAL host
> > >> thread callbacks, no other dpdk entity should configure the same
> > >> port because they should fail when they try to take ownership of the
> same port too.
> >
> > > Well, failing is one good approach, yes, blocking on port
> > > relenquishment could be another.  I'd recommend an API with the
> following interface:
> > >
> > > rte_port_ownership_claim(int port_id) - blocks execution of the
> > > calling thread until the previous owner releases ownership, then
> > > claims it and returns
> > >
> > > rte_port_ownership_release(int port_id) - releases ownership of
> > > port, or returns error if the port was not owned by this execution
> > > context
> > >
> > > rte_port_owernship_try_claim(int port_id) - same as
> > > rte_port_ownership_claim, but fails if the port is already owned.
> > >
> > > That would give the option for both semantics.
> >
> > I think the current APIs are better because of the next reasons:
> > - It defines well who is the owner.
> Theres no reason you can't integrate some ownership nonce to the above
> API as well, thats easy to add.  The relevant part is the ability to exclude
> those who are not owners (that is to say, block their progress until ownership
> is released by a preceding entity).
> 
> > - The owner structure includes string to allow better debug and printing for
> humans.
> I've got no problem with any such internals, its really the synchronization that
> I'm after.
> 
> > Did you read it?
> Yes, I don't see why you would think I hadn't, I think I've been very clear in
> my understanding of you initial patch.  Have you taken the time to
> understand my concerns?
>
OK, Just it looks like you suggested a new APIs instead of V1 APIs.

Your concerns are about the races in port ownership management.
I agree with it only after Thomas redefining of port synchronization rules.
Mean that if the port creation will be race safe and the new rules will be documented, the port ownership  should be race safe too.
 
> > I can add there an API that wait until the port ownership is released as you
> suggested in V2.
> >
> I think that would be good.
> 
> > > > Each dpdk entity which wants to take ownership must to be able to
> > > >synchronize the port configuration in its level.
> >
> > > Can you elaborate on what you mean by level here?  Are you
> > > envisioning a scheme in which multiple execution contexts might own
> > > a port for various non-conflicting purposes?
> >
> > Sure,
> > 1) Application with 2 threads wanting to configure the same port:
> > 	level = application code.
> >
> > 	a. The main thread should create owner
> identifier(rte_eth_dev_owner_new).
> > 	b. The main thread should take the port
> ownership(rte_eth_dev_owner_set).
> > 	c. Synchronization between the two threads should be done for the
> conflicted 		configurations by the application.
> > 	d. when the application finishes the port usage it should release the
> owner(rte_eth_dev_owner_remove).
> >
> > 2) Fail-safe PMD manages 2 sub-devices (2 ports) and uses alarm for
> hotplug detections which can configure the 2 ports(by the host thread).
> > 	Level = fail-safe code.
> > 	a. Application starts the eal and the fail-safe driver probing function is
> called.
> > 	b. Fail-safe probes the 2 sub-devices(2 ports are created) and takes
> ownership for them.
> > 	c. Failsafe creates itself port and leaves it ownerless.
> > 	d. Failsafe starts the hotplug alarm mechanism.
> > 	e. Application tries to take ownership for all ports and success only
> for failsafe port.
> > 	f. Application start to configure the failsafe port asynchronously to
> failsafe hotplug alarm.
> > 	g. Failsafe must use synchronization between failsafe alarm callback
> code and failsafe configuration APIs called by the application because they
> both try to configure the same sub-devices ports.
> > 	h. When fail-safe finishes with the two sub devices it should release
> the ports owner.
> >
> Ok, this I would describe as different use cases rather than parallel
> ownership, in that in both cases there is still a single execution context which
> is responsible for all aspects of a given port (which is fine with me, I'm just
> trying to be clear in our description of the model).
> 
Agree.
Can you find a realistic scenario that a non-single execution entity must to manage a port and have problems with the port races synchronization management? 
 
> 
> > > >
> > > > >
> > > > > > > 2) Is device configuration racy with device addition/removal?
> > > > > > > That is to say, can one thread remove a device while another
> > > configures it.
> > > > > >
> > > > > > I think it is the same as two threads configuring the same
> > > > > > device (item 4/ above). It can be managed with port ownership.
> > > > > >
> > > > > Only if you assert that application is required to have the
> > > > > owning port be responsible for the ports deletion, which we can
> > > > > say, but that leads to the issue above again.
> > > > >
> > > > >
> > > > As Thomas said in item 2 the port creation must be synchronized by
> > > > ethdev
> > > and we need to add it there.
> > > > I think it is obvious that port removal must to be done only by
> > > > the port
> > > owner.
> > > >
> > > You say that, but its obvious to you as a developer who has looked
> > > extensively at the code.  It may well be less so to a consumer who
> > > is not an active member of the community, for instance one who
> > > obtains the dpdk via pre-built package.
> > >
> >
> > Yes I can understand, but new rules should be documented and be
> adjusted easy easy by the customers, no?
> Ostensibly, it should be easy, yes, but in practice its a bit murkier.  For
> instance, What if an application wants to enable packet capture on an
> interface via rte_pdump_enable?  Does preforming that action require that
> the execution context which calls that function own the port before doing
> so?  Digging through the code suggests to me that it (suprisingly) does not,
> because all that function does is set a socket to record packets too, but I
> would have intuitively thought that enabling packet capture would require
> turning off the mac filter table in the hardware, and so would have required
> ownership
> 
> Conversely, using the same example, calling rte_pdump_init, using the
> model from your last patch, would require that the calling execution context
> ensured that , at the time the cli application issued the monnitor request,
> that the port be unowned, because the pdump main thread needs to set
> rx_tx callbacks on the requested port, which I belive constitutes a
> configuration change needing port ownership.
> 
> My point being, I think saying that ownership is easy and obvious isn't
> accurate.

Agree, as a finger rule all the port relation APIs should require ownership taking, but it will take time to learn when we don't need to take ownership.

>  If we are to leave proper synchrnization of access to devices up to
> the application, we either need to:
> 
> 1) Assume downstream users are intimately familiar with the code or
> 2) Exhaustively document the conditions under which ownership needs to be
> held
> 
> (1) is a non starter, and 2 I think is a fairly large undertaking, but unless we
> are willing to codify synchronization in the code explicitly (via locking), (2) is
> what we have to do.
> 
Agree.
Maybe it will be good to document each relevant API if it requires ownership taking or not in .h files, what do you think?  

> Neil

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-23 22:36                                         ` Matan Azrad
@ 2017-12-29 16:56                                           ` Neil Horman
  0 siblings, 0 replies; 214+ messages in thread
From: Neil Horman @ 2017-12-29 16:56 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Thomas Monjalon, dev, Bruce Richardson, Ananyev, Konstantin,
	Gaëtan Rivet, Wu, Jingjing

On Sat, Dec 23, 2017 at 10:36:34PM +0000, Matan Azrad wrote:
> Hi 
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Friday, December 22, 2017 4:27 PM
> > To: Matan Azrad <matan@mellanox.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Bruce
> > Richardson <bruce.richardson@intel.com>; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>;
> > Wu, Jingjing <jingjing.wu@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > 
> > On Thu, Dec 21, 2017 at 09:57:43PM +0000, Matan Azrad wrote:
> > > > -----Original Message-----
> > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > > Sent: Thursday, December 21, 2017 10:14 PM
> > > > To: Matan Azrad <matan@mellanox.com>
> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; dev@dpdk.org; Bruce
> > > > Richardson <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > > <konstantin.ananyev@intel.com>; Gaëtan Rivet
> > > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > > > Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> > > >
> > > > On Thu, Dec 21, 2017 at 07:37:06PM +0000, Matan Azrad wrote:
> > > > > Hi
> > > > >
> > > <snip>
> > > > > > > > > I think we need to clearly describe what is the
> > > > > > > > > tread-safety policy in DPDK (especially in ethdev as a first
> > example).
> > > > > > > > > Let's start with obvious things:
> > > > > > > > >
> > > > > > > > > 	1/ A queue is not protected for races with multiple Rx or Tx
> > > > > > > > > 			- no planned change because of performance
> > > > > > purpose
> > > > > > > > > 	2/ The list of devices is racy
> > > > > > > > > 			- to be fixed with atomics
> > > > > > > > > 	3/ The configuration of different devices is thread-safe
> > > > > > > > > 			- the configurations are different per-device
> > > > > > > > > 	4/ The configuration of a given device is racy
> > > > > > > > > 			- can be managed by the owner of the device
> > > > > > > > > 	5/ The device ownership is racy
> > > > > > > > > 			- to be fixed with atomics
> > > > > > > > >
> > > > > > > > > What am I missing?
> > > > > > > > >
> > > > >
> > > > > Thank you Thomas for this order.
> > > > > Actually the port ownership is a good opportunity to redefine the
> > > > > synchronization rules in ethdev :)
> > > > >
> > > > > > > > There is fan out to consider here:
> > > > > > > >
> > > > > > > > 1) Is device configuration racy with ownership?  That is to
> > > > > > > > say, can I change ownership of a device safely while another
> > > > > > > > thread that currently owns it modifies its configuration?
> > > > > > >
> > > > > > > If an entity steals ownership to another one, either it is
> > > > > > > agreed earlier, or it is done by a central authority.
> > > > > > > When it is acked that ownership can be moved, there should not
> > > > > > > be any configuration in progress.
> > > > > > > So it is more a communication issue than a race.
> > > > > > >
> > > > > > But if thats the case (specifically that mutual exclusion
> > > > > > between port ownership and configuration is an exercize left to
> > > > > > an application developer), then port ownership itself is largely
> > > > > > meaningless within the dpdk, because the notion of who owns the
> > > > > > port needs to be codified within the application anyway.
> > > > > >
> > > > >
> > > > > Bruce, As I understand it, only the dpdk entity who took ownership
> > > > > of a
> > > > port successfully can configure the device by default, if other dpdk
> > > > entities want to configure it too they must to be synchronized with
> > > > the port owner while it is not recommended after the port ownership
> > integration.
> > > > >
> > > > Can you clarify what you mean by "it is not recommended after the
> > > > port ownership integration"?
> > >
> > > Sure,
> > > The new defining of ethdev synchronization doesn't recommend to
> > manage a port by 2 different dpdk entities, it can be done but not
> > recommended.
> > >
> > Ok, thats just not what you said above.  Your suggestion made it sound like
> > you thought that  after the integration of a port ownership model, that
> > multiple dpdk entries should not synchronize with one another, which made
> > no sense.
> > 
> Ok, I can see a dual meaning in my sentence, sorry for that, I think we agree here.
> 
No need to apologize, just trying to make sure we're on the same page, and yes,
I agree that we are in consensus here.

> > >
> > > > Can you elaborate on what you mean by level here?  Are you
> > > > envisioning a scheme in which multiple execution contexts might own
> > > > a port for various non-conflicting purposes?
> > >
> > > Sure,
> > > 1) Application with 2 threads wanting to configure the same port:
> > > 	level = application code.
> > >
> > > 	a. The main thread should create owner
> > identifier(rte_eth_dev_owner_new).
> > > 	b. The main thread should take the port
> > ownership(rte_eth_dev_owner_set).
> > > 	c. Synchronization between the two threads should be done for the
> > conflicted 		configurations by the application.
> > > 	d. when the application finishes the port usage it should release the
> > owner(rte_eth_dev_owner_remove).
> > >
> > > 2) Fail-safe PMD manages 2 sub-devices (2 ports) and uses alarm for
> > hotplug detections which can configure the 2 ports(by the host thread).
> > > 	Level = fail-safe code.
> > > 	a. Application starts the eal and the fail-safe driver probing function is
> > called.
> > > 	b. Fail-safe probes the 2 sub-devices(2 ports are created) and takes
> > ownership for them.
> > > 	c. Failsafe creates itself port and leaves it ownerless.
> > > 	d. Failsafe starts the hotplug alarm mechanism.
> > > 	e. Application tries to take ownership for all ports and success only
> > for failsafe port.
> > > 	f. Application start to configure the failsafe port asynchronously to
> > failsafe hotplug alarm.
> > > 	g. Failsafe must use synchronization between failsafe alarm callback
> > code and failsafe configuration APIs called by the application because they
> > both try to configure the same sub-devices ports.
> > > 	h. When fail-safe finishes with the two sub devices it should release
> > the ports owner.
> > >
> > Ok, this I would describe as different use cases rather than parallel
> > ownership, in that in both cases there is still a single execution context which
> > is responsible for all aspects of a given port (which is fine with me, I'm just
> > trying to be clear in our description of the model).
> > 
> Agree.
> Can you find a realistic scenario that a non-single execution entity must to manage a port and have problems with the port races synchronization management? 
No, nor do I think one exists, just trying to make sure that you weren't trying
to allow for that, as it would be very difficutl to maintain I think.

>  
> > 
> > > > >
> > > > > >
> > > > > > > > 2) Is device configuration racy with device addition/removal?
> > > > > > > > That is to say, can one thread remove a device while another
> > > > configures it.
> > > > > > >
> > > > > > > I think it is the same as two threads configuring the same
> > > > > > > device (item 4/ above). It can be managed with port ownership.
> > > > > > >
> > > > > > Only if you assert that application is required to have the
> > > > > > owning port be responsible for the ports deletion, which we can
> > > > > > say, but that leads to the issue above again.
> > > > > >
> > > > > >
> > > > > As Thomas said in item 2 the port creation must be synchronized by
> > > > > ethdev
> > > > and we need to add it there.
> > > > > I think it is obvious that port removal must to be done only by
> > > > > the port
> > > > owner.
> > > > >
> > > > You say that, but its obvious to you as a developer who has looked
> > > > extensively at the code.  It may well be less so to a consumer who
> > > > is not an active member of the community, for instance one who
> > > > obtains the dpdk via pre-built package.
> > > >
> > >
> > > Yes I can understand, but new rules should be documented and be
> > adjusted easy easy by the customers, no?
> > Ostensibly, it should be easy, yes, but in practice its a bit murkier.  For
> > instance, What if an application wants to enable packet capture on an
> > interface via rte_pdump_enable?  Does preforming that action require that
> > the execution context which calls that function own the port before doing
> > so?  Digging through the code suggests to me that it (suprisingly) does not,
> > because all that function does is set a socket to record packets too, but I
> > would have intuitively thought that enabling packet capture would require
> > turning off the mac filter table in the hardware, and so would have required
> > ownership
> > 
> > Conversely, using the same example, calling rte_pdump_init, using the
> > model from your last patch, would require that the calling execution context
> > ensured that , at the time the cli application issued the monnitor request,
> > that the port be unowned, because the pdump main thread needs to set
> > rx_tx callbacks on the requested port, which I belive constitutes a
> > configuration change needing port ownership.
> > 
> > My point being, I think saying that ownership is easy and obvious isn't
> > accurate.
> 
> Agree, as a finger rule all the port relation APIs should require ownership taking, but it will take time to learn when we don't need to take ownership.
It likely will, I agree, this is the difficulty in maintaining mutual exclusion
outside of the library,  its (potentially) an every chaning model, for which
documentation needs to keep up, and I'm not sure if we will ever get there
(hence my constant bemoaning of the desire to codify multual exclusion within
the library.  I wonder if it would be worth while to explore a lock registration
mechanism in which sole access is implemented outside the library, but
communicated within the library to avoid this (i.e. a model in which the
application manipulates a mutual exclusion condition, but that can be checked by
the library to ensure proper usage.

> 
> >  If we are to leave proper synchrnization of access to devices up to
> > the application, we either need to:
> > 
> > 1) Assume downstream users are intimately familiar with the code or
> > 2) Exhaustively document the conditions under which ownership needs to be
> > held
> > 
> > (1) is a non starter, and 2 I think is a fairly large undertaking, but unless we
> > are willing to codify synchronization in the code explicitly (via locking), (2) is
> > what we have to do.
> > 
> Agree.
> Maybe it will be good to document each relevant API if it requires ownership taking or not in .h files, what do you think?  
> 
If you think thats a surmountable task, yes.  I think its necessecary if we are
to expect applications to do any sort of real locking (beyond just guaranteeing
one task is executing in the dpdk at any one time)

Best
Neil

> > Neil
> 
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v2 0/6] ethdev: port ownership
  2017-11-28 11:57 [dpdk-dev] [PATCH 0/5] ethdev: Port ownership Matan Azrad
                   ` (4 preceding siblings ...)
  2017-11-28 11:58 ` [dpdk-dev] [PATCH 5/5] app/testpmd: adjust ethdev port ownership Matan Azrad
@ 2018-01-07  9:45 ` Matan Azrad
  2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 1/6] ethdev: fix port data reset timing Matan Azrad
                     ` (6 more replies)
  5 siblings, 7 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-07  9:45 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Add ownership mechanism to DPDK Ethernet devices to avoid multiple management of a device by different DPDK entities.

V2:  
Synchronize ethdev port creation.
Synchronize port ownership mechanism.
Rename owner remove API to rte_eth_dev_owner_unset.
Remove "ethdev: free a port by a dedicated API" patch - passed to another series.
Add "ethdev: fix port data reset timing" patch.
Cahnge owner get API to return int value and to pass copy of the owner structure.
Adjust testpmd to the improved owner get API.
Adjust documentations.


Points:
1. rte_eth_dev_owner_claim API was suggested as a blocked API to set owner, I didn't find use case for this.
2. Add comments for all APIs which are required ownership setting before the calling - still was not done, other suggestions?  


Matan Azrad (6):
  ethdev: fix port data reset timing
  ethdev: add port ownership
  ethdev: synchronize port allocation
  net/failsafe: free an eth port by a dedicated API
  net/failsafe: use ownership mechanism to own ports
  app/testpmd: adjust ethdev port ownership

 app/test-pmd/cmdline.c                  |  88 +++++-------
 app/test-pmd/cmdline_flow.c             |   2 +-
 app/test-pmd/config.c                   |  37 ++---
 app/test-pmd/parameters.c               |   4 +-
 app/test-pmd/testpmd.c                  |  63 ++++++---
 app/test-pmd/testpmd.h                  |   3 +
 config/common_base                      |   4 +-
 doc/guides/prog_guide/poll_mode_drv.rst |  14 +-
 drivers/net/failsafe/failsafe.c         |   7 +
 drivers/net/failsafe/failsafe_eal.c     |  10 ++
 drivers/net/failsafe/failsafe_ether.c   |   2 +-
 drivers/net/failsafe/failsafe_private.h |   2 +
 lib/librte_ether/rte_ethdev.c           | 243 +++++++++++++++++++++++++++++---
 lib/librte_ether/rte_ethdev.h           |  89 ++++++++++++
 lib/librte_ether/rte_ethdev_version.map |  12 ++
 15 files changed, 462 insertions(+), 118 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v2 1/6] ethdev: fix port data reset timing
  2018-01-07  9:45 ` [dpdk-dev] [PATCH v2 0/6] ethdev: " Matan Azrad
@ 2018-01-07  9:45   ` Matan Azrad
  2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership Matan Azrad
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-07  9:45 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev, stable

rte_eth_dev_data structure is allocated per ethdev port and can be
used to get a data of the port internally.

rte_eth_dev_attach_secondary tries to find the port identifier using
rte_eth_dev_data name field comparison and may get an identifier of
invalid port in case of this port was released by the primary process
because the port release API doesn't reset the port data.

So, it will be better to reset the port data in release time instead of
allocation time.

Move the port data reset to the port release API.

Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple process model")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 lib/librte_ether/rte_ethdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index d1385df..684e3e8 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -233,7 +233,6 @@ struct rte_eth_dev *
 		return NULL;
 	}
 
-	memset(&rte_eth_dev_data[port_id], 0, sizeof(struct rte_eth_dev_data));
 	eth_dev = eth_dev_get(port_id);
 	snprintf(eth_dev->data->name, sizeof(eth_dev->data->name), "%s", name);
 	eth_dev->data->port_id = port_id;
@@ -279,6 +278,7 @@ struct rte_eth_dev *
 	if (eth_dev == NULL)
 		return -EINVAL;
 
+	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
 	eth_dev->state = RTE_ETH_DEV_UNUSED;
 	return 0;
 }
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-07  9:45 ` [dpdk-dev] [PATCH v2 0/6] ethdev: " Matan Azrad
  2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 1/6] ethdev: fix port data reset timing Matan Azrad
@ 2018-01-07  9:45   ` Matan Azrad
  2018-01-10 13:36     ` Ananyev, Konstantin
  2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 3/6] ethdev: synchronize port allocation Matan Azrad
                     ` (4 subsequent siblings)
  6 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-07  9:45 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

The ownership of a port is implicit in DPDK.
Making it explicit is better from the next reasons:
1. It will define well who is in charge of the port usage synchronization.
2. A library could work on top of a port.
3. A port can work on top of another port.

Also in the fail-safe case, an issue has been met in testpmd.
We need to check that the application is not trying to use a port which
is already managed by fail-safe.

A port owner is built from owner id(number) and owner name(string) while
the owner id must be unique to distinguish between two identical entity
instances and the owner name can be any name.
The name helps to logically recognize the owner by different DPDK
entities and allows easy debug.
Each DPDK entity can allocate an owner unique identifier and can use it
and its preferred name to owns valid ethdev ports.
Each DPDK entity can get any port owner status to decide if it can
manage the port or not.

The mechanism is synchronized for both the primary process threads and
the secondary processes threads to allow secondary process entity to be
a port owner.

Add a sinchronized ownership mechanism to DPDK Ethernet devices to
avoid multiple management of a device by different DPDK entities.

The current ethdev internal port management is not affected by this
feature.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 doc/guides/prog_guide/poll_mode_drv.rst |  14 ++-
 lib/librte_ether/rte_ethdev.c           | 206 ++++++++++++++++++++++++++++++--
 lib/librte_ether/rte_ethdev.h           |  89 ++++++++++++++
 lib/librte_ether/rte_ethdev_version.map |  12 ++
 4 files changed, 311 insertions(+), 10 deletions(-)

diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
index 6a0c9f9..046cde7 100644
--- a/doc/guides/prog_guide/poll_mode_drv.rst
+++ b/doc/guides/prog_guide/poll_mode_drv.rst
@@ -156,8 +156,8 @@ concurrently on the same tx queue without SW lock. This PMD feature found in som
 
 See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE`` capability probing details.
 
-Device Identification and Configuration
----------------------------------------
+Device Identification, Ownership and Configuration
+--------------------------------------------------
 
 Device Identification
 ~~~~~~~~~~~~~~~~~~~~~
@@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports are assigned two other identifiers:
 *   A port name used to designate the port in console messages, for administration or debugging purposes.
     For ease of use, the port name includes the port index.
 
+Port Ownership
+~~~~~~~~~~~~~~
+The Ethernet devices ports can be owned by a single DPDK entity (application, library, PMD, process, etc).
+The ownership mechanism is controlled by ethdev APIs and allows to set/remove/get a port owner by DPDK entities.
+Allowing this should prevent any multiple management of Ethernet port by different entities.
+
+.. note::
+
+    It is the DPDK entity responsibility to set the port owner before using it and to manage the port usage synchronization between different threads or processes.
+
 Device Configuration
 ~~~~~~~~~~~~~~~~~~~~
 
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 684e3e8..0e12452 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -70,7 +70,10 @@
 
 static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
 struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
+/* ports data array stored in shared memory */
 static struct rte_eth_dev_data *rte_eth_dev_data;
+/* next owner identifier stored in shared memory */
+static uint16_t *rte_eth_next_owner_id;
 static uint8_t eth_dev_last_created_port;
 
 /* spinlock for eth device callbacks */
@@ -82,6 +85,9 @@
 /* spinlock for add/remove tx callbacks */
 static rte_spinlock_t rte_eth_tx_cb_lock = RTE_SPINLOCK_INITIALIZER;
 
+/* spinlock for eth device ownership management stored in shared memory */
+static rte_spinlock_t *rte_eth_dev_ownership_lock;
+
 /* store statistics names and its offset in stats structure  */
 struct rte_eth_xstats_name_off {
 	char name[RTE_ETH_XSTATS_NAME_SIZE];
@@ -153,14 +159,18 @@ enum {
 }
 
 static void
-rte_eth_dev_data_alloc(void)
+rte_eth_dev_share_data_alloc(void)
 {
 	const unsigned flags = 0;
 	const struct rte_memzone *mz;
+	const unsigned int data_size = RTE_MAX_ETHPORTS *
+						sizeof(*rte_eth_dev_data);
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		/* Allocate shared memory for port data and ownership */
 		mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA,
-				RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data),
+				data_size + sizeof(*rte_eth_next_owner_id) +
+				sizeof(*rte_eth_dev_ownership_lock),
 				rte_socket_id(), flags);
 	} else
 		mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
@@ -168,9 +178,17 @@ enum {
 		rte_panic("Cannot allocate memzone for ethernet port data\n");
 
 	rte_eth_dev_data = mz->addr;
-	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
-		memset(rte_eth_dev_data, 0,
-				RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data));
+	rte_eth_next_owner_id = (uint16_t *)((uintptr_t)mz->addr +
+					     data_size);
+	rte_eth_dev_ownership_lock = (rte_spinlock_t *)
+		((uintptr_t)rte_eth_next_owner_id +
+		 sizeof(*rte_eth_next_owner_id));
+
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		memset(rte_eth_dev_data, 0, data_size);
+		*rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER + 1;
+		rte_spinlock_init(rte_eth_dev_ownership_lock);
+	}
 }
 
 struct rte_eth_dev *
@@ -225,7 +243,7 @@ struct rte_eth_dev *
 	}
 
 	if (rte_eth_dev_data == NULL)
-		rte_eth_dev_data_alloc();
+		rte_eth_dev_share_data_alloc();
 
 	if (rte_eth_dev_allocated(name) != NULL) {
 		RTE_PMD_DEBUG_TRACE("Ethernet Device with name %s already allocated!\n",
@@ -253,7 +271,7 @@ struct rte_eth_dev *
 	struct rte_eth_dev *eth_dev;
 
 	if (rte_eth_dev_data == NULL)
-		rte_eth_dev_data_alloc();
+		rte_eth_dev_share_data_alloc();
 
 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
 		if (strcmp(rte_eth_dev_data[i].name, name) == 0)
@@ -278,8 +296,12 @@ struct rte_eth_dev *
 	if (eth_dev == NULL)
 		return -EINVAL;
 
-	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
+	rte_spinlock_lock(rte_eth_dev_ownership_lock);
+
 	eth_dev->state = RTE_ETH_DEV_UNUSED;
+	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
+
+	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
 	return 0;
 }
 
@@ -294,6 +316,174 @@ struct rte_eth_dev *
 		return 1;
 }
 
+static int
+rte_eth_is_valid_owner_id(uint16_t owner_id)
+{
+	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
+	    (*rte_eth_next_owner_id > RTE_ETH_DEV_NO_OWNER &&
+	     *rte_eth_next_owner_id <= owner_id)) {
+		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
+		return 0;
+	}
+	return 1;
+}
+
+uint16_t
+rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t owner_id)
+{
+	while (port_id < RTE_MAX_ETHPORTS &&
+	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
+	       rte_eth_devices[port_id].data->owner.id != owner_id))
+		port_id++;
+
+	if (port_id >= RTE_MAX_ETHPORTS)
+		return RTE_MAX_ETHPORTS;
+
+	return port_id;
+}
+
+int
+rte_eth_dev_owner_new(uint16_t *owner_id)
+{
+	int ret = 0;
+
+	rte_spinlock_lock(rte_eth_dev_ownership_lock);
+
+	if (*rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
+		/* Counter wrap around. */
+		RTE_PMD_DEBUG_TRACE("Reached maximum number of Ethernet port owners.\n");
+		ret = -EUSERS;
+	} else {
+		*owner_id = (*rte_eth_next_owner_id)++;
+	}
+
+	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
+	return ret;
+}
+
+int
+rte_eth_dev_owner_set(const uint16_t port_id,
+		      const struct rte_eth_dev_owner *owner)
+{
+	struct rte_eth_dev_owner *port_owner;
+	int ret = 0;
+	int sret;
+
+	rte_spinlock_lock(rte_eth_dev_ownership_lock);
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		RTE_PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		ret = -ENODEV;
+		goto unlock;
+	}
+
+	if (!rte_eth_is_valid_owner_id(owner->id)) {
+		ret = -EINVAL;
+		goto unlock;
+	}
+
+	port_owner = &rte_eth_devices[port_id].data->owner;
+	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
+	    port_owner->id != owner->id) {
+		RTE_LOG(ERR, EAL,
+			"Cannot set owner to port %d already owned by %s_%05d.\n",
+			port_id, port_owner->name, port_owner->id);
+		ret = -EPERM;
+		goto unlock;
+	}
+
+	sret = snprintf(port_owner->name, RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
+			owner->name);
+	if (sret < 0 || sret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
+		memset(port_owner->name, 0, RTE_ETH_MAX_OWNER_NAME_LEN);
+		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
+		ret = -EINVAL;
+		goto unlock;
+	}
+
+	port_owner->id = owner->id;
+	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
+			    owner->name, owner->id);
+
+unlock:
+	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
+	return ret;
+}
+
+int
+rte_eth_dev_owner_unset(const uint16_t port_id, const uint16_t owner_id)
+{
+	struct rte_eth_dev_owner *port_owner;
+	int ret = 0;
+
+	rte_spinlock_lock(rte_eth_dev_ownership_lock);
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		RTE_PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		ret = -ENODEV;
+		goto unlock;
+	}
+
+	if (!rte_eth_is_valid_owner_id(owner_id)) {
+		ret = -EINVAL;
+		goto unlock;
+	}
+
+	port_owner = &rte_eth_devices[port_id].data->owner;
+	if (port_owner->id != owner_id) {
+		RTE_LOG(ERR, EAL, "Cannot unset port %d owner (%s_%05d) by"
+			" a different owner with id %5d.\n", port_id,
+			port_owner->name, port_owner->id, owner_id);
+		ret = -EPERM;
+		goto unlock;
+	}
+	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has removed.\n", port_id,
+			    port_owner->name, port_owner->id);
+
+	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
+
+unlock:
+	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
+	return ret;
+}
+
+void
+rte_eth_dev_owner_delete(const uint16_t owner_id)
+{
+	uint16_t port_id;
+
+	rte_spinlock_lock(rte_eth_dev_ownership_lock);
+
+	if (rte_eth_is_valid_owner_id(owner_id)) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(port_id, owner_id)
+			memset(&rte_eth_devices[port_id].data->owner, 0,
+			       sizeof(struct rte_eth_dev_owner));
+		RTE_PMD_DEBUG_TRACE("All port owners owned by %05d identifier"
+				    " have removed.\n", owner_id);
+	}
+
+	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
+}
+
+int
+rte_eth_dev_owner_get(const uint16_t port_id, struct rte_eth_dev_owner *owner)
+{
+	int ret = 0;
+
+	rte_spinlock_lock(rte_eth_dev_ownership_lock);
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		RTE_PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		ret = -ENODEV;
+	} else {
+		rte_memcpy(owner, &rte_eth_devices[port_id].data->owner,
+			   sizeof(*owner));
+	}
+
+	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
+	return ret;
+}
+
 int
 rte_eth_dev_socket_id(uint16_t port_id)
 {
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 57b61ed..88ad765 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
 
 #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
 
+#define RTE_ETH_DEV_NO_OWNER 0
+
+#define RTE_ETH_MAX_OWNER_NAME_LEN 64
+
+struct rte_eth_dev_owner {
+	uint16_t id; /**< The owner unique identifier. */
+	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner name. */
+};
+
 /**
  * @internal
  * The data part, with no function pointers, associated with each ethernet device.
@@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
 	int numa_node;  /**< NUMA node connection */
 	struct rte_vlan_filter_conf vlan_filter_conf;
 	/**< VLAN filter configuration. */
+	struct rte_eth_dev_owner owner; /**< The port owner. */
 };
 
 /** Device supports link state interrupt */
@@ -1846,6 +1856,85 @@ struct rte_eth_dev_data {
 
 
 /**
+ * Iterates over valid ethdev ports owned by a specific owner.
+ *
+ * @param port_id
+ *   The id of the next possible valid owned port.
+ * @param	owner_id
+ *  The owner identifier.
+ *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless ports.
+ * @return
+ *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if there is none.
+ */
+uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t owner_id);
+
+/**
+ * Macro to iterate over all enabled ethdev ports owned by a specific owner.
+ */
+#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
+	for (p = rte_eth_find_next_owned_by(0, o); \
+	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
+	     p = rte_eth_find_next_owned_by(p + 1, o))
+
+/**
+ * Get a new unique owner identifier.
+ * An owner identifier is used to owns Ethernet devices by only one DPDK entity
+ * to avoid multiple management of device by different entities.
+ *
+ * @param	owner_id
+ *   Owner identifier pointer.
+ * @return
+ *   Negative errno value on error, 0 on success.
+ */
+int rte_eth_dev_owner_new(uint16_t *owner_id);
+
+/**
+ * Set an Ethernet device owner.
+ *
+ * @param	port_id
+ *  The identifier of the port to own.
+ * @param	owner
+ *  The owner pointer.
+ * @return
+ *  Negative errno value on error, 0 on success.
+ */
+int rte_eth_dev_owner_set(const uint16_t port_id,
+			  const struct rte_eth_dev_owner *owner);
+
+/**
+ * Unset Ethernet device owner to make the device ownerless.
+ *
+ * @param	port_id
+ *  The identifier of port to make ownerless.
+ * @param	owner
+ *  The owner identifier.
+ * @return
+ *  0 on success, negative errno value on error.
+ */
+int rte_eth_dev_owner_unset(const uint16_t port_id, const uint16_t owner_id);
+
+/**
+ * Remove owner from all Ethernet devices owned by a specific owner.
+ *
+ * @param	owner
+ *  The owner identifier.
+ */
+void rte_eth_dev_owner_delete(const uint16_t owner_id);
+
+/**
+ * Get the owner of an Ethernet device.
+ *
+ * @param	port_id
+ *  The port identifier.
+ * @param	owner
+ *  The owner structure pointer to fill.
+ * @return
+ *  0 on success, negative errno value on error..
+ */
+int rte_eth_dev_owner_get(const uint16_t port_id,
+			  struct rte_eth_dev_owner *owner);
+
+/**
  * Get the total number of Ethernet devices that have been successfully
  * initialized by the matching Ethernet driver during the PCI probing phase
  * and that are available for applications to use. These devices must be
diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
index e9681ac..5d20b5f 100644
--- a/lib/librte_ether/rte_ethdev_version.map
+++ b/lib/librte_ether/rte_ethdev_version.map
@@ -198,6 +198,18 @@ DPDK_17.11 {
 
 } DPDK_17.08;
 
+DPDK_18.02 {
+	global:
+
+	rte_eth_dev_owner_delete;
+	rte_eth_dev_owner_get;
+	rte_eth_dev_owner_new;
+	rte_eth_dev_owner_set;
+	rte_eth_dev_owner_unset;
+	rte_eth_find_next_owned_by;
+
+} DPDK_17.11;
+
 EXPERIMENTAL {
 	global:
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v2 3/6] ethdev: synchronize port allocation
  2018-01-07  9:45 ` [dpdk-dev] [PATCH v2 0/6] ethdev: " Matan Azrad
  2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 1/6] ethdev: fix port data reset timing Matan Azrad
  2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership Matan Azrad
@ 2018-01-07  9:45   ` Matan Azrad
  2018-01-07  9:58     ` Matan Azrad
  2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 4/6] net/failsafe: free an eth port by a dedicated API Matan Azrad
                     ` (3 subsequent siblings)
  6 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-07  9:45 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Ethernet port allocation was not thread safe, means 2 threads which tried
to allocate a new port at the same time might get an identical port
identifier and caused to memory overwrite.
Actually, all the port configurations were not thread safe from ethdev
point of view.

The port ownership mechanism added to the ethdev is a good point to
redefine the synchronization rules in ethdev:

1. The port allocation and port release synchronization will be
   managed by ethdev.
2. The port usage synchronization will be managed by the port owner.
3. The port ownership synchronization will be managed by ethdev.

Add port allocation synchronization to complete the new rules.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 config/common_base            |  4 ++--
 lib/librte_ether/rte_ethdev.c | 39 ++++++++++++++++++++++++++++-----------
 2 files changed, 30 insertions(+), 13 deletions(-)

diff --git a/config/common_base b/config/common_base
index b8ee8f9..980ae3b 100644
--- a/config/common_base
+++ b/config/common_base
@@ -94,7 +94,7 @@ CONFIG_RTE_MAX_MEMSEG=256
 CONFIG_RTE_MAX_MEMZONE=2560
 CONFIG_RTE_MAX_TAILQ=32
 CONFIG_RTE_ENABLE_ASSERT=n
-CONFIG_RTE_LOG_LEVEL=RTE_LOG_INFO
+CONFIG_RTE_LOG_LEVEL=RTE_LOG_DEBUG
 CONFIG_RTE_LOG_DP_LEVEL=RTE_LOG_INFO
 CONFIG_RTE_LOG_HISTORY=256
 CONFIG_RTE_BACKTRACE=y
@@ -136,7 +136,7 @@ CONFIG_RTE_LIBRTE_KVARGS=y
 # Compile generic ethernet library
 #
 CONFIG_RTE_LIBRTE_ETHER=y
-CONFIG_RTE_LIBRTE_ETHDEV_DEBUG=n
+CONFIG_RTE_LIBRTE_ETHDEV_DEBUG=y
 CONFIG_RTE_MAX_ETHPORTS=32
 CONFIG_RTE_MAX_QUEUES_PER_PORT=1024
 CONFIG_RTE_LIBRTE_IEEE1588=n
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 0e12452..d30d61f 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -85,6 +85,9 @@
 /* spinlock for add/remove tx callbacks */
 static rte_spinlock_t rte_eth_tx_cb_lock = RTE_SPINLOCK_INITIALIZER;
 
+/* spinlock for shared data allocation */
+static rte_spinlock_t rte_eth_share_data_alloc = RTE_SPINLOCK_INITIALIZER;
+
 /* spinlock for eth device ownership management stored in shared memory */
 static rte_spinlock_t *rte_eth_dev_ownership_lock;
 
@@ -234,21 +237,27 @@ struct rte_eth_dev *
 rte_eth_dev_allocate(const char *name)
 {
 	uint16_t port_id;
-	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev *eth_dev = NULL;
+
+	/* Synchronize share data one time allocation between local threads. */
+	rte_spinlock_lock(&rte_eth_share_data_alloc);
+	if (rte_eth_dev_data == NULL)
+		rte_eth_dev_share_data_alloc();
+	rte_spinlock_unlock(&rte_eth_share_data_alloc);
+
+	/* Synchronize port creation between primary and secondary threads. */
+	rte_spinlock_lock(rte_eth_dev_ownership_lock);
 
 	port_id = rte_eth_dev_find_free_port();
 	if (port_id == RTE_MAX_ETHPORTS) {
 		RTE_PMD_DEBUG_TRACE("Reached maximum number of Ethernet ports\n");
-		return NULL;
+		goto unlock;
 	}
 
-	if (rte_eth_dev_data == NULL)
-		rte_eth_dev_share_data_alloc();
-
 	if (rte_eth_dev_allocated(name) != NULL) {
 		RTE_PMD_DEBUG_TRACE("Ethernet Device with name %s already allocated!\n",
 				name);
-		return NULL;
+		goto unlock;
 	}
 
 	eth_dev = eth_dev_get(port_id);
@@ -256,6 +265,8 @@ struct rte_eth_dev *
 	eth_dev->data->port_id = port_id;
 	eth_dev->data->mtu = ETHER_MTU;
 
+unlock:
+	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
 	return eth_dev;
 }
 
@@ -268,10 +279,16 @@ struct rte_eth_dev *
 rte_eth_dev_attach_secondary(const char *name)
 {
 	uint16_t i;
-	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev *eth_dev = NULL;
 
+	/* Synchronize share data one time attachment between local threads. */
+	rte_spinlock_lock(&rte_eth_share_data_alloc);
 	if (rte_eth_dev_data == NULL)
 		rte_eth_dev_share_data_alloc();
+	rte_spinlock_unlock(&rte_eth_share_data_alloc);
+
+	/* Synchronize port attachment to primary port creation and release. */
+	rte_spinlock_lock(rte_eth_dev_ownership_lock);
 
 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
 		if (strcmp(rte_eth_dev_data[i].name, name) == 0)
@@ -281,12 +298,12 @@ struct rte_eth_dev *
 		RTE_PMD_DEBUG_TRACE(
 			"device %s is not driven by the primary process\n",
 			name);
-		return NULL;
+	} else {
+		eth_dev = eth_dev_get(i);
+		RTE_ASSERT(eth_dev->data->port_id == i);
 	}
 
-	eth_dev = eth_dev_get(i);
-	RTE_ASSERT(eth_dev->data->port_id == i);
-
+	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
 	return eth_dev;
 }
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v2 4/6] net/failsafe: free an eth port by a dedicated API
  2018-01-07  9:45 ` [dpdk-dev] [PATCH v2 0/6] ethdev: " Matan Azrad
                     ` (2 preceding siblings ...)
  2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 3/6] ethdev: synchronize port allocation Matan Azrad
@ 2018-01-07  9:45   ` Matan Azrad
  2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 5/6] net/failsafe: use ownership mechanism to own ports Matan Azrad
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-07  9:45 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Call dedicated ethdev API to free port in remove time as was done in
other fail-safe places.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/failsafe/failsafe_ether.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index 21392e5..f72f44f 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -297,7 +297,7 @@
 			ERROR("Bus detach failed for sub_device %u",
 			      SUB_ID(sdev));
 		} else {
-			ETH(sdev)->state = RTE_ETH_DEV_UNUSED;
+			rte_eth_dev_release_port(ETH(sdev));
 		}
 		sdev->state = DEV_PARSED;
 		/* fallthrough */
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v2 5/6] net/failsafe: use ownership mechanism to own ports
  2018-01-07  9:45 ` [dpdk-dev] [PATCH v2 0/6] ethdev: " Matan Azrad
                     ` (3 preceding siblings ...)
  2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 4/6] net/failsafe: free an eth port by a dedicated API Matan Azrad
@ 2018-01-07  9:45   ` Matan Azrad
  2018-01-08 10:32     ` Gaëtan Rivet
  2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port ownership Matan Azrad
  2018-01-18 16:35   ` [dpdk-dev] [PATCH v3 0/7] Port ownership and syncronization Matan Azrad
  6 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-07  9:45 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Fail-safe PMD sub devices management is based on ethdev port mechanism.
So, the sub-devices management structures are exposed to other DPDK
entities which may use them in parallel to fail-safe PMD.

Use the new port ownership mechanism to avoid multiple managments of
fail-safe PMD sub-devices.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 drivers/net/failsafe/failsafe.c         |  7 +++++++
 drivers/net/failsafe/failsafe_eal.c     | 10 ++++++++++
 drivers/net/failsafe/failsafe_private.h |  2 ++
 3 files changed, 19 insertions(+)

diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index 6bc5aba..d413c20 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -191,6 +191,13 @@
 	ret = failsafe_args_parse(dev, params);
 	if (ret)
 		goto free_subs;
+	ret = rte_eth_dev_owner_new(&priv->my_owner.id);
+	if (ret) {
+		ERROR("Failed to get unique owner identifier");
+		goto free_args;
+	}
+	snprintf(priv->my_owner.name, sizeof(priv->my_owner.name),
+		 FAILSAFE_OWNER_NAME);
 	ret = failsafe_eal_init(dev);
 	if (ret)
 		goto free_args;
diff --git a/drivers/net/failsafe/failsafe_eal.c b/drivers/net/failsafe/failsafe_eal.c
index 19d26f5..b4628fb 100644
--- a/drivers/net/failsafe/failsafe_eal.c
+++ b/drivers/net/failsafe/failsafe_eal.c
@@ -69,6 +69,16 @@
 			ERROR("sub_device %d init went wrong", i);
 			return -ENODEV;
 		}
+		ret = rte_eth_dev_owner_set(j, &PRIV(dev)->my_owner);
+		if (ret) {
+			/*
+			 * It is unexpected for a fail-safe sub-device
+			 * to be owned by another DPDK entity.
+			 */
+			ERROR("sub_device %d owner set failed (%s)", i,
+			      strerror(ret));
+			return ret;
+		}
 		SUB_ID(sdev) = i;
 		sdev->fs_dev = dev;
 		sdev->dev = ETH(sdev)->device;
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index d81cc3c..9936875 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -42,6 +42,7 @@
 #include <rte_devargs.h>
 
 #define FAILSAFE_DRIVER_NAME "Fail-safe PMD"
+#define FAILSAFE_OWNER_NAME "Fail-safe"
 
 #define PMD_FAILSAFE_MAC_KVARG "mac"
 #define PMD_FAILSAFE_HOTPLUG_POLL_KVARG "hotplug_poll"
@@ -139,6 +140,7 @@ struct fs_priv {
 	uint32_t mac_addr_pool[FAILSAFE_MAX_ETHADDR];
 	/* current capabilities */
 	struct rte_eth_dev_info infos;
+	struct rte_eth_dev_owner my_owner; /* Unique owner. */
 	/*
 	 * Fail-safe state machine.
 	 * This level will be tracking state of the EAL and eth
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port ownership
  2018-01-07  9:45 ` [dpdk-dev] [PATCH v2 0/6] ethdev: " Matan Azrad
                     ` (4 preceding siblings ...)
  2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 5/6] net/failsafe: use ownership mechanism to own ports Matan Azrad
@ 2018-01-07  9:45   ` Matan Azrad
  2018-01-08 11:39     ` Gaëtan Rivet
  2018-01-16  5:53     ` Lu, Wenzhuo
  2018-01-18 16:35   ` [dpdk-dev] [PATCH v3 0/7] Port ownership and syncronization Matan Azrad
  6 siblings, 2 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-07  9:45 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Testpmd should not use ethdev ports which are managed by other DPDK
entities.

Set Testpmd ownership to each port which is not used by other entity and
prevent any usage of ethdev ports which are not owned by Testpmd.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 app/test-pmd/cmdline.c      | 88 +++++++++++++++++++--------------------------
 app/test-pmd/cmdline_flow.c |  2 +-
 app/test-pmd/config.c       | 37 +++++++++----------
 app/test-pmd/parameters.c   |  4 +--
 app/test-pmd/testpmd.c      | 63 ++++++++++++++++++++------------
 app/test-pmd/testpmd.h      |  3 ++
 6 files changed, 102 insertions(+), 95 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index f71d963..0731982 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -1357,7 +1357,7 @@ struct cmd_config_speed_all {
 			&link_speed) < 0)
 		return;
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		ports[pid].dev_conf.link_speeds = link_speed;
 	}
 
@@ -1851,7 +1851,7 @@ struct cmd_config_rss {
 	struct cmd_config_rss *res = parsed_result;
 	struct rte_eth_rss_conf rss_conf = { .rss_key_len = 0, };
 	int diag;
-	uint8_t i;
+	uint16_t pid;
 
 	if (!strcmp(res->value, "all"))
 		rss_conf.rss_hf = ETH_RSS_IP | ETH_RSS_TCP |
@@ -1885,12 +1885,12 @@ struct cmd_config_rss {
 		return;
 	}
 	rss_conf.rss_key = NULL;
-	for (i = 0; i < rte_eth_dev_count(); i++) {
-		diag = rte_eth_dev_rss_hash_update(i, &rss_conf);
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
+		diag = rte_eth_dev_rss_hash_update(pid, &rss_conf);
 		if (diag < 0)
 			printf("Configuration of RSS hash at ethernet port %d "
 				"failed with error (%d): %s.\n",
-				i, -diag, strerror(-diag));
+				pid, -diag, strerror(-diag));
 	}
 }
 
@@ -3681,10 +3681,8 @@ struct cmd_csum_result {
 	int hw = 0;
 	uint16_t mask = 0;
 
-	if (port_id_is_invalid(res->port_id, ENABLED_WARN)) {
-		printf("invalid port %d\n", res->port_id);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	if (!strcmp(res->mode, "set")) {
 
@@ -4282,8 +4280,8 @@ struct cmd_gso_show_result {
 {
 	struct cmd_gso_show_result *res = parsed_result;
 
-	if (!rte_eth_dev_is_valid_port(res->cmd_pid)) {
-		printf("invalid port id %u\n", res->cmd_pid);
+	if (port_id_is_invalid(res->cmd_pid, ENABLED_WARN)) {
+		printf("invalid/not owned port id %u\n", res->cmd_pid);
 		return;
 	}
 	if (!strcmp(res->cmd_keyword, "gso")) {
@@ -5293,7 +5291,12 @@ static void cmd_create_bonded_device_parsed(void *parsed_result,
 				port_id);
 
 		/* Update number of ports */
-		nb_ports = rte_eth_dev_count();
+		if (rte_eth_dev_owner_set(port_id, &my_owner) != 0) {
+			printf("Error: cannot own new attached port %d\n",
+			       port_id);
+			return;
+		}
+		nb_ports++;
 		reconfig(port_id, res->socket);
 		rte_eth_promiscuous_enable(port_id);
 	}
@@ -5402,10 +5405,8 @@ static void cmd_set_bond_mon_period_parsed(void *parsed_result,
 	struct cmd_set_bond_mon_period_result *res = parsed_result;
 	int ret;
 
-	if (res->port_num >= nb_ports) {
-		printf("Port id %d must be less than %d\n", res->port_num, nb_ports);
+	if (port_id_is_invalid(res->port_num, ENABLED_WARN))
 		return;
-	}
 
 	ret = rte_eth_bond_link_monitoring_set(res->port_num, res->period_ms);
 
@@ -5463,11 +5464,8 @@ struct cmd_set_bonding_agg_mode_policy_result {
 	struct cmd_set_bonding_agg_mode_policy_result *res = parsed_result;
 	uint8_t policy = AGG_BANDWIDTH;
 
-	if (res->port_num >= nb_ports) {
-		printf("Port id %d must be less than %d\n",
-				res->port_num, nb_ports);
+	if (port_id_is_invalid(res->port_num, ENABLED_WARN))
 		return;
-	}
 
 	if (!strcmp(res->policy, "bandwidth"))
 		policy = AGG_BANDWIDTH;
@@ -5726,7 +5724,7 @@ static void cmd_set_promisc_mode_parsed(void *parsed_result,
 
 	/* all ports */
 	if (allports) {
-		RTE_ETH_FOREACH_DEV(i) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
 			if (enable)
 				rte_eth_promiscuous_enable(i);
 			else
@@ -5806,7 +5804,7 @@ static void cmd_set_allmulti_mode_parsed(void *parsed_result,
 
 	/* all ports */
 	if (allports) {
-		RTE_ETH_FOREACH_DEV(i) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
 			if (enable)
 				rte_eth_allmulticast_enable(i);
 			else
@@ -6540,31 +6538,31 @@ static void cmd_showportall_parsed(void *parsed_result,
 	struct cmd_showportall_result *res = parsed_result;
 	if (!strcmp(res->show, "clear")) {
 		if (!strcmp(res->what, "stats"))
-			RTE_ETH_FOREACH_DEV(i)
+			RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 				nic_stats_clear(i);
 		else if (!strcmp(res->what, "xstats"))
-			RTE_ETH_FOREACH_DEV(i)
+			RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 				nic_xstats_clear(i);
 	} else if (!strcmp(res->what, "info"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			port_infos_display(i);
 	else if (!strcmp(res->what, "stats"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			nic_stats_display(i);
 	else if (!strcmp(res->what, "xstats"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			nic_xstats_display(i);
 	else if (!strcmp(res->what, "fdir"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			fdir_get_infos(i);
 	else if (!strcmp(res->what, "stat_qmap"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			nic_stats_mapping_display(i);
 	else if (!strcmp(res->what, "dcb_tc"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			port_dcb_info_display(i);
 	else if (!strcmp(res->what, "cap"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			port_offload_cap_display(i);
 }
 
@@ -10484,10 +10482,8 @@ struct cmd_flow_director_mask_result {
 	struct rte_eth_fdir_masks *mask;
 	struct rte_port *port;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	port = &ports[res->port_id];
 	/** Check if the port is not started **/
@@ -10685,10 +10681,8 @@ struct cmd_flow_director_flex_mask_result {
 	uint16_t i;
 	int ret;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	port = &ports[res->port_id];
 	/** Check if the port is not started **/
@@ -10839,12 +10833,10 @@ struct cmd_flow_director_flexpayload_result {
 	struct cmd_flow_director_flexpayload_result *res = parsed_result;
 	struct rte_eth_flex_payload_cfg flex_cfg;
 	struct rte_port *port;
-	int ret = 0;
+	int ret;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	port = &ports[res->port_id];
 	/** Check if the port is not started **/
@@ -11560,7 +11552,7 @@ struct cmd_config_l2_tunnel_eth_type_result {
 	entry.l2_tunnel_type = str2fdir_l2_tunnel_type(res->l2_tunnel_type);
 	entry.ether_type = res->eth_type_val;
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		rte_eth_dev_l2_tunnel_eth_type_conf(pid, &entry);
 	}
 }
@@ -11676,7 +11668,7 @@ struct cmd_config_l2_tunnel_en_dis_result {
 	else
 		en = 0;
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		rte_eth_dev_l2_tunnel_offload_set(pid,
 						  &entry,
 						  ETH_L2_TUNNEL_ENABLE_MASK,
@@ -14203,10 +14195,8 @@ struct cmd_ddp_add_result {
 	int file_num;
 	int ret = -ENOTSUP;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	if (!all_ports_stopped()) {
 		printf("Please stop all ports first\n");
@@ -14285,10 +14275,8 @@ struct cmd_ddp_del_result {
 	uint32_t size;
 	int ret = -ENOTSUP;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	if (!all_ports_stopped()) {
 		printf("Please stop all ports first\n");
@@ -14600,10 +14588,8 @@ struct cmd_ddp_get_list_result {
 #endif
 	int ret = -ENOTSUP;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 #ifdef RTE_LIBRTE_I40E_PMD
 	size = PROFILE_INFO_SIZE * MAX_PROFILE_NUM + 4;
@@ -15821,7 +15807,7 @@ struct cmd_cmdfile_result {
 	if (id == (portid_t)RTE_PORT_ALL) {
 		portid_t pid;
 
-		RTE_ETH_FOREACH_DEV(pid) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 			/* check if need_reconfig has been set to 1 */
 			if (ports[pid].need_reconfig == 0)
 				ports[pid].need_reconfig = dev;
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 561e057..e55490f 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -2652,7 +2652,7 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 
 	(void)ctx;
 	(void)token;
-	RTE_ETH_FOREACH_DEV(p) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(p, my_owner.id) {
 		if (buf && i == ent)
 			return snprintf(buf, size, "%u", p);
 		++i;
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 86ca3aa..053af0e 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -155,7 +155,7 @@ struct rss_type_info {
 
 	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
 		printf("Valid port range is [0");
-		RTE_ETH_FOREACH_DEV(pid)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 			printf(", %d", pid);
 		printf("]\n");
 		return;
@@ -235,7 +235,7 @@ struct rss_type_info {
 
 	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
 		printf("Valid port range is [0");
-		RTE_ETH_FOREACH_DEV(pid)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 			printf(", %d", pid);
 		printf("]\n");
 		return;
@@ -252,10 +252,9 @@ struct rss_type_info {
 	struct rte_eth_xstat_name *xstats_names;
 
 	printf("###### NIC extended statistics for port %-2d\n", port_id);
-	if (!rte_eth_dev_is_valid_port(port_id)) {
-		printf("Error: Invalid port number %i\n", port_id);
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
 		return;
-	}
 
 	/* Get count */
 	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, 0);
@@ -320,7 +319,7 @@ struct rss_type_info {
 
 	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
 		printf("Valid port range is [0");
-		RTE_ETH_FOREACH_DEV(pid)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 			printf(", %d", pid);
 		printf("]\n");
 		return;
@@ -439,7 +438,7 @@ struct rss_type_info {
 
 	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
 		printf("Valid port range is [0");
-		RTE_ETH_FOREACH_DEV(pid)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 			printf(", %d", pid);
 		printf("]\n");
 		return;
@@ -724,10 +723,15 @@ struct rss_type_info {
 int
 port_id_is_invalid(portid_t port_id, enum print_warning warning)
 {
+	struct rte_eth_dev_owner owner;
+	int ret;
+
 	if (port_id == (portid_t)RTE_PORT_ALL)
 		return 0;
 
-	if (rte_eth_dev_is_valid_port(port_id))
+	ret = rte_eth_dev_owner_get(port_id, &owner);
+
+	if (ret == 0 && owner.id == my_owner.id)
 		return 0;
 
 	if (warning == ENABLED_WARN)
@@ -2310,7 +2314,7 @@ struct igb_ring_desc_16_bytes {
 		return;
 	}
 	nb_pt = 0;
-	RTE_ETH_FOREACH_DEV(i) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
 		if (! ((uint64_t)(1ULL << i) & portmask))
 			continue;
 		portlist[nb_pt++] = i;
@@ -2449,10 +2453,9 @@ struct igb_ring_desc_16_bytes {
 void
 setup_gro(const char *onoff, portid_t port_id)
 {
-	if (!rte_eth_dev_is_valid_port(port_id)) {
-		printf("invalid port id %u\n", port_id);
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
 		return;
-	}
+
 	if (test_done == 0) {
 		printf("Before enable/disable GRO,"
 				" please stop forwarding first\n");
@@ -2511,10 +2514,9 @@ struct igb_ring_desc_16_bytes {
 
 	param = &gro_ports[port_id].param;
 
-	if (!rte_eth_dev_is_valid_port(port_id)) {
-		printf("Invalid port id %u.\n", port_id);
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
 		return;
-	}
+
 	if (gro_ports[port_id].enable) {
 		printf("GRO type: TCP/IPv4\n");
 		if (gro_flush_cycles == GRO_DEFAULT_FLUSH_CYCLES) {
@@ -2532,10 +2534,9 @@ struct igb_ring_desc_16_bytes {
 void
 setup_gso(const char *mode, portid_t port_id)
 {
-	if (!rte_eth_dev_is_valid_port(port_id)) {
-		printf("invalid port id %u\n", port_id);
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
 		return;
-	}
+
 	if (strcmp(mode, "on") == 0) {
 		if (test_done == 0) {
 			printf("before enabling GSO,"
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 84e7a63..63c533c 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -428,7 +428,7 @@
 		if (port_id_is_invalid(port_id, ENABLED_WARN) ||
 			port_id == (portid_t)RTE_PORT_ALL) {
 			printf("Valid port range is [0");
-			RTE_ETH_FOREACH_DEV(pid)
+			RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 				printf(", %d", pid);
 			printf("]\n");
 			return -1;
@@ -489,7 +489,7 @@
 		if (port_id_is_invalid(port_id, ENABLED_WARN) ||
 			port_id == (portid_t)RTE_PORT_ALL) {
 			printf("Valid port range is [0");
-			RTE_ETH_FOREACH_DEV(pid)
+			RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 				printf(", %d", pid);
 			printf("]\n");
 			return -1;
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index c3ab448..5f187b4 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -136,6 +136,11 @@
 lcoreid_t nb_lcores;           /**< Number of probed logical cores. */
 
 /*
+ * My port owner structure used to own Ethernet ports.
+ */
+struct rte_eth_dev_owner my_owner; /**< Unique owner. */
+
+/*
  * Test Forwarding Configuration.
  *    nb_fwd_lcores <= nb_cfg_lcores <= nb_lcores
  *    nb_fwd_ports  <= nb_cfg_ports  <= nb_ports
@@ -483,7 +488,7 @@ static int eth_event_callback(portid_t port_id,
 	portid_t pt_id;
 	int i = 0;
 
-	RTE_ETH_FOREACH_DEV(pt_id)
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pt_id, my_owner.id)
 		fwd_ports_ids[i++] = pt_id;
 
 	nb_cfg_ports = nb_ports;
@@ -607,7 +612,7 @@ static int eth_event_callback(portid_t port_id,
 		fwd_lcores[lc_id]->cpuid_idx = lc_id;
 	}
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		port = &ports[pid];
 		rte_eth_dev_info_get(pid, &port->dev_info);
 
@@ -733,7 +738,7 @@ static int eth_event_callback(portid_t port_id,
 	queueid_t q;
 
 	/* set socket id according to numa or not */
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		port = &ports[pid];
 		if (nb_rxq > port->dev_info.max_rx_queues) {
 			printf("Fail: nb_rxq(%d) is greater than "
@@ -1027,9 +1032,8 @@ static int eth_event_callback(portid_t port_id,
 	uint64_t tics_per_1sec;
 	uint64_t tics_datum;
 	uint64_t tics_current;
-	uint8_t idx_port, cnt_ports;
+	uint16_t idx_port;
 
-	cnt_ports = rte_eth_dev_count();
 	tics_datum = rte_rdtsc();
 	tics_per_1sec = rte_get_timer_hz();
 #endif
@@ -1044,11 +1048,10 @@ static int eth_event_callback(portid_t port_id,
 			tics_current = rte_rdtsc();
 			if (tics_current - tics_datum >= tics_per_1sec) {
 				/* Periodic bitrate calculation */
-				for (idx_port = 0;
-						idx_port < cnt_ports;
-						idx_port++)
+				RTE_ETH_FOREACH_DEV_OWNED_BY(idx_port,
+							     my_owner.id)
 					rte_stats_bitrate_calc(bitrate_data,
-						idx_port);
+							       idx_port);
 				tics_datum = tics_current;
 			}
 		}
@@ -1386,7 +1389,7 @@ static int eth_event_callback(portid_t port_id,
 	portid_t pi;
 	struct rte_port *port;
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		port = &ports[pi];
 		/* Check if there is a port which is not started */
 		if ((port->port_status != RTE_PORT_STARTED) &&
@@ -1404,7 +1407,7 @@ static int eth_event_callback(portid_t port_id,
 	portid_t pi;
 	struct rte_port *port;
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		port = &ports[pi];
 		if ((port->port_status != RTE_PORT_STOPPED) &&
 			(port->slave_flag == 0))
@@ -1453,7 +1456,7 @@ static int eth_event_callback(portid_t port_id,
 
 	if(dcb_config)
 		dcb_test = 1;
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
 			continue;
 
@@ -1634,7 +1637,7 @@ static int eth_event_callback(portid_t port_id,
 
 	printf("Stopping ports...\n");
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
 			continue;
 
@@ -1677,7 +1680,7 @@ static int eth_event_callback(portid_t port_id,
 
 	printf("Closing ports...\n");
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
 			continue;
 
@@ -1728,7 +1731,7 @@ static int eth_event_callback(portid_t port_id,
 
 	printf("Resetting ports...\n");
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
 			continue;
 
@@ -1773,6 +1776,12 @@ static int eth_event_callback(portid_t port_id,
 	if (rte_eth_dev_attach(identifier, &pi))
 		return;
 
+	if (rte_eth_dev_owner_set(pi, &my_owner) != 0) {
+		printf("Error: cannot own new attached port %d\n", pi);
+		return;
+	}
+	nb_ports++;
+
 	socket_id = (unsigned)rte_eth_dev_socket_id(pi);
 	/* if socket_id is invalid, set to 0 */
 	if (check_socket_id(socket_id) < 0)
@@ -1780,8 +1789,6 @@ static int eth_event_callback(portid_t port_id,
 	reconfig(pi, socket_id);
 	rte_eth_promiscuous_enable(pi);
 
-	nb_ports = rte_eth_dev_count();
-
 	ports[pi].port_status = RTE_PORT_STOPPED;
 
 	printf("Port %d is attached. Now total ports is %d\n", pi, nb_ports);
@@ -1795,6 +1802,9 @@ static int eth_event_callback(portid_t port_id,
 
 	printf("Detaching a port...\n");
 
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
+		return;
+
 	if (!port_is_closed(port_id)) {
 		printf("Please close port first\n");
 		return;
@@ -1808,7 +1818,7 @@ static int eth_event_callback(portid_t port_id,
 		return;
 	}
 
-	nb_ports = rte_eth_dev_count();
+	nb_ports--;
 
 	printf("Port '%s' is detached. Now total ports is %d\n",
 			name, nb_ports);
@@ -1826,7 +1836,7 @@ static int eth_event_callback(portid_t port_id,
 
 	if (ports != NULL) {
 		no_link_check = 1;
-		RTE_ETH_FOREACH_DEV(pt_id) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pt_id, my_owner.id) {
 			printf("\nShutting down port %d...\n", pt_id);
 			fflush(stdout);
 			stop_port(pt_id);
@@ -1858,7 +1868,7 @@ struct pmd_test_command {
 	fflush(stdout);
 	for (count = 0; count <= MAX_CHECK_TIME; count++) {
 		all_ports_up = 1;
-		RTE_ETH_FOREACH_DEV(portid) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(portid, my_owner.id) {
 			if ((port_mask & (1 << portid)) == 0)
 				continue;
 			memset(&link, 0, sizeof(link));
@@ -1948,6 +1958,8 @@ struct pmd_test_command {
 
 	switch (type) {
 	case RTE_ETH_EVENT_INTR_RMV:
+		if (port_id_is_invalid(port_id, ENABLED_WARN))
+			break;
 		if (rte_eal_alarm_set(100000,
 				rmv_event_callback, (void *)(intptr_t)port_id))
 			fprintf(stderr, "Could not set up deferred device removal\n");
@@ -2083,7 +2095,7 @@ struct pmd_test_command {
 	portid_t pid;
 	struct rte_port *port;
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		port = &ports[pid];
 		port->dev_conf.rxmode = rx_mode;
 		port->dev_conf.fdir_conf = fdir_conf;
@@ -2394,7 +2406,12 @@ uint8_t port_is_bonding_slave(portid_t slave_pid)
 	rte_pdump_init(NULL);
 #endif
 
-	nb_ports = (portid_t) rte_eth_dev_count();
+	if (rte_eth_dev_owner_new(&my_owner.id))
+		rte_panic("Failed to get unique owner identifier\n");
+	snprintf(my_owner.name, sizeof(my_owner.name), TESTPMD_OWNER_NAME);
+	RTE_ETH_FOREACH_DEV_OWNED_BY(port_id, RTE_ETH_DEV_NO_OWNER)
+		if (rte_eth_dev_owner_set(port_id, &my_owner) == 0)
+			nb_ports++;
 	if (nb_ports == 0)
 		RTE_LOG(WARNING, EAL, "No probed ethernet devices\n");
 
@@ -2442,7 +2459,7 @@ uint8_t port_is_bonding_slave(portid_t slave_pid)
 		rte_exit(EXIT_FAILURE, "Start ports failed\n");
 
 	/* set all ports to promiscuous mode by default */
-	RTE_ETH_FOREACH_DEV(port_id)
+	RTE_ETH_FOREACH_DEV_OWNED_BY(port_id, my_owner.id)
 		rte_eth_promiscuous_enable(port_id);
 
 	/* Init metrics library */
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 1639d27..7db7d72 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -79,6 +79,8 @@
 #define NUMA_NO_CONFIG 0xFF
 #define UMA_NO_CONFIG  0xFF
 
+#define TESTPMD_OWNER_NAME "TestPMD"
+
 typedef uint8_t  lcoreid_t;
 typedef uint16_t portid_t;
 typedef uint16_t queueid_t;
@@ -409,6 +411,7 @@ struct queue_stats_mappings {
  * nb_fwd_ports <= nb_cfg_ports <= nb_ports
  */
 extern portid_t nb_ports; /**< Number of ethernet ports probed at init time. */
+extern struct rte_eth_dev_owner my_owner; /**< Unique owner. */
 extern portid_t nb_cfg_ports; /**< Number of configured ports. */
 extern portid_t nb_fwd_ports; /**< Number of forwarding ports. */
 extern portid_t fwd_ports_ids[RTE_MAX_ETHPORTS];
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/6] ethdev: synchronize port allocation
  2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 3/6] ethdev: synchronize port allocation Matan Azrad
@ 2018-01-07  9:58     ` Matan Azrad
  0 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-07  9:58 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Self-review.

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> Sent: Sunday, January 7, 2018 11:46 AM
> To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> <gaetan.rivet@6wind.com>; Jingjing Wu <jingjing.wu@intel.com>
> Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Bruce
> Richardson <bruce.richardson@intel.com>; Konstantin Ananyev
> <konstantin.ananyev@intel.com>
> Subject: [dpdk-dev] [PATCH v2 3/6] ethdev: synchronize port allocation
> 
> Ethernet port allocation was not thread safe, means 2 threads which tried to
> allocate a new port at the same time might get an identical port identifier and
> caused to memory overwrite.
> Actually, all the port configurations were not thread safe from ethdev point
> of view.
> 
> The port ownership mechanism added to the ethdev is a good point to
> redefine the synchronization rules in ethdev:
> 
> 1. The port allocation and port release synchronization will be
>    managed by ethdev.
> 2. The port usage synchronization will be managed by the port owner.
> 3. The port ownership synchronization will be managed by ethdev.
> 
> Add port allocation synchronization to complete the new rules.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> ---
>  config/common_base            |  4 ++--
>  lib/librte_ether/rte_ethdev.c | 39 ++++++++++++++++++++++++++++------
> -----
>  2 files changed, 30 insertions(+), 13 deletions(-)
> 
> diff --git a/config/common_base b/config/common_base index
> b8ee8f9..980ae3b 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -94,7 +94,7 @@ CONFIG_RTE_MAX_MEMSEG=256
>  CONFIG_RTE_MAX_MEMZONE=2560
>  CONFIG_RTE_MAX_TAILQ=32
>  CONFIG_RTE_ENABLE_ASSERT=n
> -CONFIG_RTE_LOG_LEVEL=RTE_LOG_INFO
> +CONFIG_RTE_LOG_LEVEL=RTE_LOG_DEBUG
Wrong change. Will fix in the next version.
>  CONFIG_RTE_LOG_DP_LEVEL=RTE_LOG_INFO
>  CONFIG_RTE_LOG_HISTORY=256
>  CONFIG_RTE_BACKTRACE=y
> @@ -136,7 +136,7 @@ CONFIG_RTE_LIBRTE_KVARGS=y  # Compile generic
> ethernet library  #  CONFIG_RTE_LIBRTE_ETHER=y -
> CONFIG_RTE_LIBRTE_ETHDEV_DEBUG=n
> +CONFIG_RTE_LIBRTE_ETHDEV_DEBUG=y
Wrong change. Will fix in the next version.
>  CONFIG_RTE_MAX_ETHPORTS=32
>  CONFIG_RTE_MAX_QUEUES_PER_PORT=1024
>  CONFIG_RTE_LIBRTE_IEEE1588=n
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 0e12452..d30d61f 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -85,6 +85,9 @@
>  /* spinlock for add/remove tx callbacks */  static rte_spinlock_t
> rte_eth_tx_cb_lock = RTE_SPINLOCK_INITIALIZER;
> 
> +/* spinlock for shared data allocation */ static rte_spinlock_t
> +rte_eth_share_data_alloc = RTE_SPINLOCK_INITIALIZER;
> +
>  /* spinlock for eth device ownership management stored in shared memory
> */  static rte_spinlock_t *rte_eth_dev_ownership_lock;
> 
> @@ -234,21 +237,27 @@ struct rte_eth_dev *  rte_eth_dev_allocate(const
> char *name)  {
>  	uint16_t port_id;
> -	struct rte_eth_dev *eth_dev;
> +	struct rte_eth_dev *eth_dev = NULL;
> +
> +	/* Synchronize share data one time allocation between local threads.
> */
> +	rte_spinlock_lock(&rte_eth_share_data_alloc);
> +	if (rte_eth_dev_data == NULL)
> +		rte_eth_dev_share_data_alloc();
> +	rte_spinlock_unlock(&rte_eth_share_data_alloc);
> +
> +	/* Synchronize port creation between primary and secondary
> threads. */
> +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> 
>  	port_id = rte_eth_dev_find_free_port();
>  	if (port_id == RTE_MAX_ETHPORTS) {
>  		RTE_PMD_DEBUG_TRACE("Reached maximum number of
> Ethernet ports\n");
> -		return NULL;
> +		goto unlock;
>  	}
> 
> -	if (rte_eth_dev_data == NULL)
> -		rte_eth_dev_share_data_alloc();
> -
>  	if (rte_eth_dev_allocated(name) != NULL) {
>  		RTE_PMD_DEBUG_TRACE("Ethernet Device with name %s
> already allocated!\n",
>  				name);
> -		return NULL;
> +		goto unlock;
>  	}
> 
>  	eth_dev = eth_dev_get(port_id);
> @@ -256,6 +265,8 @@ struct rte_eth_dev *
>  	eth_dev->data->port_id = port_id;
>  	eth_dev->data->mtu = ETHER_MTU;
> 
> +unlock:
> +	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
>  	return eth_dev;
>  }
> 
> @@ -268,10 +279,16 @@ struct rte_eth_dev *
> rte_eth_dev_attach_secondary(const char *name)  {
>  	uint16_t i;
> -	struct rte_eth_dev *eth_dev;
> +	struct rte_eth_dev *eth_dev = NULL;
> 
> +	/* Synchronize share data one time attachment between local
> threads. */
> +	rte_spinlock_lock(&rte_eth_share_data_alloc);
>  	if (rte_eth_dev_data == NULL)
>  		rte_eth_dev_share_data_alloc();
> +	rte_spinlock_unlock(&rte_eth_share_data_alloc);
> +
> +	/* Synchronize port attachment to primary port creation and release.
> */
> +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> 
>  	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
>  		if (strcmp(rte_eth_dev_data[i].name, name) == 0) @@ -
> 281,12 +298,12 @@ struct rte_eth_dev *
>  		RTE_PMD_DEBUG_TRACE(
>  			"device %s is not driven by the primary process\n",
>  			name);
> -		return NULL;
> +	} else {
> +		eth_dev = eth_dev_get(i);
> +		RTE_ASSERT(eth_dev->data->port_id == i);
>  	}
> 
> -	eth_dev = eth_dev_get(i);
> -	RTE_ASSERT(eth_dev->data->port_id == i);
> -
> +	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
>  	return eth_dev;
>  }
> 
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 5/6] net/failsafe: use ownership mechanism to own ports
  2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 5/6] net/failsafe: use ownership mechanism to own ports Matan Azrad
@ 2018-01-08 10:32     ` Gaëtan Rivet
  2018-01-08 11:16       ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Gaëtan Rivet @ 2018-01-08 10:32 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Thomas Monjalon, Jingjing Wu, dev, Neil Horman, Bruce Richardson,
	Konstantin Ananyev

Hi Matan,

Thanks for the patches. A remark however:

On Sun, Jan 07, 2018 at 09:45:50AM +0000, Matan Azrad wrote:
> Fail-safe PMD sub devices management is based on ethdev port mechanism.
> So, the sub-devices management structures are exposed to other DPDK
> entities which may use them in parallel to fail-safe PMD.
> 
> Use the new port ownership mechanism to avoid multiple managments of
> fail-safe PMD sub-devices.
> 

I think your implementation does not work with several fail-safe
instances, have you tested this configuration?

It should be possible for a user to create any number of fail-safe
instances. The minimum would be to allow for multiple fail-safe
side-by-side, but ideally it should also support a recursive
configuration:

                      +-----------+
                      |fail-safe  |
                      |           |
                      |           |
                    +-+           +--+
                    | |           |  |
                    | +-----------+  |
                    |                |
            +-------v----+     +-----v-----+
            |fail-safe   |     |           |
            |            |     |           |
            |            |     |           |
            |            |     |           |
          +-+            +-+   |           |
          | +------------+ |   +-----------+
          |                |
    +-----v-----+    +-----v-----+
    |           |    |           |
    |           |    |           |
    |           |    |           |
    |           |    |           |
    |           |    |           |
    +-----------+    +-----------+

If I am not mistaken on this, then you need to generate different
owner-ids for each fail-safe instances.

I suggest using the full fail-safe instance name, as they are already
assured to be different from each other by the EAL, and you thus won't
need to generate IDs on the fly, as well as declare a global owner-id
prefix.

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 5/6] net/failsafe: use ownership mechanism to own ports
  2018-01-08 10:32     ` Gaëtan Rivet
@ 2018-01-08 11:16       ` Matan Azrad
  2018-01-08 11:35         ` Gaëtan Rivet
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-08 11:16 UTC (permalink / raw)
  To: Gaëtan Rivet
  Cc: Thomas Monjalon, Jingjing Wu, dev, Neil Horman, Bruce Richardson,
	Konstantin Ananyev

Hi Gaetan

From: Gaëtan Rivet, Sent: Monday, January 8, 2018 12:33 PM
> Hi Matan,
> 
> Thanks for the patches. A remark however:
> 
> On Sun, Jan 07, 2018 at 09:45:50AM +0000, Matan Azrad wrote:
> > Fail-safe PMD sub devices management is based on ethdev port
> mechanism.
> > So, the sub-devices management structures are exposed to other DPDK
> > entities which may use them in parallel to fail-safe PMD.
> >
> > Use the new port ownership mechanism to avoid multiple managments of
> > fail-safe PMD sub-devices.
> >
> 
> I think your implementation does not work with several fail-safe instances,
> have you tested this configuration?
> 

Why not? Each instance calls to fs_eth_dev_create and there the unique owner id allocation is called.
So, Any instance should get a unique owner id.
 
> It should be possible for a user to create any number of fail-safe instances.
> The minimum would be to allow for multiple fail-safe side-by-side, but ideally
> it should also support a recursive
> configuration:
> 
>                       +-----------+
>                       |fail-safe  |
>                       |           |
>                       |           |
>                     +-+           +--+
>                     | |           |  |
>                     | +-----------+  |
>                     |                |
>             +-------v----+     +-----v-----+
>             |fail-safe   |     |           |
>             |            |     |           |
>             |            |     |           |
>             |            |     |           |
>           +-+            +-+   |           |
>           | +------------+ |   +-----------+
>           |                |
>     +-----v-----+    +-----v-----+
>     |           |    |           |
>     |           |    |           |
>     |           |    |           |
>     |           |    |           |
>     |           |    |           |
>     +-----------+    +-----------+
> 
> If I am not mistaken on this, then you need to generate different owner-ids
> for each fail-safe instances.
> 
it is already done as I wrote above.

> I suggest using the full fail-safe instance name, as they are already assured to
> be different from each other by the EAL, and you thus won't need to
> generate IDs on the fly, as well as declare a global owner-id prefix.
> 

The ID generation should be initiated by the DPDK entity itself(the fail-safe instance in this case).
The prefix can be changed to the EAL full fail-safe instance name, but it is not must, because the owner IDs are different.

> --
> Gaëtan Rivet
> 6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 5/6] net/failsafe: use ownership mechanism to own ports
  2018-01-08 11:16       ` Matan Azrad
@ 2018-01-08 11:35         ` Gaëtan Rivet
  0 siblings, 0 replies; 214+ messages in thread
From: Gaëtan Rivet @ 2018-01-08 11:35 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Thomas Monjalon, Jingjing Wu, dev, Neil Horman, Bruce Richardson,
	Konstantin Ananyev

On Mon, Jan 08, 2018 at 11:16:37AM +0000, Matan Azrad wrote:
> Hi Gaetan
> 
> From: Gaëtan Rivet, Sent: Monday, January 8, 2018 12:33 PM
> > Hi Matan,
> > 
> > Thanks for the patches. A remark however:
> > 
> > On Sun, Jan 07, 2018 at 09:45:50AM +0000, Matan Azrad wrote:
> > > Fail-safe PMD sub devices management is based on ethdev port
> > mechanism.
> > > So, the sub-devices management structures are exposed to other DPDK
> > > entities which may use them in parallel to fail-safe PMD.
> > >
> > > Use the new port ownership mechanism to avoid multiple managments of
> > > fail-safe PMD sub-devices.
> > >
> > 
> > I think your implementation does not work with several fail-safe instances,
> > have you tested this configuration?
> > 
> 
> Why not? Each instance calls to fs_eth_dev_create and there the unique owner id allocation is called.
> So, Any instance should get a unique owner id.
>  

Ah, sure, I forgot about the actual owner-id.

No problem then, as you said it should work.

Regarding the fail-safe part:

Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port ownership
  2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port ownership Matan Azrad
@ 2018-01-08 11:39     ` Gaëtan Rivet
  2018-01-08 12:30       ` Matan Azrad
  2018-01-16  5:53     ` Lu, Wenzhuo
  1 sibling, 1 reply; 214+ messages in thread
From: Gaëtan Rivet @ 2018-01-08 11:39 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Thomas Monjalon, Jingjing Wu, dev, Neil Horman, Bruce Richardson,
	Konstantin Ananyev

Hi Matan,

On Sun, Jan 07, 2018 at 09:45:51AM +0000, Matan Azrad wrote:
> Testpmd should not use ethdev ports which are managed by other DPDK
> entities.
> 
> Set Testpmd ownership to each port which is not used by other entity and
> prevent any usage of ethdev ports which are not owned by Testpmd.
> 

This patch should not be necessary.

Ideally, your API evolution should not impact the default case. As such,
the default iterator RTE_ETH_FOREACH_DEV should still be used in
testpmd.

RTE_ETH_FOREACH_DEV should call RTE_ETH_FOREACH_DEV_OWNED_BY, with the
default owner (meaning that it would thus iterate on the
application-owned set of device).

This new API should avoid breaking the current code as much as possible.

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port ownership
  2018-01-08 11:39     ` Gaëtan Rivet
@ 2018-01-08 12:30       ` Matan Azrad
  2018-01-08 13:30         ` Gaëtan Rivet
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-08 12:30 UTC (permalink / raw)
  To: Gaëtan Rivet
  Cc: Thomas Monjalon, Jingjing Wu, dev, Neil Horman, Bruce Richardson,
	Konstantin Ananyev



From: Gaëtan Rivet, Monday, January 8, 2018 1:40 PM
> Hi Matan,
> 
> On Sun, Jan 07, 2018 at 09:45:51AM +0000, Matan Azrad wrote:
> > Testpmd should not use ethdev ports which are managed by other DPDK
> > entities.
> >
> > Set Testpmd ownership to each port which is not used by other entity
> > and prevent any usage of ethdev ports which are not owned by Testpmd.
> >
> 
> This patch should not be necessary.
> 
> Ideally, your API evolution should not impact the default case. As such, the
> default iterator RTE_ETH_FOREACH_DEV should still be used in testpmd.
> 
Why? We want to adjust testpmd to the port ownership.

> RTE_ETH_FOREACH_DEV should call RTE_ETH_FOREACH_DEV_OWNED_BY,
> with the default owner (meaning that it would thus iterate on the
> application-owned set of device).
> 

It will break the API (we already talked about it).
There is not any default owner:
Any DPDK entity includes applications must to allocate an owner ID and use it to own the ports they wants to use.
The application can include more than 1 owner depends on the user needs.
Each DPDK entity which can synchronize all its port usage can be a valid DPDK entity for the ownership mechanism.

> This new API should avoid breaking the current code as much as possible.
> 
Yes, but there is a real big problem in testpmd regarding ownership issue - it must be changed.
The previous testpmd thought any port is for it in many places in the code.

Please see a lot of discussions about port ownership in the previous version.



> --
> Gaëtan Rivet
> 6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port ownership
  2018-01-08 12:30       ` Matan Azrad
@ 2018-01-08 13:30         ` Gaëtan Rivet
  2018-01-08 13:55           ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Gaëtan Rivet @ 2018-01-08 13:30 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Thomas Monjalon, Jingjing Wu, dev, Neil Horman, Bruce Richardson,
	Konstantin Ananyev

On Mon, Jan 08, 2018 at 12:30:19PM +0000, Matan Azrad wrote:
> 
> 
> From: Gaëtan Rivet, Monday, January 8, 2018 1:40 PM
> > Hi Matan,
> > 
> > On Sun, Jan 07, 2018 at 09:45:51AM +0000, Matan Azrad wrote:
> > > Testpmd should not use ethdev ports which are managed by other DPDK
> > > entities.
> > >
> > > Set Testpmd ownership to each port which is not used by other entity
> > > and prevent any usage of ethdev ports which are not owned by Testpmd.
> > >
> > 
> > This patch should not be necessary.
> > 
> > Ideally, your API evolution should not impact the default case. As such, the
> > default iterator RTE_ETH_FOREACH_DEV should still be used in testpmd.
> > 
> Why? We want to adjust testpmd to the port ownership.
> 

This adjustment should be seamless for existing application.

> > RTE_ETH_FOREACH_DEV should call RTE_ETH_FOREACH_DEV_OWNED_BY,
> > with the default owner (meaning that it would thus iterate on the
> > application-owned set of device).
> > 
> 
> It will break the API (we already talked about it).
> There is not any default owner:
> Any DPDK entity includes applications must to allocate an owner ID and use it to own the ports they wants to use.
> The application can include more than 1 owner depends on the user needs.
> Each DPDK entity which can synchronize all its port usage can be a valid DPDK entity for the ownership mechanism.
> 

That's the point of my remark: you did not include a default owner.
I think there should be one, and that all ports should pertain to this
default owner by default when created.

This would not prevent a user or application from adding new owners
specific to their use and specialize ports if need be.

However, for other applications that do not care for this
specialization, they should run with the current API and avoid the ports
that are configured by other third parties.

I'm thinking about applications already written that would be used with
fail-safe ports: they would use RTE_ETH_FOREACH_DEV, and would thus
iterate over every ports, including those owned by the fail-safe, unless
they start following the new API.

This is unnecessary: adding a default owner for all created ports and
redefining RTE_ETH_FOREACH_DEV as follow

#define RTE_ETH_FOREACH_DEV(i)
        RTE_ETH_FOREACH_DEV_OWNED_BY(i, RTE_ETH_DEFAULT_OWNER)

Is simple enough and will simplify the work of DPDK users. Moreover, it
would make fail-safe compatible with all applications using
RTE_ETH_FOREACH_DEV without additional evolution. It would actually
make any code using your API supported by those same applications, which
I think would help its adoption.

> > This new API should avoid breaking the current code as much as possible.
> > 
> Yes, but there is a real big problem in testpmd regarding ownership issue - it must be changed.
> The previous testpmd thought any port is for it in many places in the code.
> 

Sure, then update the code with RTE_ETH_FOREACH_DEV().

> Please see a lot of discussions about port ownership in the previous version.

You did not address this remark in the previous thread. I'm thus
reiterating it with the new version of your patchset.

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port ownership
  2018-01-08 13:30         ` Gaëtan Rivet
@ 2018-01-08 13:55           ` Matan Azrad
  2018-01-08 14:21             ` Gaëtan Rivet
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-08 13:55 UTC (permalink / raw)
  To: Gaëtan Rivet
  Cc: Thomas Monjalon, Jingjing Wu, dev, Neil Horman, Bruce Richardson,
	Konstantin Ananyev

Hi Gaetan

From: Gaëtan Rivet, Monday, January 8, 2018 3:30 PM
> On Mon, Jan 08, 2018 at 12:30:19PM +0000, Matan Azrad wrote:
> >
> >
> > From: Gaëtan Rivet, Monday, January 8, 2018 1:40 PM
> > > Hi Matan,
> > >
> > > On Sun, Jan 07, 2018 at 09:45:51AM +0000, Matan Azrad wrote:
> > > > Testpmd should not use ethdev ports which are managed by other
> > > > DPDK entities.
> > > >
> > > > Set Testpmd ownership to each port which is not used by other
> > > > entity and prevent any usage of ethdev ports which are not owned by
> Testpmd.
> > > >
> > >
> > > This patch should not be necessary.
> > >
> > > Ideally, your API evolution should not impact the default case. As
> > > such, the default iterator RTE_ETH_FOREACH_DEV should still be used in
> testpmd.
> > >
> > Why? We want to adjust testpmd to the port ownership.
> >
> 
> This adjustment should be seamless for existing application.
> 
> > > RTE_ETH_FOREACH_DEV should call
> RTE_ETH_FOREACH_DEV_OWNED_BY, with
> > > the default owner (meaning that it would thus iterate on the
> > > application-owned set of device).
> > >
> >
> > It will break the API (we already talked about it).
> > There is not any default owner:
> > Any DPDK entity includes applications must to allocate an owner ID and use
> it to own the ports they wants to use.
> > The application can include more than 1 owner depends on the user needs.
> > Each DPDK entity which can synchronize all its port usage can be a valid
> DPDK entity for the ownership mechanism.
> >
> 
> That's the point of my remark: you did not include a default owner.
> I think there should be one, and that all ports should pertain to this default
> owner by default when created.
> 
> This would not prevent a user or application from adding new owners specific
> to their use and specialize ports if need be.
> 
> However, for other applications that do not care for this specialization, they
> should run with the current API and avoid the ports that are configured by
> other third parties.
> 

RTE_ETH_FOREACH_DEV means iterate over all devices and should stay as is in my opinion.
I understand your concern about changes in current application,
But your "default" suggestion will cause to "non-default" applications to reset all the default owners and will complicate them and hurt semantics.

> I'm thinking about applications already written that would be used with fail-
> safe ports: they would use RTE_ETH_FOREACH_DEV, and would thus iterate
> over every ports, including those owned by the fail-safe, unless they start
> following the new API.
> 

They should start, it is really not complicated.
What's about application which use count=rte_eth_dev_count and iterate over all ports from 0 to count-1?
We cannot save all the wrong application options.

> This is unnecessary: adding a default owner for all created ports and
> redefining RTE_ETH_FOREACH_DEV as follow
> 
> #define RTE_ETH_FOREACH_DEV(i)
>         RTE_ETH_FOREACH_DEV_OWNED_BY(i, RTE_ETH_DEFAULT_OWNER)
> 
> Is simple enough and will simplify the work of DPDK users. Moreover, it
> would make fail-safe compatible with all applications using
> RTE_ETH_FOREACH_DEV without additional evolution. It would actually make
> any code using your API supported by those same applications, which I think
> would help its adoption.
> 

Will break API, will hurt semantic of FOREACH , and will complicate ownership care applications as I wrote above.

> --
> Gaëtan Rivet
> 6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port ownership
  2018-01-08 13:55           ` Matan Azrad
@ 2018-01-08 14:21             ` Gaëtan Rivet
  2018-01-08 14:42               ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Gaëtan Rivet @ 2018-01-08 14:21 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Thomas Monjalon, Jingjing Wu, dev, Neil Horman, Bruce Richardson,
	Konstantin Ananyev

On Mon, Jan 08, 2018 at 01:55:52PM +0000, Matan Azrad wrote:
> Hi Gaetan
> 
> From: Gaëtan Rivet, Monday, January 8, 2018 3:30 PM
> > On Mon, Jan 08, 2018 at 12:30:19PM +0000, Matan Azrad wrote:
> > >
> > >
> > > From: Gaëtan Rivet, Monday, January 8, 2018 1:40 PM
> > > > Hi Matan,
> > > >
> > > > On Sun, Jan 07, 2018 at 09:45:51AM +0000, Matan Azrad wrote:
> > > > > Testpmd should not use ethdev ports which are managed by other
> > > > > DPDK entities.
> > > > >
> > > > > Set Testpmd ownership to each port which is not used by other
> > > > > entity and prevent any usage of ethdev ports which are not owned by
> > Testpmd.
> > > > >
> > > >
> > > > This patch should not be necessary.
> > > >
> > > > Ideally, your API evolution should not impact the default case. As
> > > > such, the default iterator RTE_ETH_FOREACH_DEV should still be used in
> > testpmd.
> > > >
> > > Why? We want to adjust testpmd to the port ownership.
> > >
> > 
> > This adjustment should be seamless for existing application.
> > 
> > > > RTE_ETH_FOREACH_DEV should call
> > RTE_ETH_FOREACH_DEV_OWNED_BY, with
> > > > the default owner (meaning that it would thus iterate on the
> > > > application-owned set of device).
> > > >
> > >
> > > It will break the API (we already talked about it).
> > > There is not any default owner:
> > > Any DPDK entity includes applications must to allocate an owner ID and use
> > it to own the ports they wants to use.
> > > The application can include more than 1 owner depends on the user needs.
> > > Each DPDK entity which can synchronize all its port usage can be a valid
> > DPDK entity for the ownership mechanism.
> > >
> > 
> > That's the point of my remark: you did not include a default owner.
> > I think there should be one, and that all ports should pertain to this default
> > owner by default when created.
> > 
> > This would not prevent a user or application from adding new owners specific
> > to their use and specialize ports if need be.
> > 
> > However, for other applications that do not care for this specialization, they
> > should run with the current API and avoid the ports that are configured by
> > other third parties.
> > 
> 
> RTE_ETH_FOREACH_DEV means iterate over all devices and should stay as is in my opinion.
> I understand your concern about changes in current application,
> But your "default" suggestion will cause to "non-default" applications to reset all the default owners and will complicate them and hurt semantics.

Why should an application be able to iterate over all ports? Leave this
capability to the EAL (or ethdev layer) alone, while other components should
be restricted to their specific set.

And if a need for this general iterator appears, solutions could be
found very easily.

RTE_ETH_FOREACH_DEV currently does not iterate over deferred ports, it
iterates over the base set of ports available. Changing this behavior is
not necessary, you could introduce your API while keeping it.

> 
> > I'm thinking about applications already written that would be used with fail-
> > safe ports: they would use RTE_ETH_FOREACH_DEV, and would thus iterate
> > over every ports, including those owned by the fail-safe, unless they start
> > following the new API.
> > 
> 
> They should start, it is really not complicated.

The point is not whether developpers downstream would be able to grasp
such complexity, but whether a project like DPDK should foster an
unstable environment for its currently still limited ecosystem.

> What's about application which use count=rte_eth_dev_count and iterate over all ports from 0 to count-1?
> We cannot save all the wrong application options.
> 
> > This is unnecessary: adding a default owner for all created ports and
> > redefining RTE_ETH_FOREACH_DEV as follow
> > 
> > #define RTE_ETH_FOREACH_DEV(i)
> >         RTE_ETH_FOREACH_DEV_OWNED_BY(i, RTE_ETH_DEFAULT_OWNER)
> > 
> > Is simple enough and will simplify the work of DPDK users. Moreover, it
> > would make fail-safe compatible with all applications using
> > RTE_ETH_FOREACH_DEV without additional evolution. It would actually make
> > any code using your API supported by those same applications, which I think
> > would help its adoption.
> > 
> 
> Will break API, will hurt semantic of FOREACH , and will complicate ownership care applications as I wrote above.

Well, breaking an API is best before such API is integrated anyway.

I disagree regarding the added complexity for applications that would
use ownership. With your proposal, most applications will only add a
single user and register all their ports with this user, then be forced
to iterate upon their registered user.

You can save all of them the hassle of adding this code, by taking care
of the most common case, avoiding redundant code downstream and simplifying
possible future update to this default case.

So if anything, this would greatly simplify ownership for the vast
majority of applications.

-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port ownership
  2018-01-08 14:21             ` Gaëtan Rivet
@ 2018-01-08 14:42               ` Matan Azrad
  0 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-08 14:42 UTC (permalink / raw)
  To: Gaëtan Rivet
  Cc: Thomas Monjalon, Jingjing Wu, dev, Neil Horman, Bruce Richardson,
	Konstantin Ananyev



From: Gaëtan Rivet, Monday, January 8, 2018 4:22 PM
> On Mon, Jan 08, 2018 at 01:55:52PM +0000, Matan Azrad wrote:
> > Hi Gaetan
> >
> > From: Gaëtan Rivet, Monday, January 8, 2018 3:30 PM
> > > On Mon, Jan 08, 2018 at 12:30:19PM +0000, Matan Azrad wrote:
> > > >
> > > >
> > > > From: Gaëtan Rivet, Monday, January 8, 2018 1:40 PM
> > > > > Hi Matan,
> > > > >
> > > > > On Sun, Jan 07, 2018 at 09:45:51AM +0000, Matan Azrad wrote:
> > > > > > Testpmd should not use ethdev ports which are managed by other
> > > > > > DPDK entities.
> > > > > >
> > > > > > Set Testpmd ownership to each port which is not used by other
> > > > > > entity and prevent any usage of ethdev ports which are not
> > > > > > owned by
> > > Testpmd.
> > > > > >
> > > > >
> > > > > This patch should not be necessary.
> > > > >
> > > > > Ideally, your API evolution should not impact the default case.
> > > > > As such, the default iterator RTE_ETH_FOREACH_DEV should still
> > > > > be used in
> > > testpmd.
> > > > >
> > > > Why? We want to adjust testpmd to the port ownership.
> > > >
> > >
> > > This adjustment should be seamless for existing application.
> > >
> > > > > RTE_ETH_FOREACH_DEV should call
> > > RTE_ETH_FOREACH_DEV_OWNED_BY, with
> > > > > the default owner (meaning that it would thus iterate on the
> > > > > application-owned set of device).
> > > > >
> > > >
> > > > It will break the API (we already talked about it).
> > > > There is not any default owner:
> > > > Any DPDK entity includes applications must to allocate an owner ID
> > > > and use
> > > it to own the ports they wants to use.
> > > > The application can include more than 1 owner depends on the user
> needs.
> > > > Each DPDK entity which can synchronize all its port usage can be a
> > > > valid
> > > DPDK entity for the ownership mechanism.
> > > >
> > >
> > > That's the point of my remark: you did not include a default owner.
> > > I think there should be one, and that all ports should pertain to
> > > this default owner by default when created.
> > >
> > > This would not prevent a user or application from adding new owners
> > > specific to their use and specialize ports if need be.
> > >
> > > However, for other applications that do not care for this
> > > specialization, they should run with the current API and avoid the
> > > ports that are configured by other third parties.
> > >
> >
> > RTE_ETH_FOREACH_DEV means iterate over all devices and should stay as
> is in my opinion.
> > I understand your concern about changes in current application, But
> > your "default" suggestion will cause to "non-default" applications to reset
> all the default owners and will complicate them and hurt semantics.
> 
> Why should an application be able to iterate over all ports? Leave this
> capability to the EAL (or ethdev layer) alone, while other components should
> be restricted to their specific set.
> 

Yes, you right.

> And if a need for this general iterator appears, solutions could be found very
> easily.
> 
> RTE_ETH_FOREACH_DEV currently does not iterate over deferred ports, it
> iterates over the base set of ports available. Changing this behavior is not
> necessary, you could introduce your API while keeping it.
> 
Right.

> >
> > > I'm thinking about applications already written that would be used
> > > with fail- safe ports: they would use RTE_ETH_FOREACH_DEV, and would
> > > thus iterate over every ports, including those owned by the
> > > fail-safe, unless they start following the new API.
> > >
> >
> > They should start, it is really not complicated.
> 
> The point is not whether developpers downstream would be able to grasp
> such complexity, but whether a project like DPDK should foster an unstable
> environment for its currently still limited ecosystem.
> 
> > What's about application which use count=rte_eth_dev_count and iterate
> over all ports from 0 to count-1?
> > We cannot save all the wrong application options.
> >
> > > This is unnecessary: adding a default owner for all created ports
> > > and redefining RTE_ETH_FOREACH_DEV as follow
> > >
> > > #define RTE_ETH_FOREACH_DEV(i)
> > >         RTE_ETH_FOREACH_DEV_OWNED_BY(i, RTE_ETH_DEFAULT_OWNER)
> > >
> > > Is simple enough and will simplify the work of DPDK users. Moreover,
> > > it would make fail-safe compatible with all applications using
> > > RTE_ETH_FOREACH_DEV without additional evolution. It would actually
> > > make any code using your API supported by those same applications,
> > > which I think would help its adoption.
> > >
> >
> > Will break API, will hurt semantic of FOREACH , and will complicate
> ownership care applications as I wrote above.
> 
> Well, breaking an API is best before such API is integrated anyway.
> 
> I disagree regarding the added complexity for applications that would use
> ownership. With your proposal, most applications will only add a single user
> and register all their ports with this user, then be forced to iterate upon their
> registered user.
> 
> You can save all of them the hassle of adding this code, by taking care of the
> most common case, avoiding redundant code downstream and simplifying
> possible future update to this default case.
> 
> So if anything, this would greatly simplify ownership for the vast majority of
> applications.
>

OK, got you.
I will just document the API with the new semantic and will use the NO_OWNER for the old API.
But actually I think testpmd should use the ownership mechanism as a good example for it.

Thanks!
 
> --
> Gaëtan Rivet
> 6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership Matan Azrad
@ 2018-01-10 13:36     ` Ananyev, Konstantin
  2018-01-10 16:58       ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-10 13:36 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce

Hi Matan,

Few comments from me below.
BTW, do you plan to add ownership mandatory check in control path functions
that change port configuration?
Konstantin

> -----Original Message-----
> From: Matan Azrad [mailto:matan@mellanox.com]
> Sent: Sunday, January 7, 2018 9:46 AM
> To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson, Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>
> Subject: [PATCH v2 2/6] ethdev: add port ownership
> 
> The ownership of a port is implicit in DPDK.
> Making it explicit is better from the next reasons:
> 1. It will define well who is in charge of the port usage synchronization.
> 2. A library could work on top of a port.
> 3. A port can work on top of another port.
> 
> Also in the fail-safe case, an issue has been met in testpmd.
> We need to check that the application is not trying to use a port which
> is already managed by fail-safe.
> 
> A port owner is built from owner id(number) and owner name(string) while
> the owner id must be unique to distinguish between two identical entity
> instances and the owner name can be any name.
> The name helps to logically recognize the owner by different DPDK
> entities and allows easy debug.
> Each DPDK entity can allocate an owner unique identifier and can use it
> and its preferred name to owns valid ethdev ports.
> Each DPDK entity can get any port owner status to decide if it can
> manage the port or not.
> 
> The mechanism is synchronized for both the primary process threads and
> the secondary processes threads to allow secondary process entity to be
> a port owner.
> 
> Add a sinchronized ownership mechanism to DPDK Ethernet devices to
> avoid multiple management of a device by different DPDK entities.
> 
> The current ethdev internal port management is not affected by this
> feature.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> ---
>  doc/guides/prog_guide/poll_mode_drv.rst |  14 ++-
>  lib/librte_ether/rte_ethdev.c           | 206 ++++++++++++++++++++++++++++++--
>  lib/librte_ether/rte_ethdev.h           |  89 ++++++++++++++
>  lib/librte_ether/rte_ethdev_version.map |  12 ++
>  4 files changed, 311 insertions(+), 10 deletions(-)


> 
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 684e3e8..0e12452 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -70,7 +70,10 @@
> 
>  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
>  struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> +/* ports data array stored in shared memory */
>  static struct rte_eth_dev_data *rte_eth_dev_data;
> +/* next owner identifier stored in shared memory */
> +static uint16_t *rte_eth_next_owner_id;
>  static uint8_t eth_dev_last_created_port;
> 
>  /* spinlock for eth device callbacks */
> @@ -82,6 +85,9 @@
>  /* spinlock for add/remove tx callbacks */
>  static rte_spinlock_t rte_eth_tx_cb_lock = RTE_SPINLOCK_INITIALIZER;
> 
> +/* spinlock for eth device ownership management stored in shared memory */
> +static rte_spinlock_t *rte_eth_dev_ownership_lock;
> +
>  /* store statistics names and its offset in stats structure  */
>  struct rte_eth_xstats_name_off {
>  	char name[RTE_ETH_XSTATS_NAME_SIZE];
> @@ -153,14 +159,18 @@ enum {
>  }
> 
>  static void
> -rte_eth_dev_data_alloc(void)
> +rte_eth_dev_share_data_alloc(void)
>  {
>  	const unsigned flags = 0;
>  	const struct rte_memzone *mz;
> +	const unsigned int data_size = RTE_MAX_ETHPORTS *
> +						sizeof(*rte_eth_dev_data);
> 
>  	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> +		/* Allocate shared memory for port data and ownership */
>  		mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA,
> -				RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data),
> +				data_size + sizeof(*rte_eth_next_owner_id) +
> +				sizeof(*rte_eth_dev_ownership_lock),
>  				rte_socket_id(), flags);
>  	} else
>  		mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
> @@ -168,9 +178,17 @@ enum {
>  		rte_panic("Cannot allocate memzone for ethernet port data\n");
> 
>  	rte_eth_dev_data = mz->addr;
> -	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> -		memset(rte_eth_dev_data, 0,
> -				RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data));
> +	rte_eth_next_owner_id = (uint16_t *)((uintptr_t)mz->addr +
> +					     data_size);
> +	rte_eth_dev_ownership_lock = (rte_spinlock_t *)
> +		((uintptr_t)rte_eth_next_owner_id +
> +		 sizeof(*rte_eth_next_owner_id));


I think that might make  rte_eth_dev_ownership_lock location not 4B aligned...
Why just not to put all data that you are trying to allocate as one chunck into the same struct:
static struct {
        uint16_t next_owner_id;
        /* spinlock for eth device ownership management stored in shared memory */
        rte_spinlock_t dev_ownership_lock;
        rte_eth_dev_data *data;
} rte_eth_dev_data;
and allocate/use it everywhere?
That would simplify allocation/management stuff. 

It is good to see that now scanning/updating rte_eth_dev_data[] is lock protected,
but it might be not very plausible to protect both data[] and next_owner_id using the same lock.
In fact, for next_owner_id, you don't need a lock - just rte_atomic_t should be enough.
Another alternative would be to use 2 locks - one for next_owner_id second for actual data[]
protection. 

Another thing - you'll probably need to grab/release a lock inside rte_eth_dev_allocated() too.
It is a public function used by drivers, so need to be protected too.

> +
> +	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> +		memset(rte_eth_dev_data, 0, data_size);
> +		*rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER + 1;
> +		rte_spinlock_init(rte_eth_dev_ownership_lock);
> +	}
>  }
> 
>  struct rte_eth_dev *
> @@ -225,7 +243,7 @@ struct rte_eth_dev *
>  	}
> 
>  	if (rte_eth_dev_data == NULL)
> -		rte_eth_dev_data_alloc();
> +		rte_eth_dev_share_data_alloc();
> 
>  	if (rte_eth_dev_allocated(name) != NULL) {
>  		RTE_PMD_DEBUG_TRACE("Ethernet Device with name %s already allocated!\n",
> @@ -253,7 +271,7 @@ struct rte_eth_dev *
>  	struct rte_eth_dev *eth_dev;
> 
>  	if (rte_eth_dev_data == NULL)
> -		rte_eth_dev_data_alloc();
> +		rte_eth_dev_share_data_alloc();
> 
>  	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
>  		if (strcmp(rte_eth_dev_data[i].name, name) == 0)
> @@ -278,8 +296,12 @@ struct rte_eth_dev *
>  	if (eth_dev == NULL)
>  		return -EINVAL;
> 
> -	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
> +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> +
>  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> +	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
> +
> +	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
>  	return 0;
>  }
> 
> @@ -294,6 +316,174 @@ struct rte_eth_dev *
>  		return 1;
>  }
> 
> +static int
> +rte_eth_is_valid_owner_id(uint16_t owner_id)
> +{
> +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> +	    (*rte_eth_next_owner_id > RTE_ETH_DEV_NO_OWNER &&
> +	     *rte_eth_next_owner_id <= owner_id)) {
> +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> +		return 0;
> +	}
> +	return 1;
> +}
> +
> +uint16_t
> +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t owner_id)
> +{
> +	while (port_id < RTE_MAX_ETHPORTS &&
> +	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
> +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> +		port_id++;
> +
> +	if (port_id >= RTE_MAX_ETHPORTS)
> +		return RTE_MAX_ETHPORTS;
> +
> +	return port_id;
> +}
> +
> +int
> +rte_eth_dev_owner_new(uint16_t *owner_id)
> +{
> +	int ret = 0;
> +
> +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> +
> +	if (*rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> +		/* Counter wrap around. */
> +		RTE_PMD_DEBUG_TRACE("Reached maximum number of Ethernet port owners.\n");
> +		ret = -EUSERS;
> +	} else {
> +		*owner_id = (*rte_eth_next_owner_id)++;
> +	}
> +
> +	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
> +	return ret;
> +}
> +
> +int
> +rte_eth_dev_owner_set(const uint16_t port_id,
> +		      const struct rte_eth_dev_owner *owner)

As a nit - if you'll have rte_eth_dev_owner_set(port_id, old_owner, new_owner) 
- that might be more plausible for user, and would greatly simplify unset() part:
just set(port_id, cur_owner, zero_owner);

> +{
> +	struct rte_eth_dev_owner *port_owner;
> +	int ret = 0;
> +	int sret;
> +
> +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> +
> +	if (!rte_eth_dev_is_valid_port(port_id)) {
> +		RTE_PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> +		ret = -ENODEV;
> +		goto unlock;
> +	}
> +
> +	if (!rte_eth_is_valid_owner_id(owner->id)) {
> +		ret = -EINVAL;
> +		goto unlock;
> +	}
> +
> +	port_owner = &rte_eth_devices[port_id].data->owner;
> +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> +	    port_owner->id != owner->id) {
> +		RTE_LOG(ERR, EAL,
> +			"Cannot set owner to port %d already owned by %s_%05d.\n",
> +			port_id, port_owner->name, port_owner->id);
> +		ret = -EPERM;
> +		goto unlock;
> +	}
> +
> +	sret = snprintf(port_owner->name, RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> +			owner->name);
> +	if (sret < 0 || sret >= RTE_ETH_MAX_OWNER_NAME_LEN) {

Personally, I don't see any reason to fail if description was truncated...
Another alternative - just use rte_malloc() here to allocate big enough buffer to hold the description.

> +		memset(port_owner->name, 0, RTE_ETH_MAX_OWNER_NAME_LEN);
> +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> +		ret = -EINVAL;
> +		goto unlock;
> +	}
> +
> +	port_owner->id = owner->id;
> +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
> +			    owner->name, owner->id);
> +

As another nit - you can avoid all these gotos by restructuring code a bit:

rte_eth_dev_owner_set(const uint16_t port_id, const struct rte_eth_dev_owner *owner)
{
    rte_spinlock_lock(...);
    ret = _eth_dev_owner_set_unlocked(port_id, owner);
    rte_spinlock_unlock(...);
    return ret;
}


> +unlock:
> +	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
> +	return ret;
> +}
> +
> +int
> +rte_eth_dev_owner_unset(const uint16_t port_id, const uint16_t owner_id)
> +{
> +	struct rte_eth_dev_owner *port_owner;
> +	int ret = 0;
> +
> +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> +
> +	if (!rte_eth_dev_is_valid_port(port_id)) {
> +		RTE_PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> +		ret = -ENODEV;
> +		goto unlock;
> +	}
> +
> +	if (!rte_eth_is_valid_owner_id(owner_id)) {
> +		ret = -EINVAL;
> +		goto unlock;
> +	}
> +
> +	port_owner = &rte_eth_devices[port_id].data->owner;
> +	if (port_owner->id != owner_id) {
> +		RTE_LOG(ERR, EAL, "Cannot unset port %d owner (%s_%05d) by"
> +			" a different owner with id %5d.\n", port_id,
> +			port_owner->name, port_owner->id, owner_id);
> +		ret = -EPERM;
> +		goto unlock;
> +	}
> +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has removed.\n", port_id,
> +			    port_owner->name, port_owner->id);
> +
> +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> +
> +unlock:
> +	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
> +	return ret;
> +}
> +
> +void
> +rte_eth_dev_owner_delete(const uint16_t owner_id)
> +{
> +	uint16_t port_id;
> +
> +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> +
> +	if (rte_eth_is_valid_owner_id(owner_id)) {
> +		RTE_ETH_FOREACH_DEV_OWNED_BY(port_id, owner_id)
> +			memset(&rte_eth_devices[port_id].data->owner, 0,
> +			       sizeof(struct rte_eth_dev_owner));
> +		RTE_PMD_DEBUG_TRACE("All port owners owned by %05d identifier"
> +				    " have removed.\n", owner_id);
> +	}
> +
> +	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
> +}
> +
> +int
> +rte_eth_dev_owner_get(const uint16_t port_id, struct rte_eth_dev_owner *owner)
> +{
> +	int ret = 0;
> +
> +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> +
> +	if (!rte_eth_dev_is_valid_port(port_id)) {
> +		RTE_PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> +		ret = -ENODEV;
> +	} else {
> +		rte_memcpy(owner, &rte_eth_devices[port_id].data->owner,
> +			   sizeof(*owner));
> +	}
> +
> +	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
> +	return ret;
> +}
> +
>  int
>  rte_eth_dev_socket_id(uint16_t port_id)
>  {
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 57b61ed..88ad765 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
> 
>  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
> 
> +#define RTE_ETH_DEV_NO_OWNER 0
> +
> +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> +
> +struct rte_eth_dev_owner {
> +	uint16_t id; /**< The owner unique identifier. */

Why limit yourself to 16bit here?
Why not uint32_t/uint64_t - or even uuid_t and make system library to generate it for you?
Wouldn't need to worry about overflows then.

> +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner name. */
> +};
> +
>  /**
>   * @internal
>   * The data part, with no function pointers, associated with each ethernet device.
> @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
>  	int numa_node;  /**< NUMA node connection */
>  	struct rte_vlan_filter_conf vlan_filter_conf;
>  	/**< VLAN filter configuration. */
> +	struct rte_eth_dev_owner owner; /**< The port owner. */
>  };
> 
>  /** Device supports link state interrupt */
> @@ -1846,6 +1856,85 @@ struct rte_eth_dev_data {
> 
> 
>  /**
> + * Iterates over valid ethdev ports owned by a specific owner.
> + *
> + * @param port_id
> + *   The id of the next possible valid owned port.
> + * @param	owner_id
> + *  The owner identifier.
> + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless ports.
> + * @return
> + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if there is none.
> + */
> +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t owner_id);
> +
> +/**
> + * Macro to iterate over all enabled ethdev ports owned by a specific owner.
> + */
> +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> +	for (p = rte_eth_find_next_owned_by(0, o); \
> +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> +	     p = rte_eth_find_next_owned_by(p + 1, o))
> +
> +/**
> + * Get a new unique owner identifier.
> + * An owner identifier is used to owns Ethernet devices by only one DPDK entity
> + * to avoid multiple management of device by different entities.
> + *
> + * @param	owner_id
> + *   Owner identifier pointer.
> + * @return
> + *   Negative errno value on error, 0 on success.
> + */
> +int rte_eth_dev_owner_new(uint16_t *owner_id);
> +
> +/**
> + * Set an Ethernet device owner.
> + *
> + * @param	port_id
> + *  The identifier of the port to own.
> + * @param	owner
> + *  The owner pointer.
> + * @return
> + *  Negative errno value on error, 0 on success.
> + */
> +int rte_eth_dev_owner_set(const uint16_t port_id,
> +			  const struct rte_eth_dev_owner *owner);
> +
> +/**
> + * Unset Ethernet device owner to make the device ownerless.
> + *
> + * @param	port_id
> + *  The identifier of port to make ownerless.
> + * @param	owner
> + *  The owner identifier.
> + * @return
> + *  0 on success, negative errno value on error.
> + */
> +int rte_eth_dev_owner_unset(const uint16_t port_id, const uint16_t owner_id);
> +
> +/**
> + * Remove owner from all Ethernet devices owned by a specific owner.
> + *
> + * @param	owner
> + *  The owner identifier.
> + */
> +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> +
> +/**
> + * Get the owner of an Ethernet device.
> + *
> + * @param	port_id
> + *  The port identifier.
> + * @param	owner
> + *  The owner structure pointer to fill.
> + * @return
> + *  0 on success, negative errno value on error..
> + */
> +int rte_eth_dev_owner_get(const uint16_t port_id,
> +			  struct rte_eth_dev_owner *owner);
> +
> +/**
>   * Get the total number of Ethernet devices that have been successfully
>   * initialized by the matching Ethernet driver during the PCI probing phase
>   * and that are available for applications to use. These devices must be
> diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
> index e9681ac..5d20b5f 100644
> --- a/lib/librte_ether/rte_ethdev_version.map
> +++ b/lib/librte_ether/rte_ethdev_version.map
> @@ -198,6 +198,18 @@ DPDK_17.11 {
> 
>  } DPDK_17.08;
> 
> +DPDK_18.02 {
> +	global:
> +
> +	rte_eth_dev_owner_delete;
> +	rte_eth_dev_owner_get;
> +	rte_eth_dev_owner_new;
> +	rte_eth_dev_owner_set;
> +	rte_eth_dev_owner_unset;
> +	rte_eth_find_next_owned_by;
> +
> +} DPDK_17.11;
> +
>  EXPERIMENTAL {
>  	global:
> 
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-10 13:36     ` Ananyev, Konstantin
@ 2018-01-10 16:58       ` Matan Azrad
  2018-01-11 12:40         ` Ananyev, Konstantin
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-10 16:58 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce

Hi Konstantin

From: Ananyev, Konstantin, Wednesday, January 10, 2018 3:36 PM
> Hi Matan,
> 
> Few comments from me below.
> BTW, do you plan to add ownership mandatory check in control path
> functions that change port configuration?

No.


> Konstantin
> 
> > -----Original Message-----
> > From: Matan Azrad [mailto:matan@mellanox.com]
> > Sent: Sunday, January 7, 2018 9:46 AM
> > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>
> > Subject: [PATCH v2 2/6] ethdev: add port ownership
> >
> > The ownership of a port is implicit in DPDK.
> > Making it explicit is better from the next reasons:
> > 1. It will define well who is in charge of the port usage synchronization.
> > 2. A library could work on top of a port.
> > 3. A port can work on top of another port.
> >
> > Also in the fail-safe case, an issue has been met in testpmd.
> > We need to check that the application is not trying to use a port
> > which is already managed by fail-safe.
> >
> > A port owner is built from owner id(number) and owner name(string)
> > while the owner id must be unique to distinguish between two identical
> > entity instances and the owner name can be any name.
> > The name helps to logically recognize the owner by different DPDK
> > entities and allows easy debug.
> > Each DPDK entity can allocate an owner unique identifier and can use
> > it and its preferred name to owns valid ethdev ports.
> > Each DPDK entity can get any port owner status to decide if it can
> > manage the port or not.
> >
> > The mechanism is synchronized for both the primary process threads and
> > the secondary processes threads to allow secondary process entity to
> > be a port owner.
> >
> > Add a sinchronized ownership mechanism to DPDK Ethernet devices to
> > avoid multiple management of a device by different DPDK entities.
> >
> > The current ethdev internal port management is not affected by this
> > feature.
> >
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > ---
> >  doc/guides/prog_guide/poll_mode_drv.rst |  14 ++-
> >  lib/librte_ether/rte_ethdev.c           | 206
> ++++++++++++++++++++++++++++++--
> >  lib/librte_ether/rte_ethdev.h           |  89 ++++++++++++++
> >  lib/librte_ether/rte_ethdev_version.map |  12 ++
> >  4 files changed, 311 insertions(+), 10 deletions(-)
> 
> 
> >
> >
> > diff --git a/lib/librte_ether/rte_ethdev.c
> > b/lib/librte_ether/rte_ethdev.c index 684e3e8..0e12452 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -70,7 +70,10 @@
> >
> >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";  struct
> > rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> > +/* ports data array stored in shared memory */
> >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > +/* next owner identifier stored in shared memory */ static uint16_t
> > +*rte_eth_next_owner_id;
> >  static uint8_t eth_dev_last_created_port;
> >
> >  /* spinlock for eth device callbacks */ @@ -82,6 +85,9 @@
> >  /* spinlock for add/remove tx callbacks */  static rte_spinlock_t
> > rte_eth_tx_cb_lock = RTE_SPINLOCK_INITIALIZER;
> >
> > +/* spinlock for eth device ownership management stored in shared
> > +memory */ static rte_spinlock_t *rte_eth_dev_ownership_lock;
> > +
> >  /* store statistics names and its offset in stats structure  */
> > struct rte_eth_xstats_name_off {
> >  	char name[RTE_ETH_XSTATS_NAME_SIZE]; @@ -153,14 +159,18 @@
> enum {  }
> >
> >  static void
> > -rte_eth_dev_data_alloc(void)
> > +rte_eth_dev_share_data_alloc(void)
> >  {
> >  	const unsigned flags = 0;
> >  	const struct rte_memzone *mz;
> > +	const unsigned int data_size = RTE_MAX_ETHPORTS *
> > +						sizeof(*rte_eth_dev_data);
> >
> >  	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > +		/* Allocate shared memory for port data and ownership */
> >  		mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA,
> > -				RTE_MAX_ETHPORTS *
> sizeof(*rte_eth_dev_data),
> > +				data_size + sizeof(*rte_eth_next_owner_id)
> +
> > +				sizeof(*rte_eth_dev_ownership_lock),
> >  				rte_socket_id(), flags);
> >  	} else
> >  		mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
> > @@ -168,9 +178,17 @@ enum {
> >  		rte_panic("Cannot allocate memzone for ethernet port
> data\n");
> >
> >  	rte_eth_dev_data = mz->addr;
> > -	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> > -		memset(rte_eth_dev_data, 0,
> > -				RTE_MAX_ETHPORTS *
> sizeof(*rte_eth_dev_data));
> > +	rte_eth_next_owner_id = (uint16_t *)((uintptr_t)mz->addr +
> > +					     data_size);
> > +	rte_eth_dev_ownership_lock = (rte_spinlock_t *)
> > +		((uintptr_t)rte_eth_next_owner_id +
> > +		 sizeof(*rte_eth_next_owner_id));
> 
> 
> I think that might make  rte_eth_dev_ownership_lock location not 4B
> aligned...

Where can I find the documentation about it?

> Why just not to put all data that you are trying to allocate as one chunck into
> the same struct:
> static struct {
>         uint16_t next_owner_id;
>         /* spinlock for eth device ownership management stored in shared
> memory */
>         rte_spinlock_t dev_ownership_lock;
>         rte_eth_dev_data *data;
> } rte_eth_dev_data;
> and allocate/use it everywhere?
> That would simplify allocation/management stuff.
>
I don't understand what exactly do you mean. ?
If you mean to group all in one struct like:

static struct {
        uint16_t next_owner_id;
        rte_spinlock_t dev_ownership_lock;
        rte_eth_dev_data  data[];
} rte_eth_dev_share_data;

Just to simplify the addresses calculation above,
It will change more code in ethdev relative to the old rte_eth_dev_data global array and will be more intrusive.
Stay it as is, focuses the change only here.

I can just move the spinlock memory allocation to be at the beginning of the memzone(to be sure about the alignment).
 
> It is good to see that now scanning/updating rte_eth_dev_data[] is lock
> protected, but it might be not very plausible to protect both data[] and
> next_owner_id using the same lock.

I guess you mean to the owner structure in rte_eth_dev_data[port_id].
The next_owner_id is read by ownership APIs(for owner validation), so it makes sense to use the same lock.
Actually, why not?

> In fact, for next_owner_id, you don't need a lock - just rte_atomic_t should
> be enough.

I don't think so, it is problematic in next_owner_id wraparound and may complicate the code in other places which read it.
Why not just to keep it simple and using the same lock?

> Another alternative would be to use 2 locks - one for next_owner_id second
> for actual data[] protection.
> 
> Another thing - you'll probably need to grab/release a lock inside
> rte_eth_dev_allocated() too.
> It is a public function used by drivers, so need to be protected too.
> 

Yes, I thought about it, but decided not to use lock in next:
rte_eth_dev_allocated
rte_eth_dev_count
rte_eth_dev_get_name_by_port
rte_eth_dev_get_port_by_name
maybe more...

Don't you think it is just timing depended?(ask in the next moment and you may get another answer) I don't see optional crash.

> > +
> > +	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > +		memset(rte_eth_dev_data, 0, data_size);
> > +		*rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER + 1;
> > +		rte_spinlock_init(rte_eth_dev_ownership_lock);
> > +	}
> >  }
> >
> >  struct rte_eth_dev *
> > @@ -225,7 +243,7 @@ struct rte_eth_dev *
> >  	}
> >
> >  	if (rte_eth_dev_data == NULL)
> > -		rte_eth_dev_data_alloc();
> > +		rte_eth_dev_share_data_alloc();
> >
> >  	if (rte_eth_dev_allocated(name) != NULL) {
> >  		RTE_PMD_DEBUG_TRACE("Ethernet Device with name %s
> already
> > allocated!\n", @@ -253,7 +271,7 @@ struct rte_eth_dev *
> >  	struct rte_eth_dev *eth_dev;
> >
> >  	if (rte_eth_dev_data == NULL)
> > -		rte_eth_dev_data_alloc();
> > +		rte_eth_dev_share_data_alloc();
> >
> >  	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> >  		if (strcmp(rte_eth_dev_data[i].name, name) == 0) @@ -
> 278,8 +296,12
> > @@ struct rte_eth_dev *
> >  	if (eth_dev == NULL)
> >  		return -EINVAL;
> >
> > -	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
> > +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> > +
> >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> > +	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
> > +
> > +	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
> >  	return 0;
> >  }
> >
> > @@ -294,6 +316,174 @@ struct rte_eth_dev *
> >  		return 1;
> >  }
> >
> > +static int
> > +rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > +	    (*rte_eth_next_owner_id > RTE_ETH_DEV_NO_OWNER &&
> > +	     *rte_eth_next_owner_id <= owner_id)) {
> > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > +		return 0;
> > +	}
> > +	return 1;
> > +}
> > +
> > +uint16_t
> > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t
> owner_id)
> > +{
> > +	while (port_id < RTE_MAX_ETHPORTS &&
> > +	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
> > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > +		port_id++;
> > +
> > +	if (port_id >= RTE_MAX_ETHPORTS)
> > +		return RTE_MAX_ETHPORTS;
> > +
> > +	return port_id;
> > +}
> > +
> > +int
> > +rte_eth_dev_owner_new(uint16_t *owner_id) {
> > +	int ret = 0;
> > +
> > +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> > +
> > +	if (*rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > +		/* Counter wrap around. */
> > +		RTE_PMD_DEBUG_TRACE("Reached maximum number of
> Ethernet port owners.\n");
> > +		ret = -EUSERS;
> > +	} else {
> > +		*owner_id = (*rte_eth_next_owner_id)++;
> > +	}
> > +
> > +	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
> > +	return ret;
> > +}
> > +
> > +int
> > +rte_eth_dev_owner_set(const uint16_t port_id,
> > +		      const struct rte_eth_dev_owner *owner)
> 
> As a nit - if you'll have rte_eth_dev_owner_set(port_id, old_owner,
> new_owner)
> - that might be more plausible for user, and would greatly simplify unset()
> part:
> just set(port_id, cur_owner, zero_owner);
> 

How the user should know the old owner?

> > +{
> > +	struct rte_eth_dev_owner *port_owner;
> > +	int ret = 0;
> > +	int sret;
> > +
> > +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> > +
> > +	if (!rte_eth_dev_is_valid_port(port_id)) {
> > +		RTE_PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > +		ret = -ENODEV;
> > +		goto unlock;
> > +	}
> > +
> > +	if (!rte_eth_is_valid_owner_id(owner->id)) {
> > +		ret = -EINVAL;
> > +		goto unlock;
> > +	}
> > +
> > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > +	    port_owner->id != owner->id) {
> > +		RTE_LOG(ERR, EAL,
> > +			"Cannot set owner to port %d already owned by
> %s_%05d.\n",
> > +			port_id, port_owner->name, port_owner->id);
> > +		ret = -EPERM;
> > +		goto unlock;
> > +	}
> > +
> > +	sret = snprintf(port_owner->name,
> RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > +			owner->name);
> > +	if (sret < 0 || sret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> 
> Personally, I don't see any reason to fail if description was truncated...
> Another alternative - just use rte_malloc() here to allocate big enough buffer
> to hold the description.
> 

But it is static allocation like in the device name, why to allocate it differently?
 
> > +		memset(port_owner->name, 0,
> RTE_ETH_MAX_OWNER_NAME_LEN);
> > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > +		ret = -EINVAL;
> > +		goto unlock;
> > +	}
> > +
> > +	port_owner->id = owner->id;
> > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
> > +			    owner->name, owner->id);
> > +
> 
> As another nit - you can avoid all these gotos by restructuring code a bit:
> 
> rte_eth_dev_owner_set(const uint16_t port_id, const struct
> rte_eth_dev_owner *owner) {
>     rte_spinlock_lock(...);
>     ret = _eth_dev_owner_set_unlocked(port_id, owner);
>     rte_spinlock_unlock(...);
>     return ret;
> }
> 
Don't you like gotos? :)
I personally use it only in error\performance scenarios.
Do you think it worth the effort?

> 
> > +unlock:
> > +	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
> > +	return ret;
> > +}
> > +
> > +int
> > +rte_eth_dev_owner_unset(const uint16_t port_id, const uint16_t
> > +owner_id) {
> > +	struct rte_eth_dev_owner *port_owner;
> > +	int ret = 0;
> > +
> > +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> > +
> > +	if (!rte_eth_dev_is_valid_port(port_id)) {
> > +		RTE_PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > +		ret = -ENODEV;
> > +		goto unlock;
> > +	}
> > +
> > +	if (!rte_eth_is_valid_owner_id(owner_id)) {
> > +		ret = -EINVAL;
> > +		goto unlock;
> > +	}
> > +
> > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > +	if (port_owner->id != owner_id) {
> > +		RTE_LOG(ERR, EAL, "Cannot unset port %d owner (%s_%05d)
> by"
> > +			" a different owner with id %5d.\n", port_id,
> > +			port_owner->name, port_owner->id, owner_id);
> > +		ret = -EPERM;
> > +		goto unlock;
> > +	}
> > +	RTE_PMD_DEBUG_TRACE("Port %d owner %s_%05d has
> removed.\n", port_id,
> > +			    port_owner->name, port_owner->id);
> > +
> > +	memset(port_owner, 0, sizeof(struct rte_eth_dev_owner));
> > +
> > +unlock:
> > +	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
> > +	return ret;
> > +}
> > +
> > +void
> > +rte_eth_dev_owner_delete(const uint16_t owner_id) {
> > +	uint16_t port_id;
> > +
> > +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> > +
> > +	if (rte_eth_is_valid_owner_id(owner_id)) {
> > +		RTE_ETH_FOREACH_DEV_OWNED_BY(port_id, owner_id)
> > +			memset(&rte_eth_devices[port_id].data->owner, 0,
> > +			       sizeof(struct rte_eth_dev_owner));
> > +		RTE_PMD_DEBUG_TRACE("All port owners owned by %05d
> identifier"
> > +				    " have removed.\n", owner_id);
> > +	}
> > +
> > +	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
> > +}
> > +
> > +int
> > +rte_eth_dev_owner_get(const uint16_t port_id, struct
> > +rte_eth_dev_owner *owner) {
> > +	int ret = 0;
> > +
> > +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> > +
> > +	if (!rte_eth_dev_is_valid_port(port_id)) {
> > +		RTE_PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > +		ret = -ENODEV;
> > +	} else {
> > +		rte_memcpy(owner, &rte_eth_devices[port_id].data-
> >owner,
> > +			   sizeof(*owner));
> > +	}
> > +
> > +	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
> > +	return ret;
> > +}
> > +
> >  int
> >  rte_eth_dev_socket_id(uint16_t port_id)  { diff --git
> > a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index
> > 57b61ed..88ad765 100644
> > --- a/lib/librte_ether/rte_ethdev.h
> > +++ b/lib/librte_ether/rte_ethdev.h
> > @@ -1760,6 +1760,15 @@ struct rte_eth_dev_sriov {
> >
> >  #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
> >
> > +#define RTE_ETH_DEV_NO_OWNER 0
> > +
> > +#define RTE_ETH_MAX_OWNER_NAME_LEN 64
> > +
> > +struct rte_eth_dev_owner {
> > +	uint16_t id; /**< The owner unique identifier. */
> 
> Why limit yourself to 16bit here?
> Why not uint32_t/uint64_t - or even uuid_t and make system library to
> generate it for you?
> Wouldn't need to worry about overflows then.
> 

Interesting.
Will change it and will remove the overflow code from next_id!
(just didn't think about realistic usage of a lot of owners and take same type as port ID).

> > +	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner
> name. */ };
> > +
> >  /**
> >   * @internal
> >   * The data part, with no function pointers, associated with each ethernet
> device.
> > @@ -1810,6 +1819,7 @@ struct rte_eth_dev_data {
> >  	int numa_node;  /**< NUMA node connection */
> >  	struct rte_vlan_filter_conf vlan_filter_conf;
> >  	/**< VLAN filter configuration. */
> > +	struct rte_eth_dev_owner owner; /**< The port owner. */
> >  };
> >
> >  /** Device supports link state interrupt */ @@ -1846,6 +1856,85 @@
> > struct rte_eth_dev_data {
> >
> >
> >  /**
> > + * Iterates over valid ethdev ports owned by a specific owner.
> > + *
> > + * @param port_id
> > + *   The id of the next possible valid owned port.
> > + * @param	owner_id
> > + *  The owner identifier.
> > + *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless
> ports.
> > + * @return
> > + *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if there is
> none.
> > + */
> > +uint16_t rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t
> > +owner_id);
> > +
> > +/**
> > + * Macro to iterate over all enabled ethdev ports owned by a specific
> owner.
> > + */
> > +#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
> > +	for (p = rte_eth_find_next_owned_by(0, o); \
> > +	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
> > +	     p = rte_eth_find_next_owned_by(p + 1, o))
> > +
> > +/**
> > + * Get a new unique owner identifier.
> > + * An owner identifier is used to owns Ethernet devices by only one
> > +DPDK entity
> > + * to avoid multiple management of device by different entities.
> > + *
> > + * @param	owner_id
> > + *   Owner identifier pointer.
> > + * @return
> > + *   Negative errno value on error, 0 on success.
> > + */
> > +int rte_eth_dev_owner_new(uint16_t *owner_id);
> > +
> > +/**
> > + * Set an Ethernet device owner.
> > + *
> > + * @param	port_id
> > + *  The identifier of the port to own.
> > + * @param	owner
> > + *  The owner pointer.
> > + * @return
> > + *  Negative errno value on error, 0 on success.
> > + */
> > +int rte_eth_dev_owner_set(const uint16_t port_id,
> > +			  const struct rte_eth_dev_owner *owner);
> > +
> > +/**
> > + * Unset Ethernet device owner to make the device ownerless.
> > + *
> > + * @param	port_id
> > + *  The identifier of port to make ownerless.
> > + * @param	owner
> > + *  The owner identifier.
> > + * @return
> > + *  0 on success, negative errno value on error.
> > + */
> > +int rte_eth_dev_owner_unset(const uint16_t port_id, const uint16_t
> > +owner_id);
> > +
> > +/**
> > + * Remove owner from all Ethernet devices owned by a specific owner.
> > + *
> > + * @param	owner
> > + *  The owner identifier.
> > + */
> > +void rte_eth_dev_owner_delete(const uint16_t owner_id);
> > +
> > +/**
> > + * Get the owner of an Ethernet device.
> > + *
> > + * @param	port_id
> > + *  The port identifier.
> > + * @param	owner
> > + *  The owner structure pointer to fill.
> > + * @return
> > + *  0 on success, negative errno value on error..
> > + */
> > +int rte_eth_dev_owner_get(const uint16_t port_id,
> > +			  struct rte_eth_dev_owner *owner);
> > +
> > +/**
> >   * Get the total number of Ethernet devices that have been successfully
> >   * initialized by the matching Ethernet driver during the PCI probing phase
> >   * and that are available for applications to use. These devices must
> > be diff --git a/lib/librte_ether/rte_ethdev_version.map
> > b/lib/librte_ether/rte_ethdev_version.map
> > index e9681ac..5d20b5f 100644
> > --- a/lib/librte_ether/rte_ethdev_version.map
> > +++ b/lib/librte_ether/rte_ethdev_version.map
> > @@ -198,6 +198,18 @@ DPDK_17.11 {
> >
> >  } DPDK_17.08;
> >
> > +DPDK_18.02 {
> > +	global:
> > +
> > +	rte_eth_dev_owner_delete;
> > +	rte_eth_dev_owner_get;
> > +	rte_eth_dev_owner_new;
> > +	rte_eth_dev_owner_set;
> > +	rte_eth_dev_owner_unset;
> > +	rte_eth_find_next_owned_by;
> > +
> > +} DPDK_17.11;
> > +
> >  EXPERIMENTAL {
> >  	global:
> >
> > --
> > 1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-10 16:58       ` Matan Azrad
@ 2018-01-11 12:40         ` Ananyev, Konstantin
  2018-01-11 14:51           ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-11 12:40 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce


Hi Matan,

> 
> Hi Konstantin
> 
> From: Ananyev, Konstantin, Wednesday, January 10, 2018 3:36 PM
> > Hi Matan,
> >
> > Few comments from me below.
> > BTW, do you plan to add ownership mandatory check in control path
> > functions that change port configuration?
> 
> No.

So it still totally voluntary usage and application nneds to be changed
to exploit it?
Apart from RTE_FOR_EACH_DEV() change proposed by Gaetan?

> 
> 
> > Konstantin
> >
> > > -----Original Message-----
> > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > Sent: Sunday, January 7, 2018 9:46 AM
> > > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > <konstantin.ananyev@intel.com>
> > > Subject: [PATCH v2 2/6] ethdev: add port ownership
> > >
> > > The ownership of a port is implicit in DPDK.
> > > Making it explicit is better from the next reasons:
> > > 1. It will define well who is in charge of the port usage synchronization.
> > > 2. A library could work on top of a port.
> > > 3. A port can work on top of another port.
> > >
> > > Also in the fail-safe case, an issue has been met in testpmd.
> > > We need to check that the application is not trying to use a port
> > > which is already managed by fail-safe.
> > >
> > > A port owner is built from owner id(number) and owner name(string)
> > > while the owner id must be unique to distinguish between two identical
> > > entity instances and the owner name can be any name.
> > > The name helps to logically recognize the owner by different DPDK
> > > entities and allows easy debug.
> > > Each DPDK entity can allocate an owner unique identifier and can use
> > > it and its preferred name to owns valid ethdev ports.
> > > Each DPDK entity can get any port owner status to decide if it can
> > > manage the port or not.
> > >
> > > The mechanism is synchronized for both the primary process threads and
> > > the secondary processes threads to allow secondary process entity to
> > > be a port owner.
> > >
> > > Add a sinchronized ownership mechanism to DPDK Ethernet devices to
> > > avoid multiple management of a device by different DPDK entities.
> > >
> > > The current ethdev internal port management is not affected by this
> > > feature.
> > >
> > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > ---
> > >  doc/guides/prog_guide/poll_mode_drv.rst |  14 ++-
> > >  lib/librte_ether/rte_ethdev.c           | 206
> > ++++++++++++++++++++++++++++++--
> > >  lib/librte_ether/rte_ethdev.h           |  89 ++++++++++++++
> > >  lib/librte_ether/rte_ethdev_version.map |  12 ++
> > >  4 files changed, 311 insertions(+), 10 deletions(-)
> >
> >
> > >
> > >
> > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > b/lib/librte_ether/rte_ethdev.c index 684e3e8..0e12452 100644
> > > --- a/lib/librte_ether/rte_ethdev.c
> > > +++ b/lib/librte_ether/rte_ethdev.c
> > > @@ -70,7 +70,10 @@
> > >
> > >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";  struct
> > > rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> > > +/* ports data array stored in shared memory */
> > >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > > +/* next owner identifier stored in shared memory */ static uint16_t
> > > +*rte_eth_next_owner_id;
> > >  static uint8_t eth_dev_last_created_port;
> > >
> > >  /* spinlock for eth device callbacks */ @@ -82,6 +85,9 @@
> > >  /* spinlock for add/remove tx callbacks */  static rte_spinlock_t
> > > rte_eth_tx_cb_lock = RTE_SPINLOCK_INITIALIZER;
> > >
> > > +/* spinlock for eth device ownership management stored in shared
> > > +memory */ static rte_spinlock_t *rte_eth_dev_ownership_lock;
> > > +
> > >  /* store statistics names and its offset in stats structure  */
> > > struct rte_eth_xstats_name_off {
> > >  	char name[RTE_ETH_XSTATS_NAME_SIZE]; @@ -153,14 +159,18 @@
> > enum {  }
> > >
> > >  static void
> > > -rte_eth_dev_data_alloc(void)
> > > +rte_eth_dev_share_data_alloc(void)
> > >  {
> > >  	const unsigned flags = 0;
> > >  	const struct rte_memzone *mz;
> > > +	const unsigned int data_size = RTE_MAX_ETHPORTS *
> > > +						sizeof(*rte_eth_dev_data);
> > >
> > >  	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > > +		/* Allocate shared memory for port data and ownership */
> > >  		mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA,
> > > -				RTE_MAX_ETHPORTS *
> > sizeof(*rte_eth_dev_data),
> > > +				data_size + sizeof(*rte_eth_next_owner_id)
> > +
> > > +				sizeof(*rte_eth_dev_ownership_lock),
> > >  				rte_socket_id(), flags);
> > >  	} else
> > >  		mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
> > > @@ -168,9 +178,17 @@ enum {
> > >  		rte_panic("Cannot allocate memzone for ethernet port
> > data\n");
> > >
> > >  	rte_eth_dev_data = mz->addr;
> > > -	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> > > -		memset(rte_eth_dev_data, 0,
> > > -				RTE_MAX_ETHPORTS *
> > sizeof(*rte_eth_dev_data));
> > > +	rte_eth_next_owner_id = (uint16_t *)((uintptr_t)mz->addr +
> > > +					     data_size);
> > > +	rte_eth_dev_ownership_lock = (rte_spinlock_t *)
> > > +		((uintptr_t)rte_eth_next_owner_id +
> > > +		 sizeof(*rte_eth_next_owner_id));
> >
> >
> > I think that might make  rte_eth_dev_ownership_lock location not 4B
> > aligned...
> 
> Where can I find the documentation about it?

That's in your code above - data_size and mz_->addr are both at least 4B aligned -
rte_eth_dev_ownership_lock = mz->addr + data_size + 2;
You can align it manually, but as discussed below it is probably easier to group related
fields into the same struct. 

> 
> > Why just not to put all data that you are trying to allocate as one chunck into
> > the same struct:
> > static struct {
> >         uint16_t next_owner_id;
> >         /* spinlock for eth device ownership management stored in shared
> > memory */
> >         rte_spinlock_t dev_ownership_lock;
> >         rte_eth_dev_data *data;
> > } rte_eth_dev_data;
> > and allocate/use it everywhere?
> > That would simplify allocation/management stuff.
> >
> I don't understand what exactly do you mean. ?
> If you mean to group all in one struct like:
> 
> static struct {
>         uint16_t next_owner_id;
>         rte_spinlock_t dev_ownership_lock;
>         rte_eth_dev_data  data[];
> } rte_eth_dev_share_data;
> 
> Just to simplify the addresses calculation above,

Yep, that's exactly what I meant.
As you said it would help with bulk allocation/alignment stuff, plus
IMO it is better and easier to group several related global together -
Improve code quality, will make it easier to read & maintain in future. 

> It will change more code in ethdev relative to the old rte_eth_dev_data global array and will be more intrusive.
> Stay it as is, focuses the change only here.

Yes it would require few more changes, though I think it worth it.

> 
> I can just move the spinlock memory allocation to be at the beginning of the memzone(to be sure about the alignment).
> 
> > It is good to see that now scanning/updating rte_eth_dev_data[] is lock
> > protected, but it might be not very plausible to protect both data[] and
> > next_owner_id using the same lock.
> 
> I guess you mean to the owner structure in rte_eth_dev_data[port_id].
> The next_owner_id is read by ownership APIs(for owner validation), so it makes sense to use the same lock.
> Actually, why not?

Well to me next_owner_id and rte_eth_dev_data[] are not directly related.
You may create new owner_id but it doesn't mean you would update rte_eth_dev_data[] immediately.
And visa-versa - you might just want to update rte_eth_dev_data[].name or .owner_id.
It is not very good coding practice to use same lock for non-related data structures.

> 
> > In fact, for next_owner_id, you don't need a lock - just rte_atomic_t should
> > be enough.
> 
> I don't think so, it is problematic in next_owner_id wraparound and may complicate the code in other places which read it.

IMO it is not that complicated, something like that should work I think.

/* init to 0 at startup*/
rte_atomic32_t *owner_id;

int new_owner_id(void)
{
    int32_t x;
    x = rte_atomic32_add_return(&owner_id, 1);
    if (x > UINT16_MAX) {
       rte_atomic32_dec(&owner_id);
       return -EOVERWLOW;
    } else 
        return x;    
} 


> Why not just to keep it simple and using the same lock?

Lock is also fine, I just think it better be a separate one - that would protext just next_owner_id.
Though if you are going to use uuid here - all that probably not relevant any more.

> 
> > Another alternative would be to use 2 locks - one for next_owner_id second
> > for actual data[] protection.
> >
> > Another thing - you'll probably need to grab/release a lock inside
> > rte_eth_dev_allocated() too.
> > It is a public function used by drivers, so need to be protected too.
> >
> 
> Yes, I thought about it, but decided not to use lock in next:
> rte_eth_dev_allocated
> rte_eth_dev_count
> rte_eth_dev_get_name_by_port
> rte_eth_dev_get_port_by_name
> maybe more...

As I can see in patch #3 you protect by lock access to  rte_eth_dev_data[].name
(which seems like a good  thing).
So I think any other public function that access rte_eth_dev_data[].name should be
protected by the same lock.

> 
> Don't you think it is just timing depended?(ask in the next moment and you may get another answer) I don't see optional crash.
> 
> > > +
> > > +	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > > +		memset(rte_eth_dev_data, 0, data_size);
> > > +		*rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER + 1;
> > > +		rte_spinlock_init(rte_eth_dev_ownership_lock);
> > > +	}
> > >  }
> > >
> > >  struct rte_eth_dev *
> > > @@ -225,7 +243,7 @@ struct rte_eth_dev *
> > >  	}
> > >
> > >  	if (rte_eth_dev_data == NULL)
> > > -		rte_eth_dev_data_alloc();
> > > +		rte_eth_dev_share_data_alloc();
> > >
> > >  	if (rte_eth_dev_allocated(name) != NULL) {
> > >  		RTE_PMD_DEBUG_TRACE("Ethernet Device with name %s
> > already
> > > allocated!\n", @@ -253,7 +271,7 @@ struct rte_eth_dev *
> > >  	struct rte_eth_dev *eth_dev;
> > >
> > >  	if (rte_eth_dev_data == NULL)
> > > -		rte_eth_dev_data_alloc();
> > > +		rte_eth_dev_share_data_alloc();
> > >
> > >  	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> > >  		if (strcmp(rte_eth_dev_data[i].name, name) == 0) @@ -
> > 278,8 +296,12
> > > @@ struct rte_eth_dev *
> > >  	if (eth_dev == NULL)
> > >  		return -EINVAL;
> > >
> > > -	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
> > > +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> > > +
> > >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> > > +	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
> > > +
> > > +	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
> > >  	return 0;
> > >  }
> > >
> > > @@ -294,6 +316,174 @@ struct rte_eth_dev *
> > >  		return 1;
> > >  }
> > >
> > > +static int
> > > +rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > > +	    (*rte_eth_next_owner_id > RTE_ETH_DEV_NO_OWNER &&
> > > +	     *rte_eth_next_owner_id <= owner_id)) {
> > > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > +		return 0;
> > > +	}
> > > +	return 1;
> > > +}
> > > +
> > > +uint16_t
> > > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t
> > owner_id)
> > > +{
> > > +	while (port_id < RTE_MAX_ETHPORTS &&
> > > +	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
> > > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > > +		port_id++;
> > > +
> > > +	if (port_id >= RTE_MAX_ETHPORTS)
> > > +		return RTE_MAX_ETHPORTS;
> > > +
> > > +	return port_id;
> > > +}
> > > +
> > > +int
> > > +rte_eth_dev_owner_new(uint16_t *owner_id) {
> > > +	int ret = 0;
> > > +
> > > +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> > > +
> > > +	if (*rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > > +		/* Counter wrap around. */
> > > +		RTE_PMD_DEBUG_TRACE("Reached maximum number of
> > Ethernet port owners.\n");
> > > +		ret = -EUSERS;
> > > +	} else {
> > > +		*owner_id = (*rte_eth_next_owner_id)++;
> > > +	}
> > > +
> > > +	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
> > > +	return ret;
> > > +}
> > > +
> > > +int
> > > +rte_eth_dev_owner_set(const uint16_t port_id,
> > > +		      const struct rte_eth_dev_owner *owner)
> >
> > As a nit - if you'll have rte_eth_dev_owner_set(port_id, old_owner,
> > new_owner)
> > - that might be more plausible for user, and would greatly simplify unset()
> > part:
> > just set(port_id, cur_owner, zero_owner);
> >
> 
> How the user should know the old owner?

By dev_owner_get() or it might have it stored somewhere already
(or constructed on the fly in case of NO_OWNER).

> 
> > > +{
> > > +	struct rte_eth_dev_owner *port_owner;
> > > +	int ret = 0;
> > > +	int sret;
> > > +
> > > +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> > > +
> > > +	if (!rte_eth_dev_is_valid_port(port_id)) {
> > > +		RTE_PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > > +		ret = -ENODEV;
> > > +		goto unlock;
> > > +	}
> > > +
> > > +	if (!rte_eth_is_valid_owner_id(owner->id)) {
> > > +		ret = -EINVAL;
> > > +		goto unlock;
> > > +	}
> > > +
> > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > > +	    port_owner->id != owner->id) {
> > > +		RTE_LOG(ERR, EAL,
> > > +			"Cannot set owner to port %d already owned by
> > %s_%05d.\n",
> > > +			port_id, port_owner->name, port_owner->id);
> > > +		ret = -EPERM;
> > > +		goto unlock;
> > > +	}
> > > +
> > > +	sret = snprintf(port_owner->name,
> > RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > > +			owner->name);
> > > +	if (sret < 0 || sret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> >
> > Personally, I don't see any reason to fail if description was truncated...
> > Another alternative - just use rte_malloc() here to allocate big enough buffer
> > to hold the description.
> >
> 
> But it is static allocation like in the device name, why to allocate it differently?

Static allocation is fine by me - I just said there is probably no need to fail
if description provide by use will be truncated in that case.
Though if used description is *that* important - rte_malloc() can help here. 

> 
> > > +		memset(port_owner->name, 0,
> > RTE_ETH_MAX_OWNER_NAME_LEN);
> > > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > > +		ret = -EINVAL;
> > > +		goto unlock;
> > > +	}
> > > +
> > > +	port_owner->id = owner->id;
> > > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
> > > +			    owner->name, owner->id);
> > > +
> >
> > As another nit - you can avoid all these gotos by restructuring code a bit:
> >
> > rte_eth_dev_owner_set(const uint16_t port_id, const struct
> > rte_eth_dev_owner *owner) {
> >     rte_spinlock_lock(...);
> >     ret = _eth_dev_owner_set_unlocked(port_id, owner);
> >     rte_spinlock_unlock(...);
> >     return ret;
> > }
> >
> Don't you like gotos? :)

Not really :)

> I personally use it only in error\performance scenarios.

Same here - prefer to avoid them if possible.

> Do you think it worth the effort?

IMO - yes, well structured code is much easier to understand and maintain.
Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-11 12:40         ` Ananyev, Konstantin
@ 2018-01-11 14:51           ` Matan Azrad
  2018-01-12  0:02             ` Ananyev, Konstantin
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-11 14:51 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce

Hi Konstantin

From: Ananyev, Konstantin, Thursday, January 11, 2018 2:40 PM
> Hi Matan,
> 
> >
> > Hi Konstantin
> >
> > From: Ananyev, Konstantin, Wednesday, January 10, 2018 3:36 PM
> > > Hi Matan,
> > >
> > > Few comments from me below.
> > > BTW, do you plan to add ownership mandatory check in control path
> > > functions that change port configuration?
> >
> > No.
> 
> So it still totally voluntary usage and application nneds to be changed to
> exploit it?
> Apart from RTE_FOR_EACH_DEV() change proposed by Gaetan?
> 

Also RTE_FOR_EACH_DEV() change proposed by Gaetan is not protected because 2 DPDK entities can get the same port while using it.
As I wrote in the log\docs and as discussed a lot in the first version:
The new synchronization rules are:
1. The port allocation and port release synchronization will be
   managed by ethdev.
2. The port usage synchronization will be managed by the port owner.
3. The port ownership API synchronization(also with port creation) will be managed by ethdev.
5. DPDK entity which want to use a port must take ownership before.

Ethdev should not protect 2 and 4 according these rules.

> > > Konstantin
> > >
> > > > -----Original Message-----
> > > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > > Sent: Sunday, January 7, 2018 9:46 AM
> > > > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> Richardson,
> > > > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > > <konstantin.ananyev@intel.com>
> > > > Subject: [PATCH v2 2/6] ethdev: add port ownership
> > > >
> > > > The ownership of a port is implicit in DPDK.
> > > > Making it explicit is better from the next reasons:
> > > > 1. It will define well who is in charge of the port usage synchronization.
> > > > 2. A library could work on top of a port.
> > > > 3. A port can work on top of another port.
> > > >
> > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > We need to check that the application is not trying to use a port
> > > > which is already managed by fail-safe.
> > > >
> > > > A port owner is built from owner id(number) and owner name(string)
> > > > while the owner id must be unique to distinguish between two
> > > > identical entity instances and the owner name can be any name.
> > > > The name helps to logically recognize the owner by different DPDK
> > > > entities and allows easy debug.
> > > > Each DPDK entity can allocate an owner unique identifier and can
> > > > use it and its preferred name to owns valid ethdev ports.
> > > > Each DPDK entity can get any port owner status to decide if it can
> > > > manage the port or not.
> > > >
> > > > The mechanism is synchronized for both the primary process threads
> > > > and the secondary processes threads to allow secondary process
> > > > entity to be a port owner.
> > > >
> > > > Add a sinchronized ownership mechanism to DPDK Ethernet devices to
> > > > avoid multiple management of a device by different DPDK entities.
> > > >
> > > > The current ethdev internal port management is not affected by
> > > > this feature.
> > > >
> > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > ---
> > > >  doc/guides/prog_guide/poll_mode_drv.rst |  14 ++-
> > > >  lib/librte_ether/rte_ethdev.c           | 206
> > > ++++++++++++++++++++++++++++++--
> > > >  lib/librte_ether/rte_ethdev.h           |  89 ++++++++++++++
> > > >  lib/librte_ether/rte_ethdev_version.map |  12 ++
> > > >  4 files changed, 311 insertions(+), 10 deletions(-)
> > >
> > >
> > > >
> > > >
> > > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > > b/lib/librte_ether/rte_ethdev.c index 684e3e8..0e12452 100644
> > > > --- a/lib/librte_ether/rte_ethdev.c
> > > > +++ b/lib/librte_ether/rte_ethdev.c
> > > > @@ -70,7 +70,10 @@
> > > >
> > > >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
> > > > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> > > > +/* ports data array stored in shared memory */
> > > >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > > > +/* next owner identifier stored in shared memory */ static
> > > > +uint16_t *rte_eth_next_owner_id;
> > > >  static uint8_t eth_dev_last_created_port;
> > > >
> > > >  /* spinlock for eth device callbacks */ @@ -82,6 +85,9 @@
> > > >  /* spinlock for add/remove tx callbacks */  static rte_spinlock_t
> > > > rte_eth_tx_cb_lock = RTE_SPINLOCK_INITIALIZER;
> > > >
> > > > +/* spinlock for eth device ownership management stored in shared
> > > > +memory */ static rte_spinlock_t *rte_eth_dev_ownership_lock;
> > > > +
> > > >  /* store statistics names and its offset in stats structure  */
> > > > struct rte_eth_xstats_name_off {
> > > >  	char name[RTE_ETH_XSTATS_NAME_SIZE]; @@ -153,14 +159,18 @@
> > > enum {  }
> > > >
> > > >  static void
> > > > -rte_eth_dev_data_alloc(void)
> > > > +rte_eth_dev_share_data_alloc(void)
> > > >  {
> > > >  	const unsigned flags = 0;
> > > >  	const struct rte_memzone *mz;
> > > > +	const unsigned int data_size = RTE_MAX_ETHPORTS *
> > > > +						sizeof(*rte_eth_dev_data);
> > > >
> > > >  	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > > > +		/* Allocate shared memory for port data and ownership */
> > > >  		mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA,
> > > > -				RTE_MAX_ETHPORTS *
> > > sizeof(*rte_eth_dev_data),
> > > > +				data_size + sizeof(*rte_eth_next_owner_id)
> > > +
> > > > +				sizeof(*rte_eth_dev_ownership_lock),
> > > >  				rte_socket_id(), flags);
> > > >  	} else
> > > >  		mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
> > > > @@ -168,9 +178,17 @@ enum {
> > > >  		rte_panic("Cannot allocate memzone for ethernet port
> > > data\n");
> > > >
> > > >  	rte_eth_dev_data = mz->addr;
> > > > -	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> > > > -		memset(rte_eth_dev_data, 0,
> > > > -				RTE_MAX_ETHPORTS *
> > > sizeof(*rte_eth_dev_data));
> > > > +	rte_eth_next_owner_id = (uint16_t *)((uintptr_t)mz->addr +
> > > > +					     data_size);
> > > > +	rte_eth_dev_ownership_lock = (rte_spinlock_t *)
> > > > +		((uintptr_t)rte_eth_next_owner_id +
> > > > +		 sizeof(*rte_eth_next_owner_id));
> > >
> > >
> > > I think that might make  rte_eth_dev_ownership_lock location not 4B
> > > aligned...
> >
> > Where can I find the documentation about it?
> 
> That's in your code above - data_size and mz_->addr are both at least 4B
> aligned - rte_eth_dev_ownership_lock = mz->addr + data_size + 2; You can
> align it manually, but as discussed below it is probably easier to group related
> fields into the same struct.
> 
I mean the documentation about the needed alignment for spinlock. Where is it?

> >
> > > Why just not to put all data that you are trying to allocate as one
> > > chunck into the same struct:
> > > static struct {
> > >         uint16_t next_owner_id;
> > >         /* spinlock for eth device ownership management stored in
> > > shared memory */
> > >         rte_spinlock_t dev_ownership_lock;
> > >         rte_eth_dev_data *data;
> > > } rte_eth_dev_data;
> > > and allocate/use it everywhere?
> > > That would simplify allocation/management stuff.
> > >
> > I don't understand what exactly do you mean. ?
> > If you mean to group all in one struct like:
> >
> > static struct {
> >         uint16_t next_owner_id;
> >         rte_spinlock_t dev_ownership_lock;
> >         rte_eth_dev_data  data[];
> > } rte_eth_dev_share_data;
> >
> > Just to simplify the addresses calculation above,
> 
> Yep, that's exactly what I meant.
> As you said it would help with bulk allocation/alignment stuff, plus IMO it is
> better and easier to group several related global together - Improve code
> quality, will make it easier to read & maintain in future.
> 
> > It will change more code in ethdev relative to the old rte_eth_dev_data
> global array and will be more intrusive.
> > Stay it as is, focuses the change only here.
> 
> Yes it would require few more changes, though I think it worth it.
> 

Ok, Got you and agree.

> >
> > I can just move the spinlock memory allocation to be at the beginning of
> the memzone(to be sure about the alignment).
> >
> > > It is good to see that now scanning/updating rte_eth_dev_data[] is
> > > lock protected, but it might be not very plausible to protect both
> > > data[] and next_owner_id using the same lock.
> >
> > I guess you mean to the owner structure in rte_eth_dev_data[port_id].
> > The next_owner_id is read by ownership APIs(for owner validation), so it
> makes sense to use the same lock.
> > Actually, why not?
> 
> Well to me next_owner_id and rte_eth_dev_data[] are not directly related.
> You may create new owner_id but it doesn't mean you would update
> rte_eth_dev_data[] immediately.
> And visa-versa - you might just want to update rte_eth_dev_data[].name or
> .owner_id.
> It is not very good coding practice to use same lock for non-related data
> structures.
>
I see the relation like next:
Since the ownership mechanism synchronization is in ethdev responsibility,
we must protect against user mistakes as much as we can by using the same lock.
So, if user try to set by invalid owner (exactly the ID which currently is allocated) we can protect on it.
 
> >
> > > In fact, for next_owner_id, you don't need a lock - just
> > > rte_atomic_t should be enough.
> >
> > I don't think so, it is problematic in next_owner_id wraparound and may
> complicate the code in other places which read it.
> 
> IMO it is not that complicated, something like that should work I think.
> 
> /* init to 0 at startup*/
> rte_atomic32_t *owner_id;
> 
> int new_owner_id(void)
> {
>     int32_t x;
>     x = rte_atomic32_add_return(&owner_id, 1);
>     if (x > UINT16_MAX) {
>        rte_atomic32_dec(&owner_id);
>        return -EOVERWLOW;
>     } else
>         return x;
> }
> 
> 
> > Why not just to keep it simple and using the same lock?
> 
> Lock is also fine, I just think it better be a separate one - that would protext
> just next_owner_id.
> Though if you are going to use uuid here - all that probably not relevant any
> more.
> 

I agree about the uuid but still think the same lock should be used for both.

> >
> > > Another alternative would be to use 2 locks - one for next_owner_id
> > > second for actual data[] protection.
> > >
> > > Another thing - you'll probably need to grab/release a lock inside
> > > rte_eth_dev_allocated() too.
> > > It is a public function used by drivers, so need to be protected too.
> > >
> >
> > Yes, I thought about it, but decided not to use lock in next:
> > rte_eth_dev_allocated
> > rte_eth_dev_count
> > rte_eth_dev_get_name_by_port
> > rte_eth_dev_get_port_by_name
> > maybe more...
> 
> As I can see in patch #3 you protect by lock access to
> rte_eth_dev_data[].name (which seems like a good  thing).
> So I think any other public function that access rte_eth_dev_data[].name
> should be protected by the same lock.
> 

I don't think so, I can understand to use the ownership lock here(as in port creation) but I don't think it is necessary too.
What are we exactly protecting here?
Don't you think it is just timing?(ask in the next moment and you
 may get another answer) I don't see optional crash.
 
> > > > +
> > > > +	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > > > +		memset(rte_eth_dev_data, 0, data_size);
> > > > +		*rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER + 1;
> > > > +		rte_spinlock_init(rte_eth_dev_ownership_lock);
> > > > +	}
> > > >  }
> > > >
> > > >  struct rte_eth_dev *
> > > > @@ -225,7 +243,7 @@ struct rte_eth_dev *
> > > >  	}
> > > >
> > > >  	if (rte_eth_dev_data == NULL)
> > > > -		rte_eth_dev_data_alloc();
> > > > +		rte_eth_dev_share_data_alloc();
> > > >
> > > >  	if (rte_eth_dev_allocated(name) != NULL) {
> > > >  		RTE_PMD_DEBUG_TRACE("Ethernet Device with name %s
> > > already
> > > > allocated!\n", @@ -253,7 +271,7 @@ struct rte_eth_dev *
> > > >  	struct rte_eth_dev *eth_dev;
> > > >
> > > >  	if (rte_eth_dev_data == NULL)
> > > > -		rte_eth_dev_data_alloc();
> > > > +		rte_eth_dev_share_data_alloc();
> > > >
> > > >  	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> > > >  		if (strcmp(rte_eth_dev_data[i].name, name) == 0) @@ -
> > > 278,8 +296,12
> > > > @@ struct rte_eth_dev *
> > > >  	if (eth_dev == NULL)
> > > >  		return -EINVAL;
> > > >
> > > > -	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
> > > > +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> > > > +
> > > >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> > > > +	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
> > > > +
> > > > +	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
> > > >  	return 0;
> > > >  }
> > > >
> > > > @@ -294,6 +316,174 @@ struct rte_eth_dev *
> > > >  		return 1;
> > > >  }
> > > >
> > > > +static int
> > > > +rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > > > +	    (*rte_eth_next_owner_id > RTE_ETH_DEV_NO_OWNER &&
> > > > +	     *rte_eth_next_owner_id <= owner_id)) {
> > > > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > > +		return 0;
> > > > +	}
> > > > +	return 1;
> > > > +}
> > > > +
> > > > +uint16_t
> > > > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t
> > > owner_id)
> > > > +{
> > > > +	while (port_id < RTE_MAX_ETHPORTS &&
> > > > +	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
> > > > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > > > +		port_id++;
> > > > +
> > > > +	if (port_id >= RTE_MAX_ETHPORTS)
> > > > +		return RTE_MAX_ETHPORTS;
> > > > +
> > > > +	return port_id;
> > > > +}
> > > > +
> > > > +int
> > > > +rte_eth_dev_owner_new(uint16_t *owner_id) {
> > > > +	int ret = 0;
> > > > +
> > > > +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> > > > +
> > > > +	if (*rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > > > +		/* Counter wrap around. */
> > > > +		RTE_PMD_DEBUG_TRACE("Reached maximum number of
> > > Ethernet port owners.\n");
> > > > +		ret = -EUSERS;
> > > > +	} else {
> > > > +		*owner_id = (*rte_eth_next_owner_id)++;
> > > > +	}
> > > > +
> > > > +	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
> > > > +	return ret;
> > > > +}
> > > > +
> > > > +int
> > > > +rte_eth_dev_owner_set(const uint16_t port_id,
> > > > +		      const struct rte_eth_dev_owner *owner)
> > >
> > > As a nit - if you'll have rte_eth_dev_owner_set(port_id, old_owner,
> > > new_owner)
> > > - that might be more plausible for user, and would greatly simplify
> > > unset()
> > > part:
> > > just set(port_id, cur_owner, zero_owner);
> > >
> >
> > How the user should know the old owner?
> 
> By dev_owner_get() or it might have it stored somewhere already (or
> constructed on the fly in case of NO_OWNER).
> 
It complicates the usage.
What's about creating an internal API  _rte_eth_dev_owner_set(port_id, old_owner,
new_owner) and using it by the current exposed set\unset APIs?

> >
> > > > +{
> > > > +	struct rte_eth_dev_owner *port_owner;
> > > > +	int ret = 0;
> > > > +	int sret;
> > > > +
> > > > +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> > > > +
> > > > +	if (!rte_eth_dev_is_valid_port(port_id)) {
> > > > +		RTE_PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > > > +		ret = -ENODEV;
> > > > +		goto unlock;
> > > > +	}
> > > > +
> > > > +	if (!rte_eth_is_valid_owner_id(owner->id)) {
> > > > +		ret = -EINVAL;
> > > > +		goto unlock;
> > > > +	}
> > > > +
> > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > > > +	    port_owner->id != owner->id) {
> > > > +		RTE_LOG(ERR, EAL,
> > > > +			"Cannot set owner to port %d already owned by
> > > %s_%05d.\n",
> > > > +			port_id, port_owner->name, port_owner->id);
> > > > +		ret = -EPERM;
> > > > +		goto unlock;
> > > > +	}
> > > > +
> > > > +	sret = snprintf(port_owner->name,
> > > RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > > > +			owner->name);
> > > > +	if (sret < 0 || sret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> > >
> > > Personally, I don't see any reason to fail if description was truncated...
> > > Another alternative - just use rte_malloc() here to allocate big
> > > enough buffer to hold the description.
> > >
> >
> > But it is static allocation like in the device name, why to allocate it
> differently?
> 
> Static allocation is fine by me - I just said there is probably no need to fail if
> description provide by use will be truncated in that case.
> Though if used description is *that* important - rte_malloc() can help here.
> 
Again, what is the difference between port name and owner name regarding the allocations?
The advantage of static allocation:
1. Not use protected malloc\free functions in other protected code.
2.  Easier to the user.

> >
> > > > +		memset(port_owner->name, 0,
> > > RTE_ETH_MAX_OWNER_NAME_LEN);
> > > > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > > > +		ret = -EINVAL;
> > > > +		goto unlock;
> > > > +	}
> > > > +
> > > > +	port_owner->id = owner->id;
> > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
> > > > +			    owner->name, owner->id);
> > > > +
> > >
> > > As another nit - you can avoid all these gotos by restructuring code a bit:
> > >
> > > rte_eth_dev_owner_set(const uint16_t port_id, const struct
> > > rte_eth_dev_owner *owner) {
> > >     rte_spinlock_lock(...);
> > >     ret = _eth_dev_owner_set_unlocked(port_id, owner);
> > >     rte_spinlock_unlock(...);
> > >     return ret;
> > > }
> > >
> > Don't you like gotos? :)
> 
> Not really :)
> 
> > I personally use it only in error\performance scenarios.
> 
> Same here - prefer to avoid them if possible.
> 
> > Do you think it worth the effort?
> 
> IMO - yes, well structured code is much easier to understand and maintain.
I don't think so in error cases(and performance), It is really clear here, but if you are insisting, I will change it.
Are you?
(If the community thinks like you I think "goto" check should be added to checkpatch).

Thanks, a lot, 
Matan.

> Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-11 14:51           ` Matan Azrad
@ 2018-01-12  0:02             ` Ananyev, Konstantin
  2018-01-12  7:24               ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-12  0:02 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce

Hi Matan,

> 
> Hi Konstantin
> 
> From: Ananyev, Konstantin, Thursday, January 11, 2018 2:40 PM
> > Hi Matan,
> >
> > >
> > > Hi Konstantin
> > >
> > > From: Ananyev, Konstantin, Wednesday, January 10, 2018 3:36 PM
> > > > Hi Matan,
> > > >
> > > > Few comments from me below.
> > > > BTW, do you plan to add ownership mandatory check in control path
> > > > functions that change port configuration?
> > >
> > > No.
> >
> > So it still totally voluntary usage and application nneds to be changed to
> > exploit it?
> > Apart from RTE_FOR_EACH_DEV() change proposed by Gaetan?
> >
> 
> Also RTE_FOR_EACH_DEV() change proposed by Gaetan is not protected because 2 DPDK entities can get the same port while using it.

I am not talking about racing condition here.
Right now even from the same thread - I can call dev_configure()
for the port which I don't own (let say it belongs to failsafe port),
and that would remain, correct?
 
> As I wrote in the log\docs and as discussed a lot in the first version:
> The new synchronization rules are:
> 1. The port allocation and port release synchronization will be
>    managed by ethdev.
> 2. The port usage synchronization will be managed by the port owner.
> 3. The port ownership API synchronization(also with port creation) will be managed by ethdev.
> 5. DPDK entity which want to use a port must take ownership before.
> 
> Ethdev should not protect 2 and 4 according these rules.
> 
> > > > Konstantin
> > > >
> > > > > -----Original Message-----
> > > > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > > > Sent: Sunday, January 7, 2018 9:46 AM
> > > > > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> > Richardson,
> > > > > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > > > <konstantin.ananyev@intel.com>
> > > > > Subject: [PATCH v2 2/6] ethdev: add port ownership
> > > > >
> > > > > The ownership of a port is implicit in DPDK.
> > > > > Making it explicit is better from the next reasons:
> > > > > 1. It will define well who is in charge of the port usage synchronization.
> > > > > 2. A library could work on top of a port.
> > > > > 3. A port can work on top of another port.
> > > > >
> > > > > Also in the fail-safe case, an issue has been met in testpmd.
> > > > > We need to check that the application is not trying to use a port
> > > > > which is already managed by fail-safe.
> > > > >
> > > > > A port owner is built from owner id(number) and owner name(string)
> > > > > while the owner id must be unique to distinguish between two
> > > > > identical entity instances and the owner name can be any name.
> > > > > The name helps to logically recognize the owner by different DPDK
> > > > > entities and allows easy debug.
> > > > > Each DPDK entity can allocate an owner unique identifier and can
> > > > > use it and its preferred name to owns valid ethdev ports.
> > > > > Each DPDK entity can get any port owner status to decide if it can
> > > > > manage the port or not.
> > > > >
> > > > > The mechanism is synchronized for both the primary process threads
> > > > > and the secondary processes threads to allow secondary process
> > > > > entity to be a port owner.
> > > > >
> > > > > Add a sinchronized ownership mechanism to DPDK Ethernet devices to
> > > > > avoid multiple management of a device by different DPDK entities.
> > > > >
> > > > > The current ethdev internal port management is not affected by
> > > > > this feature.
> > > > >
> > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > ---
> > > > >  doc/guides/prog_guide/poll_mode_drv.rst |  14 ++-
> > > > >  lib/librte_ether/rte_ethdev.c           | 206
> > > > ++++++++++++++++++++++++++++++--
> > > > >  lib/librte_ether/rte_ethdev.h           |  89 ++++++++++++++
> > > > >  lib/librte_ether/rte_ethdev_version.map |  12 ++
> > > > >  4 files changed, 311 insertions(+), 10 deletions(-)
> > > >
> > > >
> > > > >
> > > > >
> > > > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > > > b/lib/librte_ether/rte_ethdev.c index 684e3e8..0e12452 100644
> > > > > --- a/lib/librte_ether/rte_ethdev.c
> > > > > +++ b/lib/librte_ether/rte_ethdev.c
> > > > > @@ -70,7 +70,10 @@
> > > > >
> > > > >  static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
> > > > > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
> > > > > +/* ports data array stored in shared memory */
> > > > >  static struct rte_eth_dev_data *rte_eth_dev_data;
> > > > > +/* next owner identifier stored in shared memory */ static
> > > > > +uint16_t *rte_eth_next_owner_id;
> > > > >  static uint8_t eth_dev_last_created_port;
> > > > >
> > > > >  /* spinlock for eth device callbacks */ @@ -82,6 +85,9 @@
> > > > >  /* spinlock for add/remove tx callbacks */  static rte_spinlock_t
> > > > > rte_eth_tx_cb_lock = RTE_SPINLOCK_INITIALIZER;
> > > > >
> > > > > +/* spinlock for eth device ownership management stored in shared
> > > > > +memory */ static rte_spinlock_t *rte_eth_dev_ownership_lock;
> > > > > +
> > > > >  /* store statistics names and its offset in stats structure  */
> > > > > struct rte_eth_xstats_name_off {
> > > > >  	char name[RTE_ETH_XSTATS_NAME_SIZE]; @@ -153,14 +159,18 @@
> > > > enum {  }
> > > > >
> > > > >  static void
> > > > > -rte_eth_dev_data_alloc(void)
> > > > > +rte_eth_dev_share_data_alloc(void)
> > > > >  {
> > > > >  	const unsigned flags = 0;
> > > > >  	const struct rte_memzone *mz;
> > > > > +	const unsigned int data_size = RTE_MAX_ETHPORTS *
> > > > > +						sizeof(*rte_eth_dev_data);
> > > > >
> > > > >  	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > > > > +		/* Allocate shared memory for port data and ownership */
> > > > >  		mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA,
> > > > > -				RTE_MAX_ETHPORTS *
> > > > sizeof(*rte_eth_dev_data),
> > > > > +				data_size + sizeof(*rte_eth_next_owner_id)
> > > > +
> > > > > +				sizeof(*rte_eth_dev_ownership_lock),
> > > > >  				rte_socket_id(), flags);
> > > > >  	} else
> > > > >  		mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
> > > > > @@ -168,9 +178,17 @@ enum {
> > > > >  		rte_panic("Cannot allocate memzone for ethernet port
> > > > data\n");
> > > > >
> > > > >  	rte_eth_dev_data = mz->addr;
> > > > > -	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> > > > > -		memset(rte_eth_dev_data, 0,
> > > > > -				RTE_MAX_ETHPORTS *
> > > > sizeof(*rte_eth_dev_data));
> > > > > +	rte_eth_next_owner_id = (uint16_t *)((uintptr_t)mz->addr +
> > > > > +					     data_size);
> > > > > +	rte_eth_dev_ownership_lock = (rte_spinlock_t *)
> > > > > +		((uintptr_t)rte_eth_next_owner_id +
> > > > > +		 sizeof(*rte_eth_next_owner_id));
> > > >
> > > >
> > > > I think that might make  rte_eth_dev_ownership_lock location not 4B
> > > > aligned...
> > >
> > > Where can I find the documentation about it?
> >
> > That's in your code above - data_size and mz_->addr are both at least 4B
> > aligned - rte_eth_dev_ownership_lock = mz->addr + data_size + 2; You can
> > align it manually, but as discussed below it is probably easier to group related
> > fields into the same struct.
> >
> I mean the documentation about the needed alignment for spinlock. Where is it?

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka15414.html
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dht0008a/CJAGCFAF.html

Might be ARM and PPC guys can provide you some more complete/recent docs. 


> 
> > >
> > > > Why just not to put all data that you are trying to allocate as one
> > > > chunck into the same struct:
> > > > static struct {
> > > >         uint16_t next_owner_id;
> > > >         /* spinlock for eth device ownership management stored in
> > > > shared memory */
> > > >         rte_spinlock_t dev_ownership_lock;
> > > >         rte_eth_dev_data *data;
> > > > } rte_eth_dev_data;
> > > > and allocate/use it everywhere?
> > > > That would simplify allocation/management stuff.
> > > >
> > > I don't understand what exactly do you mean. ?
> > > If you mean to group all in one struct like:
> > >
> > > static struct {
> > >         uint16_t next_owner_id;
> > >         rte_spinlock_t dev_ownership_lock;
> > >         rte_eth_dev_data  data[];
> > > } rte_eth_dev_share_data;
> > >
> > > Just to simplify the addresses calculation above,
> >
> > Yep, that's exactly what I meant.
> > As you said it would help with bulk allocation/alignment stuff, plus IMO it is
> > better and easier to group several related global together - Improve code
> > quality, will make it easier to read & maintain in future.
> >
> > > It will change more code in ethdev relative to the old rte_eth_dev_data
> > global array and will be more intrusive.
> > > Stay it as is, focuses the change only here.
> >
> > Yes it would require few more changes, though I think it worth it.
> >
> 
> Ok, Got you and agree.
> 
> > >
> > > I can just move the spinlock memory allocation to be at the beginning of
> > the memzone(to be sure about the alignment).
> > >
> > > > It is good to see that now scanning/updating rte_eth_dev_data[] is
> > > > lock protected, but it might be not very plausible to protect both
> > > > data[] and next_owner_id using the same lock.
> > >
> > > I guess you mean to the owner structure in rte_eth_dev_data[port_id].
> > > The next_owner_id is read by ownership APIs(for owner validation), so it
> > makes sense to use the same lock.
> > > Actually, why not?
> >
> > Well to me next_owner_id and rte_eth_dev_data[] are not directly related.
> > You may create new owner_id but it doesn't mean you would update
> > rte_eth_dev_data[] immediately.
> > And visa-versa - you might just want to update rte_eth_dev_data[].name or
> > .owner_id.
> > It is not very good coding practice to use same lock for non-related data
> > structures.
> >
> I see the relation like next:
> Since the ownership mechanism synchronization is in ethdev responsibility,
> we must protect against user mistakes as much as we can by using the same lock.
> So, if user try to set by invalid owner (exactly the ID which currently is allocated) we can protect on it.

Hmm, not sure why you can't do same checking with different lock or atomic variable?

> 
> > >
> > > > In fact, for next_owner_id, you don't need a lock - just
> > > > rte_atomic_t should be enough.
> > >
> > > I don't think so, it is problematic in next_owner_id wraparound and may
> > complicate the code in other places which read it.
> >
> > IMO it is not that complicated, something like that should work I think.
> >
> > /* init to 0 at startup*/
> > rte_atomic32_t *owner_id;
> >
> > int new_owner_id(void)
> > {
> >     int32_t x;
> >     x = rte_atomic32_add_return(&owner_id, 1);
> >     if (x > UINT16_MAX) {
> >        rte_atomic32_dec(&owner_id);
> >        return -EOVERWLOW;
> >     } else
> >         return x;
> > }
> >
> >
> > > Why not just to keep it simple and using the same lock?
> >
> > Lock is also fine, I just think it better be a separate one - that would protext
> > just next_owner_id.
> > Though if you are going to use uuid here - all that probably not relevant any
> > more.
> >
> 
> I agree about the uuid but still think the same lock should be used for both.

But with uuid you don't need next_owner_id at all, right?
So lock will only be used for rte_eth_dev_data[] fields anyway.

> 
> > >
> > > > Another alternative would be to use 2 locks - one for next_owner_id
> > > > second for actual data[] protection.
> > > >
> > > > Another thing - you'll probably need to grab/release a lock inside
> > > > rte_eth_dev_allocated() too.
> > > > It is a public function used by drivers, so need to be protected too.
> > > >
> > >
> > > Yes, I thought about it, but decided not to use lock in next:
> > > rte_eth_dev_allocated
> > > rte_eth_dev_count
> > > rte_eth_dev_get_name_by_port
> > > rte_eth_dev_get_port_by_name
> > > maybe more...
> >
> > As I can see in patch #3 you protect by lock access to
> > rte_eth_dev_data[].name (which seems like a good  thing).
> > So I think any other public function that access rte_eth_dev_data[].name
> > should be protected by the same lock.
> >
> 
> I don't think so, I can understand to use the ownership lock here(as in port creation) but I don't think it is necessary too.
> What are we exactly protecting here?
> Don't you think it is just timing?(ask in the next moment and you
>  may get another answer) I don't see optional crash.

Not sure what you mean here by timing...
As I understand rte_eth_dev_data[].name unique identifies device and is used
by  port allocation/release/find functions.
As you stated above:
"1. The port allocation and port release synchronization will be  managed by ethdev."
To me it means that ethdev layer has to make sure that all accesses to
rte_eth_dev_data[].name are atomic.
Otherwise what would prevent the situation when one process does
rte_eth_dev_allocate()->snprintf(rte_eth_dev_data[x].name, ...)
while second one does rte_eth_dev_allocated(rte_eth_dev_data[x].name, ...)
?

> 
> > > > > +
> > > > > +	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > > > > +		memset(rte_eth_dev_data, 0, data_size);
> > > > > +		*rte_eth_next_owner_id = RTE_ETH_DEV_NO_OWNER + 1;
> > > > > +		rte_spinlock_init(rte_eth_dev_ownership_lock);
> > > > > +	}
> > > > >  }
> > > > >
> > > > >  struct rte_eth_dev *
> > > > > @@ -225,7 +243,7 @@ struct rte_eth_dev *
> > > > >  	}
> > > > >
> > > > >  	if (rte_eth_dev_data == NULL)
> > > > > -		rte_eth_dev_data_alloc();
> > > > > +		rte_eth_dev_share_data_alloc();
> > > > >
> > > > >  	if (rte_eth_dev_allocated(name) != NULL) {
> > > > >  		RTE_PMD_DEBUG_TRACE("Ethernet Device with name %s
> > > > already
> > > > > allocated!\n", @@ -253,7 +271,7 @@ struct rte_eth_dev *
> > > > >  	struct rte_eth_dev *eth_dev;
> > > > >
> > > > >  	if (rte_eth_dev_data == NULL)
> > > > > -		rte_eth_dev_data_alloc();
> > > > > +		rte_eth_dev_share_data_alloc();
> > > > >
> > > > >  	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> > > > >  		if (strcmp(rte_eth_dev_data[i].name, name) == 0) @@ -
> > > > 278,8 +296,12
> > > > > @@ struct rte_eth_dev *
> > > > >  	if (eth_dev == NULL)
> > > > >  		return -EINVAL;
> > > > >
> > > > > -	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
> > > > > +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> > > > > +
> > > > >  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> > > > > +	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
> > > > > +
> > > > > +	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
> > > > >  	return 0;
> > > > >  }
> > > > >
> > > > > @@ -294,6 +316,174 @@ struct rte_eth_dev *
> > > > >  		return 1;
> > > > >  }
> > > > >
> > > > > +static int
> > > > > +rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > > > +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> > > > > +	    (*rte_eth_next_owner_id > RTE_ETH_DEV_NO_OWNER &&
> > > > > +	     *rte_eth_next_owner_id <= owner_id)) {
> > > > > +		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > > > +		return 0;
> > > > > +	}
> > > > > +	return 1;
> > > > > +}
> > > > > +
> > > > > +uint16_t
> > > > > +rte_eth_find_next_owned_by(uint16_t port_id, const uint16_t
> > > > owner_id)
> > > > > +{
> > > > > +	while (port_id < RTE_MAX_ETHPORTS &&
> > > > > +	       (rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED ||
> > > > > +	       rte_eth_devices[port_id].data->owner.id != owner_id))
> > > > > +		port_id++;
> > > > > +
> > > > > +	if (port_id >= RTE_MAX_ETHPORTS)
> > > > > +		return RTE_MAX_ETHPORTS;
> > > > > +
> > > > > +	return port_id;
> > > > > +}
> > > > > +
> > > > > +int
> > > > > +rte_eth_dev_owner_new(uint16_t *owner_id) {
> > > > > +	int ret = 0;
> > > > > +
> > > > > +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> > > > > +
> > > > > +	if (*rte_eth_next_owner_id == RTE_ETH_DEV_NO_OWNER) {
> > > > > +		/* Counter wrap around. */
> > > > > +		RTE_PMD_DEBUG_TRACE("Reached maximum number of
> > > > Ethernet port owners.\n");
> > > > > +		ret = -EUSERS;
> > > > > +	} else {
> > > > > +		*owner_id = (*rte_eth_next_owner_id)++;
> > > > > +	}
> > > > > +
> > > > > +	rte_spinlock_unlock(rte_eth_dev_ownership_lock);
> > > > > +	return ret;
> > > > > +}
> > > > > +
> > > > > +int
> > > > > +rte_eth_dev_owner_set(const uint16_t port_id,
> > > > > +		      const struct rte_eth_dev_owner *owner)
> > > >
> > > > As a nit - if you'll have rte_eth_dev_owner_set(port_id, old_owner,
> > > > new_owner)
> > > > - that might be more plausible for user, and would greatly simplify
> > > > unset()
> > > > part:
> > > > just set(port_id, cur_owner, zero_owner);
> > > >
> > >
> > > How the user should know the old owner?
> >
> > By dev_owner_get() or it might have it stored somewhere already (or
> > constructed on the fly in case of NO_OWNER).
> >
> It complicates the usage.
> What's about creating an internal API  _rte_eth_dev_owner_set(port_id, old_owner,
> new_owner) and using it by the current exposed set\unset APIs?

Sounds good to me.

> 
> > >
> > > > > +{
> > > > > +	struct rte_eth_dev_owner *port_owner;
> > > > > +	int ret = 0;
> > > > > +	int sret;
> > > > > +
> > > > > +	rte_spinlock_lock(rte_eth_dev_ownership_lock);
> > > > > +
> > > > > +	if (!rte_eth_dev_is_valid_port(port_id)) {
> > > > > +		RTE_PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
> > > > > +		ret = -ENODEV;
> > > > > +		goto unlock;
> > > > > +	}
> > > > > +
> > > > > +	if (!rte_eth_is_valid_owner_id(owner->id)) {
> > > > > +		ret = -EINVAL;
> > > > > +		goto unlock;
> > > > > +	}
> > > > > +
> > > > > +	port_owner = &rte_eth_devices[port_id].data->owner;
> > > > > +	if (port_owner->id != RTE_ETH_DEV_NO_OWNER &&
> > > > > +	    port_owner->id != owner->id) {
> > > > > +		RTE_LOG(ERR, EAL,
> > > > > +			"Cannot set owner to port %d already owned by
> > > > %s_%05d.\n",
> > > > > +			port_id, port_owner->name, port_owner->id);
> > > > > +		ret = -EPERM;
> > > > > +		goto unlock;
> > > > > +	}
> > > > > +
> > > > > +	sret = snprintf(port_owner->name,
> > > > RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
> > > > > +			owner->name);
> > > > > +	if (sret < 0 || sret >= RTE_ETH_MAX_OWNER_NAME_LEN) {
> > > >
> > > > Personally, I don't see any reason to fail if description was truncated...
> > > > Another alternative - just use rte_malloc() here to allocate big
> > > > enough buffer to hold the description.
> > > >
> > >
> > > But it is static allocation like in the device name, why to allocate it
> > differently?
> >
> > Static allocation is fine by me - I just said there is probably no need to fail if
> > description provide by use will be truncated in that case.
> > Though if used description is *that* important - rte_malloc() can help here.
> >
> Again, what is the difference between port name and owner name regarding the allocations?

As I understand rte_eth_dev_data[].name unique identifies device and always has to be consistent.
owner.name is not critical for system operation, and I don't see a big deal if it would be truncated.

> The advantage of static allocation:
> 1. Not use protected malloc\free functions in other protected code.

You can call malloc/free before/after grabbing the lock.
But as I said - I am fine with static array here too - I just don't think
truncating user description should cause a failure.  

> 2.  Easier to the user.
> 
> > >
> > > > > +		memset(port_owner->name, 0,
> > > > RTE_ETH_MAX_OWNER_NAME_LEN);
> > > > > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > > > > +		ret = -EINVAL;
> > > > > +		goto unlock;
> > > > > +	}
> > > > > +
> > > > > +	port_owner->id = owner->id;
> > > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n", port_id,
> > > > > +			    owner->name, owner->id);
> > > > > +
> > > >
> > > > As another nit - you can avoid all these gotos by restructuring code a bit:
> > > >
> > > > rte_eth_dev_owner_set(const uint16_t port_id, const struct
> > > > rte_eth_dev_owner *owner) {
> > > >     rte_spinlock_lock(...);
> > > >     ret = _eth_dev_owner_set_unlocked(port_id, owner);
> > > >     rte_spinlock_unlock(...);
> > > >     return ret;
> > > > }
> > > >
> > > Don't you like gotos? :)
> >
> > Not really :)
> >
> > > I personally use it only in error\performance scenarios.
> >
> > Same here - prefer to avoid them if possible.
> >
> > > Do you think it worth the effort?
> >
> > IMO - yes, well structured code is much easier to understand and maintain.
> I don't think so in error cases(and performance), It is really clear here, but if you are insisting, I will change it.
> Are you?

Yes, that would be my preference.
Why otherwise I would bother to write all this? :)

> (If the community thinks like you I think "goto" check should be added to checkpatch).

Might be there are pieces of code there goto are really hard to avoid,
and/or using goto would provide some performance benefit or so...
But that case definitely doesn't look like that.
Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-12  0:02             ` Ananyev, Konstantin
@ 2018-01-12  7:24               ` Matan Azrad
  2018-01-15 11:45                 ` Ananyev, Konstantin
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-12  7:24 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce


Hi Konstantin

From: Ananyev, Konstantin, Friday, January 12, 2018 2:02 AM
> Hi Matan,
> 
> >
> > Hi Konstantin
> >
> > From: Ananyev, Konstantin, Thursday, January 11, 2018 2:40 PM
> > > Hi Matan,
> > >
> > > >
> > > > Hi Konstantin
> > > >
> > > > From: Ananyev, Konstantin, Wednesday, January 10, 2018 3:36 PM
> > > > > Hi Matan,
<snip>
> > > > > Few comments from me below.
> > > > > BTW, do you plan to add ownership mandatory check in control
> > > > > path functions that change port configuration?
> > > >
> > > > No.
> > >
> > > So it still totally voluntary usage and application nneds to be
> > > changed to exploit it?
> > > Apart from RTE_FOR_EACH_DEV() change proposed by Gaetan?
> > >
> >
> > Also RTE_FOR_EACH_DEV() change proposed by Gaetan is not protected
> because 2 DPDK entities can get the same port while using it.
> 
> I am not talking about racing condition here.
> Right now even from the same thread - I can call dev_configure() for the port
> which I don't own (let say it belongs to failsafe port), and that would remain,
> correct?
> 
Yes.

> > As I wrote in the log\docs and as discussed a lot in the first version:
> > The new synchronization rules are:
> > 1. The port allocation and port release synchronization will be
> >    managed by ethdev.
> > 2. The port usage synchronization will be managed by the port owner.
> > 3. The port ownership API synchronization(also with port creation) will be
> managed by ethdev.
> > 4. DPDK entity which want to use a port must take ownership before.
> >
> > Ethdev should not protect 2 and 4 according these rules.
> >
> > > > > Konstantin
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Matan Azrad [mailto:matan@mellanox.com]
<snip>
> > I mean the documentation about the needed alignment for spinlock.
> Where is it?
> 
> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Finfo
> center.arm.com%2Fhelp%2Findex.jsp%3Ftopic%3D%2Fcom.arm.doc.faqs%2
> Fka15414.html&data=02%7C01%7Cmatan%40mellanox.com%7Cb3c329ae9db
> f4bd29a7008d5594fb776%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C1
> %7C636513121294703050&sdata=40v3b4wk5f4qEyIY5jdDv8S47LjgXK0t9TPtav
> XIMOk%3D&reserved=0
> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Finfo
> center.arm.com%2Fhelp%2Findex.jsp%3Ftopic%3D%2Fcom.arm.doc.dht000
> 8a%2FCJAGCFAF.html&data=02%7C01%7Cmatan%40mellanox.com%7Cb3c32
> 9ae9dbf4bd29a7008d5594fb776%7Ca652971c7d2e4d9ba6a4d149256f461b%7
> C0%7C1%7C636513121294703050&sdata=B7pEZjFJntVp3Il8fS9wr%2FlxABgNX
> FSr9PE4emEPLQE%3D&reserved=0
> 
> Might be ARM and PPC guys can provide you some more complete/recent
> docs.
Thanks.
<snip> 
> > > > > It is good to see that now scanning/updating rte_eth_dev_data[]
> > > > > is lock protected, but it might be not very plausible to protect
> > > > > both data[] and next_owner_id using the same lock.
> > > >
> > > > I guess you mean to the owner structure in rte_eth_dev_data[port_id].
> > > > The next_owner_id is read by ownership APIs(for owner validation),
> > > > so it
> > > makes sense to use the same lock.
> > > > Actually, why not?
> > >
> > > Well to me next_owner_id and rte_eth_dev_data[] are not directly
> related.
> > > You may create new owner_id but it doesn't mean you would update
> > > rte_eth_dev_data[] immediately.
> > > And visa-versa - you might just want to update
> > > rte_eth_dev_data[].name or .owner_id.
> > > It is not very good coding practice to use same lock for non-related
> > > data structures.
> > >
> > I see the relation like next:
> > Since the ownership mechanism synchronization is in ethdev
> > responsibility, we must protect against user mistakes as much as we can by
> using the same lock.
> > So, if user try to set by invalid owner (exactly the ID which currently is
> allocated) we can protect on it.
> 
> Hmm, not sure why you can't do same checking with different lock or atomic
> variable?
> 
The set ownership API is protected by ownership lock and checks the owner ID validity 
By reading the next owner ID.
So, the owner ID allocation and set API should use the same atomic mechanism.
The set(and others) ownership APIs already uses the ownership lock so I think it makes sense to use the same lock also in ID allocation.
 
> > > > > In fact, for next_owner_id, you don't need a lock - just
> > > > > rte_atomic_t should be enough.
> > > >
> > > > I don't think so, it is problematic in next_owner_id wraparound
> > > > and may
> > > complicate the code in other places which read it.
> > >
> > > IMO it is not that complicated, something like that should work I think.
> > >
> > > /* init to 0 at startup*/
> > > rte_atomic32_t *owner_id;
> > >
> > > int new_owner_id(void)
> > > {
> > >     int32_t x;
> > >     x = rte_atomic32_add_return(&owner_id, 1);
> > >     if (x > UINT16_MAX) {
> > >        rte_atomic32_dec(&owner_id);
> > >        return -EOVERWLOW;
> > >     } else
> > >         return x;
> > > }
> > >
> > >
> > > > Why not just to keep it simple and using the same lock?
> > >
> > > Lock is also fine, I just think it better be a separate one - that
> > > would protext just next_owner_id.
> > > Though if you are going to use uuid here - all that probably not
> > > relevant any more.
> > >
> >
> > I agree about the uuid but still think the same lock should be used for both.
> 
> But with uuid you don't need next_owner_id at all, right?
> So lock will only be used for rte_eth_dev_data[] fields anyway.
>
Sorry, I meant uint64_t, not uuid.

> > > > > Another alternative would be to use 2 locks - one for
> > > > > next_owner_id second for actual data[] protection.
> > > > >
> > > > > Another thing - you'll probably need to grab/release a lock
> > > > > inside
> > > > > rte_eth_dev_allocated() too.
> > > > > It is a public function used by drivers, so need to be protected too.
> > > > >
> > > >
> > > > Yes, I thought about it, but decided not to use lock in next:
> > > > rte_eth_dev_allocated
> > > > rte_eth_dev_count
> > > > rte_eth_dev_get_name_by_port
> > > > rte_eth_dev_get_port_by_name
> > > > maybe more...
> > >
> > > As I can see in patch #3 you protect by lock access to
> > > rte_eth_dev_data[].name (which seems like a good  thing).
> > > So I think any other public function that access
> > > rte_eth_dev_data[].name should be protected by the same lock.
> > >
> >
> > I don't think so, I can understand to use the ownership lock here(as in port
> creation) but I don't think it is necessary too.
> > What are we exactly protecting here?
> > Don't you think it is just timing?(ask in the next moment and you  may
> > get another answer) I don't see optional crash.
> 
> Not sure what you mean here by timing...
> As I understand rte_eth_dev_data[].name unique identifies device and is
> used by  port allocation/release/find functions.
> As you stated above:
> "1. The port allocation and port release synchronization will be  managed by
> ethdev."
> To me it means that ethdev layer has to make sure that all accesses to
> rte_eth_dev_data[].name are atomic.
> Otherwise what would prevent the situation when one process does
> rte_eth_dev_allocate()->snprintf(rte_eth_dev_data[x].name, ...) while
> second one does rte_eth_dev_allocated(rte_eth_dev_data[x].name, ...) ?
> 
The second will get True or False and that is it.
Maybe if it had been called just a moment after, It might get different answer. 
Because these APIs don't change ethdev structure(just read), it can be OK.
But again, I can understand to use ownership lock also here.

<snip>
> > > Static allocation is fine by me - I just said there is probably no
> > > need to fail if description provide by use will be truncated in that case.
> > > Though if used description is *that* important - rte_malloc() can help
> here.
> > >
> > Again, what is the difference between port name and owner name
> regarding the allocations?
> 
> As I understand rte_eth_dev_data[].name unique identifies device and
> always has to be consistent.
> owner.name is not critical for system operation, and I don't see a big deal if it
> would be truncated.
> 
> > The advantage of static allocation:
> > 1. Not use protected malloc\free functions in other protected code.
> 
> You can call malloc/free before/after grabbing the lock.
> But as I said - I am fine with static array here too - I just don't think truncating
> user description should cause a failure.
> 

Ok, will just add warning print in truncation case.

> > 2.  Easier to the user.
> >
> > > >
> > > > > > +		memset(port_owner->name, 0,
> > > > > RTE_ETH_MAX_OWNER_NAME_LEN);
> > > > > > +		RTE_LOG(ERR, EAL, "Invalid owner name.\n");
> > > > > > +		ret = -EINVAL;
> > > > > > +		goto unlock;
> > > > > > +	}
> > > > > > +
> > > > > > +	port_owner->id = owner->id;
> > > > > > +	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%05d.\n",
> port_id,
> > > > > > +			    owner->name, owner->id);
> > > > > > +
> > > > >
> > > > > As another nit - you can avoid all these gotos by restructuring code a
> bit:
> > > > >
> > > > > rte_eth_dev_owner_set(const uint16_t port_id, const struct
> > > > > rte_eth_dev_owner *owner) {
> > > > >     rte_spinlock_lock(...);
> > > > >     ret = _eth_dev_owner_set_unlocked(port_id, owner);
> > > > >     rte_spinlock_unlock(...);
> > > > >     return ret;
> > > > > }
> > > > >
> > > > Don't you like gotos? :)
> > >
> > > Not really :)
> > >
> > > > I personally use it only in error\performance scenarios.
> > >
> > > Same here - prefer to avoid them if possible.
> > >
> > > > Do you think it worth the effort?
> > >
> > > IMO - yes, well structured code is much easier to understand and
> maintain.
> > I don't think so in error cases(and performance), It is really clear here, but if
> you are insisting, I will change it.
> > Are you?
> 
> Yes, that would be my preference.
> Why otherwise I would bother to write all this? :)
> 
> > (If the community thinks like you I think "goto" check should be added to
> checkpatch).
> 
> Might be there are pieces of code there goto are really hard to avoid, and/or
> using goto would provide some performance benefit or so...
> But that case definitely doesn't look like that.

Let's stop "goto" discussion here, in spite of I don't think like you globally, In this case I have no problem to change it. 

Thanks,
Matan.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-12  7:24               ` Matan Azrad
@ 2018-01-15 11:45                 ` Ananyev, Konstantin
  2018-01-15 13:09                   ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-15 11:45 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce


Hi Matan,

> 
> 
> Hi Konstantin
> 
> From: Ananyev, Konstantin, Friday, January 12, 2018 2:02 AM
> > Hi Matan,
> >
> > >
> > > Hi Konstantin
> > >
> > > From: Ananyev, Konstantin, Thursday, January 11, 2018 2:40 PM
> > > > Hi Matan,
> > > >
> > > > >
> > > > > Hi Konstantin
> > > > >
> > > > > From: Ananyev, Konstantin, Wednesday, January 10, 2018 3:36 PM
> > > > > > Hi Matan,
> <snip>
> > > > > > Few comments from me below.
> > > > > > BTW, do you plan to add ownership mandatory check in control
> > > > > > path functions that change port configuration?
> > > > >
> > > > > No.
> > > >
> > > > So it still totally voluntary usage and application nneds to be
> > > > changed to exploit it?
> > > > Apart from RTE_FOR_EACH_DEV() change proposed by Gaetan?
> > > >
> > >
> > > Also RTE_FOR_EACH_DEV() change proposed by Gaetan is not protected
> > because 2 DPDK entities can get the same port while using it.
> >
> > I am not talking about racing condition here.
> > Right now even from the same thread - I can call dev_configure() for the port
> > which I don't own (let say it belongs to failsafe port), and that would remain,
> > correct?
> >
> Yes.

Ok, thanks for clarification.
I think that makes current approach sort of incomplete, but might be it is a 
subject of separate discussion.

> 
> > > As I wrote in the log\docs and as discussed a lot in the first version:
> > > The new synchronization rules are:
> > > 1. The port allocation and port release synchronization will be
> > >    managed by ethdev.
> > > 2. The port usage synchronization will be managed by the port owner.
> > > 3. The port ownership API synchronization(also with port creation) will be
> > managed by ethdev.
> > > 4. DPDK entity which want to use a port must take ownership before.
> > >
> > > Ethdev should not protect 2 and 4 according these rules.
> > >
> > > > > > Konstantin
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Matan Azrad [mailto:matan@mellanox.com]
> <snip>
> > > I mean the documentation about the needed alignment for spinlock.
> > Where is it?
> >
> > https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Finfo
> > center.arm.com%2Fhelp%2Findex.jsp%3Ftopic%3D%2Fcom.arm.doc.faqs%2
> > Fka15414.html&data=02%7C01%7Cmatan%40mellanox.com%7Cb3c329ae9db
> > f4bd29a7008d5594fb776%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C1
> > %7C636513121294703050&sdata=40v3b4wk5f4qEyIY5jdDv8S47LjgXK0t9TPtav
> > XIMOk%3D&reserved=0
> > https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Finfo
> > center.arm.com%2Fhelp%2Findex.jsp%3Ftopic%3D%2Fcom.arm.doc.dht000
> > 8a%2FCJAGCFAF.html&data=02%7C01%7Cmatan%40mellanox.com%7Cb3c32
> > 9ae9dbf4bd29a7008d5594fb776%7Ca652971c7d2e4d9ba6a4d149256f461b%7
> > C0%7C1%7C636513121294703050&sdata=B7pEZjFJntVp3Il8fS9wr%2FlxABgNX
> > FSr9PE4emEPLQE%3D&reserved=0
> >
> > Might be ARM and PPC guys can provide you some more complete/recent
> > docs.
> Thanks.
> <snip>
> > > > > > It is good to see that now scanning/updating rte_eth_dev_data[]
> > > > > > is lock protected, but it might be not very plausible to protect
> > > > > > both data[] and next_owner_id using the same lock.
> > > > >
> > > > > I guess you mean to the owner structure in rte_eth_dev_data[port_id].
> > > > > The next_owner_id is read by ownership APIs(for owner validation),
> > > > > so it
> > > > makes sense to use the same lock.
> > > > > Actually, why not?
> > > >
> > > > Well to me next_owner_id and rte_eth_dev_data[] are not directly
> > related.
> > > > You may create new owner_id but it doesn't mean you would update
> > > > rte_eth_dev_data[] immediately.
> > > > And visa-versa - you might just want to update
> > > > rte_eth_dev_data[].name or .owner_id.
> > > > It is not very good coding practice to use same lock for non-related
> > > > data structures.
> > > >
> > > I see the relation like next:
> > > Since the ownership mechanism synchronization is in ethdev
> > > responsibility, we must protect against user mistakes as much as we can by
> > using the same lock.
> > > So, if user try to set by invalid owner (exactly the ID which currently is
> > allocated) we can protect on it.
> >
> > Hmm, not sure why you can't do same checking with different lock or atomic
> > variable?
> >
> The set ownership API is protected by ownership lock and checks the owner ID validity
> By reading the next owner ID.
> So, the owner ID allocation and set API should use the same atomic mechanism.

Sure but all you are doing for checking validity, is  check that 
owner_id > 0 &&& owner_id < next_ownwe_id, right?
As you don't allow owner_id overlap (16/3248 bits) you can safely do same check
with just atomic_get(&next_owner_id). 

> The set(and others) ownership APIs already uses the ownership lock so I think it makes sense to use the same lock also in ID allocation.
> 
> > > > > > In fact, for next_owner_id, you don't need a lock - just
> > > > > > rte_atomic_t should be enough.
> > > > >
> > > > > I don't think so, it is problematic in next_owner_id wraparound
> > > > > and may
> > > > complicate the code in other places which read it.
> > > >
> > > > IMO it is not that complicated, something like that should work I think.
> > > >
> > > > /* init to 0 at startup*/
> > > > rte_atomic32_t *owner_id;
> > > >
> > > > int new_owner_id(void)
> > > > {
> > > >     int32_t x;
> > > >     x = rte_atomic32_add_return(&owner_id, 1);
> > > >     if (x > UINT16_MAX) {
> > > >        rte_atomic32_dec(&owner_id);
> > > >        return -EOVERWLOW;
> > > >     } else
> > > >         return x;
> > > > }
> > > >
> > > >
> > > > > Why not just to keep it simple and using the same lock?
> > > >
> > > > Lock is also fine, I just think it better be a separate one - that
> > > > would protext just next_owner_id.
> > > > Though if you are going to use uuid here - all that probably not
> > > > relevant any more.
> > > >
> > >
> > > I agree about the uuid but still think the same lock should be used for both.
> >
> > But with uuid you don't need next_owner_id at all, right?
> > So lock will only be used for rte_eth_dev_data[] fields anyway.
> >
> Sorry, I meant uint64_t, not uuid.

Ah ok, my thought uuid_t is better as with it you don't need to support your own code
to allocate new owner_id, but rely on system libs instead.
But wouldn't insist here.

> 
> > > > > > Another alternative would be to use 2 locks - one for
> > > > > > next_owner_id second for actual data[] protection.
> > > > > >
> > > > > > Another thing - you'll probably need to grab/release a lock
> > > > > > inside
> > > > > > rte_eth_dev_allocated() too.
> > > > > > It is a public function used by drivers, so need to be protected too.
> > > > > >
> > > > >
> > > > > Yes, I thought about it, but decided not to use lock in next:
> > > > > rte_eth_dev_allocated
> > > > > rte_eth_dev_count
> > > > > rte_eth_dev_get_name_by_port
> > > > > rte_eth_dev_get_port_by_name
> > > > > maybe more...
> > > >
> > > > As I can see in patch #3 you protect by lock access to
> > > > rte_eth_dev_data[].name (which seems like a good  thing).
> > > > So I think any other public function that access
> > > > rte_eth_dev_data[].name should be protected by the same lock.
> > > >
> > >
> > > I don't think so, I can understand to use the ownership lock here(as in port
> > creation) but I don't think it is necessary too.
> > > What are we exactly protecting here?
> > > Don't you think it is just timing?(ask in the next moment and you  may
> > > get another answer) I don't see optional crash.
> >
> > Not sure what you mean here by timing...
> > As I understand rte_eth_dev_data[].name unique identifies device and is
> > used by  port allocation/release/find functions.
> > As you stated above:
> > "1. The port allocation and port release synchronization will be  managed by
> > ethdev."
> > To me it means that ethdev layer has to make sure that all accesses to
> > rte_eth_dev_data[].name are atomic.
> > Otherwise what would prevent the situation when one process does
> > rte_eth_dev_allocate()->snprintf(rte_eth_dev_data[x].name, ...) while
> > second one does rte_eth_dev_allocated(rte_eth_dev_data[x].name, ...) ?
> >
> The second will get True or False and that is it.

Under race condition - in the worst case it might crash, though for that you'll have to be really unlucky.
Though in most cases as you said it would just not operate correctly.
I think if we start to protect dev->name by lock we need to do it for all instances
(both read and write).  

> Maybe if it had been called just a moment after, It might get different answer.
> Because these APIs don't change ethdev structure(just read), it can be OK.
> But again, I can understand to use ownership lock also here.
> 

Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-15 11:45                 ` Ananyev, Konstantin
@ 2018-01-15 13:09                   ` Matan Azrad
  2018-01-15 18:43                     ` Ananyev, Konstantin
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-15 13:09 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce

Hi Konstantin

From: Ananyev, Konstantin, Monday, January 15, 2018 1:45 PM
> Hi Matan,
> 
> >
> >
> > Hi Konstantin
> >
> > From: Ananyev, Konstantin, Friday, January 12, 2018 2:02 AM
> > > Hi Matan,
> > >
> > > >
> > > > Hi Konstantin
> > > >
> > > > From: Ananyev, Konstantin, Thursday, January 11, 2018 2:40 PM
> > > > > Hi Matan,
> > > > >
> > > > > >
> > > > > > Hi Konstantin
> > > > > >
> > > > > > From: Ananyev, Konstantin, Wednesday, January 10, 2018 3:36 PM
> > > > > > > Hi Matan,
<snip>
> > > > > > > It is good to see that now scanning/updating
> > > > > > > rte_eth_dev_data[] is lock protected, but it might be not
> > > > > > > very plausible to protect both data[] and next_owner_id using the
> same lock.
> > > > > >
> > > > > > I guess you mean to the owner structure in
> rte_eth_dev_data[port_id].
> > > > > > The next_owner_id is read by ownership APIs(for owner
> > > > > > validation), so it
> > > > > makes sense to use the same lock.
> > > > > > Actually, why not?
> > > > >
> > > > > Well to me next_owner_id and rte_eth_dev_data[] are not directly
> > > related.
> > > > > You may create new owner_id but it doesn't mean you would update
> > > > > rte_eth_dev_data[] immediately.
> > > > > And visa-versa - you might just want to update
> > > > > rte_eth_dev_data[].name or .owner_id.
> > > > > It is not very good coding practice to use same lock for
> > > > > non-related data structures.
> > > > >
> > > > I see the relation like next:
> > > > Since the ownership mechanism synchronization is in ethdev
> > > > responsibility, we must protect against user mistakes as much as
> > > > we can by
> > > using the same lock.
> > > > So, if user try to set by invalid owner (exactly the ID which
> > > > currently is
> > > allocated) we can protect on it.
> > >
> > > Hmm, not sure why you can't do same checking with different lock or
> > > atomic variable?
> > >
> > The set ownership API is protected by ownership lock and checks the
> > owner ID validity By reading the next owner ID.
> > So, the owner ID allocation and set API should use the same atomic
> mechanism.
> 
> Sure but all you are doing for checking validity, is  check that owner_id > 0
> &&& owner_id < next_ownwe_id, right?
> As you don't allow owner_id overlap (16/3248 bits) you can safely do same
> check with just atomic_get(&next_owner_id).
> 
It will not protect it, scenario:
- current next_id is X.
- call set ownership of port A with owner id X by thread 0(by user mistake).
- context switch
- allocate new id by thread 1 and get X and change next_id to X+1 atomically.
-  context switch
- Thread 0 validate X by atomic_read and succeed to take ownership.
- The system loosed the port(or will be managed by two entities) - crash.


> > The set(and others) ownership APIs already uses the ownership lock so I
> think it makes sense to use the same lock also in ID allocation.
> >
> > > > > > > In fact, for next_owner_id, you don't need a lock - just
> > > > > > > rte_atomic_t should be enough.
> > > > > >
> > > > > > I don't think so, it is problematic in next_owner_id
> > > > > > wraparound and may
> > > > > complicate the code in other places which read it.
> > > > >
> > > > > IMO it is not that complicated, something like that should work I think.
> > > > >
> > > > > /* init to 0 at startup*/
> > > > > rte_atomic32_t *owner_id;
> > > > >
> > > > > int new_owner_id(void)
> > > > > {
> > > > >     int32_t x;
> > > > >     x = rte_atomic32_add_return(&owner_id, 1);
> > > > >     if (x > UINT16_MAX) {
> > > > >        rte_atomic32_dec(&owner_id);
> > > > >        return -EOVERWLOW;
> > > > >     } else
> > > > >         return x;
> > > > > }
> > > > >
> > > > >
> > > > > > Why not just to keep it simple and using the same lock?
> > > > >
> > > > > Lock is also fine, I just think it better be a separate one -
> > > > > that would protext just next_owner_id.
> > > > > Though if you are going to use uuid here - all that probably not
> > > > > relevant any more.
> > > > >
> > > >
> > > > I agree about the uuid but still think the same lock should be used for
> both.
> > >
> > > But with uuid you don't need next_owner_id at all, right?
> > > So lock will only be used for rte_eth_dev_data[] fields anyway.
> > >
> > Sorry, I meant uint64_t, not uuid.
> 
> Ah ok, my thought uuid_t is better as with it you don't need to support your
> own code to allocate new owner_id, but rely on system libs instead.
> But wouldn't insist here.
> 
> >
> > > > > > > Another alternative would be to use 2 locks - one for
> > > > > > > next_owner_id second for actual data[] protection.
> > > > > > >
> > > > > > > Another thing - you'll probably need to grab/release a lock
> > > > > > > inside
> > > > > > > rte_eth_dev_allocated() too.
> > > > > > > It is a public function used by drivers, so need to be protected too.
> > > > > > >
> > > > > >
> > > > > > Yes, I thought about it, but decided not to use lock in next:
> > > > > > rte_eth_dev_allocated
> > > > > > rte_eth_dev_count
> > > > > > rte_eth_dev_get_name_by_port
> > > > > > rte_eth_dev_get_port_by_name
> > > > > > maybe more...
> > > > >
> > > > > As I can see in patch #3 you protect by lock access to
> > > > > rte_eth_dev_data[].name (which seems like a good  thing).
> > > > > So I think any other public function that access
> > > > > rte_eth_dev_data[].name should be protected by the same lock.
> > > > >
> > > >
> > > > I don't think so, I can understand to use the ownership lock
> > > > here(as in port
> > > creation) but I don't think it is necessary too.
> > > > What are we exactly protecting here?
> > > > Don't you think it is just timing?(ask in the next moment and you
> > > > may get another answer) I don't see optional crash.
> > >
> > > Not sure what you mean here by timing...
> > > As I understand rte_eth_dev_data[].name unique identifies device and
> > > is used by  port allocation/release/find functions.
> > > As you stated above:
> > > "1. The port allocation and port release synchronization will be
> > > managed by ethdev."
> > > To me it means that ethdev layer has to make sure that all accesses
> > > to rte_eth_dev_data[].name are atomic.
> > > Otherwise what would prevent the situation when one process does
> > > rte_eth_dev_allocate()->snprintf(rte_eth_dev_data[x].name, ...)
> > > while second one does
> rte_eth_dev_allocated(rte_eth_dev_data[x].name, ...) ?
> > >
> > The second will get True or False and that is it.
> 
> Under race condition - in the worst case it might crash, though for that you'll
> have to be really unlucky.
> Though in most cases as you said it would just not operate correctly.
> I think if we start to protect dev->name by lock we need to do it for all
> instances (both read and write).
> 
Since under the ownership rules, the user must take ownership of a port before using it, I still don't see a problem here.
Please, Can you describe specific crash scenario and explain how could the locking fix it?

> > Maybe if it had been called just a moment after, It might get different
> answer.
> > Because these APIs don't change ethdev structure(just read), it can be OK.
> > But again, I can understand to use ownership lock also here.
> >
> 
> Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-15 13:09                   ` Matan Azrad
@ 2018-01-15 18:43                     ` Ananyev, Konstantin
  2018-01-16  8:04                       ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-15 18:43 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce

Hi Matan,

> 
> Hi Konstantin
> 
> From: Ananyev, Konstantin, Monday, January 15, 2018 1:45 PM
> > Hi Matan,
> >
> > >
> > >
> > > Hi Konstantin
> > >
> > > From: Ananyev, Konstantin, Friday, January 12, 2018 2:02 AM
> > > > Hi Matan,
> > > >
> > > > >
> > > > > Hi Konstantin
> > > > >
> > > > > From: Ananyev, Konstantin, Thursday, January 11, 2018 2:40 PM
> > > > > > Hi Matan,
> > > > > >
> > > > > > >
> > > > > > > Hi Konstantin
> > > > > > >
> > > > > > > From: Ananyev, Konstantin, Wednesday, January 10, 2018 3:36 PM
> > > > > > > > Hi Matan,
> <snip>
> > > > > > > > It is good to see that now scanning/updating
> > > > > > > > rte_eth_dev_data[] is lock protected, but it might be not
> > > > > > > > very plausible to protect both data[] and next_owner_id using the
> > same lock.
> > > > > > >
> > > > > > > I guess you mean to the owner structure in
> > rte_eth_dev_data[port_id].
> > > > > > > The next_owner_id is read by ownership APIs(for owner
> > > > > > > validation), so it
> > > > > > makes sense to use the same lock.
> > > > > > > Actually, why not?
> > > > > >
> > > > > > Well to me next_owner_id and rte_eth_dev_data[] are not directly
> > > > related.
> > > > > > You may create new owner_id but it doesn't mean you would update
> > > > > > rte_eth_dev_data[] immediately.
> > > > > > And visa-versa - you might just want to update
> > > > > > rte_eth_dev_data[].name or .owner_id.
> > > > > > It is not very good coding practice to use same lock for
> > > > > > non-related data structures.
> > > > > >
> > > > > I see the relation like next:
> > > > > Since the ownership mechanism synchronization is in ethdev
> > > > > responsibility, we must protect against user mistakes as much as
> > > > > we can by
> > > > using the same lock.
> > > > > So, if user try to set by invalid owner (exactly the ID which
> > > > > currently is
> > > > allocated) we can protect on it.
> > > >
> > > > Hmm, not sure why you can't do same checking with different lock or
> > > > atomic variable?
> > > >
> > > The set ownership API is protected by ownership lock and checks the
> > > owner ID validity By reading the next owner ID.
> > > So, the owner ID allocation and set API should use the same atomic
> > mechanism.
> >
> > Sure but all you are doing for checking validity, is  check that owner_id > 0
> > &&& owner_id < next_ownwe_id, right?
> > As you don't allow owner_id overlap (16/3248 bits) you can safely do same
> > check with just atomic_get(&next_owner_id).
> >
> It will not protect it, scenario:
> - current next_id is X.
> - call set ownership of port A with owner id X by thread 0(by user mistake).
> - context switch
> - allocate new id by thread 1 and get X and change next_id to X+1 atomically.
> -  context switch
> - Thread 0 validate X by atomic_read and succeed to take ownership.
> - The system loosed the port(or will be managed by two entities) - crash.


Ok, and how using lock will protect you with such scenario?
I don't think you can protect yourself against such scenario with or without locking.
Unless you'll make it harder for the mis-behaving thread to guess valid owner_id,
or add some extra logic here.

> 
> 
> > > The set(and others) ownership APIs already uses the ownership lock so I
> > think it makes sense to use the same lock also in ID allocation.
> > >
> > > > > > > > In fact, for next_owner_id, you don't need a lock - just
> > > > > > > > rte_atomic_t should be enough.
> > > > > > >
> > > > > > > I don't think so, it is problematic in next_owner_id
> > > > > > > wraparound and may
> > > > > > complicate the code in other places which read it.
> > > > > >
> > > > > > IMO it is not that complicated, something like that should work I think.
> > > > > >
> > > > > > /* init to 0 at startup*/
> > > > > > rte_atomic32_t *owner_id;
> > > > > >
> > > > > > int new_owner_id(void)
> > > > > > {
> > > > > >     int32_t x;
> > > > > >     x = rte_atomic32_add_return(&owner_id, 1);
> > > > > >     if (x > UINT16_MAX) {
> > > > > >        rte_atomic32_dec(&owner_id);
> > > > > >        return -EOVERWLOW;
> > > > > >     } else
> > > > > >         return x;
> > > > > > }
> > > > > >
> > > > > >
> > > > > > > Why not just to keep it simple and using the same lock?
> > > > > >
> > > > > > Lock is also fine, I just think it better be a separate one -
> > > > > > that would protext just next_owner_id.
> > > > > > Though if you are going to use uuid here - all that probably not
> > > > > > relevant any more.
> > > > > >
> > > > >
> > > > > I agree about the uuid but still think the same lock should be used for
> > both.
> > > >
> > > > But with uuid you don't need next_owner_id at all, right?
> > > > So lock will only be used for rte_eth_dev_data[] fields anyway.
> > > >
> > > Sorry, I meant uint64_t, not uuid.
> >
> > Ah ok, my thought uuid_t is better as with it you don't need to support your
> > own code to allocate new owner_id, but rely on system libs instead.
> > But wouldn't insist here.
> >
> > >
> > > > > > > > Another alternative would be to use 2 locks - one for
> > > > > > > > next_owner_id second for actual data[] protection.
> > > > > > > >
> > > > > > > > Another thing - you'll probably need to grab/release a lock
> > > > > > > > inside
> > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > It is a public function used by drivers, so need to be protected too.
> > > > > > > >
> > > > > > >
> > > > > > > Yes, I thought about it, but decided not to use lock in next:
> > > > > > > rte_eth_dev_allocated
> > > > > > > rte_eth_dev_count
> > > > > > > rte_eth_dev_get_name_by_port
> > > > > > > rte_eth_dev_get_port_by_name
> > > > > > > maybe more...
> > > > > >
> > > > > > As I can see in patch #3 you protect by lock access to
> > > > > > rte_eth_dev_data[].name (which seems like a good  thing).
> > > > > > So I think any other public function that access
> > > > > > rte_eth_dev_data[].name should be protected by the same lock.
> > > > > >
> > > > >
> > > > > I don't think so, I can understand to use the ownership lock
> > > > > here(as in port
> > > > creation) but I don't think it is necessary too.
> > > > > What are we exactly protecting here?
> > > > > Don't you think it is just timing?(ask in the next moment and you
> > > > > may get another answer) I don't see optional crash.
> > > >
> > > > Not sure what you mean here by timing...
> > > > As I understand rte_eth_dev_data[].name unique identifies device and
> > > > is used by  port allocation/release/find functions.
> > > > As you stated above:
> > > > "1. The port allocation and port release synchronization will be
> > > > managed by ethdev."
> > > > To me it means that ethdev layer has to make sure that all accesses
> > > > to rte_eth_dev_data[].name are atomic.
> > > > Otherwise what would prevent the situation when one process does
> > > > rte_eth_dev_allocate()->snprintf(rte_eth_dev_data[x].name, ...)
> > > > while second one does
> > rte_eth_dev_allocated(rte_eth_dev_data[x].name, ...) ?
> > > >
> > > The second will get True or False and that is it.
> >
> > Under race condition - in the worst case it might crash, though for that you'll
> > have to be really unlucky.
> > Though in most cases as you said it would just not operate correctly.
> > I think if we start to protect dev->name by lock we need to do it for all
> > instances (both read and write).
> >
> Since under the ownership rules, the user must take ownership of a port before using it, I still don't see a problem here.

I am not talking about owner id or name here.
I am talking about dev->name.

> Please, Can you describe specific crash scenario and explain how could the locking fix it?

Let say thread 0 doing rte_eth_dev_allocate()->snprintf(rte_eth_dev_data[x].name, ...),
thread 1 doing rte_pmd_ring_remove()->rte_eth_dev_allocated()->strcmp().
And because of race condition - rte_eth_dev_allocated() will return rte_eth_dev *
for the wrong device.
Then rte_pmd_ring_remove() will call rte_free() for related resources, while
It can still be in use by someone else.
Konstantin

> 
> > > Maybe if it had been called just a moment after, It might get different
> > answer.
> > > Because these APIs don't change ethdev structure(just read), it can be OK.
> > > But again, I can understand to use ownership lock also here.
> > >
> >
> > Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port ownership
  2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port ownership Matan Azrad
  2018-01-08 11:39     ` Gaëtan Rivet
@ 2018-01-16  5:53     ` Lu, Wenzhuo
  2018-01-16  8:15       ` Matan Azrad
  1 sibling, 1 reply; 214+ messages in thread
From: Lu, Wenzhuo @ 2018-01-16  5:53 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce, Ananyev, Konstantin

Hi Matan,


> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> Sent: Sunday, January 7, 2018 5:46 PM
> To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>
> Subject: [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port
> ownership
> 
> Testpmd should not use ethdev ports which are managed by other DPDK
> entities.
> 
> Set Testpmd ownership to each port which is not used by other entity and
> prevent any usage of ethdev ports which are not owned by Testpmd.
Sorry I don't follow all the discussion as there's too much. So it may be a silly question.
Testpmd already has the parameter " --pci-whitelist" to only use the assigned devices. When using this parameter, all the devices are owned by the current APP. So I don't know why need to set/check the ownership.
BTW, in this patch, seem all the change is for ownership checking. I don't find the setting code. Do I miss something?

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-15 18:43                     ` Ananyev, Konstantin
@ 2018-01-16  8:04                       ` Matan Azrad
  2018-01-16 19:11                         ` Ananyev, Konstantin
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-16  8:04 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce


Hi Konstantin
From: Ananyev, Konstantin, Monday, January 15, 2018 8:44 PM
> Hi Matan,
> > Hi Konstantin
> > From: Ananyev, Konstantin, Monday, January 15, 2018 1:45 PM
> > > Hi Matan,
> > > > Hi Konstantin
> > > > From: Ananyev, Konstantin, Friday, January 12, 2018 2:02 AM
> > > > > Hi Matan,
> > > > > > Hi Konstantin
> > > > > > From: Ananyev, Konstantin, Thursday, January 11, 2018 2:40 PM
> > > > > > > Hi Matan,
> > > > > > > > Hi Konstantin
> > > > > > > > From: Ananyev, Konstantin, Wednesday, January 10, 2018
> > > > > > > > 3:36 PM
> > > > > > > > > Hi Matan,
 <snip>
> > > > > > > > > It is good to see that now scanning/updating
> > > > > > > > > rte_eth_dev_data[] is lock protected, but it might be
> > > > > > > > > not very plausible to protect both data[] and
> > > > > > > > > next_owner_id using the
> > > same lock.
> > > > > > > >
> > > > > > > > I guess you mean to the owner structure in
> > > rte_eth_dev_data[port_id].
> > > > > > > > The next_owner_id is read by ownership APIs(for owner
> > > > > > > > validation), so it
> > > > > > > makes sense to use the same lock.
> > > > > > > > Actually, why not?
> > > > > > >
> > > > > > > Well to me next_owner_id and rte_eth_dev_data[] are not
> > > > > > > directly
> > > > > related.
> > > > > > > You may create new owner_id but it doesn't mean you would
> > > > > > > update rte_eth_dev_data[] immediately.
> > > > > > > And visa-versa - you might just want to update
> > > > > > > rte_eth_dev_data[].name or .owner_id.
> > > > > > > It is not very good coding practice to use same lock for
> > > > > > > non-related data structures.
> > > > > > >
> > > > > > I see the relation like next:
> > > > > > Since the ownership mechanism synchronization is in ethdev
> > > > > > responsibility, we must protect against user mistakes as much
> > > > > > as we can by
> > > > > using the same lock.
> > > > > > So, if user try to set by invalid owner (exactly the ID which
> > > > > > currently is
> > > > > allocated) we can protect on it.
> > > > >
> > > > > Hmm, not sure why you can't do same checking with different lock
> > > > > or atomic variable?
> > > > >
> > > > The set ownership API is protected by ownership lock and checks
> > > > the owner ID validity By reading the next owner ID.
> > > > So, the owner ID allocation and set API should use the same atomic
> > > mechanism.
> > >
> > > Sure but all you are doing for checking validity, is  check that
> > > owner_id > 0 &&& owner_id < next_ownwe_id, right?
> > > As you don't allow owner_id overlap (16/3248 bits) you can safely do
> > > same check with just atomic_get(&next_owner_id).
> > >
> > It will not protect it, scenario:
> > - current next_id is X.
> > - call set ownership of port A with owner id X by thread 0(by user mistake).
> > - context switch 
> > - allocate new id by thread 1 and get X and change next_id to X+1
> atomically.
> > -  context switch
> > - Thread 0 validate X by atomic_read and succeed to take ownership.
> > - The system loosed the port(or will be managed by two entities) - crash.
> 
> 
> Ok, and how using lock will protect you with such scenario?

The owner set API validation by thread 0 should fail because the owner validation is included in the protected section.

> I don't think you can protect yourself against such scenario with or without
> locking.
> Unless you'll make it harder for the mis-behaving thread to guess valid
> owner_id, or add some extra logic here.
> 
> >
> >
> > > > The set(and others) ownership APIs already uses the ownership lock
> > > > so I
> > > think it makes sense to use the same lock also in ID allocation.
> > > >
> > > > > > > > > In fact, for next_owner_id, you don't need a lock - just
> > > > > > > > > rte_atomic_t should be enough.
> > > > > > > >
> > > > > > > > I don't think so, it is problematic in next_owner_id
> > > > > > > > wraparound and may
> > > > > > > complicate the code in other places which read it.
> > > > > > >
> > > > > > > IMO it is not that complicated, something like that should work I
> think.
> > > > > > >
> > > > > > > /* init to 0 at startup*/
> > > > > > > rte_atomic32_t *owner_id;
> > > > > > >
> > > > > > > int new_owner_id(void)
> > > > > > > {
> > > > > > >     int32_t x;
> > > > > > >     x = rte_atomic32_add_return(&owner_id, 1);
> > > > > > >     if (x > UINT16_MAX) {
> > > > > > >        rte_atomic32_dec(&owner_id);
> > > > > > >        return -EOVERWLOW;
> > > > > > >     } else
> > > > > > >         return x;
> > > > > > > }
> > > > > > >
> > > > > > >
> > > > > > > > Why not just to keep it simple and using the same lock?
> > > > > > >
> > > > > > > Lock is also fine, I just think it better be a separate one
> > > > > > > - that would protext just next_owner_id.
> > > > > > > Though if you are going to use uuid here - all that probably
> > > > > > > not relevant any more.
> > > > > > >
> > > > > >
> > > > > > I agree about the uuid but still think the same lock should be
> > > > > > used for
> > > both.
> > > > >
> > > > > But with uuid you don't need next_owner_id at all, right?
> > > > > So lock will only be used for rte_eth_dev_data[] fields anyway.
> > > > >
> > > > Sorry, I meant uint64_t, not uuid.
> > >
> > > Ah ok, my thought uuid_t is better as with it you don't need to
> > > support your own code to allocate new owner_id, but rely on system libs
> instead.
> > > But wouldn't insist here.
> > >
> > > >
> > > > > > > > > Another alternative would be to use 2 locks - one for
> > > > > > > > > next_owner_id second for actual data[] protection.
> > > > > > > > >
> > > > > > > > > Another thing - you'll probably need to grab/release a
> > > > > > > > > lock inside
> > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > It is a public function used by drivers, so need to be protected
> too.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Yes, I thought about it, but decided not to use lock in next:
> > > > > > > > rte_eth_dev_allocated
> > > > > > > > rte_eth_dev_count
> > > > > > > > rte_eth_dev_get_name_by_port
> rte_eth_dev_get_port_by_name
> > > > > > > > maybe more...
> > > > > > >
> > > > > > > As I can see in patch #3 you protect by lock access to
> > > > > > > rte_eth_dev_data[].name (which seems like a good  thing).
> > > > > > > So I think any other public function that access
> > > > > > > rte_eth_dev_data[].name should be protected by the same lock.
> > > > > > >
> > > > > >
> > > > > > I don't think so, I can understand to use the ownership lock
> > > > > > here(as in port
> > > > > creation) but I don't think it is necessary too.
> > > > > > What are we exactly protecting here?
> > > > > > Don't you think it is just timing?(ask in the next moment and
> > > > > > you may get another answer) I don't see optional crash.
> > > > >
> > > > > Not sure what you mean here by timing...
> > > > > As I understand rte_eth_dev_data[].name unique identifies device
> > > > > and is used by  port allocation/release/find functions.
> > > > > As you stated above:
> > > > > "1. The port allocation and port release synchronization will be
> > > > > managed by ethdev."
> > > > > To me it means that ethdev layer has to make sure that all
> > > > > accesses to rte_eth_dev_data[].name are atomic.
> > > > > Otherwise what would prevent the situation when one process does
> > > > > rte_eth_dev_allocate()->snprintf(rte_eth_dev_data[x].name, ...)
> > > > > while second one does
> > > rte_eth_dev_allocated(rte_eth_dev_data[x].name, ...) ?
> > > > >
> > > > The second will get True or False and that is it.
> > >
> > > Under race condition - in the worst case it might crash, though for
> > > that you'll have to be really unlucky.
> > > Though in most cases as you said it would just not operate correctly.
> > > I think if we start to protect dev->name by lock we need to do it
> > > for all instances (both read and write).
> > >
> > Since under the ownership rules, the user must take ownership of a port
> before using it, I still don't see a problem here.
> 
> I am not talking about owner id or name here.
> I am talking about dev->name.
> 
So? The user still should take ownership of a device before using it (by name or by port id). 
It can just read it without owning it, but no managing it. 

> > Please, Can you describe specific crash scenario and explain how could the
> locking fix it?
> 
> Let say thread 0 doing rte_eth_dev_allocate()-
> >snprintf(rte_eth_dev_data[x].name, ...), thread 1 doing
> rte_pmd_ring_remove()->rte_eth_dev_allocated()->strcmp().
> And because of race condition - rte_eth_dev_allocated() will return
> rte_eth_dev * for the wrong device.
Which wrong device do you mean? I guess it is the device which currently is being created by thread 0.
> Then rte_pmd_ring_remove() will call rte_free() for related resources, while
> It can still be in use by someone else.
The rte_pmd_ring_remove caller(some DPDK entity) must take ownership (or validate that he is the owner) of a port before doing it(free, release), so no issue here.


Also I'm not sure I fully understand your scenario looks like moving the device state setting in allocation to be after the name setting will be good.
What do you think? 

> Konstantin
> 
> >
> > > > Maybe if it had been called just a moment after, It might get
> > > > different
> > > answer.
> > > > Because these APIs don't change ethdev structure(just read), it can be
> OK.
> > > > But again, I can understand to use ownership lock also here.
> > > >
> > >
> > > Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port ownership
  2018-01-16  5:53     ` Lu, Wenzhuo
@ 2018-01-16  8:15       ` Matan Azrad
  2018-01-17  0:46         ` Lu, Wenzhuo
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-16  8:15 UTC (permalink / raw)
  To: Lu, Wenzhuo, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce, Ananyev, Konstantin

Hi Lu
From: Lu, Wenzhuo, Tuesday, January 16, 2018 7:54 AM
> Hi Matan,
> 
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > Sent: Sunday, January 7, 2018 5:46 PM
> > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>
> > Subject: [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port
> > ownership
> >
> > Testpmd should not use ethdev ports which are managed by other DPDK
> > entities.
> >
> > Set Testpmd ownership to each port which is not used by other entity
> > and prevent any usage of ethdev ports which are not owned by Testpmd.
> Sorry I don't follow all the discussion as there's too much. So it may be a silly
> question.

No problem, I'm here for any question :)

> Testpmd already has the parameter " --pci-whitelist" to only use the assigned
> devices.

It is an EAL parameter. No? just say to EAL which devices to create..

> When using this parameter, all the devices are owned by the current
> APP.

No, what's about vdev? vdevs may manage devices(even whitlist PCI devices) by themselves and want to prevent any app to use these devices(see fail-safe PMD).

 > So I don't know why need to set/check the ownership.
> BTW, in this patch, seem all the change is for ownership checking. I don't find
> the setting code. Do I miss something?

Yes, see in main function (the first FOREACH).

Thanks, Matan.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-16  8:04                       ` Matan Azrad
@ 2018-01-16 19:11                         ` Ananyev, Konstantin
  2018-01-16 20:32                           ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-16 19:11 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce


Hi Matan,

> 
> Hi Konstantin
> From: Ananyev, Konstantin, Monday, January 15, 2018 8:44 PM
> > Hi Matan,
> > > Hi Konstantin
> > > From: Ananyev, Konstantin, Monday, January 15, 2018 1:45 PM
> > > > Hi Matan,
> > > > > Hi Konstantin
> > > > > From: Ananyev, Konstantin, Friday, January 12, 2018 2:02 AM
> > > > > > Hi Matan,
> > > > > > > Hi Konstantin
> > > > > > > From: Ananyev, Konstantin, Thursday, January 11, 2018 2:40 PM
> > > > > > > > Hi Matan,
> > > > > > > > > Hi Konstantin
> > > > > > > > > From: Ananyev, Konstantin, Wednesday, January 10, 2018
> > > > > > > > > 3:36 PM
> > > > > > > > > > Hi Matan,
>  <snip>
> > > > > > > > > > It is good to see that now scanning/updating
> > > > > > > > > > rte_eth_dev_data[] is lock protected, but it might be
> > > > > > > > > > not very plausible to protect both data[] and
> > > > > > > > > > next_owner_id using the
> > > > same lock.
> > > > > > > > >
> > > > > > > > > I guess you mean to the owner structure in
> > > > rte_eth_dev_data[port_id].
> > > > > > > > > The next_owner_id is read by ownership APIs(for owner
> > > > > > > > > validation), so it
> > > > > > > > makes sense to use the same lock.
> > > > > > > > > Actually, why not?
> > > > > > > >
> > > > > > > > Well to me next_owner_id and rte_eth_dev_data[] are not
> > > > > > > > directly
> > > > > > related.
> > > > > > > > You may create new owner_id but it doesn't mean you would
> > > > > > > > update rte_eth_dev_data[] immediately.
> > > > > > > > And visa-versa - you might just want to update
> > > > > > > > rte_eth_dev_data[].name or .owner_id.
> > > > > > > > It is not very good coding practice to use same lock for
> > > > > > > > non-related data structures.
> > > > > > > >
> > > > > > > I see the relation like next:
> > > > > > > Since the ownership mechanism synchronization is in ethdev
> > > > > > > responsibility, we must protect against user mistakes as much
> > > > > > > as we can by
> > > > > > using the same lock.
> > > > > > > So, if user try to set by invalid owner (exactly the ID which
> > > > > > > currently is
> > > > > > allocated) we can protect on it.
> > > > > >
> > > > > > Hmm, not sure why you can't do same checking with different lock
> > > > > > or atomic variable?
> > > > > >
> > > > > The set ownership API is protected by ownership lock and checks
> > > > > the owner ID validity By reading the next owner ID.
> > > > > So, the owner ID allocation and set API should use the same atomic
> > > > mechanism.
> > > >
> > > > Sure but all you are doing for checking validity, is  check that
> > > > owner_id > 0 &&& owner_id < next_ownwe_id, right?
> > > > As you don't allow owner_id overlap (16/3248 bits) you can safely do
> > > > same check with just atomic_get(&next_owner_id).
> > > >
> > > It will not protect it, scenario:
> > > - current next_id is X.
> > > - call set ownership of port A with owner id X by thread 0(by user mistake).
> > > - context switch
> > > - allocate new id by thread 1 and get X and change next_id to X+1
> > atomically.
> > > -  context switch
> > > - Thread 0 validate X by atomic_read and succeed to take ownership.
> > > - The system loosed the port(or will be managed by two entities) - crash.
> >
> >
> > Ok, and how using lock will protect you with such scenario?
> 
> The owner set API validation by thread 0 should fail because the owner validation is included in the protected section.

Then your validation function would fail even if you'll use atomic ops instead of lock.
But in fact your code is not protected for that scenario - doesn't matter will you'll use lock or atomic ops.
Let's considerer your current code with the following scenario:

next_owner_id  == 1
1) Process 0:
     rte_eth_dev_owner_new(&owner_id);
     now owner_id == 1 and next_owner_id == 2
2) Process 1 (by mistake):
    rte_eth_dev_owner_set(port_id=1, owner->id=1);
It will complete successfully, as owner_id ==1 is considered as valid.
3) Process 0:
      rte_eth_dev_owner_set(port_id=1, owner->id=1);
It will also complete with success, as owner->id is valid is equal to current port owner_id.
So you finished with 2 processes assuming that they do own exclusively then same port.

Honestly in that situation  locking around nest_owner_id wouldn't give you any advantages
over atomic ops.

> 
> > I don't think you can protect yourself against such scenario with or without
> > locking.
> > Unless you'll make it harder for the mis-behaving thread to guess valid
> > owner_id, or add some extra logic here.
> >
> > >
> > >
> > > > > The set(and others) ownership APIs already uses the ownership lock
> > > > > so I
> > > > think it makes sense to use the same lock also in ID allocation.
> > > > >
> > > > > > > > > > In fact, for next_owner_id, you don't need a lock - just
> > > > > > > > > > rte_atomic_t should be enough.
> > > > > > > > >
> > > > > > > > > I don't think so, it is problematic in next_owner_id
> > > > > > > > > wraparound and may
> > > > > > > > complicate the code in other places which read it.
> > > > > > > >
> > > > > > > > IMO it is not that complicated, something like that should work I
> > think.
> > > > > > > >
> > > > > > > > /* init to 0 at startup*/
> > > > > > > > rte_atomic32_t *owner_id;
> > > > > > > >
> > > > > > > > int new_owner_id(void)
> > > > > > > > {
> > > > > > > >     int32_t x;
> > > > > > > >     x = rte_atomic32_add_return(&owner_id, 1);
> > > > > > > >     if (x > UINT16_MAX) {
> > > > > > > >        rte_atomic32_dec(&owner_id);
> > > > > > > >        return -EOVERWLOW;
> > > > > > > >     } else
> > > > > > > >         return x;
> > > > > > > > }
> > > > > > > >
> > > > > > > >
> > > > > > > > > Why not just to keep it simple and using the same lock?
> > > > > > > >
> > > > > > > > Lock is also fine, I just think it better be a separate one
> > > > > > > > - that would protext just next_owner_id.
> > > > > > > > Though if you are going to use uuid here - all that probably
> > > > > > > > not relevant any more.
> > > > > > > >
> > > > > > >
> > > > > > > I agree about the uuid but still think the same lock should be
> > > > > > > used for
> > > > both.
> > > > > >
> > > > > > But with uuid you don't need next_owner_id at all, right?
> > > > > > So lock will only be used for rte_eth_dev_data[] fields anyway.
> > > > > >
> > > > > Sorry, I meant uint64_t, not uuid.
> > > >
> > > > Ah ok, my thought uuid_t is better as with it you don't need to
> > > > support your own code to allocate new owner_id, but rely on system libs
> > instead.
> > > > But wouldn't insist here.
> > > >
> > > > >
> > > > > > > > > > Another alternative would be to use 2 locks - one for
> > > > > > > > > > next_owner_id second for actual data[] protection.
> > > > > > > > > >
> > > > > > > > > > Another thing - you'll probably need to grab/release a
> > > > > > > > > > lock inside
> > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > It is a public function used by drivers, so need to be protected
> > too.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Yes, I thought about it, but decided not to use lock in next:
> > > > > > > > > rte_eth_dev_allocated
> > > > > > > > > rte_eth_dev_count
> > > > > > > > > rte_eth_dev_get_name_by_port
> > rte_eth_dev_get_port_by_name
> > > > > > > > > maybe more...
> > > > > > > >
> > > > > > > > As I can see in patch #3 you protect by lock access to
> > > > > > > > rte_eth_dev_data[].name (which seems like a good  thing).
> > > > > > > > So I think any other public function that access
> > > > > > > > rte_eth_dev_data[].name should be protected by the same lock.
> > > > > > > >
> > > > > > >
> > > > > > > I don't think so, I can understand to use the ownership lock
> > > > > > > here(as in port
> > > > > > creation) but I don't think it is necessary too.
> > > > > > > What are we exactly protecting here?
> > > > > > > Don't you think it is just timing?(ask in the next moment and
> > > > > > > you may get another answer) I don't see optional crash.
> > > > > >
> > > > > > Not sure what you mean here by timing...
> > > > > > As I understand rte_eth_dev_data[].name unique identifies device
> > > > > > and is used by  port allocation/release/find functions.
> > > > > > As you stated above:
> > > > > > "1. The port allocation and port release synchronization will be
> > > > > > managed by ethdev."
> > > > > > To me it means that ethdev layer has to make sure that all
> > > > > > accesses to rte_eth_dev_data[].name are atomic.
> > > > > > Otherwise what would prevent the situation when one process does
> > > > > > rte_eth_dev_allocate()->snprintf(rte_eth_dev_data[x].name, ...)
> > > > > > while second one does
> > > > rte_eth_dev_allocated(rte_eth_dev_data[x].name, ...) ?
> > > > > >
> > > > > The second will get True or False and that is it.
> > > >
> > > > Under race condition - in the worst case it might crash, though for
> > > > that you'll have to be really unlucky.
> > > > Though in most cases as you said it would just not operate correctly.
> > > > I think if we start to protect dev->name by lock we need to do it
> > > > for all instances (both read and write).
> > > >
> > > Since under the ownership rules, the user must take ownership of a port
> > before using it, I still don't see a problem here.
> >
> > I am not talking about owner id or name here.
> > I am talking about dev->name.
> >
> So? The user still should take ownership of a device before using it (by name or by port id).
> It can just read it without owning it, but no managing it.
> 
> > > Please, Can you describe specific crash scenario and explain how could the
> > locking fix it?
> >
> > Let say thread 0 doing rte_eth_dev_allocate()-
> > >snprintf(rte_eth_dev_data[x].name, ...), thread 1 doing
> > rte_pmd_ring_remove()->rte_eth_dev_allocated()->strcmp().
> > And because of race condition - rte_eth_dev_allocated() will return
> > rte_eth_dev * for the wrong device.
> Which wrong device do you mean? I guess it is the device which currently is being created by thread 0.
> > Then rte_pmd_ring_remove() will call rte_free() for related resources, while
> > It can still be in use by someone else.
> The rte_pmd_ring_remove caller(some DPDK entity) must take ownership (or validate that he is the owner) of a port before doing it(free,
> release), so no issue here.

Forget about ownership for a second.
Suppose we have a process it created ring port for itself (without setting any ownership)  and used it for some time.
Then it decided to remove it, so it calls rte_pmd_ring_remove() for it.
At the same time second process decides to call rte_eth_dev_allocate() (let say for anither ring port).
They could collide trying to read (process 0) and modify (process 1) same string rte_eth_dev_data[].name.

Konstantin 

> 
> 
> Also I'm not sure I fully understand your scenario looks like moving the device state setting in allocation to be after the name setting will be
> good.
> What do you think?
> 
> > Konstantin
> >
> > >
> > > > > Maybe if it had been called just a moment after, It might get
> > > > > different
> > > > answer.
> > > > > Because these APIs don't change ethdev structure(just read), it can be
> > OK.
> > > > > But again, I can understand to use ownership lock also here.
> > > > >
> > > >
> > > > Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-16 19:11                         ` Ananyev, Konstantin
@ 2018-01-16 20:32                           ` Matan Azrad
  2018-01-17 11:24                             ` Ananyev, Konstantin
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-16 20:32 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce



Hi Konstantin

From: Ananyev, Konstantin, Tuesday, January 16, 2018 9:11 PM
> Hi Matan,
> 
> >
> > Hi Konstantin
> > From: Ananyev, Konstantin, Monday, January 15, 2018 8:44 PM
> > > Hi Matan,
> > > > Hi Konstantin
> > > > From: Ananyev, Konstantin, Monday, January 15, 2018 1:45 PM
> > > > > Hi Matan,
> > > > > > Hi Konstantin
> > > > > > From: Ananyev, Konstantin, Friday, January 12, 2018 2:02 AM
> > > > > > > Hi Matan,
> > > > > > > > Hi Konstantin
> > > > > > > > From: Ananyev, Konstantin, Thursday, January 11, 2018 2:40
> > > > > > > > PM
> > > > > > > > > Hi Matan,
> > > > > > > > > > Hi Konstantin
> > > > > > > > > > From: Ananyev, Konstantin, Wednesday, January 10, 2018
> > > > > > > > > > 3:36 PM
> > > > > > > > > > > Hi Matan,
> >  <snip>
> > > > > > > > > > > It is good to see that now scanning/updating
> > > > > > > > > > > rte_eth_dev_data[] is lock protected, but it might
> > > > > > > > > > > be not very plausible to protect both data[] and
> > > > > > > > > > > next_owner_id using the
> > > > > same lock.
> > > > > > > > > >
> > > > > > > > > > I guess you mean to the owner structure in
> > > > > rte_eth_dev_data[port_id].
> > > > > > > > > > The next_owner_id is read by ownership APIs(for owner
> > > > > > > > > > validation), so it
> > > > > > > > > makes sense to use the same lock.
> > > > > > > > > > Actually, why not?
> > > > > > > > >
> > > > > > > > > Well to me next_owner_id and rte_eth_dev_data[] are not
> > > > > > > > > directly
> > > > > > > related.
> > > > > > > > > You may create new owner_id but it doesn't mean you
> > > > > > > > > would update rte_eth_dev_data[] immediately.
> > > > > > > > > And visa-versa - you might just want to update
> > > > > > > > > rte_eth_dev_data[].name or .owner_id.
> > > > > > > > > It is not very good coding practice to use same lock for
> > > > > > > > > non-related data structures.
> > > > > > > > >
> > > > > > > > I see the relation like next:
> > > > > > > > Since the ownership mechanism synchronization is in ethdev
> > > > > > > > responsibility, we must protect against user mistakes as
> > > > > > > > much as we can by
> > > > > > > using the same lock.
> > > > > > > > So, if user try to set by invalid owner (exactly the ID
> > > > > > > > which currently is
> > > > > > > allocated) we can protect on it.
> > > > > > >
> > > > > > > Hmm, not sure why you can't do same checking with different
> > > > > > > lock or atomic variable?
> > > > > > >
> > > > > > The set ownership API is protected by ownership lock and
> > > > > > checks the owner ID validity By reading the next owner ID.
> > > > > > So, the owner ID allocation and set API should use the same
> > > > > > atomic
> > > > > mechanism.
> > > > >
> > > > > Sure but all you are doing for checking validity, is  check that
> > > > > owner_id > 0 &&& owner_id < next_ownwe_id, right?
> > > > > As you don't allow owner_id overlap (16/3248 bits) you can
> > > > > safely do same check with just atomic_get(&next_owner_id).
> > > > >
> > > > It will not protect it, scenario:
> > > > - current next_id is X.
> > > > - call set ownership of port A with owner id X by thread 0(by user
> mistake).
> > > > - context switch
> > > > - allocate new id by thread 1 and get X and change next_id to X+1
> > > atomically.
> > > > -  context switch
> > > > - Thread 0 validate X by atomic_read and succeed to take ownership.
> > > > - The system loosed the port(or will be managed by two entities) -
> crash.
> > >
> > >
> > > Ok, and how using lock will protect you with such scenario?
> >
> > The owner set API validation by thread 0 should fail because the owner
> validation is included in the protected section.
> 
> Then your validation function would fail even if you'll use atomic ops instead
> of lock.
No.
With atomic this specific scenario will cause the validation to pass.
With lock no next_id changes can be done while the thread is in the set API. 

> But in fact your code is not protected for that scenario - doesn't matter will
> you'll use lock or atomic ops.
> Let's considerer your current code with the following scenario:
> 
> next_owner_id  == 1
> 1) Process 0:
>      rte_eth_dev_owner_new(&owner_id);
>      now owner_id == 1 and next_owner_id == 2
> 2) Process 1 (by mistake):
>     rte_eth_dev_owner_set(port_id=1, owner->id=1); It will complete
> successfully, as owner_id ==1 is considered as valid.
> 3) Process 0:
>       rte_eth_dev_owner_set(port_id=1, owner->id=1); It will also complete
> with success, as owner->id is valid is equal to current port owner_id.
> So you finished with 2 processes assuming that they do own exclusively then
> same port.
> 
> Honestly in that situation  locking around nest_owner_id wouldn't give you
> any advantages over atomic ops.
> 

This is a different scenario that we can't protect on it with atomic or locks.
But for the first scenario I described I think we can.
Please read it again, I described it step by step.

> >
> > > I don't think you can protect yourself against such scenario with or
> > > without locking.
> > > Unless you'll make it harder for the mis-behaving thread to guess
> > > valid owner_id, or add some extra logic here.
> > >
> > > >
> > > >
> > > > > > The set(and others) ownership APIs already uses the ownership
> > > > > > lock so I
> > > > > think it makes sense to use the same lock also in ID allocation.
> > > > > >
> > > > > > > > > > > In fact, for next_owner_id, you don't need a lock -
> > > > > > > > > > > just rte_atomic_t should be enough.
> > > > > > > > > >
> > > > > > > > > > I don't think so, it is problematic in next_owner_id
> > > > > > > > > > wraparound and may
> > > > > > > > > complicate the code in other places which read it.
> > > > > > > > >
> > > > > > > > > IMO it is not that complicated, something like that
> > > > > > > > > should work I
> > > think.
> > > > > > > > >
> > > > > > > > > /* init to 0 at startup*/ rte_atomic32_t *owner_id;
> > > > > > > > >
> > > > > > > > > int new_owner_id(void)
> > > > > > > > > {
> > > > > > > > >     int32_t x;
> > > > > > > > >     x = rte_atomic32_add_return(&owner_id, 1);
> > > > > > > > >     if (x > UINT16_MAX) {
> > > > > > > > >        rte_atomic32_dec(&owner_id);
> > > > > > > > >        return -EOVERWLOW;
> > > > > > > > >     } else
> > > > > > > > >         return x;
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > Why not just to keep it simple and using the same lock?
> > > > > > > > >
> > > > > > > > > Lock is also fine, I just think it better be a separate
> > > > > > > > > one
> > > > > > > > > - that would protext just next_owner_id.
> > > > > > > > > Though if you are going to use uuid here - all that
> > > > > > > > > probably not relevant any more.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I agree about the uuid but still think the same lock
> > > > > > > > should be used for
> > > > > both.
> > > > > > >
> > > > > > > But with uuid you don't need next_owner_id at all, right?
> > > > > > > So lock will only be used for rte_eth_dev_data[] fields anyway.
> > > > > > >
> > > > > > Sorry, I meant uint64_t, not uuid.
> > > > >
> > > > > Ah ok, my thought uuid_t is better as with it you don't need to
> > > > > support your own code to allocate new owner_id, but rely on
> > > > > system libs
> > > instead.
> > > > > But wouldn't insist here.
> > > > >
> > > > > >
> > > > > > > > > > > Another alternative would be to use 2 locks - one
> > > > > > > > > > > for next_owner_id second for actual data[] protection.
> > > > > > > > > > >
> > > > > > > > > > > Another thing - you'll probably need to grab/release
> > > > > > > > > > > a lock inside
> > > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > > It is a public function used by drivers, so need to
> > > > > > > > > > > be protected
> > > too.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Yes, I thought about it, but decided not to use lock in next:
> > > > > > > > > > rte_eth_dev_allocated
> > > > > > > > > > rte_eth_dev_count
> > > > > > > > > > rte_eth_dev_get_name_by_port
> > > rte_eth_dev_get_port_by_name
> > > > > > > > > > maybe more...
> > > > > > > > >
> > > > > > > > > As I can see in patch #3 you protect by lock access to
> > > > > > > > > rte_eth_dev_data[].name (which seems like a good  thing).
> > > > > > > > > So I think any other public function that access
> > > > > > > > > rte_eth_dev_data[].name should be protected by the same
> lock.
> > > > > > > > >
> > > > > > > >
> > > > > > > > I don't think so, I can understand to use the ownership
> > > > > > > > lock here(as in port
> > > > > > > creation) but I don't think it is necessary too.
> > > > > > > > What are we exactly protecting here?
> > > > > > > > Don't you think it is just timing?(ask in the next moment
> > > > > > > > and you may get another answer) I don't see optional crash.
> > > > > > >
> > > > > > > Not sure what you mean here by timing...
> > > > > > > As I understand rte_eth_dev_data[].name unique identifies
> > > > > > > device and is used by  port allocation/release/find functions.
> > > > > > > As you stated above:
> > > > > > > "1. The port allocation and port release synchronization
> > > > > > > will be managed by ethdev."
> > > > > > > To me it means that ethdev layer has to make sure that all
> > > > > > > accesses to rte_eth_dev_data[].name are atomic.
> > > > > > > Otherwise what would prevent the situation when one process
> > > > > > > does
> > > > > > > rte_eth_dev_allocate()->snprintf(rte_eth_dev_data[x].name,
> > > > > > > ...) while second one does
> > > > > rte_eth_dev_allocated(rte_eth_dev_data[x].name, ...) ?
> > > > > > >
> > > > > > The second will get True or False and that is it.
> > > > >
> > > > > Under race condition - in the worst case it might crash, though
> > > > > for that you'll have to be really unlucky.
> > > > > Though in most cases as you said it would just not operate correctly.
> > > > > I think if we start to protect dev->name by lock we need to do
> > > > > it for all instances (both read and write).
> > > > >
> > > > Since under the ownership rules, the user must take ownership of a
> > > > port
> > > before using it, I still don't see a problem here.
> > >
> > > I am not talking about owner id or name here.
> > > I am talking about dev->name.
> > >
> > So? The user still should take ownership of a device before using it (by
> name or by port id).
> > It can just read it without owning it, but no managing it.
> >
> > > > Please, Can you describe specific crash scenario and explain how
> > > > could the
> > > locking fix it?
> > >
> > > Let say thread 0 doing rte_eth_dev_allocate()-
> > > >snprintf(rte_eth_dev_data[x].name, ...), thread 1 doing
> > > rte_pmd_ring_remove()->rte_eth_dev_allocated()->strcmp().
> > > And because of race condition - rte_eth_dev_allocated() will return
> > > rte_eth_dev * for the wrong device.
> > Which wrong device do you mean? I guess it is the device which currently is
> being created by thread 0.
> > > Then rte_pmd_ring_remove() will call rte_free() for related
> > > resources, while It can still be in use by someone else.
> > The rte_pmd_ring_remove caller(some DPDK entity) must take ownership
> > (or validate that he is the owner) of a port before doing it(free, release), so
> no issue here.
> 
> Forget about ownership for a second.
> Suppose we have a process it created ring port for itself (without setting any
> ownership)  and used it for some time.
> Then it decided to remove it, so it calls rte_pmd_ring_remove() for it.
> At the same time second process decides to call rte_eth_dev_allocate() (let
> say for anither ring port).
> They could collide trying to read (process 0) and modify (process 1) same
> string rte_eth_dev_data[].name.
>
Do you mean that process 0 will compare successfully the process 1 new port name?
The state are in local process memory - so process 0 will not compare the process 1 port, from its point of view this port is in UNUSED state. 

> Konstantin
> 
> >
> >
> > Also I'm not sure I fully understand your scenario looks like moving
> > the device state setting in allocation to be after the name setting will be
> good.
> > What do you think?
> >
> > > Konstantin
> > >
> > > >
> > > > > > Maybe if it had been called just a moment after, It might get
> > > > > > different
> > > > > answer.
> > > > > > Because these APIs don't change ethdev structure(just read),
> > > > > > it can be
> > > OK.
> > > > > > But again, I can understand to use ownership lock also here.
> > > > > >
> > > > >
> > > > > Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port ownership
  2018-01-16  8:15       ` Matan Azrad
@ 2018-01-17  0:46         ` Lu, Wenzhuo
  2018-01-17  8:51           ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Lu, Wenzhuo @ 2018-01-17  0:46 UTC (permalink / raw)
  To: 'Matan Azrad', Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce, Ananyev, Konstantin

Hi Matan,

> -----Original Message-----
> From: Matan Azrad [mailto:matan@mellanox.com]
> Sent: Tuesday, January 16, 2018 4:16 PM
> To: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Thomas Monjalon
> <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>; Wu,
> Jingjing <jingjing.wu@intel.com>
> Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port
> ownership
> 
> Hi Lu
> From: Lu, Wenzhuo, Tuesday, January 16, 2018 7:54 AM
> > Hi Matan,
> >
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > > Sent: Sunday, January 7, 2018 5:46 PM
> > > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > <konstantin.ananyev@intel.com>
> > > Subject: [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port
> > > ownership
> > >
> > > Testpmd should not use ethdev ports which are managed by other DPDK
> > > entities.
> > >
> > > Set Testpmd ownership to each port which is not used by other entity
> > > and prevent any usage of ethdev ports which are not owned by Testpmd.
> > Sorry I don't follow all the discussion as there's too much. So it may
> > be a silly question.
> 
> No problem, I'm here for any question :)
> 
> > Testpmd already has the parameter " --pci-whitelist" to only use the
> > assigned devices.
> 
> It is an EAL parameter. No? just say to EAL which devices to create..
> 
> > When using this parameter, all the devices are owned by the current
> > APP.
> 
> No, what's about vdev? vdevs may manage devices(even whitlist PCI devices)
> by themselves and want to prevent any app to use these devices(see fail-
> safe PMD).
I'm not an expert of EAL and vdev. Suppose this would be discussed in other patches.
I don't want to bother you again here as testpmd is only used to show the result.
So I think if this patch is needed just depends on if other patches are accepted :)

> 
>  > So I don't know why need to set/check the ownership.
> > BTW, in this patch, seem all the change is for ownership checking. I
> > don't find the setting code. Do I miss something?
> 
> Yes, see in main function (the first FOREACH).
I think you mean this change,

@@ -2394,7 +2406,12 @@  uint8_t port_is_bonding_slave(portid_t slave_pid)
 	rte_pdump_init(NULL);
 #endif
 
-	nb_ports = (portid_t) rte_eth_dev_count();
+	if (rte_eth_dev_owner_new(&my_owner.id))
+		rte_panic("Failed to get unique owner identifier\n");
+	snprintf(my_owner.name, sizeof(my_owner.name), TESTPMD_OWNER_NAME);
+	RTE_ETH_FOREACH_DEV_OWNED_BY(port_id, RTE_ETH_DEV_NO_OWNER)
+		if (rte_eth_dev_owner_set(port_id, &my_owner) == 0)
+			nb_ports++;
 	if (nb_ports == 0)
 		RTE_LOG(WARNING, EAL, "No probed ethernet devices\n");
But I thought about some code to assign a specific device to a specific APP explicitly.
This code looks like just occupying the devices with no owner. So, it means the first APP will occupy all the devices? It makes me confused as I don't see the benefit or the difference than before.

> 
> Thanks, Matan.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port ownership
  2018-01-17  0:46         ` Lu, Wenzhuo
@ 2018-01-17  8:51           ` Matan Azrad
  2018-01-18  0:53             ` Lu, Wenzhuo
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-17  8:51 UTC (permalink / raw)
  To: Lu, Wenzhuo, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce, Ananyev, Konstantin

Hi Lu

From: Lu, Wenzhuo, Wednesday, January 17, 2018 2:47 AM
> Hi Matan,
> 
> > -----Original Message-----
> > From: Matan Azrad [mailto:matan@mellanox.com]
> > Sent: Tuesday, January 16, 2018 4:16 PM
> > To: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Thomas Monjalon
> > <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>; Wu,
> > Jingjing <jingjing.wu@intel.com>
> > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>
> > Subject: RE: [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port
> > ownership
> >
> > Hi Lu
> > From: Lu, Wenzhuo, Tuesday, January 16, 2018 7:54 AM
> > > Hi Matan,
> > >
> > >
> > > > -----Original Message-----
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > > > Sent: Sunday, January 7, 2018 5:46 PM
> > > > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> Richardson,
> > > > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > > <konstantin.ananyev@intel.com>
> > > > Subject: [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port
> > > > ownership
> > > >
> > > > Testpmd should not use ethdev ports which are managed by other
> > > > DPDK entities.
> > > >
> > > > Set Testpmd ownership to each port which is not used by other
> > > > entity and prevent any usage of ethdev ports which are not owned by
> Testpmd.
> > > Sorry I don't follow all the discussion as there's too much. So it
> > > may be a silly question.
> >
> > No problem, I'm here for any question :)
> >
> > > Testpmd already has the parameter " --pci-whitelist" to only use the
> > > assigned devices.
> >
> > It is an EAL parameter. No? just say to EAL which devices to create..
> >
> > > When using this parameter, all the devices are owned by the current
> > > APP.
> >
> > No, what's about vdev? vdevs may manage devices(even whitlist PCI
> > devices) by themselves and want to prevent any app to use these
> > devices(see fail- safe PMD).
> I'm not an expert of EAL and vdev. Suppose this would be discussed in other
> patches.
> I don't want to bother you again here as testpmd is only used to show the
> result.
> So I think if this patch is needed just depends on if other patches are
> accepted :)
> 
Sure!

> >
> >  > So I don't know why need to set/check the ownership.
> > > BTW, in this patch, seem all the change is for ownership checking. I
> > > don't find the setting code. Do I miss something?
> >
> > Yes, see in main function (the first FOREACH).
> I think you mean this change,
> 
> @@ -2394,7 +2406,12 @@  uint8_t port_is_bonding_slave(portid_t
> slave_pid)
>  	rte_pdump_init(NULL);
>  #endif
> 
> -	nb_ports = (portid_t) rte_eth_dev_count();
> +	if (rte_eth_dev_owner_new(&my_owner.id))
> +		rte_panic("Failed to get unique owner identifier\n");
> +	snprintf(my_owner.name, sizeof(my_owner.name),
> TESTPMD_OWNER_NAME);
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(port_id,
> RTE_ETH_DEV_NO_OWNER)
> +		if (rte_eth_dev_owner_set(port_id, &my_owner) == 0)
> +			nb_ports++;
>  	if (nb_ports == 0)
>  		RTE_LOG(WARNING, EAL, "No probed ethernet devices\n");

Yes.

> But I thought about some code to assign a specific device to a specific APP
> explicitly.
> This code looks like just occupying the devices with no owner. So, it means
> the first APP will occupy all the devices? It makes me confused as I don't see
> the benefit or the difference than before.

Remember that in EAL init some drivers may create ports and take ownership of them, so this code owns all the ownerless ports.
So, ports which are already owned by another DPDK entity will not success to take ownership here.
Since Testpmd wanted to manage all the ports before this feature( by mistake ), I guess this is the right behavior now, just use the ports which are not used.
 
> 
> >
> > Thanks, Matan.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-16 20:32                           ` Matan Azrad
@ 2018-01-17 11:24                             ` Ananyev, Konstantin
  2018-01-17 12:05                               ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-17 11:24 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce

Hi Matan,

> Hi Konstantin
> 
> From: Ananyev, Konstantin, Tuesday, January 16, 2018 9:11 PM
> > Hi Matan,
> >
> > >
> > > Hi Konstantin
> > > From: Ananyev, Konstantin, Monday, January 15, 2018 8:44 PM
> > > > Hi Matan,
> > > > > Hi Konstantin
> > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 1:45 PM
> > > > > > Hi Matan,
> > > > > > > Hi Konstantin
> > > > > > > From: Ananyev, Konstantin, Friday, January 12, 2018 2:02 AM
> > > > > > > > Hi Matan,
> > > > > > > > > Hi Konstantin
> > > > > > > > > From: Ananyev, Konstantin, Thursday, January 11, 2018 2:40
> > > > > > > > > PM
> > > > > > > > > > Hi Matan,
> > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > From: Ananyev, Konstantin, Wednesday, January 10, 2018
> > > > > > > > > > > 3:36 PM
> > > > > > > > > > > > Hi Matan,
> > >  <snip>
> > > > > > > > > > > > It is good to see that now scanning/updating
> > > > > > > > > > > > rte_eth_dev_data[] is lock protected, but it might
> > > > > > > > > > > > be not very plausible to protect both data[] and
> > > > > > > > > > > > next_owner_id using the
> > > > > > same lock.
> > > > > > > > > > >
> > > > > > > > > > > I guess you mean to the owner structure in
> > > > > > rte_eth_dev_data[port_id].
> > > > > > > > > > > The next_owner_id is read by ownership APIs(for owner
> > > > > > > > > > > validation), so it
> > > > > > > > > > makes sense to use the same lock.
> > > > > > > > > > > Actually, why not?
> > > > > > > > > >
> > > > > > > > > > Well to me next_owner_id and rte_eth_dev_data[] are not
> > > > > > > > > > directly
> > > > > > > > related.
> > > > > > > > > > You may create new owner_id but it doesn't mean you
> > > > > > > > > > would update rte_eth_dev_data[] immediately.
> > > > > > > > > > And visa-versa - you might just want to update
> > > > > > > > > > rte_eth_dev_data[].name or .owner_id.
> > > > > > > > > > It is not very good coding practice to use same lock for
> > > > > > > > > > non-related data structures.
> > > > > > > > > >
> > > > > > > > > I see the relation like next:
> > > > > > > > > Since the ownership mechanism synchronization is in ethdev
> > > > > > > > > responsibility, we must protect against user mistakes as
> > > > > > > > > much as we can by
> > > > > > > > using the same lock.
> > > > > > > > > So, if user try to set by invalid owner (exactly the ID
> > > > > > > > > which currently is
> > > > > > > > allocated) we can protect on it.
> > > > > > > >
> > > > > > > > Hmm, not sure why you can't do same checking with different
> > > > > > > > lock or atomic variable?
> > > > > > > >
> > > > > > > The set ownership API is protected by ownership lock and
> > > > > > > checks the owner ID validity By reading the next owner ID.
> > > > > > > So, the owner ID allocation and set API should use the same
> > > > > > > atomic
> > > > > > mechanism.
> > > > > >
> > > > > > Sure but all you are doing for checking validity, is  check that
> > > > > > owner_id > 0 &&& owner_id < next_ownwe_id, right?
> > > > > > As you don't allow owner_id overlap (16/3248 bits) you can
> > > > > > safely do same check with just atomic_get(&next_owner_id).
> > > > > >
> > > > > It will not protect it, scenario:
> > > > > - current next_id is X.
> > > > > - call set ownership of port A with owner id X by thread 0(by user
> > mistake).
> > > > > - context switch
> > > > > - allocate new id by thread 1 and get X and change next_id to X+1
> > > > atomically.
> > > > > -  context switch
> > > > > - Thread 0 validate X by atomic_read and succeed to take ownership.
> > > > > - The system loosed the port(or will be managed by two entities) -
> > crash.
> > > >
> > > >
> > > > Ok, and how using lock will protect you with such scenario?
> > >
> > > The owner set API validation by thread 0 should fail because the owner
> > validation is included in the protected section.
> >
> > Then your validation function would fail even if you'll use atomic ops instead
> > of lock.
> No.
> With atomic this specific scenario will cause the validation to pass.

Can you explain to me how?

rte_eth_is_valid_owner_id(uint16_t owner_id)
{
              int32_t cur_owner_id = RTE_MIN(rte_atomic32_get(next_owner_id), UINT16_MAX);

	if (owner_id == RTE_ETH_DEV_NO_OWNER || owner > cur_owner_id) {
		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
		return 0;
	}
	return 1;
}

Let say your next_owne_id==X, and you invoke rte_eth_is_valid_owner_id(owner_id=X+1)  -
it would fail.

> With lock no next_id changes can be done while the thread is in the set API.
> 
> > But in fact your code is not protected for that scenario - doesn't matter will
> > you'll use lock or atomic ops.
> > Let's considerer your current code with the following scenario:
> >
> > next_owner_id  == 1
> > 1) Process 0:
> >      rte_eth_dev_owner_new(&owner_id);
> >      now owner_id == 1 and next_owner_id == 2
> > 2) Process 1 (by mistake):
> >     rte_eth_dev_owner_set(port_id=1, owner->id=1); It will complete
> > successfully, as owner_id ==1 is considered as valid.
> > 3) Process 0:
> >       rte_eth_dev_owner_set(port_id=1, owner->id=1); It will also complete
> > with success, as owner->id is valid is equal to current port owner_id.
> > So you finished with 2 processes assuming that they do own exclusively then
> > same port.
> >
> > Honestly in that situation  locking around nest_owner_id wouldn't give you
> > any advantages over atomic ops.
> >
> 
> This is a different scenario that we can't protect on it with atomic or locks.
> But for the first scenario I described I think we can.
> Please read it again, I described it step by step.
> 
> > >
> > > > I don't think you can protect yourself against such scenario with or
> > > > without locking.
> > > > Unless you'll make it harder for the mis-behaving thread to guess
> > > > valid owner_id, or add some extra logic here.
> > > >
> > > > >
> > > > >
> > > > > > > The set(and others) ownership APIs already uses the ownership
> > > > > > > lock so I
> > > > > > think it makes sense to use the same lock also in ID allocation.
> > > > > > >
> > > > > > > > > > > > In fact, for next_owner_id, you don't need a lock -
> > > > > > > > > > > > just rte_atomic_t should be enough.
> > > > > > > > > > >
> > > > > > > > > > > I don't think so, it is problematic in next_owner_id
> > > > > > > > > > > wraparound and may
> > > > > > > > > > complicate the code in other places which read it.
> > > > > > > > > >
> > > > > > > > > > IMO it is not that complicated, something like that
> > > > > > > > > > should work I
> > > > think.
> > > > > > > > > >
> > > > > > > > > > /* init to 0 at startup*/ rte_atomic32_t *owner_id;
> > > > > > > > > >
> > > > > > > > > > int new_owner_id(void)
> > > > > > > > > > {
> > > > > > > > > >     int32_t x;
> > > > > > > > > >     x = rte_atomic32_add_return(&owner_id, 1);
> > > > > > > > > >     if (x > UINT16_MAX) {
> > > > > > > > > >        rte_atomic32_dec(&owner_id);
> > > > > > > > > >        return -EOVERWLOW;
> > > > > > > > > >     } else
> > > > > > > > > >         return x;
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > Why not just to keep it simple and using the same lock?
> > > > > > > > > >
> > > > > > > > > > Lock is also fine, I just think it better be a separate
> > > > > > > > > > one
> > > > > > > > > > - that would protext just next_owner_id.
> > > > > > > > > > Though if you are going to use uuid here - all that
> > > > > > > > > > probably not relevant any more.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I agree about the uuid but still think the same lock
> > > > > > > > > should be used for
> > > > > > both.
> > > > > > > >
> > > > > > > > But with uuid you don't need next_owner_id at all, right?
> > > > > > > > So lock will only be used for rte_eth_dev_data[] fields anyway.
> > > > > > > >
> > > > > > > Sorry, I meant uint64_t, not uuid.
> > > > > >
> > > > > > Ah ok, my thought uuid_t is better as with it you don't need to
> > > > > > support your own code to allocate new owner_id, but rely on
> > > > > > system libs
> > > > instead.
> > > > > > But wouldn't insist here.
> > > > > >
> > > > > > >
> > > > > > > > > > > > Another alternative would be to use 2 locks - one
> > > > > > > > > > > > for next_owner_id second for actual data[] protection.
> > > > > > > > > > > >
> > > > > > > > > > > > Another thing - you'll probably need to grab/release
> > > > > > > > > > > > a lock inside
> > > > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > > > It is a public function used by drivers, so need to
> > > > > > > > > > > > be protected
> > > > too.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Yes, I thought about it, but decided not to use lock in next:
> > > > > > > > > > > rte_eth_dev_allocated
> > > > > > > > > > > rte_eth_dev_count
> > > > > > > > > > > rte_eth_dev_get_name_by_port
> > > > rte_eth_dev_get_port_by_name
> > > > > > > > > > > maybe more...
> > > > > > > > > >
> > > > > > > > > > As I can see in patch #3 you protect by lock access to
> > > > > > > > > > rte_eth_dev_data[].name (which seems like a good  thing).
> > > > > > > > > > So I think any other public function that access
> > > > > > > > > > rte_eth_dev_data[].name should be protected by the same
> > lock.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I don't think so, I can understand to use the ownership
> > > > > > > > > lock here(as in port
> > > > > > > > creation) but I don't think it is necessary too.
> > > > > > > > > What are we exactly protecting here?
> > > > > > > > > Don't you think it is just timing?(ask in the next moment
> > > > > > > > > and you may get another answer) I don't see optional crash.
> > > > > > > >
> > > > > > > > Not sure what you mean here by timing...
> > > > > > > > As I understand rte_eth_dev_data[].name unique identifies
> > > > > > > > device and is used by  port allocation/release/find functions.
> > > > > > > > As you stated above:
> > > > > > > > "1. The port allocation and port release synchronization
> > > > > > > > will be managed by ethdev."
> > > > > > > > To me it means that ethdev layer has to make sure that all
> > > > > > > > accesses to rte_eth_dev_data[].name are atomic.
> > > > > > > > Otherwise what would prevent the situation when one process
> > > > > > > > does
> > > > > > > > rte_eth_dev_allocate()->snprintf(rte_eth_dev_data[x].name,
> > > > > > > > ...) while second one does
> > > > > > rte_eth_dev_allocated(rte_eth_dev_data[x].name, ...) ?
> > > > > > > >
> > > > > > > The second will get True or False and that is it.
> > > > > >
> > > > > > Under race condition - in the worst case it might crash, though
> > > > > > for that you'll have to be really unlucky.
> > > > > > Though in most cases as you said it would just not operate correctly.
> > > > > > I think if we start to protect dev->name by lock we need to do
> > > > > > it for all instances (both read and write).
> > > > > >
> > > > > Since under the ownership rules, the user must take ownership of a
> > > > > port
> > > > before using it, I still don't see a problem here.
> > > >
> > > > I am not talking about owner id or name here.
> > > > I am talking about dev->name.
> > > >
> > > So? The user still should take ownership of a device before using it (by
> > name or by port id).
> > > It can just read it without owning it, but no managing it.
> > >
> > > > > Please, Can you describe specific crash scenario and explain how
> > > > > could the
> > > > locking fix it?
> > > >
> > > > Let say thread 0 doing rte_eth_dev_allocate()-
> > > > >snprintf(rte_eth_dev_data[x].name, ...), thread 1 doing
> > > > rte_pmd_ring_remove()->rte_eth_dev_allocated()->strcmp().
> > > > And because of race condition - rte_eth_dev_allocated() will return
> > > > rte_eth_dev * for the wrong device.
> > > Which wrong device do you mean? I guess it is the device which currently is
> > being created by thread 0.
> > > > Then rte_pmd_ring_remove() will call rte_free() for related
> > > > resources, while It can still be in use by someone else.
> > > The rte_pmd_ring_remove caller(some DPDK entity) must take ownership
> > > (or validate that he is the owner) of a port before doing it(free, release), so
> > no issue here.
> >
> > Forget about ownership for a second.
> > Suppose we have a process it created ring port for itself (without setting any
> > ownership)  and used it for some time.
> > Then it decided to remove it, so it calls rte_pmd_ring_remove() for it.
> > At the same time second process decides to call rte_eth_dev_allocate() (let
> > say for anither ring port).
> > They could collide trying to read (process 0) and modify (process 1) same
> > string rte_eth_dev_data[].name.
> >
> Do you mean that process 0 will compare successfully the process 1 new port name?

Yes.

> The state are in local process memory - so process 0 will not compare the process 1 port, from its point of view this port is in UNUSED
> state.
>

Ok, and why it can't be in attached state in process 0 too?
Konstantin
 
> > Konstantin
> >
> > >
> > >
> > > Also I'm not sure I fully understand your scenario looks like moving
> > > the device state setting in allocation to be after the name setting will be
> > good.
> > > What do you think?
> > >
> > > > Konstantin
> > > >
> > > > >
> > > > > > > Maybe if it had been called just a moment after, It might get
> > > > > > > different
> > > > > > answer.
> > > > > > > Because these APIs don't change ethdev structure(just read),
> > > > > > > it can be
> > > > OK.
> > > > > > > But again, I can understand to use ownership lock also here.
> > > > > > >
> > > > > >
> > > > > > Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-17 11:24                             ` Ananyev, Konstantin
@ 2018-01-17 12:05                               ` Matan Azrad
  2018-01-17 12:54                                 ` Ananyev, Konstantin
  2018-01-17 14:00                                 ` Neil Horman
  0 siblings, 2 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-17 12:05 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce


Hi Konstantin
From: Ananyev, Konstantin, Sent: Wednesday, January 17, 2018 1:24 PM
> Hi Matan,
> 
> > Hi Konstantin
> >
> > From: Ananyev, Konstantin, Tuesday, January 16, 2018 9:11 PM
> > > Hi Matan,
> > >
> > > >
> > > > Hi Konstantin
> > > > From: Ananyev, Konstantin, Monday, January 15, 2018 8:44 PM
> > > > > Hi Matan,
> > > > > > Hi Konstantin
> > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 1:45 PM
> > > > > > > Hi Matan,
> > > > > > > > Hi Konstantin
> > > > > > > > From: Ananyev, Konstantin, Friday, January 12, 2018 2:02
> > > > > > > > AM
> > > > > > > > > Hi Matan,
> > > > > > > > > > Hi Konstantin
> > > > > > > > > > From: Ananyev, Konstantin, Thursday, January 11, 2018
> > > > > > > > > > 2:40 PM
> > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > From: Ananyev, Konstantin, Wednesday, January 10,
> > > > > > > > > > > > 2018
> > > > > > > > > > > > 3:36 PM
> > > > > > > > > > > > > Hi Matan,
> > > >  <snip>
> > > > > > > > > > > > > It is good to see that now scanning/updating
> > > > > > > > > > > > > rte_eth_dev_data[] is lock protected, but it
> > > > > > > > > > > > > might be not very plausible to protect both
> > > > > > > > > > > > > data[] and next_owner_id using the
> > > > > > > same lock.
> > > > > > > > > > > >
> > > > > > > > > > > > I guess you mean to the owner structure in
> > > > > > > rte_eth_dev_data[port_id].
> > > > > > > > > > > > The next_owner_id is read by ownership APIs(for
> > > > > > > > > > > > owner validation), so it
> > > > > > > > > > > makes sense to use the same lock.
> > > > > > > > > > > > Actually, why not?
> > > > > > > > > > >
> > > > > > > > > > > Well to me next_owner_id and rte_eth_dev_data[] are
> > > > > > > > > > > not directly
> > > > > > > > > related.
> > > > > > > > > > > You may create new owner_id but it doesn't mean you
> > > > > > > > > > > would update rte_eth_dev_data[] immediately.
> > > > > > > > > > > And visa-versa - you might just want to update
> > > > > > > > > > > rte_eth_dev_data[].name or .owner_id.
> > > > > > > > > > > It is not very good coding practice to use same lock
> > > > > > > > > > > for non-related data structures.
> > > > > > > > > > >
> > > > > > > > > > I see the relation like next:
> > > > > > > > > > Since the ownership mechanism synchronization is in
> > > > > > > > > > ethdev responsibility, we must protect against user
> > > > > > > > > > mistakes as much as we can by
> > > > > > > > > using the same lock.
> > > > > > > > > > So, if user try to set by invalid owner (exactly the
> > > > > > > > > > ID which currently is
> > > > > > > > > allocated) we can protect on it.
> > > > > > > > >
> > > > > > > > > Hmm, not sure why you can't do same checking with
> > > > > > > > > different lock or atomic variable?
> > > > > > > > >
> > > > > > > > The set ownership API is protected by ownership lock and
> > > > > > > > checks the owner ID validity By reading the next owner ID.
> > > > > > > > So, the owner ID allocation and set API should use the
> > > > > > > > same atomic
> > > > > > > mechanism.
> > > > > > >
> > > > > > > Sure but all you are doing for checking validity, is  check
> > > > > > > that owner_id > 0 &&& owner_id < next_ownwe_id, right?
> > > > > > > As you don't allow owner_id overlap (16/3248 bits) you can
> > > > > > > safely do same check with just atomic_get(&next_owner_id).
> > > > > > >
> > > > > > It will not protect it, scenario:
> > > > > > - current next_id is X.
> > > > > > - call set ownership of port A with owner id X by thread 0(by
> > > > > > user
> > > mistake).
> > > > > > - context switch
> > > > > > - allocate new id by thread 1 and get X and change next_id to
> > > > > > X+1
> > > > > atomically.
> > > > > > -  context switch
> > > > > > - Thread 0 validate X by atomic_read and succeed to take
> ownership.
> > > > > > - The system loosed the port(or will be managed by two
> > > > > > entities) -
> > > crash.
> > > > >
> > > > >
> > > > > Ok, and how using lock will protect you with such scenario?
> > > >
> > > > The owner set API validation by thread 0 should fail because the
> > > > owner
> > > validation is included in the protected section.
> > >
> > > Then your validation function would fail even if you'll use atomic
> > > ops instead of lock.
> > No.
> > With atomic this specific scenario will cause the validation to pass.
> 
> Can you explain to me how?
> 
> rte_eth_is_valid_owner_id(uint16_t owner_id) {
>               int32_t cur_owner_id = RTE_MIN(rte_atomic32_get(next_owner_id),
> UINT16_MAX);
> 
> 	if (owner_id == RTE_ETH_DEV_NO_OWNER || owner >
> cur_owner_id) {
> 		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> 		return 0;
> 	}
> 	return 1;
> }
> 
> Let say your next_owne_id==X, and you invoke
> rte_eth_is_valid_owner_id(owner_id=X+1)  - it would fail.

Explanation:
The scenario with locks:
next_owner_id = X.
Thread 0 call to set API(with invalid owner Y=X) and take lock.
Context switch.
Thread 1 call to owner_new and stuck in the lock.
Context switch.
Thread 0 does owner id validation and failed(Y>=X) - unlock the lock and return failure to the user.
Context switch.
Thread 1 take the lock and update X to X+1, then, unlock the lock.
Everything is OK!

The same scenario with atomics:
next_owner_id = X.
Thread 0 call to set API(with invalid owner Y=X) and take lock.
Context switch.
Thread 1 call to owner_new and change X to X+1(atomically).
Context switch.
Thread 0 does owner id validation and success(Y<(atomic)X+1) - unlock the lock and return success to the  user.
Problem!
 
> > With lock no next_id changes can be done while the thread is in the set
> API.
> >
> > > But in fact your code is not protected for that scenario - doesn't
> > > matter will you'll use lock or atomic ops.
> > > Let's considerer your current code with the following scenario:
> > >
> > > next_owner_id  == 1
> > > 1) Process 0:
> > >      rte_eth_dev_owner_new(&owner_id);
> > >      now owner_id == 1 and next_owner_id == 2
> > > 2) Process 1 (by mistake):
> > >     rte_eth_dev_owner_set(port_id=1, owner->id=1); It will complete
> > > successfully, as owner_id ==1 is considered as valid.
> > > 3) Process 0:
> > >       rte_eth_dev_owner_set(port_id=1, owner->id=1); It will also
> > > complete with success, as owner->id is valid is equal to current port
> owner_id.
> > > So you finished with 2 processes assuming that they do own
> > > exclusively then same port.
> > >
> > > Honestly in that situation  locking around nest_owner_id wouldn't
> > > give you any advantages over atomic ops.
> > >
> >
> > This is a different scenario that we can't protect on it with atomic or locks.
> > But for the first scenario I described I think we can.
> > Please read it again, I described it step by step.
> >
> > > >
> > > > > I don't think you can protect yourself against such scenario
> > > > > with or without locking.
> > > > > Unless you'll make it harder for the mis-behaving thread to
> > > > > guess valid owner_id, or add some extra logic here.
> > > > >
> > > > > >
> > > > > >
> > > > > > > > The set(and others) ownership APIs already uses the
> > > > > > > > ownership lock so I
> > > > > > > think it makes sense to use the same lock also in ID allocation.
> > > > > > > >
> > > > > > > > > > > > > In fact, for next_owner_id, you don't need a
> > > > > > > > > > > > > lock - just rte_atomic_t should be enough.
> > > > > > > > > > > >
> > > > > > > > > > > > I don't think so, it is problematic in
> > > > > > > > > > > > next_owner_id wraparound and may
> > > > > > > > > > > complicate the code in other places which read it.
> > > > > > > > > > >
> > > > > > > > > > > IMO it is not that complicated, something like that
> > > > > > > > > > > should work I
> > > > > think.
> > > > > > > > > > >
> > > > > > > > > > > /* init to 0 at startup*/ rte_atomic32_t *owner_id;
> > > > > > > > > > >
> > > > > > > > > > > int new_owner_id(void) {
> > > > > > > > > > >     int32_t x;
> > > > > > > > > > >     x = rte_atomic32_add_return(&owner_id, 1);
> > > > > > > > > > >     if (x > UINT16_MAX) {
> > > > > > > > > > >        rte_atomic32_dec(&owner_id);
> > > > > > > > > > >        return -EOVERWLOW;
> > > > > > > > > > >     } else
> > > > > > > > > > >         return x;
> > > > > > > > > > > }
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > Why not just to keep it simple and using the same lock?
> > > > > > > > > > >
> > > > > > > > > > > Lock is also fine, I just think it better be a separate
> > > > > > > > > > > one
> > > > > > > > > > > - that would protext just next_owner_id.
> > > > > > > > > > > Though if you are going to use uuid here - all that
> > > > > > > > > > > probably not relevant any more.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I agree about the uuid but still think the same lock
> > > > > > > > > > should be used for
> > > > > > > both.
> > > > > > > > >
> > > > > > > > > But with uuid you don't need next_owner_id at all, right?
> > > > > > > > > So lock will only be used for rte_eth_dev_data[] fields
> anyway.
> > > > > > > > >
> > > > > > > > Sorry, I meant uint64_t, not uuid.
> > > > > > >
> > > > > > > Ah ok, my thought uuid_t is better as with it you don't need to
> > > > > > > support your own code to allocate new owner_id, but rely on
> > > > > > > system libs
> > > > > instead.
> > > > > > > But wouldn't insist here.
> > > > > > >
> > > > > > > >
> > > > > > > > > > > > > Another alternative would be to use 2 locks - one
> > > > > > > > > > > > > for next_owner_id second for actual data[] protection.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Another thing - you'll probably need to grab/release
> > > > > > > > > > > > > a lock inside
> > > > > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > > > > It is a public function used by drivers, so need to
> > > > > > > > > > > > > be protected
> > > > > too.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Yes, I thought about it, but decided not to use lock in
> next:
> > > > > > > > > > > > rte_eth_dev_allocated
> > > > > > > > > > > > rte_eth_dev_count
> > > > > > > > > > > > rte_eth_dev_get_name_by_port
> > > > > rte_eth_dev_get_port_by_name
> > > > > > > > > > > > maybe more...
> > > > > > > > > > >
> > > > > > > > > > > As I can see in patch #3 you protect by lock access to
> > > > > > > > > > > rte_eth_dev_data[].name (which seems like a good
> thing).
> > > > > > > > > > > So I think any other public function that access
> > > > > > > > > > > rte_eth_dev_data[].name should be protected by the
> same
> > > lock.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I don't think so, I can understand to use the ownership
> > > > > > > > > > lock here(as in port
> > > > > > > > > creation) but I don't think it is necessary too.
> > > > > > > > > > What are we exactly protecting here?
> > > > > > > > > > Don't you think it is just timing?(ask in the next moment
> > > > > > > > > > and you may get another answer) I don't see optional crash.
> > > > > > > > >
> > > > > > > > > Not sure what you mean here by timing...
> > > > > > > > > As I understand rte_eth_dev_data[].name unique identifies
> > > > > > > > > device and is used by  port allocation/release/find functions.
> > > > > > > > > As you stated above:
> > > > > > > > > "1. The port allocation and port release synchronization
> > > > > > > > > will be managed by ethdev."
> > > > > > > > > To me it means that ethdev layer has to make sure that all
> > > > > > > > > accesses to rte_eth_dev_data[].name are atomic.
> > > > > > > > > Otherwise what would prevent the situation when one
> process
> > > > > > > > > does
> > > > > > > > > rte_eth_dev_allocate()->snprintf(rte_eth_dev_data[x].name,
> > > > > > > > > ...) while second one does
> > > > > > > rte_eth_dev_allocated(rte_eth_dev_data[x].name, ...) ?
> > > > > > > > >
> > > > > > > > The second will get True or False and that is it.
> > > > > > >
> > > > > > > Under race condition - in the worst case it might crash, though
> > > > > > > for that you'll have to be really unlucky.
> > > > > > > Though in most cases as you said it would just not operate
> correctly.
> > > > > > > I think if we start to protect dev->name by lock we need to do
> > > > > > > it for all instances (both read and write).
> > > > > > >
> > > > > > Since under the ownership rules, the user must take ownership of a
> > > > > > port
> > > > > before using it, I still don't see a problem here.
> > > > >
> > > > > I am not talking about owner id or name here.
> > > > > I am talking about dev->name.
> > > > >
> > > > So? The user still should take ownership of a device before using it (by
> > > name or by port id).
> > > > It can just read it without owning it, but no managing it.
> > > >
> > > > > > Please, Can you describe specific crash scenario and explain how
> > > > > > could the
> > > > > locking fix it?
> > > > >
> > > > > Let say thread 0 doing rte_eth_dev_allocate()-
> > > > > >snprintf(rte_eth_dev_data[x].name, ...), thread 1 doing
> > > > > rte_pmd_ring_remove()->rte_eth_dev_allocated()->strcmp().
> > > > > And because of race condition - rte_eth_dev_allocated() will return
> > > > > rte_eth_dev * for the wrong device.
> > > > Which wrong device do you mean? I guess it is the device which
> currently is
> > > being created by thread 0.
> > > > > Then rte_pmd_ring_remove() will call rte_free() for related
> > > > > resources, while It can still be in use by someone else.
> > > > The rte_pmd_ring_remove caller(some DPDK entity) must take
> ownership
> > > > (or validate that he is the owner) of a port before doing it(free,
> release), so
> > > no issue here.
> > >
> > > Forget about ownership for a second.
> > > Suppose we have a process it created ring port for itself (without setting
> any
> > > ownership)  and used it for some time.
> > > Then it decided to remove it, so it calls rte_pmd_ring_remove() for it.
> > > At the same time second process decides to call rte_eth_dev_allocate()
> (let
> > > say for anither ring port).
> > > They could collide trying to read (process 0) and modify (process 1) same
> > > string rte_eth_dev_data[].name.
> > >
> > Do you mean that process 0 will compare successfully the process 1 new
> port name?
> 
> Yes.
> 
> > The state are in local process memory - so process 0 will not compare the
> process 1 port, from its point of view this port is in UNUSED
> > state.
> >
> 
> Ok, and why it can't be in attached state in process 0 too?

Someone in process 0 should attach it using protected attach_secondary somewhere in your scenario.


> Konstantin
> 
> > > Konstantin
> > >
> > > >
> > > >
> > > > Also I'm not sure I fully understand your scenario looks like moving
> > > > the device state setting in allocation to be after the name setting will be
> > > good.
> > > > What do you think?
> > > >
> > > > > Konstantin
> > > > >
> > > > > >
> > > > > > > > Maybe if it had been called just a moment after, It might get
> > > > > > > > different
> > > > > > > answer.
> > > > > > > > Because these APIs don't change ethdev structure(just read),
> > > > > > > > it can be
> > > > > OK.
> > > > > > > > But again, I can understand to use ownership lock also here.
> > > > > > > >
> > > > > > >
> > > > > > > Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-17 12:05                               ` Matan Azrad
@ 2018-01-17 12:54                                 ` Ananyev, Konstantin
  2018-01-17 13:10                                   ` Matan Azrad
  2018-01-17 14:00                                 ` Neil Horman
  1 sibling, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-17 12:54 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce



> 
> 
> Hi Konstantin
> From: Ananyev, Konstantin, Sent: Wednesday, January 17, 2018 1:24 PM
> > Hi Matan,
> >
> > > Hi Konstantin
> > >
> > > From: Ananyev, Konstantin, Tuesday, January 16, 2018 9:11 PM
> > > > Hi Matan,
> > > >
> > > > >
> > > > > Hi Konstantin
> > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 8:44 PM
> > > > > > Hi Matan,
> > > > > > > Hi Konstantin
> > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 1:45 PM
> > > > > > > > Hi Matan,
> > > > > > > > > Hi Konstantin
> > > > > > > > > From: Ananyev, Konstantin, Friday, January 12, 2018 2:02
> > > > > > > > > AM
> > > > > > > > > > Hi Matan,
> > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > From: Ananyev, Konstantin, Thursday, January 11, 2018
> > > > > > > > > > > 2:40 PM
> > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > From: Ananyev, Konstantin, Wednesday, January 10,
> > > > > > > > > > > > > 2018
> > > > > > > > > > > > > 3:36 PM
> > > > > > > > > > > > > > Hi Matan,
> > > > >  <snip>
> > > > > > > > > > > > > > It is good to see that now scanning/updating
> > > > > > > > > > > > > > rte_eth_dev_data[] is lock protected, but it
> > > > > > > > > > > > > > might be not very plausible to protect both
> > > > > > > > > > > > > > data[] and next_owner_id using the
> > > > > > > > same lock.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I guess you mean to the owner structure in
> > > > > > > > rte_eth_dev_data[port_id].
> > > > > > > > > > > > > The next_owner_id is read by ownership APIs(for
> > > > > > > > > > > > > owner validation), so it
> > > > > > > > > > > > makes sense to use the same lock.
> > > > > > > > > > > > > Actually, why not?
> > > > > > > > > > > >
> > > > > > > > > > > > Well to me next_owner_id and rte_eth_dev_data[] are
> > > > > > > > > > > > not directly
> > > > > > > > > > related.
> > > > > > > > > > > > You may create new owner_id but it doesn't mean you
> > > > > > > > > > > > would update rte_eth_dev_data[] immediately.
> > > > > > > > > > > > And visa-versa - you might just want to update
> > > > > > > > > > > > rte_eth_dev_data[].name or .owner_id.
> > > > > > > > > > > > It is not very good coding practice to use same lock
> > > > > > > > > > > > for non-related data structures.
> > > > > > > > > > > >
> > > > > > > > > > > I see the relation like next:
> > > > > > > > > > > Since the ownership mechanism synchronization is in
> > > > > > > > > > > ethdev responsibility, we must protect against user
> > > > > > > > > > > mistakes as much as we can by
> > > > > > > > > > using the same lock.
> > > > > > > > > > > So, if user try to set by invalid owner (exactly the
> > > > > > > > > > > ID which currently is
> > > > > > > > > > allocated) we can protect on it.
> > > > > > > > > >
> > > > > > > > > > Hmm, not sure why you can't do same checking with
> > > > > > > > > > different lock or atomic variable?
> > > > > > > > > >
> > > > > > > > > The set ownership API is protected by ownership lock and
> > > > > > > > > checks the owner ID validity By reading the next owner ID.
> > > > > > > > > So, the owner ID allocation and set API should use the
> > > > > > > > > same atomic
> > > > > > > > mechanism.
> > > > > > > >
> > > > > > > > Sure but all you are doing for checking validity, is  check
> > > > > > > > that owner_id > 0 &&& owner_id < next_ownwe_id, right?
> > > > > > > > As you don't allow owner_id overlap (16/3248 bits) you can
> > > > > > > > safely do same check with just atomic_get(&next_owner_id).
> > > > > > > >
> > > > > > > It will not protect it, scenario:
> > > > > > > - current next_id is X.
> > > > > > > - call set ownership of port A with owner id X by thread 0(by
> > > > > > > user
> > > > mistake).
> > > > > > > - context switch
> > > > > > > - allocate new id by thread 1 and get X and change next_id to
> > > > > > > X+1
> > > > > > atomically.
> > > > > > > -  context switch
> > > > > > > - Thread 0 validate X by atomic_read and succeed to take
> > ownership.
> > > > > > > - The system loosed the port(or will be managed by two
> > > > > > > entities) -
> > > > crash.
> > > > > >
> > > > > >
> > > > > > Ok, and how using lock will protect you with such scenario?
> > > > >
> > > > > The owner set API validation by thread 0 should fail because the
> > > > > owner
> > > > validation is included in the protected section.
> > > >
> > > > Then your validation function would fail even if you'll use atomic
> > > > ops instead of lock.
> > > No.
> > > With atomic this specific scenario will cause the validation to pass.
> >
> > Can you explain to me how?
> >
> > rte_eth_is_valid_owner_id(uint16_t owner_id) {
> >               int32_t cur_owner_id = RTE_MIN(rte_atomic32_get(next_owner_id),
> > UINT16_MAX);
> >
> > 	if (owner_id == RTE_ETH_DEV_NO_OWNER || owner >
> > cur_owner_id) {
> > 		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > 		return 0;
> > 	}
> > 	return 1;
> > }
> >
> > Let say your next_owne_id==X, and you invoke
> > rte_eth_is_valid_owner_id(owner_id=X+1)  - it would fail.
> 
> Explanation:
> The scenario with locks:
> next_owner_id = X.
> Thread 0 call to set API(with invalid owner Y=X) and take lock.

Ok I see what you mean.
But, as I said before, if thread 0 will grab the lock first - you'll experience the same failure.
I understand now that by some reason you treat these two scenarios as something different,
but for me it is pretty much the same case.
And to me it means that neither lock, neither atomic can fully protect you here.

> Context switch.
> Thread 1 call to owner_new and stuck in the lock.
> Context switch.
> Thread 0 does owner id validation and failed(Y>=X) - unlock the lock and return failure to the user.
> Context switch.
> Thread 1 take the lock and update X to X+1, then, unlock the lock.
> Everything is OK!
> 
> The same scenario with atomics:
> next_owner_id = X.
> Thread 0 call to set API(with invalid owner Y=X) and take lock.
> Context switch.
> Thread 1 call to owner_new and change X to X+1(atomically).
> Context switch.
> Thread 0 does owner id validation and success(Y<(atomic)X+1) - unlock the lock and return success to the  user.
> Problem!
> 
> > > With lock no next_id changes can be done while the thread is in the set
> > API.
> > >
> > > > But in fact your code is not protected for that scenario - doesn't
> > > > matter will you'll use lock or atomic ops.
> > > > Let's considerer your current code with the following scenario:
> > > >
> > > > next_owner_id  == 1
> > > > 1) Process 0:
> > > >      rte_eth_dev_owner_new(&owner_id);
> > > >      now owner_id == 1 and next_owner_id == 2
> > > > 2) Process 1 (by mistake):
> > > >     rte_eth_dev_owner_set(port_id=1, owner->id=1); It will complete
> > > > successfully, as owner_id ==1 is considered as valid.
> > > > 3) Process 0:
> > > >       rte_eth_dev_owner_set(port_id=1, owner->id=1); It will also
> > > > complete with success, as owner->id is valid is equal to current port
> > owner_id.
> > > > So you finished with 2 processes assuming that they do own
> > > > exclusively then same port.
> > > >
> > > > Honestly in that situation  locking around nest_owner_id wouldn't
> > > > give you any advantages over atomic ops.
> > > >
> > >
> > > This is a different scenario that we can't protect on it with atomic or locks.
> > > But for the first scenario I described I think we can.
> > > Please read it again, I described it step by step.
> > >
> > > > >
> > > > > > I don't think you can protect yourself against such scenario
> > > > > > with or without locking.
> > > > > > Unless you'll make it harder for the mis-behaving thread to
> > > > > > guess valid owner_id, or add some extra logic here.
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > > The set(and others) ownership APIs already uses the
> > > > > > > > > ownership lock so I
> > > > > > > > think it makes sense to use the same lock also in ID allocation.
> > > > > > > > >
> > > > > > > > > > > > > > In fact, for next_owner_id, you don't need a
> > > > > > > > > > > > > > lock - just rte_atomic_t should be enough.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I don't think so, it is problematic in
> > > > > > > > > > > > > next_owner_id wraparound and may
> > > > > > > > > > > > complicate the code in other places which read it.
> > > > > > > > > > > >
> > > > > > > > > > > > IMO it is not that complicated, something like that
> > > > > > > > > > > > should work I
> > > > > > think.
> > > > > > > > > > > >
> > > > > > > > > > > > /* init to 0 at startup*/ rte_atomic32_t *owner_id;
> > > > > > > > > > > >
> > > > > > > > > > > > int new_owner_id(void) {
> > > > > > > > > > > >     int32_t x;
> > > > > > > > > > > >     x = rte_atomic32_add_return(&owner_id, 1);
> > > > > > > > > > > >     if (x > UINT16_MAX) {
> > > > > > > > > > > >        rte_atomic32_dec(&owner_id);
> > > > > > > > > > > >        return -EOVERWLOW;
> > > > > > > > > > > >     } else
> > > > > > > > > > > >         return x;
> > > > > > > > > > > > }
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > Why not just to keep it simple and using the same lock?
> > > > > > > > > > > >
> > > > > > > > > > > > Lock is also fine, I just think it better be a separate
> > > > > > > > > > > > one
> > > > > > > > > > > > - that would protext just next_owner_id.
> > > > > > > > > > > > Though if you are going to use uuid here - all that
> > > > > > > > > > > > probably not relevant any more.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I agree about the uuid but still think the same lock
> > > > > > > > > > > should be used for
> > > > > > > > both.
> > > > > > > > > >
> > > > > > > > > > But with uuid you don't need next_owner_id at all, right?
> > > > > > > > > > So lock will only be used for rte_eth_dev_data[] fields
> > anyway.
> > > > > > > > > >
> > > > > > > > > Sorry, I meant uint64_t, not uuid.
> > > > > > > >
> > > > > > > > Ah ok, my thought uuid_t is better as with it you don't need to
> > > > > > > > support your own code to allocate new owner_id, but rely on
> > > > > > > > system libs
> > > > > > instead.
> > > > > > > > But wouldn't insist here.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > > > > > Another alternative would be to use 2 locks - one
> > > > > > > > > > > > > > for next_owner_id second for actual data[] protection.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Another thing - you'll probably need to grab/release
> > > > > > > > > > > > > > a lock inside
> > > > > > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > > > > > It is a public function used by drivers, so need to
> > > > > > > > > > > > > > be protected
> > > > > > too.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Yes, I thought about it, but decided not to use lock in
> > next:
> > > > > > > > > > > > > rte_eth_dev_allocated
> > > > > > > > > > > > > rte_eth_dev_count
> > > > > > > > > > > > > rte_eth_dev_get_name_by_port
> > > > > > rte_eth_dev_get_port_by_name
> > > > > > > > > > > > > maybe more...
> > > > > > > > > > > >
> > > > > > > > > > > > As I can see in patch #3 you protect by lock access to
> > > > > > > > > > > > rte_eth_dev_data[].name (which seems like a good
> > thing).
> > > > > > > > > > > > So I think any other public function that access
> > > > > > > > > > > > rte_eth_dev_data[].name should be protected by the
> > same
> > > > lock.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I don't think so, I can understand to use the ownership
> > > > > > > > > > > lock here(as in port
> > > > > > > > > > creation) but I don't think it is necessary too.
> > > > > > > > > > > What are we exactly protecting here?
> > > > > > > > > > > Don't you think it is just timing?(ask in the next moment
> > > > > > > > > > > and you may get another answer) I don't see optional crash.
> > > > > > > > > >
> > > > > > > > > > Not sure what you mean here by timing...
> > > > > > > > > > As I understand rte_eth_dev_data[].name unique identifies
> > > > > > > > > > device and is used by  port allocation/release/find functions.
> > > > > > > > > > As you stated above:
> > > > > > > > > > "1. The port allocation and port release synchronization
> > > > > > > > > > will be managed by ethdev."
> > > > > > > > > > To me it means that ethdev layer has to make sure that all
> > > > > > > > > > accesses to rte_eth_dev_data[].name are atomic.
> > > > > > > > > > Otherwise what would prevent the situation when one
> > process
> > > > > > > > > > does
> > > > > > > > > > rte_eth_dev_allocate()->snprintf(rte_eth_dev_data[x].name,
> > > > > > > > > > ...) while second one does
> > > > > > > > rte_eth_dev_allocated(rte_eth_dev_data[x].name, ...) ?
> > > > > > > > > >
> > > > > > > > > The second will get True or False and that is it.
> > > > > > > >
> > > > > > > > Under race condition - in the worst case it might crash, though
> > > > > > > > for that you'll have to be really unlucky.
> > > > > > > > Though in most cases as you said it would just not operate
> > correctly.
> > > > > > > > I think if we start to protect dev->name by lock we need to do
> > > > > > > > it for all instances (both read and write).
> > > > > > > >
> > > > > > > Since under the ownership rules, the user must take ownership of a
> > > > > > > port
> > > > > > before using it, I still don't see a problem here.
> > > > > >
> > > > > > I am not talking about owner id or name here.
> > > > > > I am talking about dev->name.
> > > > > >
> > > > > So? The user still should take ownership of a device before using it (by
> > > > name or by port id).
> > > > > It can just read it without owning it, but no managing it.
> > > > >
> > > > > > > Please, Can you describe specific crash scenario and explain how
> > > > > > > could the
> > > > > > locking fix it?
> > > > > >
> > > > > > Let say thread 0 doing rte_eth_dev_allocate()-
> > > > > > >snprintf(rte_eth_dev_data[x].name, ...), thread 1 doing
> > > > > > rte_pmd_ring_remove()->rte_eth_dev_allocated()->strcmp().
> > > > > > And because of race condition - rte_eth_dev_allocated() will return
> > > > > > rte_eth_dev * for the wrong device.
> > > > > Which wrong device do you mean? I guess it is the device which
> > currently is
> > > > being created by thread 0.
> > > > > > Then rte_pmd_ring_remove() will call rte_free() for related
> > > > > > resources, while It can still be in use by someone else.
> > > > > The rte_pmd_ring_remove caller(some DPDK entity) must take
> > ownership
> > > > > (or validate that he is the owner) of a port before doing it(free,
> > release), so
> > > > no issue here.
> > > >
> > > > Forget about ownership for a second.
> > > > Suppose we have a process it created ring port for itself (without setting
> > any
> > > > ownership)  and used it for some time.
> > > > Then it decided to remove it, so it calls rte_pmd_ring_remove() for it.
> > > > At the same time second process decides to call rte_eth_dev_allocate()
> > (let
> > > > say for anither ring port).
> > > > They could collide trying to read (process 0) and modify (process 1) same
> > > > string rte_eth_dev_data[].name.
> > > >
> > > Do you mean that process 0 will compare successfully the process 1 new
> > port name?
> >
> > Yes.
> >
> > > The state are in local process memory - so process 0 will not compare the
> > process 1 port, from its point of view this port is in UNUSED
> > > state.
> > >
> >
> > Ok, and why it can't be in attached state in process 0 too?
> 
> Someone in process 0 should attach it using protected attach_secondary somewhere in your scenario.

Yes, process 0 can have this port attached too, why not?
Konstantin

> 
> 
> > Konstantin
> >
> > > > Konstantin
> > > >
> > > > >
> > > > >
> > > > > Also I'm not sure I fully understand your scenario looks like moving
> > > > > the device state setting in allocation to be after the name setting will be
> > > > good.
> > > > > What do you think?
> > > > >
> > > > > > Konstantin
> > > > > >
> > > > > > >
> > > > > > > > > Maybe if it had been called just a moment after, It might get
> > > > > > > > > different
> > > > > > > > answer.
> > > > > > > > > Because these APIs don't change ethdev structure(just read),
> > > > > > > > > it can be
> > > > > > OK.
> > > > > > > > > But again, I can understand to use ownership lock also here.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-17 12:54                                 ` Ananyev, Konstantin
@ 2018-01-17 13:10                                   ` Matan Azrad
  2018-01-17 16:52                                     ` Ananyev, Konstantin
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-17 13:10 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce


Hi Konstantin
From: Ananyev, Konstantin, Wednesday, January 17, 2018 2:55 PM
> >
> >
> > Hi Konstantin
> > From: Ananyev, Konstantin, Sent: Wednesday, January 17, 2018 1:24 PM
> > > Hi Matan,
> > >
> > > > Hi Konstantin
> > > >
> > > > From: Ananyev, Konstantin, Tuesday, January 16, 2018 9:11 PM
> > > > > Hi Matan,
> > > > >
> > > > > >
> > > > > > Hi Konstantin
> > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 8:44 PM
> > > > > > > Hi Matan,
> > > > > > > > Hi Konstantin
> > > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 1:45
> > > > > > > > PM
> > > > > > > > > Hi Matan,
> > > > > > > > > > Hi Konstantin
> > > > > > > > > > From: Ananyev, Konstantin, Friday, January 12, 2018
> > > > > > > > > > 2:02 AM
> > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > From: Ananyev, Konstantin, Thursday, January 11,
> > > > > > > > > > > > 2018
> > > > > > > > > > > > 2:40 PM
> > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > From: Ananyev, Konstantin, Wednesday, January
> > > > > > > > > > > > > > 10,
> > > > > > > > > > > > > > 2018
> > > > > > > > > > > > > > 3:36 PM
> > > > > > > > > > > > > > > Hi Matan,
> > > > > >  <snip>
> > > > > > > > > > > > > > > It is good to see that now scanning/updating
> > > > > > > > > > > > > > > rte_eth_dev_data[] is lock protected, but it
> > > > > > > > > > > > > > > might be not very plausible to protect both
> > > > > > > > > > > > > > > data[] and next_owner_id using the
> > > > > > > > > same lock.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I guess you mean to the owner structure in
> > > > > > > > > rte_eth_dev_data[port_id].
> > > > > > > > > > > > > > The next_owner_id is read by ownership
> > > > > > > > > > > > > > APIs(for owner validation), so it
> > > > > > > > > > > > > makes sense to use the same lock.
> > > > > > > > > > > > > > Actually, why not?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Well to me next_owner_id and rte_eth_dev_data[]
> > > > > > > > > > > > > are not directly
> > > > > > > > > > > related.
> > > > > > > > > > > > > You may create new owner_id but it doesn't mean
> > > > > > > > > > > > > you would update rte_eth_dev_data[] immediately.
> > > > > > > > > > > > > And visa-versa - you might just want to update
> > > > > > > > > > > > > rte_eth_dev_data[].name or .owner_id.
> > > > > > > > > > > > > It is not very good coding practice to use same
> > > > > > > > > > > > > lock for non-related data structures.
> > > > > > > > > > > > >
> > > > > > > > > > > > I see the relation like next:
> > > > > > > > > > > > Since the ownership mechanism synchronization is
> > > > > > > > > > > > in ethdev responsibility, we must protect against
> > > > > > > > > > > > user mistakes as much as we can by
> > > > > > > > > > > using the same lock.
> > > > > > > > > > > > So, if user try to set by invalid owner (exactly
> > > > > > > > > > > > the ID which currently is
> > > > > > > > > > > allocated) we can protect on it.
> > > > > > > > > > >
> > > > > > > > > > > Hmm, not sure why you can't do same checking with
> > > > > > > > > > > different lock or atomic variable?
> > > > > > > > > > >
> > > > > > > > > > The set ownership API is protected by ownership lock
> > > > > > > > > > and checks the owner ID validity By reading the next owner
> ID.
> > > > > > > > > > So, the owner ID allocation and set API should use the
> > > > > > > > > > same atomic
> > > > > > > > > mechanism.
> > > > > > > > >
> > > > > > > > > Sure but all you are doing for checking validity, is
> > > > > > > > > check that owner_id > 0 &&& owner_id < next_ownwe_id,
> right?
> > > > > > > > > As you don't allow owner_id overlap (16/3248 bits) you
> > > > > > > > > can safely do same check with just
> atomic_get(&next_owner_id).
> > > > > > > > >
> > > > > > > > It will not protect it, scenario:
> > > > > > > > - current next_id is X.
> > > > > > > > - call set ownership of port A with owner id X by thread
> > > > > > > > 0(by user
> > > > > mistake).
> > > > > > > > - context switch
> > > > > > > > - allocate new id by thread 1 and get X and change next_id
> > > > > > > > to
> > > > > > > > X+1
> > > > > > > atomically.
> > > > > > > > -  context switch
> > > > > > > > - Thread 0 validate X by atomic_read and succeed to take
> > > ownership.
> > > > > > > > - The system loosed the port(or will be managed by two
> > > > > > > > entities) -
> > > > > crash.
> > > > > > >
> > > > > > >
> > > > > > > Ok, and how using lock will protect you with such scenario?
> > > > > >
> > > > > > The owner set API validation by thread 0 should fail because
> > > > > > the owner
> > > > > validation is included in the protected section.
> > > > >
> > > > > Then your validation function would fail even if you'll use
> > > > > atomic ops instead of lock.
> > > > No.
> > > > With atomic this specific scenario will cause the validation to pass.
> > >
> > > Can you explain to me how?
> > >
> > > rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > >               int32_t cur_owner_id =
> > > RTE_MIN(rte_atomic32_get(next_owner_id),
> > > UINT16_MAX);
> > >
> > > 	if (owner_id == RTE_ETH_DEV_NO_OWNER || owner >
> > > cur_owner_id) {
> > > 		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > 		return 0;
> > > 	}
> > > 	return 1;
> > > }
> > >
> > > Let say your next_owne_id==X, and you invoke
> > > rte_eth_is_valid_owner_id(owner_id=X+1)  - it would fail.
> >
> > Explanation:
> > The scenario with locks:
> > next_owner_id = X.
> > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> 
> Ok I see what you mean.
> But, as I said before, if thread 0 will grab the lock first - you'll experience the
> same failure.
> I understand now that by some reason you treat these two scenarios as
> something different, but for me it is pretty much the same case.
> And to me it means that neither lock, neither atomic can fully protect you
> here.
> 

I agree that we are not fully protected even when using locks but one lock are more protected than ether atomics or 2 different locks.
So, I think keeping it as is (with one lock) makes sense.

> > Context switch.
> > Thread 1 call to owner_new and stuck in the lock.
> > Context switch.
> > Thread 0 does owner id validation and failed(Y>=X) - unlock the lock and
> return failure to the user.
> > Context switch.
> > Thread 1 take the lock and update X to X+1, then, unlock the lock.
> > Everything is OK!
> >
> > The same scenario with atomics:
> > next_owner_id = X.
> > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > Context switch.
> > Thread 1 call to owner_new and change X to X+1(atomically).
> > Context switch.
> > Thread 0 does owner id validation and success(Y<(atomic)X+1) - unlock the
> lock and return success to the  user.
> > Problem!
> >
> > > > With lock no next_id changes can be done while the thread is in
> > > > the set
> > > API.
> > > >
> > > > > But in fact your code is not protected for that scenario -
> > > > > doesn't matter will you'll use lock or atomic ops.
> > > > > Let's considerer your current code with the following scenario:
> > > > >
> > > > > next_owner_id  == 1
> > > > > 1) Process 0:
> > > > >      rte_eth_dev_owner_new(&owner_id);
> > > > >      now owner_id == 1 and next_owner_id == 2
> > > > > 2) Process 1 (by mistake):
> > > > >     rte_eth_dev_owner_set(port_id=1, owner->id=1); It will
> > > > > complete successfully, as owner_id ==1 is considered as valid.
> > > > > 3) Process 0:
> > > > >       rte_eth_dev_owner_set(port_id=1, owner->id=1); It will
> > > > > also complete with success, as owner->id is valid is equal to
> > > > > current port
> > > owner_id.
> > > > > So you finished with 2 processes assuming that they do own
> > > > > exclusively then same port.
> > > > >
> > > > > Honestly in that situation  locking around nest_owner_id
> > > > > wouldn't give you any advantages over atomic ops.
> > > > >
> > > >
> > > > This is a different scenario that we can't protect on it with atomic or
> locks.
> > > > But for the first scenario I described I think we can.
> > > > Please read it again, I described it step by step.
> > > >
> > > > > >
> > > > > > > I don't think you can protect yourself against such scenario
> > > > > > > with or without locking.
> > > > > > > Unless you'll make it harder for the mis-behaving thread to
> > > > > > > guess valid owner_id, or add some extra logic here.
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > > The set(and others) ownership APIs already uses the
> > > > > > > > > > ownership lock so I
> > > > > > > > > think it makes sense to use the same lock also in ID allocation.
> > > > > > > > > >
> > > > > > > > > > > > > > > In fact, for next_owner_id, you don't need a
> > > > > > > > > > > > > > > lock - just rte_atomic_t should be enough.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I don't think so, it is problematic in
> > > > > > > > > > > > > > next_owner_id wraparound and may
> > > > > > > > > > > > > complicate the code in other places which read it.
> > > > > > > > > > > > >
> > > > > > > > > > > > > IMO it is not that complicated, something like
> > > > > > > > > > > > > that should work I
> > > > > > > think.
> > > > > > > > > > > > >
> > > > > > > > > > > > > /* init to 0 at startup*/ rte_atomic32_t
> > > > > > > > > > > > > *owner_id;
> > > > > > > > > > > > >
> > > > > > > > > > > > > int new_owner_id(void) {
> > > > > > > > > > > > >     int32_t x;
> > > > > > > > > > > > >     x = rte_atomic32_add_return(&owner_id, 1);
> > > > > > > > > > > > >     if (x > UINT16_MAX) {
> > > > > > > > > > > > >        rte_atomic32_dec(&owner_id);
> > > > > > > > > > > > >        return -EOVERWLOW;
> > > > > > > > > > > > >     } else
> > > > > > > > > > > > >         return x; }
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Why not just to keep it simple and using the same
> lock?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Lock is also fine, I just think it better be a separate
> > > > > > > > > > > > > one
> > > > > > > > > > > > > - that would protext just next_owner_id.
> > > > > > > > > > > > > Though if you are going to use uuid here - all that
> > > > > > > > > > > > > probably not relevant any more.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > I agree about the uuid but still think the same lock
> > > > > > > > > > > > should be used for
> > > > > > > > > both.
> > > > > > > > > > >
> > > > > > > > > > > But with uuid you don't need next_owner_id at all, right?
> > > > > > > > > > > So lock will only be used for rte_eth_dev_data[] fields
> > > anyway.
> > > > > > > > > > >
> > > > > > > > > > Sorry, I meant uint64_t, not uuid.
> > > > > > > > >
> > > > > > > > > Ah ok, my thought uuid_t is better as with it you don't need to
> > > > > > > > > support your own code to allocate new owner_id, but rely on
> > > > > > > > > system libs
> > > > > > > instead.
> > > > > > > > > But wouldn't insist here.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > > > > > Another alternative would be to use 2 locks - one
> > > > > > > > > > > > > > > for next_owner_id second for actual data[]
> protection.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Another thing - you'll probably need to
> grab/release
> > > > > > > > > > > > > > > a lock inside
> > > > > > > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > > > > > > It is a public function used by drivers, so need to
> > > > > > > > > > > > > > > be protected
> > > > > > > too.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Yes, I thought about it, but decided not to use lock in
> > > next:
> > > > > > > > > > > > > > rte_eth_dev_allocated
> > > > > > > > > > > > > > rte_eth_dev_count
> > > > > > > > > > > > > > rte_eth_dev_get_name_by_port
> > > > > > > rte_eth_dev_get_port_by_name
> > > > > > > > > > > > > > maybe more...
> > > > > > > > > > > > >
> > > > > > > > > > > > > As I can see in patch #3 you protect by lock access to
> > > > > > > > > > > > > rte_eth_dev_data[].name (which seems like a good
> > > thing).
> > > > > > > > > > > > > So I think any other public function that access
> > > > > > > > > > > > > rte_eth_dev_data[].name should be protected by the
> > > same
> > > > > lock.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > I don't think so, I can understand to use the ownership
> > > > > > > > > > > > lock here(as in port
> > > > > > > > > > > creation) but I don't think it is necessary too.
> > > > > > > > > > > > What are we exactly protecting here?
> > > > > > > > > > > > Don't you think it is just timing?(ask in the next moment
> > > > > > > > > > > > and you may get another answer) I don't see optional
> crash.
> > > > > > > > > > >
> > > > > > > > > > > Not sure what you mean here by timing...
> > > > > > > > > > > As I understand rte_eth_dev_data[].name unique
> identifies
> > > > > > > > > > > device and is used by  port allocation/release/find
> functions.
> > > > > > > > > > > As you stated above:
> > > > > > > > > > > "1. The port allocation and port release synchronization
> > > > > > > > > > > will be managed by ethdev."
> > > > > > > > > > > To me it means that ethdev layer has to make sure that all
> > > > > > > > > > > accesses to rte_eth_dev_data[].name are atomic.
> > > > > > > > > > > Otherwise what would prevent the situation when one
> > > process
> > > > > > > > > > > does
> > > > > > > > > > > rte_eth_dev_allocate()-
> >snprintf(rte_eth_dev_data[x].name,
> > > > > > > > > > > ...) while second one does
> > > > > > > > > rte_eth_dev_allocated(rte_eth_dev_data[x].name, ...) ?
> > > > > > > > > > >
> > > > > > > > > > The second will get True or False and that is it.
> > > > > > > > >
> > > > > > > > > Under race condition - in the worst case it might crash, though
> > > > > > > > > for that you'll have to be really unlucky.
> > > > > > > > > Though in most cases as you said it would just not operate
> > > correctly.
> > > > > > > > > I think if we start to protect dev->name by lock we need to do
> > > > > > > > > it for all instances (both read and write).
> > > > > > > > >
> > > > > > > > Since under the ownership rules, the user must take ownership
> of a
> > > > > > > > port
> > > > > > > before using it, I still don't see a problem here.
> > > > > > >
> > > > > > > I am not talking about owner id or name here.
> > > > > > > I am talking about dev->name.
> > > > > > >
> > > > > > So? The user still should take ownership of a device before using it
> (by
> > > > > name or by port id).
> > > > > > It can just read it without owning it, but no managing it.
> > > > > >
> > > > > > > > Please, Can you describe specific crash scenario and explain how
> > > > > > > > could the
> > > > > > > locking fix it?
> > > > > > >
> > > > > > > Let say thread 0 doing rte_eth_dev_allocate()-
> > > > > > > >snprintf(rte_eth_dev_data[x].name, ...), thread 1 doing
> > > > > > > rte_pmd_ring_remove()->rte_eth_dev_allocated()->strcmp().
> > > > > > > And because of race condition - rte_eth_dev_allocated() will
> return
> > > > > > > rte_eth_dev * for the wrong device.
> > > > > > Which wrong device do you mean? I guess it is the device which
> > > currently is
> > > > > being created by thread 0.
> > > > > > > Then rte_pmd_ring_remove() will call rte_free() for related
> > > > > > > resources, while It can still be in use by someone else.
> > > > > > The rte_pmd_ring_remove caller(some DPDK entity) must take
> > > ownership
> > > > > > (or validate that he is the owner) of a port before doing it(free,
> > > release), so
> > > > > no issue here.
> > > > >
> > > > > Forget about ownership for a second.
> > > > > Suppose we have a process it created ring port for itself (without
> setting
> > > any
> > > > > ownership)  and used it for some time.
> > > > > Then it decided to remove it, so it calls rte_pmd_ring_remove() for it.
> > > > > At the same time second process decides to call
> rte_eth_dev_allocate()
> > > (let
> > > > > say for anither ring port).
> > > > > They could collide trying to read (process 0) and modify (process 1)
> same
> > > > > string rte_eth_dev_data[].name.
> > > > >
> > > > Do you mean that process 0 will compare successfully the process 1
> new
> > > port name?
> > >
> > > Yes.
> > >
> > > > The state are in local process memory - so process 0 will not compare
> the
> > > process 1 port, from its point of view this port is in UNUSED
> > > > state.
> > > >
> > >
> > > Ok, and why it can't be in attached state in process 0 too?
> >
> > Someone in process 0 should attach it using protected attach_secondary
> somewhere in your scenario.
> 
> Yes, process 0 can have this port attached too, why not?
See the function with inline comments:

struct rte_eth_dev *
rte_eth_dev_allocated(const char *name)
{
	unsigned i;

	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {

	    	The below state are in local process memory,
		So, if here process 1 will allocate a new port (the current i), update its local state to ATTACHED and write the name,
		the state is not visible by process 0 until someone in process 0 will attach it by rte_eth_dev_attach_secondary.
		So, to use rte_eth_dev_attach_secondary process 0 must take the lock and it can't, because it is currently locked by process 1.

		if ((rte_eth_devices[i].state == RTE_ETH_DEV_ATTACHED) &&
		strcmp(rte_eth_devices[i].data->name, name) == 0)
			return &rte_eth_devices[i];
	}
	return NULL;


> Konstantin
> 
> >
> >
> > > Konstantin
> > >
> > > > > Konstantin
> > > > >
> > > > > >
> > > > > >
> > > > > > Also I'm not sure I fully understand your scenario looks like moving
> > > > > > the device state setting in allocation to be after the name setting
> will be
> > > > > good.
> > > > > > What do you think?
> > > > > >
> > > > > > > Konstantin
> > > > > > >
> > > > > > > >
> > > > > > > > > > Maybe if it had been called just a moment after, It might get
> > > > > > > > > > different
> > > > > > > > > answer.
> > > > > > > > > > Because these APIs don't change ethdev structure(just
> read),
> > > > > > > > > > it can be
> > > > > > > OK.
> > > > > > > > > > But again, I can understand to use ownership lock also here.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-17 12:05                               ` Matan Azrad
  2018-01-17 12:54                                 ` Ananyev, Konstantin
@ 2018-01-17 14:00                                 ` Neil Horman
  2018-01-17 17:01                                   ` Ananyev, Konstantin
  2018-01-17 17:58                                   ` Matan Azrad
  1 sibling, 2 replies; 214+ messages in thread
From: Neil Horman @ 2018-01-17 14:00 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing,
	dev, Richardson, Bruce

On Wed, Jan 17, 2018 at 12:05:42PM +0000, Matan Azrad wrote:
> 
> Hi Konstantin
> From: Ananyev, Konstantin, Sent: Wednesday, January 17, 2018 1:24 PM
> > Hi Matan,
> > 
> > > Hi Konstantin
> > >
> > > From: Ananyev, Konstantin, Tuesday, January 16, 2018 9:11 PM
> > > > Hi Matan,
> > > >
> > > > >
> > > > > Hi Konstantin
> > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 8:44 PM
> > > > > > Hi Matan,
> > > > > > > Hi Konstantin
> > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 1:45 PM
> > > > > > > > Hi Matan,
> > > > > > > > > Hi Konstantin
> > > > > > > > > From: Ananyev, Konstantin, Friday, January 12, 2018 2:02
> > > > > > > > > AM
> > > > > > > > > > Hi Matan,
> > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > From: Ananyev, Konstantin, Thursday, January 11, 2018
> > > > > > > > > > > 2:40 PM
> > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > From: Ananyev, Konstantin, Wednesday, January 10,
> > > > > > > > > > > > > 2018
> > > > > > > > > > > > > 3:36 PM
> > > > > > > > > > > > > > Hi Matan,
> > > > >  <snip>
> > > > > > > > > > > > > > It is good to see that now scanning/updating
> > > > > > > > > > > > > > rte_eth_dev_data[] is lock protected, but it
> > > > > > > > > > > > > > might be not very plausible to protect both
> > > > > > > > > > > > > > data[] and next_owner_id using the
> > > > > > > > same lock.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I guess you mean to the owner structure in
> > > > > > > > rte_eth_dev_data[port_id].
> > > > > > > > > > > > > The next_owner_id is read by ownership APIs(for
> > > > > > > > > > > > > owner validation), so it
> > > > > > > > > > > > makes sense to use the same lock.
> > > > > > > > > > > > > Actually, why not?
> > > > > > > > > > > >
> > > > > > > > > > > > Well to me next_owner_id and rte_eth_dev_data[] are
> > > > > > > > > > > > not directly
> > > > > > > > > > related.
> > > > > > > > > > > > You may create new owner_id but it doesn't mean you
> > > > > > > > > > > > would update rte_eth_dev_data[] immediately.
> > > > > > > > > > > > And visa-versa - you might just want to update
> > > > > > > > > > > > rte_eth_dev_data[].name or .owner_id.
> > > > > > > > > > > > It is not very good coding practice to use same lock
> > > > > > > > > > > > for non-related data structures.
> > > > > > > > > > > >
> > > > > > > > > > > I see the relation like next:
> > > > > > > > > > > Since the ownership mechanism synchronization is in
> > > > > > > > > > > ethdev responsibility, we must protect against user
> > > > > > > > > > > mistakes as much as we can by
> > > > > > > > > > using the same lock.
> > > > > > > > > > > So, if user try to set by invalid owner (exactly the
> > > > > > > > > > > ID which currently is
> > > > > > > > > > allocated) we can protect on it.
> > > > > > > > > >
> > > > > > > > > > Hmm, not sure why you can't do same checking with
> > > > > > > > > > different lock or atomic variable?
> > > > > > > > > >
> > > > > > > > > The set ownership API is protected by ownership lock and
> > > > > > > > > checks the owner ID validity By reading the next owner ID.
> > > > > > > > > So, the owner ID allocation and set API should use the
> > > > > > > > > same atomic
> > > > > > > > mechanism.
> > > > > > > >
> > > > > > > > Sure but all you are doing for checking validity, is  check
> > > > > > > > that owner_id > 0 &&& owner_id < next_ownwe_id, right?
> > > > > > > > As you don't allow owner_id overlap (16/3248 bits) you can
> > > > > > > > safely do same check with just atomic_get(&next_owner_id).
> > > > > > > >
> > > > > > > It will not protect it, scenario:
> > > > > > > - current next_id is X.
> > > > > > > - call set ownership of port A with owner id X by thread 0(by
> > > > > > > user
> > > > mistake).
> > > > > > > - context switch
> > > > > > > - allocate new id by thread 1 and get X and change next_id to
> > > > > > > X+1
> > > > > > atomically.
> > > > > > > -  context switch
> > > > > > > - Thread 0 validate X by atomic_read and succeed to take
> > ownership.
> > > > > > > - The system loosed the port(or will be managed by two
> > > > > > > entities) -
> > > > crash.
> > > > > >
> > > > > >
> > > > > > Ok, and how using lock will protect you with such scenario?
> > > > >
> > > > > The owner set API validation by thread 0 should fail because the
> > > > > owner
> > > > validation is included in the protected section.
> > > >
> > > > Then your validation function would fail even if you'll use atomic
> > > > ops instead of lock.
> > > No.
> > > With atomic this specific scenario will cause the validation to pass.
> > 
> > Can you explain to me how?
> > 
> > rte_eth_is_valid_owner_id(uint16_t owner_id) {
> >               int32_t cur_owner_id = RTE_MIN(rte_atomic32_get(next_owner_id),
> > UINT16_MAX);
> > 
> > 	if (owner_id == RTE_ETH_DEV_NO_OWNER || owner >
> > cur_owner_id) {
> > 		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > 		return 0;
> > 	}
> > 	return 1;
> > }
> > 
> > Let say your next_owne_id==X, and you invoke
> > rte_eth_is_valid_owner_id(owner_id=X+1)  - it would fail.
> 
> Explanation:
> The scenario with locks:
> next_owner_id = X.
> Thread 0 call to set API(with invalid owner Y=X) and take lock.
> Context switch.
> Thread 1 call to owner_new and stuck in the lock.
> Context switch.
> Thread 0 does owner id validation and failed(Y>=X) - unlock the lock and return failure to the user.
> Context switch.
> Thread 1 take the lock and update X to X+1, then, unlock the lock.
> Everything is OK!
> 
> The same scenario with atomics:
> next_owner_id = X.
> Thread 0 call to set API(with invalid owner Y=X) and take lock.
> Context switch.
> Thread 1 call to owner_new and change X to X+1(atomically).
> Context switch.
> Thread 0 does owner id validation and success(Y<(atomic)X+1) - unlock the lock and return success to the  user.
> Problem!
> 


Matan is correct here, there is no way to preform parallel set operations using
just and atomic variable here, because multiple reads of next_owner_id need to
be preformed while it is stable.  That is to say rte_eth_next_owner_id must be
compared to RTE_ETH_DEV_NO_OWNER and owner_id in rte_eth_is_valid_owner_id.  If
you were to only use an atomic_read on such a variable, it could be incremented
by the owner_new function between the checks and an invalid owner value could
become valid because  a third thread incremented the next value.  The state of
next_owner_id must be kept stable during any validity checks

That said, I really have to wonder why ownership ids are really needed here at
all.  It seems this design could be much simpler with the addition of a per-port
lock (and optional ownership record).  The API could consist of three
operations:

ownership_set
ownership_tryset
ownership_release
ownership_get


The first call simply tries to take the per-port lock (blocking if its already
locked)

The second call is a non-blocking version of the first

The third unlocks the port, allowing others to take ownership

The fourth returns whatever ownership record you want to encode with the lock.

The addition of all this id checking seems a bit overcomplicated

Neil

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-17 13:10                                   ` Matan Azrad
@ 2018-01-17 16:52                                     ` Ananyev, Konstantin
  2018-01-17 18:02                                       ` Matan Azrad
  2018-01-17 20:34                                       ` Matan Azrad
  0 siblings, 2 replies; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-17 16:52 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce

Hi Matan,

> 
> Hi Konstantin
> From: Ananyev, Konstantin, Wednesday, January 17, 2018 2:55 PM
> > >
> > >
> > > Hi Konstantin
> > > From: Ananyev, Konstantin, Sent: Wednesday, January 17, 2018 1:24 PM
> > > > Hi Matan,
> > > >
> > > > > Hi Konstantin
> > > > >
> > > > > From: Ananyev, Konstantin, Tuesday, January 16, 2018 9:11 PM
> > > > > > Hi Matan,
> > > > > >
> > > > > > >
> > > > > > > Hi Konstantin
> > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 8:44 PM
> > > > > > > > Hi Matan,
> > > > > > > > > Hi Konstantin
> > > > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 1:45
> > > > > > > > > PM
> > > > > > > > > > Hi Matan,
> > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > From: Ananyev, Konstantin, Friday, January 12, 2018
> > > > > > > > > > > 2:02 AM
> > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > From: Ananyev, Konstantin, Thursday, January 11,
> > > > > > > > > > > > > 2018
> > > > > > > > > > > > > 2:40 PM
> > > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > > From: Ananyev, Konstantin, Wednesday, January
> > > > > > > > > > > > > > > 10,
> > > > > > > > > > > > > > > 2018
> > > > > > > > > > > > > > > 3:36 PM
> > > > > > > > > > > > > > > > Hi Matan,
> > > > > > >  <snip>
> > > > > > > > > > > > > > > > It is good to see that now scanning/updating
> > > > > > > > > > > > > > > > rte_eth_dev_data[] is lock protected, but it
> > > > > > > > > > > > > > > > might be not very plausible to protect both
> > > > > > > > > > > > > > > > data[] and next_owner_id using the
> > > > > > > > > > same lock.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I guess you mean to the owner structure in
> > > > > > > > > > rte_eth_dev_data[port_id].
> > > > > > > > > > > > > > > The next_owner_id is read by ownership
> > > > > > > > > > > > > > > APIs(for owner validation), so it
> > > > > > > > > > > > > > makes sense to use the same lock.
> > > > > > > > > > > > > > > Actually, why not?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Well to me next_owner_id and rte_eth_dev_data[]
> > > > > > > > > > > > > > are not directly
> > > > > > > > > > > > related.
> > > > > > > > > > > > > > You may create new owner_id but it doesn't mean
> > > > > > > > > > > > > > you would update rte_eth_dev_data[] immediately.
> > > > > > > > > > > > > > And visa-versa - you might just want to update
> > > > > > > > > > > > > > rte_eth_dev_data[].name or .owner_id.
> > > > > > > > > > > > > > It is not very good coding practice to use same
> > > > > > > > > > > > > > lock for non-related data structures.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > I see the relation like next:
> > > > > > > > > > > > > Since the ownership mechanism synchronization is
> > > > > > > > > > > > > in ethdev responsibility, we must protect against
> > > > > > > > > > > > > user mistakes as much as we can by
> > > > > > > > > > > > using the same lock.
> > > > > > > > > > > > > So, if user try to set by invalid owner (exactly
> > > > > > > > > > > > > the ID which currently is
> > > > > > > > > > > > allocated) we can protect on it.
> > > > > > > > > > > >
> > > > > > > > > > > > Hmm, not sure why you can't do same checking with
> > > > > > > > > > > > different lock or atomic variable?
> > > > > > > > > > > >
> > > > > > > > > > > The set ownership API is protected by ownership lock
> > > > > > > > > > > and checks the owner ID validity By reading the next owner
> > ID.
> > > > > > > > > > > So, the owner ID allocation and set API should use the
> > > > > > > > > > > same atomic
> > > > > > > > > > mechanism.
> > > > > > > > > >
> > > > > > > > > > Sure but all you are doing for checking validity, is
> > > > > > > > > > check that owner_id > 0 &&& owner_id < next_ownwe_id,
> > right?
> > > > > > > > > > As you don't allow owner_id overlap (16/3248 bits) you
> > > > > > > > > > can safely do same check with just
> > atomic_get(&next_owner_id).
> > > > > > > > > >
> > > > > > > > > It will not protect it, scenario:
> > > > > > > > > - current next_id is X.
> > > > > > > > > - call set ownership of port A with owner id X by thread
> > > > > > > > > 0(by user
> > > > > > mistake).
> > > > > > > > > - context switch
> > > > > > > > > - allocate new id by thread 1 and get X and change next_id
> > > > > > > > > to
> > > > > > > > > X+1
> > > > > > > > atomically.
> > > > > > > > > -  context switch
> > > > > > > > > - Thread 0 validate X by atomic_read and succeed to take
> > > > ownership.
> > > > > > > > > - The system loosed the port(or will be managed by two
> > > > > > > > > entities) -
> > > > > > crash.
> > > > > > > >
> > > > > > > >
> > > > > > > > Ok, and how using lock will protect you with such scenario?
> > > > > > >
> > > > > > > The owner set API validation by thread 0 should fail because
> > > > > > > the owner
> > > > > > validation is included in the protected section.
> > > > > >
> > > > > > Then your validation function would fail even if you'll use
> > > > > > atomic ops instead of lock.
> > > > > No.
> > > > > With atomic this specific scenario will cause the validation to pass.
> > > >
> > > > Can you explain to me how?
> > > >
> > > > rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > >               int32_t cur_owner_id =
> > > > RTE_MIN(rte_atomic32_get(next_owner_id),
> > > > UINT16_MAX);
> > > >
> > > > 	if (owner_id == RTE_ETH_DEV_NO_OWNER || owner >
> > > > cur_owner_id) {
> > > > 		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > > 		return 0;
> > > > 	}
> > > > 	return 1;
> > > > }
> > > >
> > > > Let say your next_owne_id==X, and you invoke
> > > > rte_eth_is_valid_owner_id(owner_id=X+1)  - it would fail.
> > >
> > > Explanation:
> > > The scenario with locks:
> > > next_owner_id = X.
> > > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> >
> > Ok I see what you mean.
> > But, as I said before, if thread 0 will grab the lock first - you'll experience the
> > same failure.
> > I understand now that by some reason you treat these two scenarios as
> > something different, but for me it is pretty much the same case.
> > And to me it means that neither lock, neither atomic can fully protect you
> > here.
> >
> 
> I agree that we are not fully protected even when using locks but one lock are more protected than ether atomics or 2 different locks.
> So, I think keeping it as is (with one lock) makes sense.

Ok if that your preference - let's keep your current approach here.

> 
> > > Context switch.
> > > Thread 1 call to owner_new and stuck in the lock.
> > > Context switch.
> > > Thread 0 does owner id validation and failed(Y>=X) - unlock the lock and
> > return failure to the user.
> > > Context switch.
> > > Thread 1 take the lock and update X to X+1, then, unlock the lock.
> > > Everything is OK!
> > >
> > > The same scenario with atomics:
> > > next_owner_id = X.
> > > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > > Context switch.
> > > Thread 1 call to owner_new and change X to X+1(atomically).
> > > Context switch.
> > > Thread 0 does owner id validation and success(Y<(atomic)X+1) - unlock the
> > lock and return success to the  user.
> > > Problem!
> > >
> > > > > With lock no next_id changes can be done while the thread is in
> > > > > the set
> > > > API.
> > > > >
> > > > > > But in fact your code is not protected for that scenario -
> > > > > > doesn't matter will you'll use lock or atomic ops.
> > > > > > Let's considerer your current code with the following scenario:
> > > > > >
> > > > > > next_owner_id  == 1
> > > > > > 1) Process 0:
> > > > > >      rte_eth_dev_owner_new(&owner_id);
> > > > > >      now owner_id == 1 and next_owner_id == 2
> > > > > > 2) Process 1 (by mistake):
> > > > > >     rte_eth_dev_owner_set(port_id=1, owner->id=1); It will
> > > > > > complete successfully, as owner_id ==1 is considered as valid.
> > > > > > 3) Process 0:
> > > > > >       rte_eth_dev_owner_set(port_id=1, owner->id=1); It will
> > > > > > also complete with success, as owner->id is valid is equal to
> > > > > > current port
> > > > owner_id.
> > > > > > So you finished with 2 processes assuming that they do own
> > > > > > exclusively then same port.
> > > > > >
> > > > > > Honestly in that situation  locking around nest_owner_id
> > > > > > wouldn't give you any advantages over atomic ops.
> > > > > >
> > > > >
> > > > > This is a different scenario that we can't protect on it with atomic or
> > locks.
> > > > > But for the first scenario I described I think we can.
> > > > > Please read it again, I described it step by step.
> > > > >
> > > > > > >
> > > > > > > > I don't think you can protect yourself against such scenario
> > > > > > > > with or without locking.
> > > > > > > > Unless you'll make it harder for the mis-behaving thread to
> > > > > > > > guess valid owner_id, or add some extra logic here.
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > > The set(and others) ownership APIs already uses the
> > > > > > > > > > > ownership lock so I
> > > > > > > > > > think it makes sense to use the same lock also in ID allocation.
> > > > > > > > > > >
> > > > > > > > > > > > > > > > In fact, for next_owner_id, you don't need a
> > > > > > > > > > > > > > > > lock - just rte_atomic_t should be enough.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I don't think so, it is problematic in
> > > > > > > > > > > > > > > next_owner_id wraparound and may
> > > > > > > > > > > > > > complicate the code in other places which read it.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IMO it is not that complicated, something like
> > > > > > > > > > > > > > that should work I
> > > > > > > > think.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > /* init to 0 at startup*/ rte_atomic32_t
> > > > > > > > > > > > > > *owner_id;
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > int new_owner_id(void) {
> > > > > > > > > > > > > >     int32_t x;
> > > > > > > > > > > > > >     x = rte_atomic32_add_return(&owner_id, 1);
> > > > > > > > > > > > > >     if (x > UINT16_MAX) {
> > > > > > > > > > > > > >        rte_atomic32_dec(&owner_id);
> > > > > > > > > > > > > >        return -EOVERWLOW;
> > > > > > > > > > > > > >     } else
> > > > > > > > > > > > > >         return x; }
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Why not just to keep it simple and using the same
> > lock?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Lock is also fine, I just think it better be a separate
> > > > > > > > > > > > > > one
> > > > > > > > > > > > > > - that would protext just next_owner_id.
> > > > > > > > > > > > > > Though if you are going to use uuid here - all that
> > > > > > > > > > > > > > probably not relevant any more.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > I agree about the uuid but still think the same lock
> > > > > > > > > > > > > should be used for
> > > > > > > > > > both.
> > > > > > > > > > > >
> > > > > > > > > > > > But with uuid you don't need next_owner_id at all, right?
> > > > > > > > > > > > So lock will only be used for rte_eth_dev_data[] fields
> > > > anyway.
> > > > > > > > > > > >
> > > > > > > > > > > Sorry, I meant uint64_t, not uuid.
> > > > > > > > > >
> > > > > > > > > > Ah ok, my thought uuid_t is better as with it you don't need to
> > > > > > > > > > support your own code to allocate new owner_id, but rely on
> > > > > > > > > > system libs
> > > > > > > > instead.
> > > > > > > > > > But wouldn't insist here.
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > > > > > Another alternative would be to use 2 locks - one
> > > > > > > > > > > > > > > > for next_owner_id second for actual data[]
> > protection.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Another thing - you'll probably need to
> > grab/release
> > > > > > > > > > > > > > > > a lock inside
> > > > > > > > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > > > > > > > It is a public function used by drivers, so need to
> > > > > > > > > > > > > > > > be protected
> > > > > > > > too.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Yes, I thought about it, but decided not to use lock in
> > > > next:
> > > > > > > > > > > > > > > rte_eth_dev_allocated
> > > > > > > > > > > > > > > rte_eth_dev_count
> > > > > > > > > > > > > > > rte_eth_dev_get_name_by_port
> > > > > > > > rte_eth_dev_get_port_by_name
> > > > > > > > > > > > > > > maybe more...
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > As I can see in patch #3 you protect by lock access to
> > > > > > > > > > > > > > rte_eth_dev_data[].name (which seems like a good
> > > > thing).
> > > > > > > > > > > > > > So I think any other public function that access
> > > > > > > > > > > > > > rte_eth_dev_data[].name should be protected by the
> > > > same
> > > > > > lock.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > I don't think so, I can understand to use the ownership
> > > > > > > > > > > > > lock here(as in port
> > > > > > > > > > > > creation) but I don't think it is necessary too.
> > > > > > > > > > > > > What are we exactly protecting here?
> > > > > > > > > > > > > Don't you think it is just timing?(ask in the next moment
> > > > > > > > > > > > > and you may get another answer) I don't see optional
> > crash.
> > > > > > > > > > > >
> > > > > > > > > > > > Not sure what you mean here by timing...
> > > > > > > > > > > > As I understand rte_eth_dev_data[].name unique
> > identifies
> > > > > > > > > > > > device and is used by  port allocation/release/find
> > functions.
> > > > > > > > > > > > As you stated above:
> > > > > > > > > > > > "1. The port allocation and port release synchronization
> > > > > > > > > > > > will be managed by ethdev."
> > > > > > > > > > > > To me it means that ethdev layer has to make sure that all
> > > > > > > > > > > > accesses to rte_eth_dev_data[].name are atomic.
> > > > > > > > > > > > Otherwise what would prevent the situation when one
> > > > process
> > > > > > > > > > > > does
> > > > > > > > > > > > rte_eth_dev_allocate()-
> > >snprintf(rte_eth_dev_data[x].name,
> > > > > > > > > > > > ...) while second one does
> > > > > > > > > > rte_eth_dev_allocated(rte_eth_dev_data[x].name, ...) ?
> > > > > > > > > > > >
> > > > > > > > > > > The second will get True or False and that is it.
> > > > > > > > > >
> > > > > > > > > > Under race condition - in the worst case it might crash, though
> > > > > > > > > > for that you'll have to be really unlucky.
> > > > > > > > > > Though in most cases as you said it would just not operate
> > > > correctly.
> > > > > > > > > > I think if we start to protect dev->name by lock we need to do
> > > > > > > > > > it for all instances (both read and write).
> > > > > > > > > >
> > > > > > > > > Since under the ownership rules, the user must take ownership
> > of a
> > > > > > > > > port
> > > > > > > > before using it, I still don't see a problem here.
> > > > > > > >
> > > > > > > > I am not talking about owner id or name here.
> > > > > > > > I am talking about dev->name.
> > > > > > > >
> > > > > > > So? The user still should take ownership of a device before using it
> > (by
> > > > > > name or by port id).
> > > > > > > It can just read it without owning it, but no managing it.
> > > > > > >
> > > > > > > > > Please, Can you describe specific crash scenario and explain how
> > > > > > > > > could the
> > > > > > > > locking fix it?
> > > > > > > >
> > > > > > > > Let say thread 0 doing rte_eth_dev_allocate()-
> > > > > > > > >snprintf(rte_eth_dev_data[x].name, ...), thread 1 doing
> > > > > > > > rte_pmd_ring_remove()->rte_eth_dev_allocated()->strcmp().
> > > > > > > > And because of race condition - rte_eth_dev_allocated() will
> > return
> > > > > > > > rte_eth_dev * for the wrong device.
> > > > > > > Which wrong device do you mean? I guess it is the device which
> > > > currently is
> > > > > > being created by thread 0.
> > > > > > > > Then rte_pmd_ring_remove() will call rte_free() for related
> > > > > > > > resources, while It can still be in use by someone else.
> > > > > > > The rte_pmd_ring_remove caller(some DPDK entity) must take
> > > > ownership
> > > > > > > (or validate that he is the owner) of a port before doing it(free,
> > > > release), so
> > > > > > no issue here.
> > > > > >
> > > > > > Forget about ownership for a second.
> > > > > > Suppose we have a process it created ring port for itself (without
> > setting
> > > > any
> > > > > > ownership)  and used it for some time.
> > > > > > Then it decided to remove it, so it calls rte_pmd_ring_remove() for it.
> > > > > > At the same time second process decides to call
> > rte_eth_dev_allocate()
> > > > (let
> > > > > > say for anither ring port).
> > > > > > They could collide trying to read (process 0) and modify (process 1)
> > same
> > > > > > string rte_eth_dev_data[].name.
> > > > > >
> > > > > Do you mean that process 0 will compare successfully the process 1
> > new
> > > > port name?
> > > >
> > > > Yes.
> > > >
> > > > > The state are in local process memory - so process 0 will not compare
> > the
> > > > process 1 port, from its point of view this port is in UNUSED
> > > > > state.
> > > > >
> > > >
> > > > Ok, and why it can't be in attached state in process 0 too?
> > >
> > > Someone in process 0 should attach it using protected attach_secondary
> > somewhere in your scenario.
> >
> > Yes, process 0 can have this port attached too, why not?
> See the function with inline comments:
> 
> struct rte_eth_dev *
> rte_eth_dev_allocated(const char *name)
> {
> 	unsigned i;
> 
> 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> 
> 	    	The below state are in local process memory,
> 		So, if here process 1 will allocate a new port (the current i), update its local state to ATTACHED and write the name,
> 		the state is not visible by process 0 until someone in process 0 will attach it by rte_eth_dev_attach_secondary.
> 		So, to use rte_eth_dev_attach_secondary process 0 must take the lock and it can't, because it is currently locked by
> process 1.

Ok I see.
Thanks for your patience.
BTW, that means that if let say process 0 will call rte_eth_dev_allocate("xxx")
and process 1 will call rte_eth_dev_allocate("yyy") we can endup with
same port_id be used for different devices and 2 processes will overwrite the
same rte_eth_dev_data[port_id]?
Konstantin

> 
> 		if ((rte_eth_devices[i].state == RTE_ETH_DEV_ATTACHED) &&
> 		strcmp(rte_eth_devices[i].data->name, name) == 0)
> 			return &rte_eth_devices[i];
> 	}
> 	return NULL;
> 
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-17 14:00                                 ` Neil Horman
@ 2018-01-17 17:01                                   ` Ananyev, Konstantin
  2018-01-18 13:10                                     ` Neil Horman
  2018-01-18 16:27                                     ` Neil Horman
  2018-01-17 17:58                                   ` Matan Azrad
  1 sibling, 2 replies; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-17 17:01 UTC (permalink / raw)
  To: Neil Horman, Matan Azrad
  Cc: Thomas Monjalon, Gaetan Rivet, Wu, Jingjing, dev, Richardson, Bruce



> -----Original Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Wednesday, January 17, 2018 2:00 PM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [PATCH v2 2/6] ethdev: add port ownership
> 
> On Wed, Jan 17, 2018 at 12:05:42PM +0000, Matan Azrad wrote:
> >
> > Hi Konstantin
> > From: Ananyev, Konstantin, Sent: Wednesday, January 17, 2018 1:24 PM
> > > Hi Matan,
> > >
> > > > Hi Konstantin
> > > >
> > > > From: Ananyev, Konstantin, Tuesday, January 16, 2018 9:11 PM
> > > > > Hi Matan,
> > > > >
> > > > > >
> > > > > > Hi Konstantin
> > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 8:44 PM
> > > > > > > Hi Matan,
> > > > > > > > Hi Konstantin
> > > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 1:45 PM
> > > > > > > > > Hi Matan,
> > > > > > > > > > Hi Konstantin
> > > > > > > > > > From: Ananyev, Konstantin, Friday, January 12, 2018 2:02
> > > > > > > > > > AM
> > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > From: Ananyev, Konstantin, Thursday, January 11, 2018
> > > > > > > > > > > > 2:40 PM
> > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > From: Ananyev, Konstantin, Wednesday, January 10,
> > > > > > > > > > > > > > 2018
> > > > > > > > > > > > > > 3:36 PM
> > > > > > > > > > > > > > > Hi Matan,
> > > > > >  <snip>
> > > > > > > > > > > > > > > It is good to see that now scanning/updating
> > > > > > > > > > > > > > > rte_eth_dev_data[] is lock protected, but it
> > > > > > > > > > > > > > > might be not very plausible to protect both
> > > > > > > > > > > > > > > data[] and next_owner_id using the
> > > > > > > > > same lock.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I guess you mean to the owner structure in
> > > > > > > > > rte_eth_dev_data[port_id].
> > > > > > > > > > > > > > The next_owner_id is read by ownership APIs(for
> > > > > > > > > > > > > > owner validation), so it
> > > > > > > > > > > > > makes sense to use the same lock.
> > > > > > > > > > > > > > Actually, why not?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Well to me next_owner_id and rte_eth_dev_data[] are
> > > > > > > > > > > > > not directly
> > > > > > > > > > > related.
> > > > > > > > > > > > > You may create new owner_id but it doesn't mean you
> > > > > > > > > > > > > would update rte_eth_dev_data[] immediately.
> > > > > > > > > > > > > And visa-versa - you might just want to update
> > > > > > > > > > > > > rte_eth_dev_data[].name or .owner_id.
> > > > > > > > > > > > > It is not very good coding practice to use same lock
> > > > > > > > > > > > > for non-related data structures.
> > > > > > > > > > > > >
> > > > > > > > > > > > I see the relation like next:
> > > > > > > > > > > > Since the ownership mechanism synchronization is in
> > > > > > > > > > > > ethdev responsibility, we must protect against user
> > > > > > > > > > > > mistakes as much as we can by
> > > > > > > > > > > using the same lock.
> > > > > > > > > > > > So, if user try to set by invalid owner (exactly the
> > > > > > > > > > > > ID which currently is
> > > > > > > > > > > allocated) we can protect on it.
> > > > > > > > > > >
> > > > > > > > > > > Hmm, not sure why you can't do same checking with
> > > > > > > > > > > different lock or atomic variable?
> > > > > > > > > > >
> > > > > > > > > > The set ownership API is protected by ownership lock and
> > > > > > > > > > checks the owner ID validity By reading the next owner ID.
> > > > > > > > > > So, the owner ID allocation and set API should use the
> > > > > > > > > > same atomic
> > > > > > > > > mechanism.
> > > > > > > > >
> > > > > > > > > Sure but all you are doing for checking validity, is  check
> > > > > > > > > that owner_id > 0 &&& owner_id < next_ownwe_id, right?
> > > > > > > > > As you don't allow owner_id overlap (16/3248 bits) you can
> > > > > > > > > safely do same check with just atomic_get(&next_owner_id).
> > > > > > > > >
> > > > > > > > It will not protect it, scenario:
> > > > > > > > - current next_id is X.
> > > > > > > > - call set ownership of port A with owner id X by thread 0(by
> > > > > > > > user
> > > > > mistake).
> > > > > > > > - context switch
> > > > > > > > - allocate new id by thread 1 and get X and change next_id to
> > > > > > > > X+1
> > > > > > > atomically.
> > > > > > > > -  context switch
> > > > > > > > - Thread 0 validate X by atomic_read and succeed to take
> > > ownership.
> > > > > > > > - The system loosed the port(or will be managed by two
> > > > > > > > entities) -
> > > > > crash.
> > > > > > >
> > > > > > >
> > > > > > > Ok, and how using lock will protect you with such scenario?
> > > > > >
> > > > > > The owner set API validation by thread 0 should fail because the
> > > > > > owner
> > > > > validation is included in the protected section.
> > > > >
> > > > > Then your validation function would fail even if you'll use atomic
> > > > > ops instead of lock.
> > > > No.
> > > > With atomic this specific scenario will cause the validation to pass.
> > >
> > > Can you explain to me how?
> > >
> > > rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > >               int32_t cur_owner_id = RTE_MIN(rte_atomic32_get(next_owner_id),
> > > UINT16_MAX);
> > >
> > > 	if (owner_id == RTE_ETH_DEV_NO_OWNER || owner >
> > > cur_owner_id) {
> > > 		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > 		return 0;
> > > 	}
> > > 	return 1;
> > > }
> > >
> > > Let say your next_owne_id==X, and you invoke
> > > rte_eth_is_valid_owner_id(owner_id=X+1)  - it would fail.
> >
> > Explanation:
> > The scenario with locks:
> > next_owner_id = X.
> > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > Context switch.
> > Thread 1 call to owner_new and stuck in the lock.
> > Context switch.
> > Thread 0 does owner id validation and failed(Y>=X) - unlock the lock and return failure to the user.
> > Context switch.
> > Thread 1 take the lock and update X to X+1, then, unlock the lock.
> > Everything is OK!
> >
> > The same scenario with atomics:
> > next_owner_id = X.
> > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > Context switch.
> > Thread 1 call to owner_new and change X to X+1(atomically).
> > Context switch.
> > Thread 0 does owner id validation and success(Y<(atomic)X+1) - unlock the lock and return success to the  user.
> > Problem!
> >
> 
> 
> Matan is correct here, there is no way to preform parallel set operations using
> just and atomic variable here, because multiple reads of next_owner_id need to
> be preformed while it is stable.  That is to say rte_eth_next_owner_id must be
> compared to RTE_ETH_DEV_NO_OWNER and owner_id in rte_eth_is_valid_owner_id.  If
> you were to only use an atomic_read on such a variable, it could be incremented
> by the owner_new function between the checks and an invalid owner value could
> become valid because  a third thread incremented the next value.  The state of
> next_owner_id must be kept stable during any validity checks

It could still be incremented between the checks - if let say different thread will
invoke new_onwer_id, grab the lock update counter, release the lock - all that
before the check.
But ok, there is probably no point to argue on that one any longer -
let's keep the lock here, nothing will be broken with it for sure.

> 
> That said, I really have to wonder why ownership ids are really needed here at
> all.  It seems this design could be much simpler with the addition of a per-port
> lock (and optional ownership record).  The API could consist of three
> operations:
> 
> ownership_set
> ownership_tryset
> ownership_release
> ownership_get
> 

Ok, but how to distinguish who is the current owner of the port?
To make sure that only owner is allowed to perform control ops?
Konstantin

> 
> The first call simply tries to take the per-port lock (blocking if its already
> locked)
> 
> The second call is a non-blocking version of the first
> 
> The third unlocks the port, allowing others to take ownership
> 
> The fourth returns whatever ownership record you want to encode with the lock.
> 
> The addition of all this id checking seems a bit overcomplicated
> 
> Neil

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-17 14:00                                 ` Neil Horman
  2018-01-17 17:01                                   ` Ananyev, Konstantin
@ 2018-01-17 17:58                                   ` Matan Azrad
  2018-01-18 13:20                                     ` Neil Horman
  1 sibling, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-17 17:58 UTC (permalink / raw)
  To: Neil Horman
  Cc: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing,
	dev, Richardson, Bruce


Hi Neil

 From: Neil Horman, Wednesday, January 17, 2018 4:00 PM
> On Wed, Jan 17, 2018 at 12:05:42PM +0000, Matan Azrad wrote:
> >
> > Hi Konstantin
> > From: Ananyev, Konstantin, Sent: Wednesday, January 17, 2018 1:24 PM
> > > Hi Matan,
> > >
> > > > Hi Konstantin
> > > >
> > > > From: Ananyev, Konstantin, Tuesday, January 16, 2018 9:11 PM
> > > > > Hi Matan,
> > > > >
> > > > > >
> > > > > > Hi Konstantin
> > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 8:44 PM
> > > > > > > Hi Matan,
> > > > > > > > Hi Konstantin
> > > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 1:45
> > > > > > > > PM
> > > > > > > > > Hi Matan,
> > > > > > > > > > Hi Konstantin
> > > > > > > > > > From: Ananyev, Konstantin, Friday, January 12, 2018
> > > > > > > > > > 2:02 AM
> > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > From: Ananyev, Konstantin, Thursday, January 11,
> > > > > > > > > > > > 2018
> > > > > > > > > > > > 2:40 PM
> > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > From: Ananyev, Konstantin, Wednesday, January
> > > > > > > > > > > > > > 10,
> > > > > > > > > > > > > > 2018
> > > > > > > > > > > > > > 3:36 PM
> > > > > > > > > > > > > > > Hi Matan,
> > > > > >  <snip>
> > > > > > > > > > > > > > > It is good to see that now scanning/updating
> > > > > > > > > > > > > > > rte_eth_dev_data[] is lock protected, but it
> > > > > > > > > > > > > > > might be not very plausible to protect both
> > > > > > > > > > > > > > > data[] and next_owner_id using the
> > > > > > > > > same lock.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I guess you mean to the owner structure in
> > > > > > > > > rte_eth_dev_data[port_id].
> > > > > > > > > > > > > > The next_owner_id is read by ownership
> > > > > > > > > > > > > > APIs(for owner validation), so it
> > > > > > > > > > > > > makes sense to use the same lock.
> > > > > > > > > > > > > > Actually, why not?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Well to me next_owner_id and rte_eth_dev_data[]
> > > > > > > > > > > > > are not directly
> > > > > > > > > > > related.
> > > > > > > > > > > > > You may create new owner_id but it doesn't mean
> > > > > > > > > > > > > you would update rte_eth_dev_data[] immediately.
> > > > > > > > > > > > > And visa-versa - you might just want to update
> > > > > > > > > > > > > rte_eth_dev_data[].name or .owner_id.
> > > > > > > > > > > > > It is not very good coding practice to use same
> > > > > > > > > > > > > lock for non-related data structures.
> > > > > > > > > > > > >
> > > > > > > > > > > > I see the relation like next:
> > > > > > > > > > > > Since the ownership mechanism synchronization is
> > > > > > > > > > > > in ethdev responsibility, we must protect against
> > > > > > > > > > > > user mistakes as much as we can by
> > > > > > > > > > > using the same lock.
> > > > > > > > > > > > So, if user try to set by invalid owner (exactly
> > > > > > > > > > > > the ID which currently is
> > > > > > > > > > > allocated) we can protect on it.
> > > > > > > > > > >
> > > > > > > > > > > Hmm, not sure why you can't do same checking with
> > > > > > > > > > > different lock or atomic variable?
> > > > > > > > > > >
> > > > > > > > > > The set ownership API is protected by ownership lock
> > > > > > > > > > and checks the owner ID validity By reading the next owner
> ID.
> > > > > > > > > > So, the owner ID allocation and set API should use the
> > > > > > > > > > same atomic
> > > > > > > > > mechanism.
> > > > > > > > >
> > > > > > > > > Sure but all you are doing for checking validity, is
> > > > > > > > > check that owner_id > 0 &&& owner_id < next_ownwe_id,
> right?
> > > > > > > > > As you don't allow owner_id overlap (16/3248 bits) you
> > > > > > > > > can safely do same check with just
> atomic_get(&next_owner_id).
> > > > > > > > >
> > > > > > > > It will not protect it, scenario:
> > > > > > > > - current next_id is X.
> > > > > > > > - call set ownership of port A with owner id X by thread
> > > > > > > > 0(by user
> > > > > mistake).
> > > > > > > > - context switch
> > > > > > > > - allocate new id by thread 1 and get X and change next_id
> > > > > > > > to
> > > > > > > > X+1
> > > > > > > atomically.
> > > > > > > > -  context switch
> > > > > > > > - Thread 0 validate X by atomic_read and succeed to take
> > > ownership.
> > > > > > > > - The system loosed the port(or will be managed by two
> > > > > > > > entities) -
> > > > > crash.
> > > > > > >
> > > > > > >
> > > > > > > Ok, and how using lock will protect you with such scenario?
> > > > > >
> > > > > > The owner set API validation by thread 0 should fail because
> > > > > > the owner
> > > > > validation is included in the protected section.
> > > > >
> > > > > Then your validation function would fail even if you'll use
> > > > > atomic ops instead of lock.
> > > > No.
> > > > With atomic this specific scenario will cause the validation to pass.
> > >
> > > Can you explain to me how?
> > >
> > > rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > >               int32_t cur_owner_id =
> > > RTE_MIN(rte_atomic32_get(next_owner_id),
> > > UINT16_MAX);
> > >
> > > 	if (owner_id == RTE_ETH_DEV_NO_OWNER || owner >
> > > cur_owner_id) {
> > > 		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > 		return 0;
> > > 	}
> > > 	return 1;
> > > }
> > >
> > > Let say your next_owne_id==X, and you invoke
> > > rte_eth_is_valid_owner_id(owner_id=X+1)  - it would fail.
> >
> > Explanation:
> > The scenario with locks:
> > next_owner_id = X.
> > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > Context switch.
> > Thread 1 call to owner_new and stuck in the lock.
> > Context switch.
> > Thread 0 does owner id validation and failed(Y>=X) - unlock the lock and
> return failure to the user.
> > Context switch.
> > Thread 1 take the lock and update X to X+1, then, unlock the lock.
> > Everything is OK!
> >
> > The same scenario with atomics:
> > next_owner_id = X.
> > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > Context switch.
> > Thread 1 call to owner_new and change X to X+1(atomically).
> > Context switch.
> > Thread 0 does owner id validation and success(Y<(atomic)X+1) - unlock the
> lock and return success to the  user.
> > Problem!
> >
> 
> 
> Matan is correct here, there is no way to preform parallel set operations
> using just and atomic variable here, because multiple reads of
> next_owner_id need to be preformed while it is stable.  That is to say
> rte_eth_next_owner_id must be compared to RTE_ETH_DEV_NO_OWNER
> and owner_id in rte_eth_is_valid_owner_id.  If you were to only use an
> atomic_read on such a variable, it could be incremented by the owner_new
> function between the checks and an invalid owner value could become valid
> because  a third thread incremented the next value.  The state of
> next_owner_id must be kept stable during any validity checks
> 
> That said, I really have to wonder why ownership ids are really needed here
> at all.  It seems this design could be much simpler with the addition of a per-
> port lock (and optional ownership record).  The API could consist of three
> operations:
> 
> ownership_set
> ownership_tryset
> ownership_release
> ownership_get
> 
> 
> The first call simply tries to take the per-port lock (blocking if its already
> locked)
> 

Per port lock is not good because the ownership mechanism must to be synchronized with the port creation\release.
So the port creation and port ownership should use the same lock.

I didn't find precedence for blocking function in ethdev.

> The second call is a non-blocking version of the first
> 
> The third unlocks the port, allowing others to take ownership
> 
> The fourth returns whatever ownership record you want to encode with the
> lock.
> 
> The addition of all this id checking seems a bit overcomplicated

You miss the identification of the owner - we want to allow info of the owner for printing and easy debug.
And it is makes sense to manage the owner uniqueness by unique ID.

The API already discussed a lot in the previous version, Do you really want, now, to open it again?  
 
> Neil

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-17 16:52                                     ` Ananyev, Konstantin
@ 2018-01-17 18:02                                       ` Matan Azrad
  2018-01-17 20:34                                       ` Matan Azrad
  1 sibling, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-17 18:02 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce

Hi Konstantin

From: Ananyev, Konstantin, Wednesday, January 17, 2018 6:53 PM
> Hi Matan,
> 
> >
> > Hi Konstantin
> > From: Ananyev, Konstantin, Wednesday, January 17, 2018 2:55 PM
> > > >
> > > >
> > > > Hi Konstantin
> > > > From: Ananyev, Konstantin, Sent: Wednesday, January 17, 2018 1:24
> > > > PM
> > > > > Hi Matan,
> > > > >
> > > > > > Hi Konstantin
> > > > > >
> > > > > > From: Ananyev, Konstantin, Tuesday, January 16, 2018 9:11 PM
> > > > > > > Hi Matan,
> > > > > > >
> > > > > > > >
> > > > > > > > Hi Konstantin
> > > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 8:44
> > > > > > > > PM
> > > > > > > > > Hi Matan,
> > > > > > > > > > Hi Konstantin
> > > > > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018
> > > > > > > > > > 1:45 PM
> > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > From: Ananyev, Konstantin, Friday, January 12,
> > > > > > > > > > > > 2018
> > > > > > > > > > > > 2:02 AM
> > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > From: Ananyev, Konstantin, Thursday, January
> > > > > > > > > > > > > > 11,
> > > > > > > > > > > > > > 2018
> > > > > > > > > > > > > > 2:40 PM
> > > > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > > > From: Ananyev, Konstantin, Wednesday,
> > > > > > > > > > > > > > > > January 10,
> > > > > > > > > > > > > > > > 2018
> > > > > > > > > > > > > > > > 3:36 PM
> > > > > > > > > > > > > > > > > Hi Matan,
> > > > > > > >  <snip>
> > > > > > > > > > > > > > > > > It is good to see that now
> > > > > > > > > > > > > > > > > scanning/updating rte_eth_dev_data[] is
> > > > > > > > > > > > > > > > > lock protected, but it might be not very
> > > > > > > > > > > > > > > > > plausible to protect both data[] and
> > > > > > > > > > > > > > > > > next_owner_id using the
> > > > > > > > > > > same lock.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I guess you mean to the owner structure in
> > > > > > > > > > > rte_eth_dev_data[port_id].
> > > > > > > > > > > > > > > > The next_owner_id is read by ownership
> > > > > > > > > > > > > > > > APIs(for owner validation), so it
> > > > > > > > > > > > > > > makes sense to use the same lock.
> > > > > > > > > > > > > > > > Actually, why not?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Well to me next_owner_id and
> > > > > > > > > > > > > > > rte_eth_dev_data[] are not directly
> > > > > > > > > > > > > related.
> > > > > > > > > > > > > > > You may create new owner_id but it doesn't
> > > > > > > > > > > > > > > mean you would update rte_eth_dev_data[]
> immediately.
> > > > > > > > > > > > > > > And visa-versa - you might just want to
> > > > > > > > > > > > > > > update rte_eth_dev_data[].name or .owner_id.
> > > > > > > > > > > > > > > It is not very good coding practice to use
> > > > > > > > > > > > > > > same lock for non-related data structures.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > I see the relation like next:
> > > > > > > > > > > > > > Since the ownership mechanism synchronization
> > > > > > > > > > > > > > is in ethdev responsibility, we must protect
> > > > > > > > > > > > > > against user mistakes as much as we can by
> > > > > > > > > > > > > using the same lock.
> > > > > > > > > > > > > > So, if user try to set by invalid owner
> > > > > > > > > > > > > > (exactly the ID which currently is
> > > > > > > > > > > > > allocated) we can protect on it.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hmm, not sure why you can't do same checking
> > > > > > > > > > > > > with different lock or atomic variable?
> > > > > > > > > > > > >
> > > > > > > > > > > > The set ownership API is protected by ownership
> > > > > > > > > > > > lock and checks the owner ID validity By reading
> > > > > > > > > > > > the next owner
> > > ID.
> > > > > > > > > > > > So, the owner ID allocation and set API should use
> > > > > > > > > > > > the same atomic
> > > > > > > > > > > mechanism.
> > > > > > > > > > >
> > > > > > > > > > > Sure but all you are doing for checking validity, is
> > > > > > > > > > > check that owner_id > 0 &&& owner_id <
> > > > > > > > > > > next_ownwe_id,
> > > right?
> > > > > > > > > > > As you don't allow owner_id overlap (16/3248 bits)
> > > > > > > > > > > you can safely do same check with just
> > > atomic_get(&next_owner_id).
> > > > > > > > > > >
> > > > > > > > > > It will not protect it, scenario:
> > > > > > > > > > - current next_id is X.
> > > > > > > > > > - call set ownership of port A with owner id X by
> > > > > > > > > > thread 0(by user
> > > > > > > mistake).
> > > > > > > > > > - context switch
> > > > > > > > > > - allocate new id by thread 1 and get X and change
> > > > > > > > > > next_id to
> > > > > > > > > > X+1
> > > > > > > > > atomically.
> > > > > > > > > > -  context switch
> > > > > > > > > > - Thread 0 validate X by atomic_read and succeed to
> > > > > > > > > > take
> > > > > ownership.
> > > > > > > > > > - The system loosed the port(or will be managed by two
> > > > > > > > > > entities) -
> > > > > > > crash.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Ok, and how using lock will protect you with such scenario?
> > > > > > > >
> > > > > > > > The owner set API validation by thread 0 should fail
> > > > > > > > because the owner
> > > > > > > validation is included in the protected section.
> > > > > > >
> > > > > > > Then your validation function would fail even if you'll use
> > > > > > > atomic ops instead of lock.
> > > > > > No.
> > > > > > With atomic this specific scenario will cause the validation to pass.
> > > > >
> > > > > Can you explain to me how?
> > > > >
> > > > > rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > > >               int32_t cur_owner_id =
> > > > > RTE_MIN(rte_atomic32_get(next_owner_id),
> > > > > UINT16_MAX);
> > > > >
> > > > > 	if (owner_id == RTE_ETH_DEV_NO_OWNER || owner >
> > > > > cur_owner_id) {
> > > > > 		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > > > 		return 0;
> > > > > 	}
> > > > > 	return 1;
> > > > > }
> > > > >
> > > > > Let say your next_owne_id==X, and you invoke
> > > > > rte_eth_is_valid_owner_id(owner_id=X+1)  - it would fail.
> > > >
> > > > Explanation:
> > > > The scenario with locks:
> > > > next_owner_id = X.
> > > > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > >
> > > Ok I see what you mean.
> > > But, as I said before, if thread 0 will grab the lock first - you'll
> > > experience the same failure.
> > > I understand now that by some reason you treat these two scenarios
> > > as something different, but for me it is pretty much the same case.
> > > And to me it means that neither lock, neither atomic can fully
> > > protect you here.
> > >
> >
> > I agree that we are not fully protected even when using locks but one lock
> are more protected than ether atomics or 2 different locks.
> > So, I think keeping it as is (with one lock) makes sense.
> 
> Ok if that your preference - let's keep your current approach here.
> 
> >
> > > > Context switch.
> > > > Thread 1 call to owner_new and stuck in the lock.
> > > > Context switch.
> > > > Thread 0 does owner id validation and failed(Y>=X) - unlock the
> > > > lock and
> > > return failure to the user.
> > > > Context switch.
> > > > Thread 1 take the lock and update X to X+1, then, unlock the lock.
> > > > Everything is OK!
> > > >
> > > > The same scenario with atomics:
> > > > next_owner_id = X.
> > > > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > > > Context switch.
> > > > Thread 1 call to owner_new and change X to X+1(atomically).
> > > > Context switch.
> > > > Thread 0 does owner id validation and success(Y<(atomic)X+1) -
> > > > unlock the
> > > lock and return success to the  user.
> > > > Problem!
> > > >
> > > > > > With lock no next_id changes can be done while the thread is
> > > > > > in the set
> > > > > API.
> > > > > >
> > > > > > > But in fact your code is not protected for that scenario -
> > > > > > > doesn't matter will you'll use lock or atomic ops.
> > > > > > > Let's considerer your current code with the following scenario:
> > > > > > >
> > > > > > > next_owner_id  == 1
> > > > > > > 1) Process 0:
> > > > > > >      rte_eth_dev_owner_new(&owner_id);
> > > > > > >      now owner_id == 1 and next_owner_id == 2
> > > > > > > 2) Process 1 (by mistake):
> > > > > > >     rte_eth_dev_owner_set(port_id=1, owner->id=1); It will
> > > > > > > complete successfully, as owner_id ==1 is considered as valid.
> > > > > > > 3) Process 0:
> > > > > > >       rte_eth_dev_owner_set(port_id=1, owner->id=1); It will
> > > > > > > also complete with success, as owner->id is valid is equal
> > > > > > > to current port
> > > > > owner_id.
> > > > > > > So you finished with 2 processes assuming that they do own
> > > > > > > exclusively then same port.
> > > > > > >
> > > > > > > Honestly in that situation  locking around nest_owner_id
> > > > > > > wouldn't give you any advantages over atomic ops.
> > > > > > >
> > > > > >
> > > > > > This is a different scenario that we can't protect on it with
> > > > > > atomic or
> > > locks.
> > > > > > But for the first scenario I described I think we can.
> > > > > > Please read it again, I described it step by step.
> > > > > >
> > > > > > > >
> > > > > > > > > I don't think you can protect yourself against such
> > > > > > > > > scenario with or without locking.
> > > > > > > > > Unless you'll make it harder for the mis-behaving thread
> > > > > > > > > to guess valid owner_id, or add some extra logic here.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > > The set(and others) ownership APIs already uses
> > > > > > > > > > > > the ownership lock so I
> > > > > > > > > > > think it makes sense to use the same lock also in ID
> allocation.
> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > In fact, for next_owner_id, you don't
> > > > > > > > > > > > > > > > > need a lock - just rte_atomic_t should be
> enough.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I don't think so, it is problematic in
> > > > > > > > > > > > > > > > next_owner_id wraparound and may
> > > > > > > > > > > > > > > complicate the code in other places which read it.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IMO it is not that complicated, something
> > > > > > > > > > > > > > > like that should work I
> > > > > > > > > think.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > /* init to 0 at startup*/ rte_atomic32_t
> > > > > > > > > > > > > > > *owner_id;
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > int new_owner_id(void) {
> > > > > > > > > > > > > > >     int32_t x;
> > > > > > > > > > > > > > >     x = rte_atomic32_add_return(&owner_id, 1);
> > > > > > > > > > > > > > >     if (x > UINT16_MAX) {
> > > > > > > > > > > > > > >        rte_atomic32_dec(&owner_id);
> > > > > > > > > > > > > > >        return -EOVERWLOW;
> > > > > > > > > > > > > > >     } else
> > > > > > > > > > > > > > >         return x; }
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Why not just to keep it simple and using
> > > > > > > > > > > > > > > > the same
> > > lock?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Lock is also fine, I just think it better be
> > > > > > > > > > > > > > > a separate one
> > > > > > > > > > > > > > > - that would protext just next_owner_id.
> > > > > > > > > > > > > > > Though if you are going to use uuid here -
> > > > > > > > > > > > > > > all that probably not relevant any more.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I agree about the uuid but still think the
> > > > > > > > > > > > > > same lock should be used for
> > > > > > > > > > > both.
> > > > > > > > > > > > >
> > > > > > > > > > > > > But with uuid you don't need next_owner_id at all,
> right?
> > > > > > > > > > > > > So lock will only be used for rte_eth_dev_data[]
> > > > > > > > > > > > > fields
> > > > > anyway.
> > > > > > > > > > > > >
> > > > > > > > > > > > Sorry, I meant uint64_t, not uuid.
> > > > > > > > > > >
> > > > > > > > > > > Ah ok, my thought uuid_t is better as with it you
> > > > > > > > > > > don't need to support your own code to allocate new
> > > > > > > > > > > owner_id, but rely on system libs
> > > > > > > > > instead.
> > > > > > > > > > > But wouldn't insist here.
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Another alternative would be to use 2
> > > > > > > > > > > > > > > > > locks - one for next_owner_id second for
> > > > > > > > > > > > > > > > > actual data[]
> > > protection.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Another thing - you'll probably need to
> > > grab/release
> > > > > > > > > > > > > > > > > a lock inside
> > > > > > > > > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > > > > > > > > It is a public function used by drivers,
> > > > > > > > > > > > > > > > > so need to be protected
> > > > > > > > > too.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Yes, I thought about it, but decided not
> > > > > > > > > > > > > > > > to use lock in
> > > > > next:
> > > > > > > > > > > > > > > > rte_eth_dev_allocated rte_eth_dev_count
> > > > > > > > > > > > > > > > rte_eth_dev_get_name_by_port
> > > > > > > > > rte_eth_dev_get_port_by_name
> > > > > > > > > > > > > > > > maybe more...
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > As I can see in patch #3 you protect by lock
> > > > > > > > > > > > > > > access to rte_eth_dev_data[].name (which
> > > > > > > > > > > > > > > seems like a good
> > > > > thing).
> > > > > > > > > > > > > > > So I think any other public function that
> > > > > > > > > > > > > > > access rte_eth_dev_data[].name should be
> > > > > > > > > > > > > > > protected by the
> > > > > same
> > > > > > > lock.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I don't think so, I can understand to use the
> > > > > > > > > > > > > > ownership lock here(as in port
> > > > > > > > > > > > > creation) but I don't think it is necessary too.
> > > > > > > > > > > > > > What are we exactly protecting here?
> > > > > > > > > > > > > > Don't you think it is just timing?(ask in the
> > > > > > > > > > > > > > next moment and you may get another answer) I
> > > > > > > > > > > > > > don't see optional
> > > crash.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Not sure what you mean here by timing...
> > > > > > > > > > > > > As I understand rte_eth_dev_data[].name unique
> > > identifies
> > > > > > > > > > > > > device and is used by  port
> > > > > > > > > > > > > allocation/release/find
> > > functions.
> > > > > > > > > > > > > As you stated above:
> > > > > > > > > > > > > "1. The port allocation and port release
> > > > > > > > > > > > > synchronization will be managed by ethdev."
> > > > > > > > > > > > > To me it means that ethdev layer has to make
> > > > > > > > > > > > > sure that all accesses to rte_eth_dev_data[].name are
> atomic.
> > > > > > > > > > > > > Otherwise what would prevent the situation when
> > > > > > > > > > > > > one
> > > > > process
> > > > > > > > > > > > > does
> > > > > > > > > > > > > rte_eth_dev_allocate()-
> > > >snprintf(rte_eth_dev_data[x].name,
> > > > > > > > > > > > > ...) while second one does
> > > > > > > > > > > rte_eth_dev_allocated(rte_eth_dev_data[x].name, ...) ?
> > > > > > > > > > > > >
> > > > > > > > > > > > The second will get True or False and that is it.
> > > > > > > > > > >
> > > > > > > > > > > Under race condition - in the worst case it might
> > > > > > > > > > > crash, though for that you'll have to be really unlucky.
> > > > > > > > > > > Though in most cases as you said it would just not
> > > > > > > > > > > operate
> > > > > correctly.
> > > > > > > > > > > I think if we start to protect dev->name by lock we
> > > > > > > > > > > need to do it for all instances (both read and write).
> > > > > > > > > > >
> > > > > > > > > > Since under the ownership rules, the user must take
> > > > > > > > > > ownership
> > > of a
> > > > > > > > > > port
> > > > > > > > > before using it, I still don't see a problem here.
> > > > > > > > >
> > > > > > > > > I am not talking about owner id or name here.
> > > > > > > > > I am talking about dev->name.
> > > > > > > > >
> > > > > > > > So? The user still should take ownership of a device
> > > > > > > > before using it
> > > (by
> > > > > > > name or by port id).
> > > > > > > > It can just read it without owning it, but no managing it.
> > > > > > > >
> > > > > > > > > > Please, Can you describe specific crash scenario and
> > > > > > > > > > explain how could the
> > > > > > > > > locking fix it?
> > > > > > > > >
> > > > > > > > > Let say thread 0 doing rte_eth_dev_allocate()-
> > > > > > > > > >snprintf(rte_eth_dev_data[x].name, ...), thread 1 doing
> > > > > > > > > rte_pmd_ring_remove()->rte_eth_dev_allocated()-
> >strcmp().
> > > > > > > > > And because of race condition - rte_eth_dev_allocated()
> > > > > > > > > will
> > > return
> > > > > > > > > rte_eth_dev * for the wrong device.
> > > > > > > > Which wrong device do you mean? I guess it is the device
> > > > > > > > which
> > > > > currently is
> > > > > > > being created by thread 0.
> > > > > > > > > Then rte_pmd_ring_remove() will call rte_free() for
> > > > > > > > > related resources, while It can still be in use by someone else.
> > > > > > > > The rte_pmd_ring_remove caller(some DPDK entity) must take
> > > > > ownership
> > > > > > > > (or validate that he is the owner) of a port before doing
> > > > > > > > it(free,
> > > > > release), so
> > > > > > > no issue here.
> > > > > > >
> > > > > > > Forget about ownership for a second.
> > > > > > > Suppose we have a process it created ring port for itself
> > > > > > > (without
> > > setting
> > > > > any
> > > > > > > ownership)  and used it for some time.
> > > > > > > Then it decided to remove it, so it calls rte_pmd_ring_remove()
> for it.
> > > > > > > At the same time second process decides to call
> > > rte_eth_dev_allocate()
> > > > > (let
> > > > > > > say for anither ring port).
> > > > > > > They could collide trying to read (process 0) and modify
> > > > > > > (process 1)
> > > same
> > > > > > > string rte_eth_dev_data[].name.
> > > > > > >
> > > > > > Do you mean that process 0 will compare successfully the
> > > > > > process 1
> > > new
> > > > > port name?
> > > > >
> > > > > Yes.
> > > > >
> > > > > > The state are in local process memory - so process 0 will not
> > > > > > compare
> > > the
> > > > > process 1 port, from its point of view this port is in UNUSED
> > > > > > state.
> > > > > >
> > > > >
> > > > > Ok, and why it can't be in attached state in process 0 too?
> > > >
> > > > Someone in process 0 should attach it using protected
> > > > attach_secondary
> > > somewhere in your scenario.
> > >
> > > Yes, process 0 can have this port attached too, why not?
> > See the function with inline comments:
> >
> > struct rte_eth_dev *
> > rte_eth_dev_allocated(const char *name) {
> > 	unsigned i;
> >
> > 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> >
> > 	    	The below state are in local process memory,
> > 		So, if here process 1 will allocate a new port (the current i),
> update its local state to ATTACHED and write the name,
> > 		the state is not visible by process 0 until someone in process
> 0 will attach it by rte_eth_dev_attach_secondary.
> > 		So, to use rte_eth_dev_attach_secondary process 0 must
> take the lock
> > and it can't, because it is currently locked by process 1.
> 
> Ok I see.
> Thanks for your patience.
> BTW, that means that if let say process 0 will call rte_eth_dev_allocate("xxx")
> and process 1 will call rte_eth_dev_allocate("yyy") we can endup with same
> port_id be used for different devices and 2 processes will overwrite the
> same rte_eth_dev_data[port_id]?

No, contrary to the state, the lock itself is in shared memory, so 2 processes cannot allocate port in the same time.(you can see it in the next patch of this series).

> Konstantin
> 
> >
> > 		if ((rte_eth_devices[i].state == RTE_ETH_DEV_ATTACHED)
> &&
> > 		strcmp(rte_eth_devices[i].data->name, name) == 0)
> > 			return &rte_eth_devices[i];
> > 	}
> > 	return NULL;
> >
> >

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-17 16:52                                     ` Ananyev, Konstantin
  2018-01-17 18:02                                       ` Matan Azrad
@ 2018-01-17 20:34                                       ` Matan Azrad
  2018-01-18 14:17                                         ` Ananyev, Konstantin
  1 sibling, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-17 20:34 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce



From: Matan Azrad, Wednesday, January 17, 2018 8:02 PM
> Hi Konstantin
> 
> From: Ananyev, Konstantin, Wednesday, January 17, 2018 6:53 PM
> > Hi Matan,
> >
> > >
> > > Hi Konstantin
> > > From: Ananyev, Konstantin, Wednesday, January 17, 2018 2:55 PM
> > > > >
> > > > >
> > > > > Hi Konstantin
> > > > > From: Ananyev, Konstantin, Sent: Wednesday, January 17, 2018
> > > > > 1:24 PM
> > > > > > Hi Matan,
> > > > > >
> > > > > > > Hi Konstantin
> > > > > > >
> > > > > > > From: Ananyev, Konstantin, Tuesday, January 16, 2018 9:11 PM
> > > > > > > > Hi Matan,
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Hi Konstantin
> > > > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 8:44
> > > > > > > > > PM
> > > > > > > > > > Hi Matan,
> > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018
> > > > > > > > > > > 1:45 PM
> > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > From: Ananyev, Konstantin, Friday, January 12,
> > > > > > > > > > > > > 2018
> > > > > > > > > > > > > 2:02 AM
> > > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > > From: Ananyev, Konstantin, Thursday, January
> > > > > > > > > > > > > > > 11,
> > > > > > > > > > > > > > > 2018
> > > > > > > > > > > > > > > 2:40 PM
> > > > > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > > > > From: Ananyev, Konstantin, Wednesday,
> > > > > > > > > > > > > > > > > January 10,
> > > > > > > > > > > > > > > > > 2018
> > > > > > > > > > > > > > > > > 3:36 PM
> > > > > > > > > > > > > > > > > > Hi Matan,
> > > > > > > > >  <snip>
> > > > > > > > > > > > > > > > > > It is good to see that now
> > > > > > > > > > > > > > > > > > scanning/updating rte_eth_dev_data[]
> > > > > > > > > > > > > > > > > > is lock protected, but it might be not
> > > > > > > > > > > > > > > > > > very plausible to protect both data[]
> > > > > > > > > > > > > > > > > > and next_owner_id using the
> > > > > > > > > > > > same lock.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I guess you mean to the owner structure
> > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > rte_eth_dev_data[port_id].
> > > > > > > > > > > > > > > > > The next_owner_id is read by ownership
> > > > > > > > > > > > > > > > > APIs(for owner validation), so it
> > > > > > > > > > > > > > > > makes sense to use the same lock.
> > > > > > > > > > > > > > > > > Actually, why not?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Well to me next_owner_id and
> > > > > > > > > > > > > > > > rte_eth_dev_data[] are not directly
> > > > > > > > > > > > > > related.
> > > > > > > > > > > > > > > > You may create new owner_id but it doesn't
> > > > > > > > > > > > > > > > mean you would update rte_eth_dev_data[]
> > immediately.
> > > > > > > > > > > > > > > > And visa-versa - you might just want to
> > > > > > > > > > > > > > > > update rte_eth_dev_data[].name or .owner_id.
> > > > > > > > > > > > > > > > It is not very good coding practice to use
> > > > > > > > > > > > > > > > same lock for non-related data structures.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I see the relation like next:
> > > > > > > > > > > > > > > Since the ownership mechanism
> > > > > > > > > > > > > > > synchronization is in ethdev responsibility,
> > > > > > > > > > > > > > > we must protect against user mistakes as
> > > > > > > > > > > > > > > much as we can by
> > > > > > > > > > > > > > using the same lock.
> > > > > > > > > > > > > > > So, if user try to set by invalid owner
> > > > > > > > > > > > > > > (exactly the ID which currently is
> > > > > > > > > > > > > > allocated) we can protect on it.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hmm, not sure why you can't do same checking
> > > > > > > > > > > > > > with different lock or atomic variable?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > The set ownership API is protected by ownership
> > > > > > > > > > > > > lock and checks the owner ID validity By reading
> > > > > > > > > > > > > the next owner
> > > > ID.
> > > > > > > > > > > > > So, the owner ID allocation and set API should
> > > > > > > > > > > > > use the same atomic
> > > > > > > > > > > > mechanism.
> > > > > > > > > > > >
> > > > > > > > > > > > Sure but all you are doing for checking validity,
> > > > > > > > > > > > is check that owner_id > 0 &&& owner_id <
> > > > > > > > > > > > next_ownwe_id,
> > > > right?
> > > > > > > > > > > > As you don't allow owner_id overlap (16/3248 bits)
> > > > > > > > > > > > you can safely do same check with just
> > > > atomic_get(&next_owner_id).
> > > > > > > > > > > >
> > > > > > > > > > > It will not protect it, scenario:
> > > > > > > > > > > - current next_id is X.
> > > > > > > > > > > - call set ownership of port A with owner id X by
> > > > > > > > > > > thread 0(by user
> > > > > > > > mistake).
> > > > > > > > > > > - context switch
> > > > > > > > > > > - allocate new id by thread 1 and get X and change
> > > > > > > > > > > next_id to
> > > > > > > > > > > X+1
> > > > > > > > > > atomically.
> > > > > > > > > > > -  context switch
> > > > > > > > > > > - Thread 0 validate X by atomic_read and succeed to
> > > > > > > > > > > take
> > > > > > ownership.
> > > > > > > > > > > - The system loosed the port(or will be managed by
> > > > > > > > > > > two
> > > > > > > > > > > entities) -
> > > > > > > > crash.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Ok, and how using lock will protect you with such scenario?
> > > > > > > > >
> > > > > > > > > The owner set API validation by thread 0 should fail
> > > > > > > > > because the owner
> > > > > > > > validation is included in the protected section.
> > > > > > > >
> > > > > > > > Then your validation function would fail even if you'll
> > > > > > > > use atomic ops instead of lock.
> > > > > > > No.
> > > > > > > With atomic this specific scenario will cause the validation to pass.
> > > > > >
> > > > > > Can you explain to me how?
> > > > > >
> > > > > > rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > > > >               int32_t cur_owner_id =
> > > > > > RTE_MIN(rte_atomic32_get(next_owner_id),
> > > > > > UINT16_MAX);
> > > > > >
> > > > > > 	if (owner_id == RTE_ETH_DEV_NO_OWNER || owner >
> > > > > > cur_owner_id) {
> > > > > > 		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n",
> owner_id);
> > > > > > 		return 0;
> > > > > > 	}
> > > > > > 	return 1;
> > > > > > }
> > > > > >
> > > > > > Let say your next_owne_id==X, and you invoke
> > > > > > rte_eth_is_valid_owner_id(owner_id=X+1)  - it would fail.
> > > > >
> > > > > Explanation:
> > > > > The scenario with locks:
> > > > > next_owner_id = X.
> > > > > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > > >
> > > > Ok I see what you mean.
> > > > But, as I said before, if thread 0 will grab the lock first -
> > > > you'll experience the same failure.
> > > > I understand now that by some reason you treat these two scenarios
> > > > as something different, but for me it is pretty much the same case.
> > > > And to me it means that neither lock, neither atomic can fully
> > > > protect you here.
> > > >
> > >
> > > I agree that we are not fully protected even when using locks but
> > > one lock
> > are more protected than ether atomics or 2 different locks.
> > > So, I think keeping it as is (with one lock) makes sense.
> >
> > Ok if that your preference - let's keep your current approach here.
> >
> > >
> > > > > Context switch.
> > > > > Thread 1 call to owner_new and stuck in the lock.
> > > > > Context switch.
> > > > > Thread 0 does owner id validation and failed(Y>=X) - unlock the
> > > > > lock and
> > > > return failure to the user.
> > > > > Context switch.
> > > > > Thread 1 take the lock and update X to X+1, then, unlock the lock.
> > > > > Everything is OK!
> > > > >
> > > > > The same scenario with atomics:
> > > > > next_owner_id = X.
> > > > > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > > > > Context switch.
> > > > > Thread 1 call to owner_new and change X to X+1(atomically).
> > > > > Context switch.
> > > > > Thread 0 does owner id validation and success(Y<(atomic)X+1) -
> > > > > unlock the
> > > > lock and return success to the  user.
> > > > > Problem!
> > > > >
> > > > > > > With lock no next_id changes can be done while the thread is
> > > > > > > in the set
> > > > > > API.
> > > > > > >
> > > > > > > > But in fact your code is not protected for that scenario -
> > > > > > > > doesn't matter will you'll use lock or atomic ops.
> > > > > > > > Let's considerer your current code with the following scenario:
> > > > > > > >
> > > > > > > > next_owner_id  == 1
> > > > > > > > 1) Process 0:
> > > > > > > >      rte_eth_dev_owner_new(&owner_id);
> > > > > > > >      now owner_id == 1 and next_owner_id == 2
> > > > > > > > 2) Process 1 (by mistake):
> > > > > > > >     rte_eth_dev_owner_set(port_id=1, owner->id=1); It will
> > > > > > > > complete successfully, as owner_id ==1 is considered as valid.
> > > > > > > > 3) Process 0:
> > > > > > > >       rte_eth_dev_owner_set(port_id=1, owner->id=1); It
> > > > > > > > will also complete with success, as owner->id is valid is
> > > > > > > > equal to current port
> > > > > > owner_id.
> > > > > > > > So you finished with 2 processes assuming that they do own
> > > > > > > > exclusively then same port.
> > > > > > > >
> > > > > > > > Honestly in that situation  locking around nest_owner_id
> > > > > > > > wouldn't give you any advantages over atomic ops.
> > > > > > > >
> > > > > > >
> > > > > > > This is a different scenario that we can't protect on it
> > > > > > > with atomic or
> > > > locks.
> > > > > > > But for the first scenario I described I think we can.
> > > > > > > Please read it again, I described it step by step.
> > > > > > >
> > > > > > > > >
> > > > > > > > > > I don't think you can protect yourself against such
> > > > > > > > > > scenario with or without locking.
> > > > > > > > > > Unless you'll make it harder for the mis-behaving
> > > > > > > > > > thread to guess valid owner_id, or add some extra logic
> here.
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > > The set(and others) ownership APIs already uses
> > > > > > > > > > > > > the ownership lock so I
> > > > > > > > > > > > think it makes sense to use the same lock also in
> > > > > > > > > > > > ID
> > allocation.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > In fact, for next_owner_id, you don't
> > > > > > > > > > > > > > > > > > need a lock - just rte_atomic_t should
> > > > > > > > > > > > > > > > > > be
> > enough.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I don't think so, it is problematic in
> > > > > > > > > > > > > > > > > next_owner_id wraparound and may
> > > > > > > > > > > > > > > > complicate the code in other places which read it.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > IMO it is not that complicated, something
> > > > > > > > > > > > > > > > like that should work I
> > > > > > > > > > think.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > /* init to 0 at startup*/ rte_atomic32_t
> > > > > > > > > > > > > > > > *owner_id;
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > int new_owner_id(void) {
> > > > > > > > > > > > > > > >     int32_t x;
> > > > > > > > > > > > > > > >     x = rte_atomic32_add_return(&owner_id, 1);
> > > > > > > > > > > > > > > >     if (x > UINT16_MAX) {
> > > > > > > > > > > > > > > >        rte_atomic32_dec(&owner_id);
> > > > > > > > > > > > > > > >        return -EOVERWLOW;
> > > > > > > > > > > > > > > >     } else
> > > > > > > > > > > > > > > >         return x; }
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Why not just to keep it simple and using
> > > > > > > > > > > > > > > > > the same
> > > > lock?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Lock is also fine, I just think it better
> > > > > > > > > > > > > > > > be a separate one
> > > > > > > > > > > > > > > > - that would protext just next_owner_id.
> > > > > > > > > > > > > > > > Though if you are going to use uuid here -
> > > > > > > > > > > > > > > > all that probably not relevant any more.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I agree about the uuid but still think the
> > > > > > > > > > > > > > > same lock should be used for
> > > > > > > > > > > > both.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > But with uuid you don't need next_owner_id at
> > > > > > > > > > > > > > all,
> > right?
> > > > > > > > > > > > > > So lock will only be used for
> > > > > > > > > > > > > > rte_eth_dev_data[] fields
> > > > > > anyway.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > Sorry, I meant uint64_t, not uuid.
> > > > > > > > > > > >
> > > > > > > > > > > > Ah ok, my thought uuid_t is better as with it you
> > > > > > > > > > > > don't need to support your own code to allocate
> > > > > > > > > > > > new owner_id, but rely on system libs
> > > > > > > > > > instead.
> > > > > > > > > > > > But wouldn't insist here.
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Another alternative would be to use 2
> > > > > > > > > > > > > > > > > > locks - one for next_owner_id second
> > > > > > > > > > > > > > > > > > for actual data[]
> > > > protection.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Another thing - you'll probably need
> > > > > > > > > > > > > > > > > > to
> > > > grab/release
> > > > > > > > > > > > > > > > > > a lock inside
> > > > > > > > > > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > > > > > > > > > It is a public function used by
> > > > > > > > > > > > > > > > > > drivers, so need to be protected
> > > > > > > > > > too.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Yes, I thought about it, but decided not
> > > > > > > > > > > > > > > > > to use lock in
> > > > > > next:
> > > > > > > > > > > > > > > > > rte_eth_dev_allocated rte_eth_dev_count
> > > > > > > > > > > > > > > > > rte_eth_dev_get_name_by_port
> > > > > > > > > > rte_eth_dev_get_port_by_name
> > > > > > > > > > > > > > > > > maybe more...
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > As I can see in patch #3 you protect by
> > > > > > > > > > > > > > > > lock access to rte_eth_dev_data[].name
> > > > > > > > > > > > > > > > (which seems like a good
> > > > > > thing).
> > > > > > > > > > > > > > > > So I think any other public function that
> > > > > > > > > > > > > > > > access rte_eth_dev_data[].name should be
> > > > > > > > > > > > > > > > protected by the
> > > > > > same
> > > > > > > > lock.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I don't think so, I can understand to use
> > > > > > > > > > > > > > > the ownership lock here(as in port
> > > > > > > > > > > > > > creation) but I don't think it is necessary too.
> > > > > > > > > > > > > > > What are we exactly protecting here?
> > > > > > > > > > > > > > > Don't you think it is just timing?(ask in
> > > > > > > > > > > > > > > the next moment and you may get another
> > > > > > > > > > > > > > > answer) I don't see optional
> > > > crash.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Not sure what you mean here by timing...
> > > > > > > > > > > > > > As I understand rte_eth_dev_data[].name unique
> > > > identifies
> > > > > > > > > > > > > > device and is used by  port
> > > > > > > > > > > > > > allocation/release/find
> > > > functions.
> > > > > > > > > > > > > > As you stated above:
> > > > > > > > > > > > > > "1. The port allocation and port release
> > > > > > > > > > > > > > synchronization will be managed by ethdev."
> > > > > > > > > > > > > > To me it means that ethdev layer has to make
> > > > > > > > > > > > > > sure that all accesses to
> > > > > > > > > > > > > > rte_eth_dev_data[].name are
> > atomic.
> > > > > > > > > > > > > > Otherwise what would prevent the situation
> > > > > > > > > > > > > > when one
> > > > > > process
> > > > > > > > > > > > > > does
> > > > > > > > > > > > > > rte_eth_dev_allocate()-
> > > > >snprintf(rte_eth_dev_data[x].name,
> > > > > > > > > > > > > > ...) while second one does
> > > > > > > > > > > > rte_eth_dev_allocated(rte_eth_dev_data[x].name, ...) ?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > The second will get True or False and that is it.
> > > > > > > > > > > >
> > > > > > > > > > > > Under race condition - in the worst case it might
> > > > > > > > > > > > crash, though for that you'll have to be really unlucky.
> > > > > > > > > > > > Though in most cases as you said it would just not
> > > > > > > > > > > > operate
> > > > > > correctly.
> > > > > > > > > > > > I think if we start to protect dev->name by lock
> > > > > > > > > > > > we need to do it for all instances (both read and write).
> > > > > > > > > > > >
> > > > > > > > > > > Since under the ownership rules, the user must take
> > > > > > > > > > > ownership
> > > > of a
> > > > > > > > > > > port
> > > > > > > > > > before using it, I still don't see a problem here.
> > > > > > > > > >
> > > > > > > > > > I am not talking about owner id or name here.
> > > > > > > > > > I am talking about dev->name.
> > > > > > > > > >
> > > > > > > > > So? The user still should take ownership of a device
> > > > > > > > > before using it
> > > > (by
> > > > > > > > name or by port id).
> > > > > > > > > It can just read it without owning it, but no managing it.
> > > > > > > > >
> > > > > > > > > > > Please, Can you describe specific crash scenario and
> > > > > > > > > > > explain how could the
> > > > > > > > > > locking fix it?
> > > > > > > > > >
> > > > > > > > > > Let say thread 0 doing rte_eth_dev_allocate()-
> > > > > > > > > > >snprintf(rte_eth_dev_data[x].name, ...), thread 1
> > > > > > > > > > >doing
> > > > > > > > > > rte_pmd_ring_remove()->rte_eth_dev_allocated()-
> > >strcmp().
> > > > > > > > > > And because of race condition -
> > > > > > > > > > rte_eth_dev_allocated() will
> > > > return
> > > > > > > > > > rte_eth_dev * for the wrong device.
> > > > > > > > > Which wrong device do you mean? I guess it is the device
> > > > > > > > > which
> > > > > > currently is
> > > > > > > > being created by thread 0.
> > > > > > > > > > Then rte_pmd_ring_remove() will call rte_free() for
> > > > > > > > > > related resources, while It can still be in use by someone
> else.
> > > > > > > > > The rte_pmd_ring_remove caller(some DPDK entity) must
> > > > > > > > > take
> > > > > > ownership
> > > > > > > > > (or validate that he is the owner) of a port before
> > > > > > > > > doing it(free,
> > > > > > release), so
> > > > > > > > no issue here.
> > > > > > > >
> > > > > > > > Forget about ownership for a second.
> > > > > > > > Suppose we have a process it created ring port for itself
> > > > > > > > (without
> > > > setting
> > > > > > any
> > > > > > > > ownership)  and used it for some time.
> > > > > > > > Then it decided to remove it, so it calls
> > > > > > > > rte_pmd_ring_remove()
> > for it.
> > > > > > > > At the same time second process decides to call
> > > > rte_eth_dev_allocate()
> > > > > > (let
> > > > > > > > say for anither ring port).
> > > > > > > > They could collide trying to read (process 0) and modify
> > > > > > > > (process 1)
> > > > same
> > > > > > > > string rte_eth_dev_data[].name.
> > > > > > > >
> > > > > > > Do you mean that process 0 will compare successfully the
> > > > > > > process 1
> > > > new
> > > > > > port name?
> > > > > >
> > > > > > Yes.
> > > > > >
> > > > > > > The state are in local process memory - so process 0 will
> > > > > > > not compare
> > > > the
> > > > > > process 1 port, from its point of view this port is in UNUSED
> > > > > > > state.
> > > > > > >
> > > > > >
> > > > > > Ok, and why it can't be in attached state in process 0 too?
> > > > >
> > > > > Someone in process 0 should attach it using protected
> > > > > attach_secondary
> > > > somewhere in your scenario.
> > > >
> > > > Yes, process 0 can have this port attached too, why not?
> > > See the function with inline comments:
> > >
> > > struct rte_eth_dev *
> > > rte_eth_dev_allocated(const char *name) {
> > > 	unsigned i;
> > >
> > > 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> > >
> > > 	    	The below state are in local process memory,
> > > 		So, if here process 1 will allocate a new port (the current i),
> > update its local state to ATTACHED and write the name,
> > > 		the state is not visible by process 0 until someone in process
> > 0 will attach it by rte_eth_dev_attach_secondary.
> > > 		So, to use rte_eth_dev_attach_secondary process 0 must
> > take the lock
> > > and it can't, because it is currently locked by process 1.
> >
> > Ok I see.
> > Thanks for your patience.
> > BTW, that means that if let say process 0 will call
> > rte_eth_dev_allocate("xxx") and process 1 will call
> > rte_eth_dev_allocate("yyy") we can endup with same port_id be used for
> > different devices and 2 processes will overwrite the same
> rte_eth_dev_data[port_id]?
> 
> No, contrary to the state, the lock itself is in shared memory, so 2 processes
> cannot allocate port in the same time.(you can see it in the next patch of this
> series).
>

Actually I think only one process(primary) should allocate ports, the others should attach them.
The race of port allocation is only between the threads of the primary process.

 
> > Konstantin
> >
> > >
> > > 		if ((rte_eth_devices[i].state == RTE_ETH_DEV_ATTACHED)
> > &&
> > > 		strcmp(rte_eth_devices[i].data->name, name) == 0)
> > > 			return &rte_eth_devices[i];
> > > 	}
> > > 	return NULL;
> > >
> > >

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port ownership
  2018-01-17  8:51           ` Matan Azrad
@ 2018-01-18  0:53             ` Lu, Wenzhuo
  0 siblings, 0 replies; 214+ messages in thread
From: Lu, Wenzhuo @ 2018-01-18  0:53 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce, Ananyev, Konstantin

Hi Matan,

> -----Original Message-----
> From: Matan Azrad [mailto:matan@mellanox.com]
> Sent: Wednesday, January 17, 2018 4:51 PM
> To: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Thomas Monjalon
> <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>; Wu,
> Jingjing <jingjing.wu@intel.com>
> Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port
> ownership
> 
> Hi Lu
> 
> From: Lu, Wenzhuo, Wednesday, January 17, 2018 2:47 AM
> > Hi Matan,
> >
> > > -----Original Message-----
> > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > Sent: Tuesday, January 16, 2018 4:16 PM
> > > To: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Thomas Monjalon
> > > <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>; Wu,
> > > Jingjing <jingjing.wu@intel.com>
> > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > <konstantin.ananyev@intel.com>
> > > Subject: RE: [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev
> > > port ownership
> > >
> > > Hi Lu
> > > From: Lu, Wenzhuo, Tuesday, January 16, 2018 7:54 AM
> > > > Hi Matan,
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > > > > Sent: Sunday, January 7, 2018 5:46 PM
> > > > > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> > Richardson,
> > > > > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > > > <konstantin.ananyev@intel.com>
> > > > > Subject: [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev
> > > > > port ownership
> > > > >
> > > > > Testpmd should not use ethdev ports which are managed by other
> > > > > DPDK entities.
> > > > >
> > > > > Set Testpmd ownership to each port which is not used by other
> > > > > entity and prevent any usage of ethdev ports which are not owned
> > > > > by
> > Testpmd.
> > > > Sorry I don't follow all the discussion as there's too much. So it
> > > > may be a silly question.
> > >
> > > No problem, I'm here for any question :)
> > >
> > > > Testpmd already has the parameter " --pci-whitelist" to only use
> > > > the assigned devices.
> > >
> > > It is an EAL parameter. No? just say to EAL which devices to create..
> > >
> > > > When using this parameter, all the devices are owned by the
> > > > current APP.
> > >
> > > No, what's about vdev? vdevs may manage devices(even whitlist PCI
> > > devices) by themselves and want to prevent any app to use these
> > > devices(see fail- safe PMD).
> > I'm not an expert of EAL and vdev. Suppose this would be discussed in
> > other patches.
> > I don't want to bother you again here as testpmd is only used to show
> > the result.
> > So I think if this patch is needed just depends on if other patches
> > are accepted :)
> >
> Sure!
> 
> > >
> > >  > So I don't know why need to set/check the ownership.
> > > > BTW, in this patch, seem all the change is for ownership checking.
> > > > I don't find the setting code. Do I miss something?
> > >
> > > Yes, see in main function (the first FOREACH).
> > I think you mean this change,
> >
> > @@ -2394,7 +2406,12 @@  uint8_t port_is_bonding_slave(portid_t
> > slave_pid)
> >  	rte_pdump_init(NULL);
> >  #endif
> >
> > -	nb_ports = (portid_t) rte_eth_dev_count();
> > +	if (rte_eth_dev_owner_new(&my_owner.id))
> > +		rte_panic("Failed to get unique owner identifier\n");
> > +	snprintf(my_owner.name, sizeof(my_owner.name),
> > TESTPMD_OWNER_NAME);
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(port_id,
> > RTE_ETH_DEV_NO_OWNER)
> > +		if (rte_eth_dev_owner_set(port_id, &my_owner) == 0)
> > +			nb_ports++;
> >  	if (nb_ports == 0)
> >  		RTE_LOG(WARNING, EAL, "No probed ethernet devices\n");
> 
> Yes.
> 
> > But I thought about some code to assign a specific device to a
> > specific APP explicitly.
> > This code looks like just occupying the devices with no owner. So, it
> > means the first APP will occupy all the devices? It makes me confused
> > as I don't see the benefit or the difference than before.
> 
> Remember that in EAL init some drivers may create ports and take
> ownership of them, so this code owns all the ownerless ports.
> So, ports which are already owned by another DPDK entity will not success to
> take ownership here.
> Since Testpmd wanted to manage all the ports before this feature( by
> mistake ), I guess this is the right behavior now, just use the ports which are
> not used.
Thanks for the explanation. Sounds good to me :)

> 
> >
> > >
> > > Thanks, Matan.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-17 17:01                                   ` Ananyev, Konstantin
@ 2018-01-18 13:10                                     ` Neil Horman
  2018-01-18 14:00                                       ` Matan Azrad
  2018-01-18 16:27                                     ` Neil Horman
  1 sibling, 1 reply; 214+ messages in thread
From: Neil Horman @ 2018-01-18 13:10 UTC (permalink / raw)
  To: Ananyev, Konstantin
  Cc: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing, dev,
	Richardson, Bruce

On Wed, Jan 17, 2018 at 05:01:10PM +0000, Ananyev, Konstantin wrote:
> 
> 
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Wednesday, January 17, 2018 2:00 PM
> > To: Matan Azrad <matan@mellanox.com>
> > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>
> > Subject: Re: [PATCH v2 2/6] ethdev: add port ownership
> > 
> > On Wed, Jan 17, 2018 at 12:05:42PM +0000, Matan Azrad wrote:
> > >
> > > Hi Konstantin
> > > From: Ananyev, Konstantin, Sent: Wednesday, January 17, 2018 1:24 PM
> > > > Hi Matan,
> > > >
> > > > > Hi Konstantin
> > > > >
> > > > > From: Ananyev, Konstantin, Tuesday, January 16, 2018 9:11 PM
> > > > > > Hi Matan,
> > > > > >
> > > > > > >
> > > > > > > Hi Konstantin
> > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 8:44 PM
> > > > > > > > Hi Matan,
> > > > > > > > > Hi Konstantin
> > > > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 1:45 PM
> > > > > > > > > > Hi Matan,
> > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > From: Ananyev, Konstantin, Friday, January 12, 2018 2:02
> > > > > > > > > > > AM
> > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > From: Ananyev, Konstantin, Thursday, January 11, 2018
> > > > > > > > > > > > > 2:40 PM
> > > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > > From: Ananyev, Konstantin, Wednesday, January 10,
> > > > > > > > > > > > > > > 2018
> > > > > > > > > > > > > > > 3:36 PM
> > > > > > > > > > > > > > > > Hi Matan,
> > > > > > >  <snip>
> > > > > > > > > > > > > > > > It is good to see that now scanning/updating
> > > > > > > > > > > > > > > > rte_eth_dev_data[] is lock protected, but it
> > > > > > > > > > > > > > > > might be not very plausible to protect both
> > > > > > > > > > > > > > > > data[] and next_owner_id using the
> > > > > > > > > > same lock.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I guess you mean to the owner structure in
> > > > > > > > > > rte_eth_dev_data[port_id].
> > > > > > > > > > > > > > > The next_owner_id is read by ownership APIs(for
> > > > > > > > > > > > > > > owner validation), so it
> > > > > > > > > > > > > > makes sense to use the same lock.
> > > > > > > > > > > > > > > Actually, why not?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Well to me next_owner_id and rte_eth_dev_data[] are
> > > > > > > > > > > > > > not directly
> > > > > > > > > > > > related.
> > > > > > > > > > > > > > You may create new owner_id but it doesn't mean you
> > > > > > > > > > > > > > would update rte_eth_dev_data[] immediately.
> > > > > > > > > > > > > > And visa-versa - you might just want to update
> > > > > > > > > > > > > > rte_eth_dev_data[].name or .owner_id.
> > > > > > > > > > > > > > It is not very good coding practice to use same lock
> > > > > > > > > > > > > > for non-related data structures.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > I see the relation like next:
> > > > > > > > > > > > > Since the ownership mechanism synchronization is in
> > > > > > > > > > > > > ethdev responsibility, we must protect against user
> > > > > > > > > > > > > mistakes as much as we can by
> > > > > > > > > > > > using the same lock.
> > > > > > > > > > > > > So, if user try to set by invalid owner (exactly the
> > > > > > > > > > > > > ID which currently is
> > > > > > > > > > > > allocated) we can protect on it.
> > > > > > > > > > > >
> > > > > > > > > > > > Hmm, not sure why you can't do same checking with
> > > > > > > > > > > > different lock or atomic variable?
> > > > > > > > > > > >
> > > > > > > > > > > The set ownership API is protected by ownership lock and
> > > > > > > > > > > checks the owner ID validity By reading the next owner ID.
> > > > > > > > > > > So, the owner ID allocation and set API should use the
> > > > > > > > > > > same atomic
> > > > > > > > > > mechanism.
> > > > > > > > > >
> > > > > > > > > > Sure but all you are doing for checking validity, is  check
> > > > > > > > > > that owner_id > 0 &&& owner_id < next_ownwe_id, right?
> > > > > > > > > > As you don't allow owner_id overlap (16/3248 bits) you can
> > > > > > > > > > safely do same check with just atomic_get(&next_owner_id).
> > > > > > > > > >
> > > > > > > > > It will not protect it, scenario:
> > > > > > > > > - current next_id is X.
> > > > > > > > > - call set ownership of port A with owner id X by thread 0(by
> > > > > > > > > user
> > > > > > mistake).
> > > > > > > > > - context switch
> > > > > > > > > - allocate new id by thread 1 and get X and change next_id to
> > > > > > > > > X+1
> > > > > > > > atomically.
> > > > > > > > > -  context switch
> > > > > > > > > - Thread 0 validate X by atomic_read and succeed to take
> > > > ownership.
> > > > > > > > > - The system loosed the port(or will be managed by two
> > > > > > > > > entities) -
> > > > > > crash.
> > > > > > > >
> > > > > > > >
> > > > > > > > Ok, and how using lock will protect you with such scenario?
> > > > > > >
> > > > > > > The owner set API validation by thread 0 should fail because the
> > > > > > > owner
> > > > > > validation is included in the protected section.
> > > > > >
> > > > > > Then your validation function would fail even if you'll use atomic
> > > > > > ops instead of lock.
> > > > > No.
> > > > > With atomic this specific scenario will cause the validation to pass.
> > > >
> > > > Can you explain to me how?
> > > >
> > > > rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > >               int32_t cur_owner_id = RTE_MIN(rte_atomic32_get(next_owner_id),
> > > > UINT16_MAX);
> > > >
> > > > 	if (owner_id == RTE_ETH_DEV_NO_OWNER || owner >
> > > > cur_owner_id) {
> > > > 		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > > 		return 0;
> > > > 	}
> > > > 	return 1;
> > > > }
> > > >
> > > > Let say your next_owne_id==X, and you invoke
> > > > rte_eth_is_valid_owner_id(owner_id=X+1)  - it would fail.
> > >
> > > Explanation:
> > > The scenario with locks:
> > > next_owner_id = X.
> > > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > > Context switch.
> > > Thread 1 call to owner_new and stuck in the lock.
> > > Context switch.
> > > Thread 0 does owner id validation and failed(Y>=X) - unlock the lock and return failure to the user.
> > > Context switch.
> > > Thread 1 take the lock and update X to X+1, then, unlock the lock.
> > > Everything is OK!
> > >
> > > The same scenario with atomics:
> > > next_owner_id = X.
> > > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > > Context switch.
> > > Thread 1 call to owner_new and change X to X+1(atomically).
> > > Context switch.
> > > Thread 0 does owner id validation and success(Y<(atomic)X+1) - unlock the lock and return success to the  user.
> > > Problem!
> > >
> > 
> > 
> > Matan is correct here, there is no way to preform parallel set operations using
> > just and atomic variable here, because multiple reads of next_owner_id need to
> > be preformed while it is stable.  That is to say rte_eth_next_owner_id must be
> > compared to RTE_ETH_DEV_NO_OWNER and owner_id in rte_eth_is_valid_owner_id.  If
> > you were to only use an atomic_read on such a variable, it could be incremented
> > by the owner_new function between the checks and an invalid owner value could
> > become valid because  a third thread incremented the next value.  The state of
> > next_owner_id must be kept stable during any validity checks
> 
> It could still be incremented between the checks - if let say different thread will
> invoke new_onwer_id, grab the lock update counter, release the lock - all that
> before the check.
I don't see how all of the contents of rte_eth_dev_owner_set is protected under
rte_eth_dev_ownership_lock, as is rte_eth_dev_owner_new.  Next_owner might
increment between another threads calls to owner_new and owner_set, but that
will just cause a transition from an ownership id being valid to invalid, and
thats ok, as long as there is consistency in the model that enforces a single
valid owner at a time (in that case the subsequent caller to owner_new).

Though this confusion does underscore my assertion I think that this API is
overly complicated

Neil

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-17 17:58                                   ` Matan Azrad
@ 2018-01-18 13:20                                     ` Neil Horman
  2018-01-18 14:52                                       ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Neil Horman @ 2018-01-18 13:20 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing,
	dev, Richardson, Bruce

On Wed, Jan 17, 2018 at 05:58:07PM +0000, Matan Azrad wrote:
> 
> Hi Neil
> 
>  From: Neil Horman, Wednesday, January 17, 2018 4:00 PM
> > On Wed, Jan 17, 2018 at 12:05:42PM +0000, Matan Azrad wrote:
> > >
> > > Hi Konstantin
> > > From: Ananyev, Konstantin, Sent: Wednesday, January 17, 2018 1:24 PM
> > > > Hi Matan,
> > > >
> > > > > Hi Konstantin
> > > > >
> > > > > From: Ananyev, Konstantin, Tuesday, January 16, 2018 9:11 PM
> > > > > > Hi Matan,
> > > > > >
> > > > > > >
> > > > > > > Hi Konstantin
> > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 8:44 PM
> > > > > > > > Hi Matan,
> > > > > > > > > Hi Konstantin
> > > > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 1:45
> > > > > > > > > PM
> > > > > > > > > > Hi Matan,
> > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > From: Ananyev, Konstantin, Friday, January 12, 2018
> > > > > > > > > > > 2:02 AM
> > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > From: Ananyev, Konstantin, Thursday, January 11,
> > > > > > > > > > > > > 2018
> > > > > > > > > > > > > 2:40 PM
> > > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > > From: Ananyev, Konstantin, Wednesday, January
> > > > > > > > > > > > > > > 10,
> > > > > > > > > > > > > > > 2018
> > > > > > > > > > > > > > > 3:36 PM
> > > > > > > > > > > > > > > > Hi Matan,
> > > > > > >  <snip>
> > > > > > > > > > > > > > > > It is good to see that now scanning/updating
> > > > > > > > > > > > > > > > rte_eth_dev_data[] is lock protected, but it
> > > > > > > > > > > > > > > > might be not very plausible to protect both
> > > > > > > > > > > > > > > > data[] and next_owner_id using the
> > > > > > > > > > same lock.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I guess you mean to the owner structure in
> > > > > > > > > > rte_eth_dev_data[port_id].
> > > > > > > > > > > > > > > The next_owner_id is read by ownership
> > > > > > > > > > > > > > > APIs(for owner validation), so it
> > > > > > > > > > > > > > makes sense to use the same lock.
> > > > > > > > > > > > > > > Actually, why not?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Well to me next_owner_id and rte_eth_dev_data[]
> > > > > > > > > > > > > > are not directly
> > > > > > > > > > > > related.
> > > > > > > > > > > > > > You may create new owner_id but it doesn't mean
> > > > > > > > > > > > > > you would update rte_eth_dev_data[] immediately.
> > > > > > > > > > > > > > And visa-versa - you might just want to update
> > > > > > > > > > > > > > rte_eth_dev_data[].name or .owner_id.
> > > > > > > > > > > > > > It is not very good coding practice to use same
> > > > > > > > > > > > > > lock for non-related data structures.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > I see the relation like next:
> > > > > > > > > > > > > Since the ownership mechanism synchronization is
> > > > > > > > > > > > > in ethdev responsibility, we must protect against
> > > > > > > > > > > > > user mistakes as much as we can by
> > > > > > > > > > > > using the same lock.
> > > > > > > > > > > > > So, if user try to set by invalid owner (exactly
> > > > > > > > > > > > > the ID which currently is
> > > > > > > > > > > > allocated) we can protect on it.
> > > > > > > > > > > >
> > > > > > > > > > > > Hmm, not sure why you can't do same checking with
> > > > > > > > > > > > different lock or atomic variable?
> > > > > > > > > > > >
> > > > > > > > > > > The set ownership API is protected by ownership lock
> > > > > > > > > > > and checks the owner ID validity By reading the next owner
> > ID.
> > > > > > > > > > > So, the owner ID allocation and set API should use the
> > > > > > > > > > > same atomic
> > > > > > > > > > mechanism.
> > > > > > > > > >
> > > > > > > > > > Sure but all you are doing for checking validity, is
> > > > > > > > > > check that owner_id > 0 &&& owner_id < next_ownwe_id,
> > right?
> > > > > > > > > > As you don't allow owner_id overlap (16/3248 bits) you
> > > > > > > > > > can safely do same check with just
> > atomic_get(&next_owner_id).
> > > > > > > > > >
> > > > > > > > > It will not protect it, scenario:
> > > > > > > > > - current next_id is X.
> > > > > > > > > - call set ownership of port A with owner id X by thread
> > > > > > > > > 0(by user
> > > > > > mistake).
> > > > > > > > > - context switch
> > > > > > > > > - allocate new id by thread 1 and get X and change next_id
> > > > > > > > > to
> > > > > > > > > X+1
> > > > > > > > atomically.
> > > > > > > > > -  context switch
> > > > > > > > > - Thread 0 validate X by atomic_read and succeed to take
> > > > ownership.
> > > > > > > > > - The system loosed the port(or will be managed by two
> > > > > > > > > entities) -
> > > > > > crash.
> > > > > > > >
> > > > > > > >
> > > > > > > > Ok, and how using lock will protect you with such scenario?
> > > > > > >
> > > > > > > The owner set API validation by thread 0 should fail because
> > > > > > > the owner
> > > > > > validation is included in the protected section.
> > > > > >
> > > > > > Then your validation function would fail even if you'll use
> > > > > > atomic ops instead of lock.
> > > > > No.
> > > > > With atomic this specific scenario will cause the validation to pass.
> > > >
> > > > Can you explain to me how?
> > > >
> > > > rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > >               int32_t cur_owner_id =
> > > > RTE_MIN(rte_atomic32_get(next_owner_id),
> > > > UINT16_MAX);
> > > >
> > > > 	if (owner_id == RTE_ETH_DEV_NO_OWNER || owner >
> > > > cur_owner_id) {
> > > > 		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > > 		return 0;
> > > > 	}
> > > > 	return 1;
> > > > }
> > > >
> > > > Let say your next_owne_id==X, and you invoke
> > > > rte_eth_is_valid_owner_id(owner_id=X+1)  - it would fail.
> > >
> > > Explanation:
> > > The scenario with locks:
> > > next_owner_id = X.
> > > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > > Context switch.
> > > Thread 1 call to owner_new and stuck in the lock.
> > > Context switch.
> > > Thread 0 does owner id validation and failed(Y>=X) - unlock the lock and
> > return failure to the user.
> > > Context switch.
> > > Thread 1 take the lock and update X to X+1, then, unlock the lock.
> > > Everything is OK!
> > >
> > > The same scenario with atomics:
> > > next_owner_id = X.
> > > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > > Context switch.
> > > Thread 1 call to owner_new and change X to X+1(atomically).
> > > Context switch.
> > > Thread 0 does owner id validation and success(Y<(atomic)X+1) - unlock the
> > lock and return success to the  user.
> > > Problem!
> > >
> > 
> > 
> > Matan is correct here, there is no way to preform parallel set operations
> > using just and atomic variable here, because multiple reads of
> > next_owner_id need to be preformed while it is stable.  That is to say
> > rte_eth_next_owner_id must be compared to RTE_ETH_DEV_NO_OWNER
> > and owner_id in rte_eth_is_valid_owner_id.  If you were to only use an
> > atomic_read on such a variable, it could be incremented by the owner_new
> > function between the checks and an invalid owner value could become valid
> > because  a third thread incremented the next value.  The state of
> > next_owner_id must be kept stable during any validity checks
> > 
> > That said, I really have to wonder why ownership ids are really needed here
> > at all.  It seems this design could be much simpler with the addition of a per-
> > port lock (and optional ownership record).  The API could consist of three
> > operations:
> > 
> > ownership_set
> > ownership_tryset
> > ownership_release
> > ownership_get
> > 
> > 
> > The first call simply tries to take the per-port lock (blocking if its already
> > locked)
> > 
> 
> Per port lock is not good because the ownership mechanism must to be synchronized with the port creation\release.
> So the port creation and port ownership should use the same lock.
> 
In what way do you need to synchronize with port creation?  If a port has not
yet been created, then by definition the owner must be the thread calling the
create function.  If you are concerned about the mechanics of the port data
structure (i.e. the fact that rte_eth_devices is statically allocated, you can
add a lock structure to the rte_eth_dev struct and initialize it statically with
RTE_SPINLOCK_INITAIZER()


> I didn't find precedence for blocking function in ethdev.
> 
Then perhaps we don't need that api call.  Perhaps ownership_tryset is enough.

> > The second call is a non-blocking version of the first
> > 
> > The third unlocks the port, allowing others to take ownership
> > 
> > The fourth returns whatever ownership record you want to encode with the
> > lock.
> > 
> > The addition of all this id checking seems a bit overcomplicated
> 
> You miss the identification of the owner - we want to allow info of the owner for printing and easy debug.
> And it is makes sense to manage the owner uniqueness by unique ID.
> 
I specifically pointed that out above.  There is no reason an owernship record
couldn't be added to the rte_eth_dev structure.

> The API already discussed a lot in the previous version, Do you really want, now, to open it again?  
>  
What I want is the most useful and elegant ownership API available.  If you
think what you have is that, so be it.  I only bring this up because the amount
of debate you and Konstantin have had over lock safety causes me to wonder if
this isn't an overly complex design.

Neil


> > Neil
> 
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-18 13:10                                     ` Neil Horman
@ 2018-01-18 14:00                                       ` Matan Azrad
  2018-01-18 16:54                                         ` Neil Horman
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-18 14:00 UTC (permalink / raw)
  To: Neil Horman, Ananyev, Konstantin
  Cc: Thomas Monjalon, Gaetan Rivet, Wu, Jingjing, dev, Richardson, Bruce

Hi Neil

From: Neil Horman, Thursday, January 18, 2018 3:10 PM
> On Wed, Jan 17, 2018 at 05:01:10PM +0000, Ananyev, Konstantin wrote:
> >
> >
> > > -----Original Message-----
> > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > Sent: Wednesday, January 17, 2018 2:00 PM
> > > To: Matan Azrad <matan@mellanox.com>
> > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Thomas
> > > Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>;
> > > dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>
> > > Subject: Re: [PATCH v2 2/6] ethdev: add port ownership
> > >
> > > On Wed, Jan 17, 2018 at 12:05:42PM +0000, Matan Azrad wrote:
> > > >
> > > > Hi Konstantin
> > > > From: Ananyev, Konstantin, Sent: Wednesday, January 17, 2018 1:24
> > > > PM
> > > > > Hi Matan,
> > > > >
> > > > > > Hi Konstantin
> > > > > >
> > > > > > From: Ananyev, Konstantin, Tuesday, January 16, 2018 9:11 PM
> > > > > > > Hi Matan,
> > > > > > >
> > > > > > > >
> > > > > > > > Hi Konstantin
> > > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 8:44
> > > > > > > > PM
> > > > > > > > > Hi Matan,
> > > > > > > > > > Hi Konstantin
> > > > > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018
> > > > > > > > > > 1:45 PM
> > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > From: Ananyev, Konstantin, Friday, January 12,
> > > > > > > > > > > > 2018 2:02 AM
> > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > From: Ananyev, Konstantin, Thursday, January
> > > > > > > > > > > > > > 11, 2018
> > > > > > > > > > > > > > 2:40 PM
> > > > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > > > From: Ananyev, Konstantin, Wednesday,
> > > > > > > > > > > > > > > > January 10,
> > > > > > > > > > > > > > > > 2018
> > > > > > > > > > > > > > > > 3:36 PM
> > > > > > > > > > > > > > > > > Hi Matan,
> > > > > > > >  <snip>
> > > > > > > > > > > > > > > > > It is good to see that now
> > > > > > > > > > > > > > > > > scanning/updating rte_eth_dev_data[] is
> > > > > > > > > > > > > > > > > lock protected, but it might be not very
> > > > > > > > > > > > > > > > > plausible to protect both data[] and
> > > > > > > > > > > > > > > > > next_owner_id using the
> > > > > > > > > > > same lock.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I guess you mean to the owner structure in
> > > > > > > > > > > rte_eth_dev_data[port_id].
> > > > > > > > > > > > > > > > The next_owner_id is read by ownership
> > > > > > > > > > > > > > > > APIs(for owner validation), so it
> > > > > > > > > > > > > > > makes sense to use the same lock.
> > > > > > > > > > > > > > > > Actually, why not?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Well to me next_owner_id and
> > > > > > > > > > > > > > > rte_eth_dev_data[] are not directly
> > > > > > > > > > > > > related.
> > > > > > > > > > > > > > > You may create new owner_id but it doesn't
> > > > > > > > > > > > > > > mean you would update rte_eth_dev_data[]
> immediately.
> > > > > > > > > > > > > > > And visa-versa - you might just want to
> > > > > > > > > > > > > > > update rte_eth_dev_data[].name or .owner_id.
> > > > > > > > > > > > > > > It is not very good coding practice to use
> > > > > > > > > > > > > > > same lock for non-related data structures.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > I see the relation like next:
> > > > > > > > > > > > > > Since the ownership mechanism synchronization
> > > > > > > > > > > > > > is in ethdev responsibility, we must protect
> > > > > > > > > > > > > > against user mistakes as much as we can by
> > > > > > > > > > > > > using the same lock.
> > > > > > > > > > > > > > So, if user try to set by invalid owner
> > > > > > > > > > > > > > (exactly the ID which currently is
> > > > > > > > > > > > > allocated) we can protect on it.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hmm, not sure why you can't do same checking
> > > > > > > > > > > > > with different lock or atomic variable?
> > > > > > > > > > > > >
> > > > > > > > > > > > The set ownership API is protected by ownership
> > > > > > > > > > > > lock and checks the owner ID validity By reading the next
> owner ID.
> > > > > > > > > > > > So, the owner ID allocation and set API should use
> > > > > > > > > > > > the same atomic
> > > > > > > > > > > mechanism.
> > > > > > > > > > >
> > > > > > > > > > > Sure but all you are doing for checking validity, is
> > > > > > > > > > > check that owner_id > 0 &&& owner_id < next_ownwe_id,
> right?
> > > > > > > > > > > As you don't allow owner_id overlap (16/3248 bits)
> > > > > > > > > > > you can safely do same check with just
> atomic_get(&next_owner_id).
> > > > > > > > > > >
> > > > > > > > > > It will not protect it, scenario:
> > > > > > > > > > - current next_id is X.
> > > > > > > > > > - call set ownership of port A with owner id X by
> > > > > > > > > > thread 0(by user
> > > > > > > mistake).
> > > > > > > > > > - context switch
> > > > > > > > > > - allocate new id by thread 1 and get X and change
> > > > > > > > > > next_id to
> > > > > > > > > > X+1
> > > > > > > > > atomically.
> > > > > > > > > > -  context switch
> > > > > > > > > > - Thread 0 validate X by atomic_read and succeed to
> > > > > > > > > > take
> > > > > ownership.
> > > > > > > > > > - The system loosed the port(or will be managed by two
> > > > > > > > > > entities) -
> > > > > > > crash.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Ok, and how using lock will protect you with such scenario?
> > > > > > > >
> > > > > > > > The owner set API validation by thread 0 should fail
> > > > > > > > because the owner
> > > > > > > validation is included in the protected section.
> > > > > > >
> > > > > > > Then your validation function would fail even if you'll use
> > > > > > > atomic ops instead of lock.
> > > > > > No.
> > > > > > With atomic this specific scenario will cause the validation to pass.
> > > > >
> > > > > Can you explain to me how?
> > > > >
> > > > > rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > > >               int32_t cur_owner_id =
> > > > > RTE_MIN(rte_atomic32_get(next_owner_id),
> > > > > UINT16_MAX);
> > > > >
> > > > > 	if (owner_id == RTE_ETH_DEV_NO_OWNER || owner >
> > > > > cur_owner_id) {
> > > > > 		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > > > 		return 0;
> > > > > 	}
> > > > > 	return 1;
> > > > > }
> > > > >
> > > > > Let say your next_owne_id==X, and you invoke
> > > > > rte_eth_is_valid_owner_id(owner_id=X+1)  - it would fail.
> > > >
> > > > Explanation:
> > > > The scenario with locks:
> > > > next_owner_id = X.
> > > > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > > > Context switch.
> > > > Thread 1 call to owner_new and stuck in the lock.
> > > > Context switch.
> > > > Thread 0 does owner id validation and failed(Y>=X) - unlock the lock and
> return failure to the user.
> > > > Context switch.
> > > > Thread 1 take the lock and update X to X+1, then, unlock the lock.
> > > > Everything is OK!
> > > >
> > > > The same scenario with atomics:
> > > > next_owner_id = X.
> > > > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > > > Context switch.
> > > > Thread 1 call to owner_new and change X to X+1(atomically).
> > > > Context switch.
> > > > Thread 0 does owner id validation and success(Y<(atomic)X+1) - unlock
> the lock and return success to the  user.
> > > > Problem!
> > > >
> > >
> > >
> > > Matan is correct here, there is no way to preform parallel set
> > > operations using just and atomic variable here, because multiple
> > > reads of next_owner_id need to be preformed while it is stable.
> > > That is to say rte_eth_next_owner_id must be compared to
> > > RTE_ETH_DEV_NO_OWNER and owner_id in rte_eth_is_valid_owner_id.
> If
> > > you were to only use an atomic_read on such a variable, it could be
> > > incremented by the owner_new function between the checks and an
> > > invalid owner value could become valid because  a third thread
> > > incremented the next value.  The state of next_owner_id must be kept
> > > stable during any validity checks
> >
> > It could still be incremented between the checks - if let say
> > different thread will invoke new_onwer_id, grab the lock update
> > counter, release the lock - all that before the check.
> I don't see how all of the contents of rte_eth_dev_owner_set is protected
> under rte_eth_dev_ownership_lock, as is rte_eth_dev_owner_new.
> Next_owner might increment between another threads calls to owner_new
> and owner_set, but that will just cause a transition from an ownership id
> being valid to invalid, and thats ok, as long as there is consistency in the
> model that enforces a single valid owner at a time (in that case the
> subsequent caller to owner_new).
> 

I'm not sure I fully understand you, but see:
we can't protect all of the user mistakes(using the wrong owner id).
But we are doing the maximum for it.


> Though this confusion does underscore my assertion I think that this API is
> overly complicated
> 

I really don't think it is complicated. - just take ownership of a port(by owner id allocation and set APIs) and manage the port as you want. 

> Neil

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-17 20:34                                       ` Matan Azrad
@ 2018-01-18 14:17                                         ` Ananyev, Konstantin
  2018-01-18 14:26                                           ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-18 14:17 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce


Hi Matan,

> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Another thing - you'll probably need
> > > > > > > > > > > > > > > > > > > to
> > > > > grab/release
> > > > > > > > > > > > > > > > > > > a lock inside
> > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > > > > > > > > > > It is a public function used by
> > > > > > > > > > > > > > > > > > > drivers, so need to be protected
> > > > > > > > > > > too.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Yes, I thought about it, but decided not
> > > > > > > > > > > > > > > > > > to use lock in
> > > > > > > next:
> > > > > > > > > > > > > > > > > > rte_eth_dev_allocated rte_eth_dev_count
> > > > > > > > > > > > > > > > > > rte_eth_dev_get_name_by_port
> > > > > > > > > > > rte_eth_dev_get_port_by_name
> > > > > > > > > > > > > > > > > > maybe more...
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > As I can see in patch #3 you protect by
> > > > > > > > > > > > > > > > > lock access to rte_eth_dev_data[].name
> > > > > > > > > > > > > > > > > (which seems like a good
> > > > > > > thing).
> > > > > > > > > > > > > > > > > So I think any other public function that
> > > > > > > > > > > > > > > > > access rte_eth_dev_data[].name should be
> > > > > > > > > > > > > > > > > protected by the
> > > > > > > same
> > > > > > > > > lock.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I don't think so, I can understand to use
> > > > > > > > > > > > > > > > the ownership lock here(as in port
> > > > > > > > > > > > > > > creation) but I don't think it is necessary too.
> > > > > > > > > > > > > > > > What are we exactly protecting here?
> > > > > > > > > > > > > > > > Don't you think it is just timing?(ask in
> > > > > > > > > > > > > > > > the next moment and you may get another
> > > > > > > > > > > > > > > > answer) I don't see optional
> > > > > crash.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Not sure what you mean here by timing...
> > > > > > > > > > > > > > > As I understand rte_eth_dev_data[].name unique
> > > > > identifies
> > > > > > > > > > > > > > > device and is used by  port
> > > > > > > > > > > > > > > allocation/release/find
> > > > > functions.
> > > > > > > > > > > > > > > As you stated above:
> > > > > > > > > > > > > > > "1. The port allocation and port release
> > > > > > > > > > > > > > > synchronization will be managed by ethdev."
> > > > > > > > > > > > > > > To me it means that ethdev layer has to make
> > > > > > > > > > > > > > > sure that all accesses to
> > > > > > > > > > > > > > > rte_eth_dev_data[].name are
> > > atomic.
> > > > > > > > > > > > > > > Otherwise what would prevent the situation
> > > > > > > > > > > > > > > when one
> > > > > > > process
> > > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > rte_eth_dev_allocate()-
> > > > > >snprintf(rte_eth_dev_data[x].name,
> > > > > > > > > > > > > > > ...) while second one does
> > > > > > > > > > > > > rte_eth_dev_allocated(rte_eth_dev_data[x].name, ...) ?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > The second will get True or False and that is it.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Under race condition - in the worst case it might
> > > > > > > > > > > > > crash, though for that you'll have to be really unlucky.
> > > > > > > > > > > > > Though in most cases as you said it would just not
> > > > > > > > > > > > > operate
> > > > > > > correctly.
> > > > > > > > > > > > > I think if we start to protect dev->name by lock
> > > > > > > > > > > > > we need to do it for all instances (both read and write).
> > > > > > > > > > > > >
> > > > > > > > > > > > Since under the ownership rules, the user must take
> > > > > > > > > > > > ownership
> > > > > of a
> > > > > > > > > > > > port
> > > > > > > > > > > before using it, I still don't see a problem here.
> > > > > > > > > > >
> > > > > > > > > > > I am not talking about owner id or name here.
> > > > > > > > > > > I am talking about dev->name.
> > > > > > > > > > >
> > > > > > > > > > So? The user still should take ownership of a device
> > > > > > > > > > before using it
> > > > > (by
> > > > > > > > > name or by port id).
> > > > > > > > > > It can just read it without owning it, but no managing it.
> > > > > > > > > >
> > > > > > > > > > > > Please, Can you describe specific crash scenario and
> > > > > > > > > > > > explain how could the
> > > > > > > > > > > locking fix it?
> > > > > > > > > > >
> > > > > > > > > > > Let say thread 0 doing rte_eth_dev_allocate()-
> > > > > > > > > > > >snprintf(rte_eth_dev_data[x].name, ...), thread 1
> > > > > > > > > > > >doing
> > > > > > > > > > > rte_pmd_ring_remove()->rte_eth_dev_allocated()-
> > > >strcmp().
> > > > > > > > > > > And because of race condition -
> > > > > > > > > > > rte_eth_dev_allocated() will
> > > > > return
> > > > > > > > > > > rte_eth_dev * for the wrong device.
> > > > > > > > > > Which wrong device do you mean? I guess it is the device
> > > > > > > > > > which
> > > > > > > currently is
> > > > > > > > > being created by thread 0.
> > > > > > > > > > > Then rte_pmd_ring_remove() will call rte_free() for
> > > > > > > > > > > related resources, while It can still be in use by someone
> > else.
> > > > > > > > > > The rte_pmd_ring_remove caller(some DPDK entity) must
> > > > > > > > > > take
> > > > > > > ownership
> > > > > > > > > > (or validate that he is the owner) of a port before
> > > > > > > > > > doing it(free,
> > > > > > > release), so
> > > > > > > > > no issue here.
> > > > > > > > >
> > > > > > > > > Forget about ownership for a second.
> > > > > > > > > Suppose we have a process it created ring port for itself
> > > > > > > > > (without
> > > > > setting
> > > > > > > any
> > > > > > > > > ownership)  and used it for some time.
> > > > > > > > > Then it decided to remove it, so it calls
> > > > > > > > > rte_pmd_ring_remove()
> > > for it.
> > > > > > > > > At the same time second process decides to call
> > > > > rte_eth_dev_allocate()
> > > > > > > (let
> > > > > > > > > say for anither ring port).
> > > > > > > > > They could collide trying to read (process 0) and modify
> > > > > > > > > (process 1)
> > > > > same
> > > > > > > > > string rte_eth_dev_data[].name.
> > > > > > > > >
> > > > > > > > Do you mean that process 0 will compare successfully the
> > > > > > > > process 1
> > > > > new
> > > > > > > port name?
> > > > > > >
> > > > > > > Yes.
> > > > > > >
> > > > > > > > The state are in local process memory - so process 0 will
> > > > > > > > not compare
> > > > > the
> > > > > > > process 1 port, from its point of view this port is in UNUSED
> > > > > > > > state.
> > > > > > > >
> > > > > > >
> > > > > > > Ok, and why it can't be in attached state in process 0 too?
> > > > > >
> > > > > > Someone in process 0 should attach it using protected
> > > > > > attach_secondary
> > > > > somewhere in your scenario.
> > > > >
> > > > > Yes, process 0 can have this port attached too, why not?
> > > > See the function with inline comments:
> > > >
> > > > struct rte_eth_dev *
> > > > rte_eth_dev_allocated(const char *name) {
> > > > 	unsigned i;
> > > >
> > > > 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> > > >
> > > > 	    	The below state are in local process memory,
> > > > 		So, if here process 1 will allocate a new port (the current i),
> > > update its local state to ATTACHED and write the name,
> > > > 		the state is not visible by process 0 until someone in process
> > > 0 will attach it by rte_eth_dev_attach_secondary.
> > > > 		So, to use rte_eth_dev_attach_secondary process 0 must
> > > take the lock
> > > > and it can't, because it is currently locked by process 1.
> > >
> > > Ok I see.
> > > Thanks for your patience.
> > > BTW, that means that if let say process 0 will call
> > > rte_eth_dev_allocate("xxx") and process 1 will call
> > > rte_eth_dev_allocate("yyy") we can endup with same port_id be used for
> > > different devices and 2 processes will overwrite the same
> > rte_eth_dev_data[port_id]?
> >
> > No, contrary to the state, the lock itself is in shared memory, so 2 processes
> > cannot allocate port in the same time.(you can see it in the next patch of this
> > series).

I am not talking about racing here.
Let say process 0 calls rte_pmd_ring_probe()->....->rte_eth_dev_allocate("xxx")
rte_eth_dev_allocate() finds that port N is 'free', i.e.
local rte_eth_devices[N].state == RTE_ETH_DEV_UNUSED
so it assigns new dev ("xxx") to port N.
Then after some time process 1 calls rte_pmd_ring_probe()->....->rte_eth_dev_allocate("yyy").
>From its perspective port N is still free:  rte_eth_devices[N].state == RTE_ETH_DEV_UNUSED,
so it will assign new dev ("yyy") to the same port.
Konstantin
  

> >
> 
> Actually I think only one process(primary) should allocate ports, the others should attach them.
> The race of port allocation is only between the threads of the primary process.
> 
> 
> > > Konstantin
> > >
> > > >
> > > > 		if ((rte_eth_devices[i].state == RTE_ETH_DEV_ATTACHED)
> > > &&
> > > > 		strcmp(rte_eth_devices[i].data->name, name) == 0)
> > > > 			return &rte_eth_devices[i];
> > > > 	}
> > > > 	return NULL;
> > > >
> > > >

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-18 14:17                                         ` Ananyev, Konstantin
@ 2018-01-18 14:26                                           ` Matan Azrad
  2018-01-18 14:41                                             ` Ananyev, Konstantin
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-18 14:26 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce

Hi Konstantine

> Hi Matan,
> 
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Another thing - you'll probably
> > > > > > > > > > > > > > > > > > > > need to
> > > > > > grab/release
> > > > > > > > > > > > > > > > > > > > a lock inside
> > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > > > > > > > > > > > It is a public function used by
> > > > > > > > > > > > > > > > > > > > drivers, so need to be protected
> > > > > > > > > > > > too.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Yes, I thought about it, but decided
> > > > > > > > > > > > > > > > > > > not to use lock in
> > > > > > > > next:
> > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated
> > > > > > > > > > > > > > > > > > > rte_eth_dev_count
> > > > > > > > > > > > > > > > > > > rte_eth_dev_get_name_by_port
> > > > > > > > > > > > rte_eth_dev_get_port_by_name
> > > > > > > > > > > > > > > > > > > maybe more...
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > As I can see in patch #3 you protect
> > > > > > > > > > > > > > > > > > by lock access to
> > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name (which seems
> > > > > > > > > > > > > > > > > > like a good
> > > > > > > > thing).
> > > > > > > > > > > > > > > > > > So I think any other public function
> > > > > > > > > > > > > > > > > > that access rte_eth_dev_data[].name
> > > > > > > > > > > > > > > > > > should be protected by the
> > > > > > > > same
> > > > > > > > > > lock.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I don't think so, I can understand to
> > > > > > > > > > > > > > > > > use the ownership lock here(as in port
> > > > > > > > > > > > > > > > creation) but I don't think it is necessary too.
> > > > > > > > > > > > > > > > > What are we exactly protecting here?
> > > > > > > > > > > > > > > > > Don't you think it is just timing?(ask
> > > > > > > > > > > > > > > > > in the next moment and you may get
> > > > > > > > > > > > > > > > > another
> > > > > > > > > > > > > > > > > answer) I don't see optional
> > > > > > crash.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Not sure what you mean here by timing...
> > > > > > > > > > > > > > > > As I understand rte_eth_dev_data[].name
> > > > > > > > > > > > > > > > unique
> > > > > > identifies
> > > > > > > > > > > > > > > > device and is used by  port
> > > > > > > > > > > > > > > > allocation/release/find
> > > > > > functions.
> > > > > > > > > > > > > > > > As you stated above:
> > > > > > > > > > > > > > > > "1. The port allocation and port release
> > > > > > > > > > > > > > > > synchronization will be managed by ethdev."
> > > > > > > > > > > > > > > > To me it means that ethdev layer has to
> > > > > > > > > > > > > > > > make sure that all accesses to
> > > > > > > > > > > > > > > > rte_eth_dev_data[].name are
> > > > atomic.
> > > > > > > > > > > > > > > > Otherwise what would prevent the situation
> > > > > > > > > > > > > > > > when one
> > > > > > > > process
> > > > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > > rte_eth_dev_allocate()-
> > > > > > >snprintf(rte_eth_dev_data[x].name,
> > > > > > > > > > > > > > > > ...) while second one does
> > > > > > > > > > > > > > rte_eth_dev_allocated(rte_eth_dev_data[x].name,
> ...) ?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The second will get True or False and that is it.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Under race condition - in the worst case it
> > > > > > > > > > > > > > might crash, though for that you'll have to be really
> unlucky.
> > > > > > > > > > > > > > Though in most cases as you said it would just
> > > > > > > > > > > > > > not operate
> > > > > > > > correctly.
> > > > > > > > > > > > > > I think if we start to protect dev->name by
> > > > > > > > > > > > > > lock we need to do it for all instances (both read and
> write).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > Since under the ownership rules, the user must
> > > > > > > > > > > > > take ownership
> > > > > > of a
> > > > > > > > > > > > > port
> > > > > > > > > > > > before using it, I still don't see a problem here.
> > > > > > > > > > > >
> > > > > > > > > > > > I am not talking about owner id or name here.
> > > > > > > > > > > > I am talking about dev->name.
> > > > > > > > > > > >
> > > > > > > > > > > So? The user still should take ownership of a device
> > > > > > > > > > > before using it
> > > > > > (by
> > > > > > > > > > name or by port id).
> > > > > > > > > > > It can just read it without owning it, but no managing it.
> > > > > > > > > > >
> > > > > > > > > > > > > Please, Can you describe specific crash scenario
> > > > > > > > > > > > > and explain how could the
> > > > > > > > > > > > locking fix it?
> > > > > > > > > > > >
> > > > > > > > > > > > Let say thread 0 doing rte_eth_dev_allocate()-
> > > > > > > > > > > > >snprintf(rte_eth_dev_data[x].name, ...), thread 1
> > > > > > > > > > > > >doing
> > > > > > > > > > > > rte_pmd_ring_remove()->rte_eth_dev_allocated()-
> > > > >strcmp().
> > > > > > > > > > > > And because of race condition -
> > > > > > > > > > > > rte_eth_dev_allocated() will
> > > > > > return
> > > > > > > > > > > > rte_eth_dev * for the wrong device.
> > > > > > > > > > > Which wrong device do you mean? I guess it is the
> > > > > > > > > > > device which
> > > > > > > > currently is
> > > > > > > > > > being created by thread 0.
> > > > > > > > > > > > Then rte_pmd_ring_remove() will call rte_free()
> > > > > > > > > > > > for related resources, while It can still be in
> > > > > > > > > > > > use by someone
> > > else.
> > > > > > > > > > > The rte_pmd_ring_remove caller(some DPDK entity)
> > > > > > > > > > > must take
> > > > > > > > ownership
> > > > > > > > > > > (or validate that he is the owner) of a port before
> > > > > > > > > > > doing it(free,
> > > > > > > > release), so
> > > > > > > > > > no issue here.
> > > > > > > > > >
> > > > > > > > > > Forget about ownership for a second.
> > > > > > > > > > Suppose we have a process it created ring port for
> > > > > > > > > > itself (without
> > > > > > setting
> > > > > > > > any
> > > > > > > > > > ownership)  and used it for some time.
> > > > > > > > > > Then it decided to remove it, so it calls
> > > > > > > > > > rte_pmd_ring_remove()
> > > > for it.
> > > > > > > > > > At the same time second process decides to call
> > > > > > rte_eth_dev_allocate()
> > > > > > > > (let
> > > > > > > > > > say for anither ring port).
> > > > > > > > > > They could collide trying to read (process 0) and
> > > > > > > > > > modify (process 1)
> > > > > > same
> > > > > > > > > > string rte_eth_dev_data[].name.
> > > > > > > > > >
> > > > > > > > > Do you mean that process 0 will compare successfully the
> > > > > > > > > process 1
> > > > > > new
> > > > > > > > port name?
> > > > > > > >
> > > > > > > > Yes.
> > > > > > > >
> > > > > > > > > The state are in local process memory - so process 0
> > > > > > > > > will not compare
> > > > > > the
> > > > > > > > process 1 port, from its point of view this port is in
> > > > > > > > UNUSED
> > > > > > > > > state.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Ok, and why it can't be in attached state in process 0 too?
> > > > > > >
> > > > > > > Someone in process 0 should attach it using protected
> > > > > > > attach_secondary
> > > > > > somewhere in your scenario.
> > > > > >
> > > > > > Yes, process 0 can have this port attached too, why not?
> > > > > See the function with inline comments:
> > > > >
> > > > > struct rte_eth_dev *
> > > > > rte_eth_dev_allocated(const char *name) {
> > > > > 	unsigned i;
> > > > >
> > > > > 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> > > > >
> > > > > 	    	The below state are in local process memory,
> > > > > 		So, if here process 1 will allocate a new port (the current
> > > > > i),
> > > > update its local state to ATTACHED and write the name,
> > > > > 		the state is not visible by process 0 until someone in process
> > > > 0 will attach it by rte_eth_dev_attach_secondary.
> > > > > 		So, to use rte_eth_dev_attach_secondary process 0 must
> > > > take the lock
> > > > > and it can't, because it is currently locked by process 1.
> > > >
> > > > Ok I see.
> > > > Thanks for your patience.
> > > > BTW, that means that if let say process 0 will call
> > > > rte_eth_dev_allocate("xxx") and process 1 will call
> > > > rte_eth_dev_allocate("yyy") we can endup with same port_id be used
> > > > for different devices and 2 processes will overwrite the same
> > > rte_eth_dev_data[port_id]?
> > >
> > > No, contrary to the state, the lock itself is in shared memory, so 2
> > > processes cannot allocate port in the same time.(you can see it in
> > > the next patch of this series).
> 
> I am not talking about racing here.
> Let say process 0 calls rte_pmd_ring_probe()->....-
> >rte_eth_dev_allocate("xxx")
> rte_eth_dev_allocate() finds that port N is 'free', i.e.
> local rte_eth_devices[N].state == RTE_ETH_DEV_UNUSED so it assigns new
> dev ("xxx") to port N.
> Then after some time process 1 calls rte_pmd_ring_probe()->....-
> >rte_eth_dev_allocate("yyy").
> From its perspective port N is still free:  rte_eth_devices[N].state ==
> RTE_ETH_DEV_UNUSED, so it will assign new dev ("yyy") to the same port.
> 

Yes you right, this is a problem(not related actually to port ownership) but look:
As I understand the secondary processes are not allowed to create a ports and they must to use attach_secondary API, but there is not hardcoded check which prevent them to do it.


Konstantin
> 
> 
> > >
> >
> > Actually I think only one process(primary) should allocate ports, the others
> should attach them.
> > The race of port allocation is only between the threads of the primary
> process.
> >
> >
> > > > Konstantin
> > > >
> > > > >
> > > > > 		if ((rte_eth_devices[i].state == RTE_ETH_DEV_ATTACHED)
> > > > &&
> > > > > 		strcmp(rte_eth_devices[i].data->name, name) == 0)
> > > > > 			return &rte_eth_devices[i];
> > > > > 	}
> > > > > 	return NULL;
> > > > >
> > > > >

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-18 14:26                                           ` Matan Azrad
@ 2018-01-18 14:41                                             ` Ananyev, Konstantin
  2018-01-18 14:45                                               ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-18 14:41 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce

> 
> Hi Konstantine
> 
> > Hi Matan,
> >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Another thing - you'll probably
> > > > > > > > > > > > > > > > > > > > > need to
> > > > > > > grab/release
> > > > > > > > > > > > > > > > > > > > > a lock inside
> > > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > > > > > > > > > > > > It is a public function used by
> > > > > > > > > > > > > > > > > > > > > drivers, so need to be protected
> > > > > > > > > > > > > too.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Yes, I thought about it, but decided
> > > > > > > > > > > > > > > > > > > > not to use lock in
> > > > > > > > > next:
> > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated
> > > > > > > > > > > > > > > > > > > > rte_eth_dev_count
> > > > > > > > > > > > > > > > > > > > rte_eth_dev_get_name_by_port
> > > > > > > > > > > > > rte_eth_dev_get_port_by_name
> > > > > > > > > > > > > > > > > > > > maybe more...
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > As I can see in patch #3 you protect
> > > > > > > > > > > > > > > > > > > by lock access to
> > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name (which seems
> > > > > > > > > > > > > > > > > > > like a good
> > > > > > > > > thing).
> > > > > > > > > > > > > > > > > > > So I think any other public function
> > > > > > > > > > > > > > > > > > > that access rte_eth_dev_data[].name
> > > > > > > > > > > > > > > > > > > should be protected by the
> > > > > > > > > same
> > > > > > > > > > > lock.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I don't think so, I can understand to
> > > > > > > > > > > > > > > > > > use the ownership lock here(as in port
> > > > > > > > > > > > > > > > > creation) but I don't think it is necessary too.
> > > > > > > > > > > > > > > > > > What are we exactly protecting here?
> > > > > > > > > > > > > > > > > > Don't you think it is just timing?(ask
> > > > > > > > > > > > > > > > > > in the next moment and you may get
> > > > > > > > > > > > > > > > > > another
> > > > > > > > > > > > > > > > > > answer) I don't see optional
> > > > > > > crash.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Not sure what you mean here by timing...
> > > > > > > > > > > > > > > > > As I understand rte_eth_dev_data[].name
> > > > > > > > > > > > > > > > > unique
> > > > > > > identifies
> > > > > > > > > > > > > > > > > device and is used by  port
> > > > > > > > > > > > > > > > > allocation/release/find
> > > > > > > functions.
> > > > > > > > > > > > > > > > > As you stated above:
> > > > > > > > > > > > > > > > > "1. The port allocation and port release
> > > > > > > > > > > > > > > > > synchronization will be managed by ethdev."
> > > > > > > > > > > > > > > > > To me it means that ethdev layer has to
> > > > > > > > > > > > > > > > > make sure that all accesses to
> > > > > > > > > > > > > > > > > rte_eth_dev_data[].name are
> > > > > atomic.
> > > > > > > > > > > > > > > > > Otherwise what would prevent the situation
> > > > > > > > > > > > > > > > > when one
> > > > > > > > > process
> > > > > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > > > rte_eth_dev_allocate()-
> > > > > > > >snprintf(rte_eth_dev_data[x].name,
> > > > > > > > > > > > > > > > > ...) while second one does
> > > > > > > > > > > > > > > rte_eth_dev_allocated(rte_eth_dev_data[x].name,
> > ...) ?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > The second will get True or False and that is it.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Under race condition - in the worst case it
> > > > > > > > > > > > > > > might crash, though for that you'll have to be really
> > unlucky.
> > > > > > > > > > > > > > > Though in most cases as you said it would just
> > > > > > > > > > > > > > > not operate
> > > > > > > > > correctly.
> > > > > > > > > > > > > > > I think if we start to protect dev->name by
> > > > > > > > > > > > > > > lock we need to do it for all instances (both read and
> > write).
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > Since under the ownership rules, the user must
> > > > > > > > > > > > > > take ownership
> > > > > > > of a
> > > > > > > > > > > > > > port
> > > > > > > > > > > > > before using it, I still don't see a problem here.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I am not talking about owner id or name here.
> > > > > > > > > > > > > I am talking about dev->name.
> > > > > > > > > > > > >
> > > > > > > > > > > > So? The user still should take ownership of a device
> > > > > > > > > > > > before using it
> > > > > > > (by
> > > > > > > > > > > name or by port id).
> > > > > > > > > > > > It can just read it without owning it, but no managing it.
> > > > > > > > > > > >
> > > > > > > > > > > > > > Please, Can you describe specific crash scenario
> > > > > > > > > > > > > > and explain how could the
> > > > > > > > > > > > > locking fix it?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Let say thread 0 doing rte_eth_dev_allocate()-
> > > > > > > > > > > > > >snprintf(rte_eth_dev_data[x].name, ...), thread 1
> > > > > > > > > > > > > >doing
> > > > > > > > > > > > > rte_pmd_ring_remove()->rte_eth_dev_allocated()-
> > > > > >strcmp().
> > > > > > > > > > > > > And because of race condition -
> > > > > > > > > > > > > rte_eth_dev_allocated() will
> > > > > > > return
> > > > > > > > > > > > > rte_eth_dev * for the wrong device.
> > > > > > > > > > > > Which wrong device do you mean? I guess it is the
> > > > > > > > > > > > device which
> > > > > > > > > currently is
> > > > > > > > > > > being created by thread 0.
> > > > > > > > > > > > > Then rte_pmd_ring_remove() will call rte_free()
> > > > > > > > > > > > > for related resources, while It can still be in
> > > > > > > > > > > > > use by someone
> > > > else.
> > > > > > > > > > > > The rte_pmd_ring_remove caller(some DPDK entity)
> > > > > > > > > > > > must take
> > > > > > > > > ownership
> > > > > > > > > > > > (or validate that he is the owner) of a port before
> > > > > > > > > > > > doing it(free,
> > > > > > > > > release), so
> > > > > > > > > > > no issue here.
> > > > > > > > > > >
> > > > > > > > > > > Forget about ownership for a second.
> > > > > > > > > > > Suppose we have a process it created ring port for
> > > > > > > > > > > itself (without
> > > > > > > setting
> > > > > > > > > any
> > > > > > > > > > > ownership)  and used it for some time.
> > > > > > > > > > > Then it decided to remove it, so it calls
> > > > > > > > > > > rte_pmd_ring_remove()
> > > > > for it.
> > > > > > > > > > > At the same time second process decides to call
> > > > > > > rte_eth_dev_allocate()
> > > > > > > > > (let
> > > > > > > > > > > say for anither ring port).
> > > > > > > > > > > They could collide trying to read (process 0) and
> > > > > > > > > > > modify (process 1)
> > > > > > > same
> > > > > > > > > > > string rte_eth_dev_data[].name.
> > > > > > > > > > >
> > > > > > > > > > Do you mean that process 0 will compare successfully the
> > > > > > > > > > process 1
> > > > > > > new
> > > > > > > > > port name?
> > > > > > > > >
> > > > > > > > > Yes.
> > > > > > > > >
> > > > > > > > > > The state are in local process memory - so process 0
> > > > > > > > > > will not compare
> > > > > > > the
> > > > > > > > > process 1 port, from its point of view this port is in
> > > > > > > > > UNUSED
> > > > > > > > > > state.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Ok, and why it can't be in attached state in process 0 too?
> > > > > > > >
> > > > > > > > Someone in process 0 should attach it using protected
> > > > > > > > attach_secondary
> > > > > > > somewhere in your scenario.
> > > > > > >
> > > > > > > Yes, process 0 can have this port attached too, why not?
> > > > > > See the function with inline comments:
> > > > > >
> > > > > > struct rte_eth_dev *
> > > > > > rte_eth_dev_allocated(const char *name) {
> > > > > > 	unsigned i;
> > > > > >
> > > > > > 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> > > > > >
> > > > > > 	    	The below state are in local process memory,
> > > > > > 		So, if here process 1 will allocate a new port (the current
> > > > > > i),
> > > > > update its local state to ATTACHED and write the name,
> > > > > > 		the state is not visible by process 0 until someone in process
> > > > > 0 will attach it by rte_eth_dev_attach_secondary.
> > > > > > 		So, to use rte_eth_dev_attach_secondary process 0 must
> > > > > take the lock
> > > > > > and it can't, because it is currently locked by process 1.
> > > > >
> > > > > Ok I see.
> > > > > Thanks for your patience.
> > > > > BTW, that means that if let say process 0 will call
> > > > > rte_eth_dev_allocate("xxx") and process 1 will call
> > > > > rte_eth_dev_allocate("yyy") we can endup with same port_id be used
> > > > > for different devices and 2 processes will overwrite the same
> > > > rte_eth_dev_data[port_id]?
> > > >
> > > > No, contrary to the state, the lock itself is in shared memory, so 2
> > > > processes cannot allocate port in the same time.(you can see it in
> > > > the next patch of this series).
> >
> > I am not talking about racing here.
> > Let say process 0 calls rte_pmd_ring_probe()->....-
> > >rte_eth_dev_allocate("xxx")
> > rte_eth_dev_allocate() finds that port N is 'free', i.e.
> > local rte_eth_devices[N].state == RTE_ETH_DEV_UNUSED so it assigns new
> > dev ("xxx") to port N.
> > Then after some time process 1 calls rte_pmd_ring_probe()->....-
> > >rte_eth_dev_allocate("yyy").
> > From its perspective port N is still free:  rte_eth_devices[N].state ==
> > RTE_ETH_DEV_UNUSED, so it will assign new dev ("yyy") to the same port.
> >
> 
> Yes you right, this is a problem(not related actually to port ownership)

Yep that's true - it was there before your patches.

> but look:
> As I understand the secondary processes are not allowed to create a ports and they must to use attach_secondary API, but there is not
> hardcoded check which prevent them to do it.

Secondary processes ae the ability to allocate their own vdevs and probably it should stay like that.
I just thought it is a good opportunity to fix it while you are on these changes anyway,
but ok we can leave it for now.
 
Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-18 14:41                                             ` Ananyev, Konstantin
@ 2018-01-18 14:45                                               ` Matan Azrad
  2018-01-18 14:51                                                 ` Ananyev, Konstantin
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-18 14:45 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce

HI

From: Ananyev, Konstantin, Thursday, January 18, 2018 4:42 PM
> > Hi Konstantine
> >
> > > Hi Matan,
> > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Another thing - you'll
> > > > > > > > > > > > > > > > > > > > > > probably need to
> > > > > > > > grab/release
> > > > > > > > > > > > > > > > > > > > > > a lock inside
> > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > > > > > > > > > > > > > It is a public function used
> > > > > > > > > > > > > > > > > > > > > > by drivers, so need to be
> > > > > > > > > > > > > > > > > > > > > > protected
> > > > > > > > > > > > > > too.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Yes, I thought about it, but
> > > > > > > > > > > > > > > > > > > > > decided not to use lock in
> > > > > > > > > > next:
> > > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated
> > > > > > > > > > > > > > > > > > > > > rte_eth_dev_count
> > > > > > > > > > > > > > > > > > > > > rte_eth_dev_get_name_by_port
> > > > > > > > > > > > > > rte_eth_dev_get_port_by_name
> > > > > > > > > > > > > > > > > > > > > maybe more...
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > As I can see in patch #3 you
> > > > > > > > > > > > > > > > > > > > protect by lock access to
> > > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name (which
> > > > > > > > > > > > > > > > > > > > seems like a good
> > > > > > > > > > thing).
> > > > > > > > > > > > > > > > > > > > So I think any other public
> > > > > > > > > > > > > > > > > > > > function that access
> > > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name should be
> > > > > > > > > > > > > > > > > > > > protected by the
> > > > > > > > > > same
> > > > > > > > > > > > lock.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I don't think so, I can understand
> > > > > > > > > > > > > > > > > > > to use the ownership lock here(as in
> > > > > > > > > > > > > > > > > > > port
> > > > > > > > > > > > > > > > > > creation) but I don't think it is necessary too.
> > > > > > > > > > > > > > > > > > > What are we exactly protecting here?
> > > > > > > > > > > > > > > > > > > Don't you think it is just
> > > > > > > > > > > > > > > > > > > timing?(ask in the next moment and
> > > > > > > > > > > > > > > > > > > you may get another
> > > > > > > > > > > > > > > > > > > answer) I don't see optional
> > > > > > > > crash.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Not sure what you mean here by timing...
> > > > > > > > > > > > > > > > > > As I understand
> > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name unique
> > > > > > > > identifies
> > > > > > > > > > > > > > > > > > device and is used by  port
> > > > > > > > > > > > > > > > > > allocation/release/find
> > > > > > > > functions.
> > > > > > > > > > > > > > > > > > As you stated above:
> > > > > > > > > > > > > > > > > > "1. The port allocation and port
> > > > > > > > > > > > > > > > > > release synchronization will be managed by
> ethdev."
> > > > > > > > > > > > > > > > > > To me it means that ethdev layer has
> > > > > > > > > > > > > > > > > > to make sure that all accesses to
> > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name are
> > > > > > atomic.
> > > > > > > > > > > > > > > > > > Otherwise what would prevent the
> > > > > > > > > > > > > > > > > > situation when one
> > > > > > > > > > process
> > > > > > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > > > > rte_eth_dev_allocate()-
> > > > > > > > >snprintf(rte_eth_dev_data[x].name,
> > > > > > > > > > > > > > > > > > ...) while second one does
> > > > > > > > > > > > > > > > rte_eth_dev_allocated(rte_eth_dev_data[x].
> > > > > > > > > > > > > > > > name,
> > > ...) ?
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > The second will get True or False and that is it.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Under race condition - in the worst case
> > > > > > > > > > > > > > > > it might crash, though for that you'll
> > > > > > > > > > > > > > > > have to be really
> > > unlucky.
> > > > > > > > > > > > > > > > Though in most cases as you said it would
> > > > > > > > > > > > > > > > just not operate
> > > > > > > > > > correctly.
> > > > > > > > > > > > > > > > I think if we start to protect dev->name
> > > > > > > > > > > > > > > > by lock we need to do it for all instances
> > > > > > > > > > > > > > > > (both read and
> > > write).
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Since under the ownership rules, the user
> > > > > > > > > > > > > > > must take ownership
> > > > > > > > of a
> > > > > > > > > > > > > > > port
> > > > > > > > > > > > > > before using it, I still don't see a problem here.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I am not talking about owner id or name here.
> > > > > > > > > > > > > > I am talking about dev->name.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > So? The user still should take ownership of a
> > > > > > > > > > > > > device before using it
> > > > > > > > (by
> > > > > > > > > > > > name or by port id).
> > > > > > > > > > > > > It can just read it without owning it, but no managing it.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > Please, Can you describe specific crash
> > > > > > > > > > > > > > > scenario and explain how could the
> > > > > > > > > > > > > > locking fix it?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Let say thread 0 doing rte_eth_dev_allocate()-
> > > > > > > > > > > > > > >snprintf(rte_eth_dev_data[x].name, ...),
> > > > > > > > > > > > > > >thread 1 doing
> > > > > > > > > > > > > > rte_pmd_ring_remove()->rte_eth_dev_allocated()
> > > > > > > > > > > > > > -
> > > > > > >strcmp().
> > > > > > > > > > > > > > And because of race condition -
> > > > > > > > > > > > > > rte_eth_dev_allocated() will
> > > > > > > > return
> > > > > > > > > > > > > > rte_eth_dev * for the wrong device.
> > > > > > > > > > > > > Which wrong device do you mean? I guess it is
> > > > > > > > > > > > > the device which
> > > > > > > > > > currently is
> > > > > > > > > > > > being created by thread 0.
> > > > > > > > > > > > > > Then rte_pmd_ring_remove() will call
> > > > > > > > > > > > > > rte_free() for related resources, while It can
> > > > > > > > > > > > > > still be in use by someone
> > > > > else.
> > > > > > > > > > > > > The rte_pmd_ring_remove caller(some DPDK entity)
> > > > > > > > > > > > > must take
> > > > > > > > > > ownership
> > > > > > > > > > > > > (or validate that he is the owner) of a port
> > > > > > > > > > > > > before doing it(free,
> > > > > > > > > > release), so
> > > > > > > > > > > > no issue here.
> > > > > > > > > > > >
> > > > > > > > > > > > Forget about ownership for a second.
> > > > > > > > > > > > Suppose we have a process it created ring port for
> > > > > > > > > > > > itself (without
> > > > > > > > setting
> > > > > > > > > > any
> > > > > > > > > > > > ownership)  and used it for some time.
> > > > > > > > > > > > Then it decided to remove it, so it calls
> > > > > > > > > > > > rte_pmd_ring_remove()
> > > > > > for it.
> > > > > > > > > > > > At the same time second process decides to call
> > > > > > > > rte_eth_dev_allocate()
> > > > > > > > > > (let
> > > > > > > > > > > > say for anither ring port).
> > > > > > > > > > > > They could collide trying to read (process 0) and
> > > > > > > > > > > > modify (process 1)
> > > > > > > > same
> > > > > > > > > > > > string rte_eth_dev_data[].name.
> > > > > > > > > > > >
> > > > > > > > > > > Do you mean that process 0 will compare successfully
> > > > > > > > > > > the process 1
> > > > > > > > new
> > > > > > > > > > port name?
> > > > > > > > > >
> > > > > > > > > > Yes.
> > > > > > > > > >
> > > > > > > > > > > The state are in local process memory - so process 0
> > > > > > > > > > > will not compare
> > > > > > > > the
> > > > > > > > > > process 1 port, from its point of view this port is in
> > > > > > > > > > UNUSED
> > > > > > > > > > > state.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Ok, and why it can't be in attached state in process 0 too?
> > > > > > > > >
> > > > > > > > > Someone in process 0 should attach it using protected
> > > > > > > > > attach_secondary
> > > > > > > > somewhere in your scenario.
> > > > > > > >
> > > > > > > > Yes, process 0 can have this port attached too, why not?
> > > > > > > See the function with inline comments:
> > > > > > >
> > > > > > > struct rte_eth_dev *
> > > > > > > rte_eth_dev_allocated(const char *name) {
> > > > > > > 	unsigned i;
> > > > > > >
> > > > > > > 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> > > > > > >
> > > > > > > 	    	The below state are in local process memory,
> > > > > > > 		So, if here process 1 will allocate a new port (the
> > > > > > > current i),
> > > > > > update its local state to ATTACHED and write the name,
> > > > > > > 		the state is not visible by process 0 until someone in
> > > > > > > process
> > > > > > 0 will attach it by rte_eth_dev_attach_secondary.
> > > > > > > 		So, to use rte_eth_dev_attach_secondary process 0
> must
> > > > > > take the lock
> > > > > > > and it can't, because it is currently locked by process 1.
> > > > > >
> > > > > > Ok I see.
> > > > > > Thanks for your patience.
> > > > > > BTW, that means that if let say process 0 will call
> > > > > > rte_eth_dev_allocate("xxx") and process 1 will call
> > > > > > rte_eth_dev_allocate("yyy") we can endup with same port_id be
> > > > > > used for different devices and 2 processes will overwrite the
> > > > > > same
> > > > > rte_eth_dev_data[port_id]?
> > > > >
> > > > > No, contrary to the state, the lock itself is in shared memory,
> > > > > so 2 processes cannot allocate port in the same time.(you can
> > > > > see it in the next patch of this series).
> > >
> > > I am not talking about racing here.
> > > Let say process 0 calls rte_pmd_ring_probe()->....-
> > > >rte_eth_dev_allocate("xxx")
> > > rte_eth_dev_allocate() finds that port N is 'free', i.e.
> > > local rte_eth_devices[N].state == RTE_ETH_DEV_UNUSED so it assigns
> > > new dev ("xxx") to port N.
> > > Then after some time process 1 calls rte_pmd_ring_probe()->....-
> > > >rte_eth_dev_allocate("yyy").
> > > From its perspective port N is still free:  rte_eth_devices[N].state
> > > == RTE_ETH_DEV_UNUSED, so it will assign new dev ("yyy") to the same
> port.
> > >
> >
> > Yes you right, this is a problem(not related actually to port
> > ownership)
> 
> Yep that's true - it was there before your patches.
> 
> > but look:
> > As I understand the secondary processes are not allowed to create a
> > ports and they must to use attach_secondary API, but there is not
> hardcoded check which prevent them to do it.
> 
> Secondary processes ae the ability to allocate their own vdevs and probably it
> should stay like that.
> I just thought it is a good opportunity to fix it while you are on these changes
> anyway, but ok we can leave it for now.
> 
Looks like the fix should break ABI(moving the state to the shared memory), let's try to fix it in the next version :)

> Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-18 14:45                                               ` Matan Azrad
@ 2018-01-18 14:51                                                 ` Ananyev, Konstantin
  2018-01-18 15:00                                                   ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-18 14:51 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce



> -----Original Message-----
> From: Matan Azrad [mailto:matan@mellanox.com]
> Sent: Thursday, January 18, 2018 2:45 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: RE: [PATCH v2 2/6] ethdev: add port ownership
> 
> HI
> 
> From: Ananyev, Konstantin, Thursday, January 18, 2018 4:42 PM
> > > Hi Konstantine
> > >
> > > > Hi Matan,
> > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Another thing - you'll
> > > > > > > > > > > > > > > > > > > > > > > probably need to
> > > > > > > > > grab/release
> > > > > > > > > > > > > > > > > > > > > > > a lock inside
> > > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > > > > > > > > > > > > > > It is a public function used
> > > > > > > > > > > > > > > > > > > > > > > by drivers, so need to be
> > > > > > > > > > > > > > > > > > > > > > > protected
> > > > > > > > > > > > > > > too.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Yes, I thought about it, but
> > > > > > > > > > > > > > > > > > > > > > decided not to use lock in
> > > > > > > > > > > next:
> > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated
> > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_count
> > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_get_name_by_port
> > > > > > > > > > > > > > > rte_eth_dev_get_port_by_name
> > > > > > > > > > > > > > > > > > > > > > maybe more...
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > As I can see in patch #3 you
> > > > > > > > > > > > > > > > > > > > > protect by lock access to
> > > > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name (which
> > > > > > > > > > > > > > > > > > > > > seems like a good
> > > > > > > > > > > thing).
> > > > > > > > > > > > > > > > > > > > > So I think any other public
> > > > > > > > > > > > > > > > > > > > > function that access
> > > > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name should be
> > > > > > > > > > > > > > > > > > > > > protected by the
> > > > > > > > > > > same
> > > > > > > > > > > > > lock.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I don't think so, I can understand
> > > > > > > > > > > > > > > > > > > > to use the ownership lock here(as in
> > > > > > > > > > > > > > > > > > > > port
> > > > > > > > > > > > > > > > > > > creation) but I don't think it is necessary too.
> > > > > > > > > > > > > > > > > > > > What are we exactly protecting here?
> > > > > > > > > > > > > > > > > > > > Don't you think it is just
> > > > > > > > > > > > > > > > > > > > timing?(ask in the next moment and
> > > > > > > > > > > > > > > > > > > > you may get another
> > > > > > > > > > > > > > > > > > > > answer) I don't see optional
> > > > > > > > > crash.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Not sure what you mean here by timing...
> > > > > > > > > > > > > > > > > > > As I understand
> > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name unique
> > > > > > > > > identifies
> > > > > > > > > > > > > > > > > > > device and is used by  port
> > > > > > > > > > > > > > > > > > > allocation/release/find
> > > > > > > > > functions.
> > > > > > > > > > > > > > > > > > > As you stated above:
> > > > > > > > > > > > > > > > > > > "1. The port allocation and port
> > > > > > > > > > > > > > > > > > > release synchronization will be managed by
> > ethdev."
> > > > > > > > > > > > > > > > > > > To me it means that ethdev layer has
> > > > > > > > > > > > > > > > > > > to make sure that all accesses to
> > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name are
> > > > > > > atomic.
> > > > > > > > > > > > > > > > > > > Otherwise what would prevent the
> > > > > > > > > > > > > > > > > > > situation when one
> > > > > > > > > > > process
> > > > > > > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > > > > > rte_eth_dev_allocate()-
> > > > > > > > > >snprintf(rte_eth_dev_data[x].name,
> > > > > > > > > > > > > > > > > > > ...) while second one does
> > > > > > > > > > > > > > > > > rte_eth_dev_allocated(rte_eth_dev_data[x].
> > > > > > > > > > > > > > > > > name,
> > > > ...) ?
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > The second will get True or False and that is it.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Under race condition - in the worst case
> > > > > > > > > > > > > > > > > it might crash, though for that you'll
> > > > > > > > > > > > > > > > > have to be really
> > > > unlucky.
> > > > > > > > > > > > > > > > > Though in most cases as you said it would
> > > > > > > > > > > > > > > > > just not operate
> > > > > > > > > > > correctly.
> > > > > > > > > > > > > > > > > I think if we start to protect dev->name
> > > > > > > > > > > > > > > > > by lock we need to do it for all instances
> > > > > > > > > > > > > > > > > (both read and
> > > > write).
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Since under the ownership rules, the user
> > > > > > > > > > > > > > > > must take ownership
> > > > > > > > > of a
> > > > > > > > > > > > > > > > port
> > > > > > > > > > > > > > > before using it, I still don't see a problem here.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I am not talking about owner id or name here.
> > > > > > > > > > > > > > > I am talking about dev->name.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > So? The user still should take ownership of a
> > > > > > > > > > > > > > device before using it
> > > > > > > > > (by
> > > > > > > > > > > > > name or by port id).
> > > > > > > > > > > > > > It can just read it without owning it, but no managing it.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Please, Can you describe specific crash
> > > > > > > > > > > > > > > > scenario and explain how could the
> > > > > > > > > > > > > > > locking fix it?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Let say thread 0 doing rte_eth_dev_allocate()-
> > > > > > > > > > > > > > > >snprintf(rte_eth_dev_data[x].name, ...),
> > > > > > > > > > > > > > > >thread 1 doing
> > > > > > > > > > > > > > > rte_pmd_ring_remove()->rte_eth_dev_allocated()
> > > > > > > > > > > > > > > -
> > > > > > > >strcmp().
> > > > > > > > > > > > > > > And because of race condition -
> > > > > > > > > > > > > > > rte_eth_dev_allocated() will
> > > > > > > > > return
> > > > > > > > > > > > > > > rte_eth_dev * for the wrong device.
> > > > > > > > > > > > > > Which wrong device do you mean? I guess it is
> > > > > > > > > > > > > > the device which
> > > > > > > > > > > currently is
> > > > > > > > > > > > > being created by thread 0.
> > > > > > > > > > > > > > > Then rte_pmd_ring_remove() will call
> > > > > > > > > > > > > > > rte_free() for related resources, while It can
> > > > > > > > > > > > > > > still be in use by someone
> > > > > > else.
> > > > > > > > > > > > > > The rte_pmd_ring_remove caller(some DPDK entity)
> > > > > > > > > > > > > > must take
> > > > > > > > > > > ownership
> > > > > > > > > > > > > > (or validate that he is the owner) of a port
> > > > > > > > > > > > > > before doing it(free,
> > > > > > > > > > > release), so
> > > > > > > > > > > > > no issue here.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Forget about ownership for a second.
> > > > > > > > > > > > > Suppose we have a process it created ring port for
> > > > > > > > > > > > > itself (without
> > > > > > > > > setting
> > > > > > > > > > > any
> > > > > > > > > > > > > ownership)  and used it for some time.
> > > > > > > > > > > > > Then it decided to remove it, so it calls
> > > > > > > > > > > > > rte_pmd_ring_remove()
> > > > > > > for it.
> > > > > > > > > > > > > At the same time second process decides to call
> > > > > > > > > rte_eth_dev_allocate()
> > > > > > > > > > > (let
> > > > > > > > > > > > > say for anither ring port).
> > > > > > > > > > > > > They could collide trying to read (process 0) and
> > > > > > > > > > > > > modify (process 1)
> > > > > > > > > same
> > > > > > > > > > > > > string rte_eth_dev_data[].name.
> > > > > > > > > > > > >
> > > > > > > > > > > > Do you mean that process 0 will compare successfully
> > > > > > > > > > > > the process 1
> > > > > > > > > new
> > > > > > > > > > > port name?
> > > > > > > > > > >
> > > > > > > > > > > Yes.
> > > > > > > > > > >
> > > > > > > > > > > > The state are in local process memory - so process 0
> > > > > > > > > > > > will not compare
> > > > > > > > > the
> > > > > > > > > > > process 1 port, from its point of view this port is in
> > > > > > > > > > > UNUSED
> > > > > > > > > > > > state.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Ok, and why it can't be in attached state in process 0 too?
> > > > > > > > > >
> > > > > > > > > > Someone in process 0 should attach it using protected
> > > > > > > > > > attach_secondary
> > > > > > > > > somewhere in your scenario.
> > > > > > > > >
> > > > > > > > > Yes, process 0 can have this port attached too, why not?
> > > > > > > > See the function with inline comments:
> > > > > > > >
> > > > > > > > struct rte_eth_dev *
> > > > > > > > rte_eth_dev_allocated(const char *name) {
> > > > > > > > 	unsigned i;
> > > > > > > >
> > > > > > > > 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> > > > > > > >
> > > > > > > > 	    	The below state are in local process memory,
> > > > > > > > 		So, if here process 1 will allocate a new port (the
> > > > > > > > current i),
> > > > > > > update its local state to ATTACHED and write the name,
> > > > > > > > 		the state is not visible by process 0 until someone in
> > > > > > > > process
> > > > > > > 0 will attach it by rte_eth_dev_attach_secondary.
> > > > > > > > 		So, to use rte_eth_dev_attach_secondary process 0
> > must
> > > > > > > take the lock
> > > > > > > > and it can't, because it is currently locked by process 1.
> > > > > > >
> > > > > > > Ok I see.
> > > > > > > Thanks for your patience.
> > > > > > > BTW, that means that if let say process 0 will call
> > > > > > > rte_eth_dev_allocate("xxx") and process 1 will call
> > > > > > > rte_eth_dev_allocate("yyy") we can endup with same port_id be
> > > > > > > used for different devices and 2 processes will overwrite the
> > > > > > > same
> > > > > > rte_eth_dev_data[port_id]?
> > > > > >
> > > > > > No, contrary to the state, the lock itself is in shared memory,
> > > > > > so 2 processes cannot allocate port in the same time.(you can
> > > > > > see it in the next patch of this series).
> > > >
> > > > I am not talking about racing here.
> > > > Let say process 0 calls rte_pmd_ring_probe()->....-
> > > > >rte_eth_dev_allocate("xxx")
> > > > rte_eth_dev_allocate() finds that port N is 'free', i.e.
> > > > local rte_eth_devices[N].state == RTE_ETH_DEV_UNUSED so it assigns
> > > > new dev ("xxx") to port N.
> > > > Then after some time process 1 calls rte_pmd_ring_probe()->....-
> > > > >rte_eth_dev_allocate("yyy").
> > > > From its perspective port N is still free:  rte_eth_devices[N].state
> > > > == RTE_ETH_DEV_UNUSED, so it will assign new dev ("yyy") to the same
> > port.
> > > >
> > >
> > > Yes you right, this is a problem(not related actually to port
> > > ownership)
> >
> > Yep that's true - it was there before your patches.
> >
> > > but look:
> > > As I understand the secondary processes are not allowed to create a
> > > ports and they must to use attach_secondary API, but there is not
> > hardcoded check which prevent them to do it.
> >
> > Secondary processes ae the ability to allocate their own vdevs and probably it
> > should stay like that.
> > I just thought it is a good opportunity to fix it while you are on these changes
> > anyway, but ok we can leave it for now.
> >
> Looks like the fix should break ABI(moving the state to the shared memory), let's try to fix it in the next version :)

Not necessarily - I think we can just  add a check inside te_eth_dev_find_free_port() that 
rte_eth_dev_data[port_id].name is an empty string.
Konstantin


> 
> > Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-18 13:20                                     ` Neil Horman
@ 2018-01-18 14:52                                       ` Matan Azrad
  2018-01-19 13:57                                         ` Neil Horman
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-18 14:52 UTC (permalink / raw)
  To: Neil Horman
  Cc: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing,
	dev, Richardson, Bruce

Hi Neil

From: Neil Horman, Thursday, January 18, 2018 3:21 PM
> On Wed, Jan 17, 2018 at 05:58:07PM +0000, Matan Azrad wrote:
> >
> > Hi Neil
> >
> >  From: Neil Horman, Wednesday, January 17, 2018 4:00 PM
> > > On Wed, Jan 17, 2018 at 12:05:42PM +0000, Matan Azrad wrote:
<snip>
> > > Matan is correct here, there is no way to preform parallel set
> > > operations using just and atomic variable here, because multiple
> > > reads of next_owner_id need to be preformed while it is stable.
> > > That is to say rte_eth_next_owner_id must be compared to
> > > RTE_ETH_DEV_NO_OWNER and owner_id in rte_eth_is_valid_owner_id.
> If
> > > you were to only use an atomic_read on such a variable, it could be
> > > incremented by the owner_new function between the checks and an
> > > invalid owner value could become valid because  a third thread
> > > incremented the next value.  The state of next_owner_id must be kept
> > > stable during any validity checks
> > >
> > > That said, I really have to wonder why ownership ids are really
> > > needed here at all.  It seems this design could be much simpler with
> > > the addition of a per- port lock (and optional ownership record).
> > > The API could consist of three
> > > operations:
> > >
> > > ownership_set
> > > ownership_tryset
> > > ownership_release
> > > ownership_get
> > >
> > >
> > > The first call simply tries to take the per-port lock (blocking if
> > > its already
> > > locked)
> > >
> >
> > Per port lock is not good because the ownership mechanism must to be
> synchronized with the port creation\release.
> > So the port creation and port ownership should use the same lock.
> >
> In what way do you need to synchronize with port creation?

The port release zeroes the data field of the port owner, so it should be synchronized with the ownership APIs.
The port creation should be synchronized with the port release.


>  If a port has not
> yet been created, then by definition the owner must be the thread calling
> the create function.

No, the owner can be any dpdk entity. (an application - multi\single threads\proccesses, a PMD, a library).
So the port allocation(usually done from the port PMD by one thread from one process) just should to allocate a port.


>  If you are concerned about the mechanics of the port
> data structure (i.e. the fact that rte_eth_devices is statically allocated, you
> can add a lock structure to the rte_eth_dev struct and initialize it statically
> with
> RTE_SPINLOCK_INITAIZER()
> 

The lock should be in shared memory to allow secondary processes entities to take owner safely.
 
> > I didn't find precedence for blocking function in ethdev.
> >
> Then perhaps we don't need that api call.  Perhaps ownership_tryset is
> enough.
>

As I already did :)
 
> > > The second call is a non-blocking version of the first
> > >
> > > The third unlocks the port, allowing others to take ownership
> > >
> > > The fourth returns whatever ownership record you want to encode with
> > > the lock.
> > >
> > > The addition of all this id checking seems a bit overcomplicated
> >
> > You miss the identification of the owner - we want to allow info of the
> owner for printing and easy debug.
> > And it is makes sense to manage the owner uniqueness by unique ID.
> >
> I specifically pointed that out above.  There is no reason an owernship record
> couldn't be added to the rte_eth_dev structure.
> 

Sorry, don't understand why.

> > The API already discussed a lot in the previous version, Do you really want,
> now, to open it again?
> >
> What I want is the most useful and elegant ownership API available.  If you
> think what you have is that, so be it.  I only bring this up because the amount
> of debate you and Konstantin have had over lock safety causes me to
> wonder if this isn't an overly complex design.

I think the complex design is in secondary\primary processes, not in the current port ownership.
I think there is some work to do there regardless port ownership.
I think also there is some work in progress for it.

Thanks, a lot.

> 
> Neil
> 
> 
> > > Neil
> >
> >

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-18 14:51                                                 ` Ananyev, Konstantin
@ 2018-01-18 15:00                                                   ` Matan Azrad
  0 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-18 15:00 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce



From: Ananyev, Konstantin, Thursday, January 18, 2018 4:52 PM
> 
> > -----Original Message-----
> > From: Matan Azrad [mailto:matan@mellanox.com]
> > Sent: Thursday, January 18, 2018 2:45 PM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Thomas
> > Monjalon <thomas@monjalon.net>; Gaetan Rivet
> <gaetan.rivet@6wind.com>;
> > Wu, Jingjing <jingjing.wu@intel.com>
> > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > Bruce <bruce.richardson@intel.com>
> > Subject: RE: [PATCH v2 2/6] ethdev: add port ownership
> >
> > HI
> >
> > From: Ananyev, Konstantin, Thursday, January 18, 2018 4:42 PM
> > > > Hi Konstantine
> > > >
> > > > > Hi Matan,
> > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Another thing - you'll
> > > > > > > > > > > > > > > > > > > > > > > > probably need to
> > > > > > > > > > grab/release
> > > > > > > > > > > > > > > > > > > > > > > > a lock inside
> > > > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > > > > > > > > > > > > > > > It is a public function
> > > > > > > > > > > > > > > > > > > > > > > > used by drivers, so need
> > > > > > > > > > > > > > > > > > > > > > > > to be protected
> > > > > > > > > > > > > > > > too.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Yes, I thought about it, but
> > > > > > > > > > > > > > > > > > > > > > > decided not to use lock in
> > > > > > > > > > > > next:
> > > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated
> > > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_count
> > > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_get_name_by_port
> > > > > > > > > > > > > > > > rte_eth_dev_get_port_by_name
> > > > > > > > > > > > > > > > > > > > > > > maybe more...
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > As I can see in patch #3 you
> > > > > > > > > > > > > > > > > > > > > > protect by lock access to
> > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name (which
> > > > > > > > > > > > > > > > > > > > > > seems like a good
> > > > > > > > > > > > thing).
> > > > > > > > > > > > > > > > > > > > > > So I think any other public
> > > > > > > > > > > > > > > > > > > > > > function that access
> > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name should
> > > > > > > > > > > > > > > > > > > > > > be protected by the
> > > > > > > > > > > > same
> > > > > > > > > > > > > > lock.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > I don't think so, I can
> > > > > > > > > > > > > > > > > > > > > understand to use the ownership
> > > > > > > > > > > > > > > > > > > > > lock here(as in port
> > > > > > > > > > > > > > > > > > > > creation) but I don't think it is necessary
> too.
> > > > > > > > > > > > > > > > > > > > > What are we exactly protecting here?
> > > > > > > > > > > > > > > > > > > > > Don't you think it is just
> > > > > > > > > > > > > > > > > > > > > timing?(ask in the next moment
> > > > > > > > > > > > > > > > > > > > > and you may get another
> > > > > > > > > > > > > > > > > > > > > answer) I don't see optional
> > > > > > > > > > crash.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Not sure what you mean here by timing...
> > > > > > > > > > > > > > > > > > > > As I understand
> > > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name unique
> > > > > > > > > > identifies
> > > > > > > > > > > > > > > > > > > > device and is used by  port
> > > > > > > > > > > > > > > > > > > > allocation/release/find
> > > > > > > > > > functions.
> > > > > > > > > > > > > > > > > > > > As you stated above:
> > > > > > > > > > > > > > > > > > > > "1. The port allocation and port
> > > > > > > > > > > > > > > > > > > > release synchronization will be
> > > > > > > > > > > > > > > > > > > > managed by
> > > ethdev."
> > > > > > > > > > > > > > > > > > > > To me it means that ethdev layer
> > > > > > > > > > > > > > > > > > > > has to make sure that all accesses
> > > > > > > > > > > > > > > > > > > > to rte_eth_dev_data[].name are
> > > > > > > > atomic.
> > > > > > > > > > > > > > > > > > > > Otherwise what would prevent the
> > > > > > > > > > > > > > > > > > > > situation when one
> > > > > > > > > > > > process
> > > > > > > > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocate()-
> > > > > > > > > > >snprintf(rte_eth_dev_data[x].name,
> > > > > > > > > > > > > > > > > > > > ...) while second one does
> > > > > > > > > > > > > > > > > > rte_eth_dev_allocated(rte_eth_dev_data[x].
> > > > > > > > > > > > > > > > > > name,
> > > > > ...) ?
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > The second will get True or False and that is
> it.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Under race condition - in the worst
> > > > > > > > > > > > > > > > > > case it might crash, though for that
> > > > > > > > > > > > > > > > > > you'll have to be really
> > > > > unlucky.
> > > > > > > > > > > > > > > > > > Though in most cases as you said it
> > > > > > > > > > > > > > > > > > would just not operate
> > > > > > > > > > > > correctly.
> > > > > > > > > > > > > > > > > > I think if we start to protect
> > > > > > > > > > > > > > > > > > dev->name by lock we need to do it for
> > > > > > > > > > > > > > > > > > all instances (both read and
> > > > > write).
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Since under the ownership rules, the
> > > > > > > > > > > > > > > > > user must take ownership
> > > > > > > > > > of a
> > > > > > > > > > > > > > > > > port
> > > > > > > > > > > > > > > > before using it, I still don't see a problem here.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I am not talking about owner id or name here.
> > > > > > > > > > > > > > > > I am talking about dev->name.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > So? The user still should take ownership of
> > > > > > > > > > > > > > > a device before using it
> > > > > > > > > > (by
> > > > > > > > > > > > > > name or by port id).
> > > > > > > > > > > > > > > It can just read it without owning it, but no
> managing it.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Please, Can you describe specific crash
> > > > > > > > > > > > > > > > > scenario and explain how could the
> > > > > > > > > > > > > > > > locking fix it?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Let say thread 0 doing
> > > > > > > > > > > > > > > > rte_eth_dev_allocate()-
> > > > > > > > > > > > > > > > >snprintf(rte_eth_dev_data[x].name, ...),
> > > > > > > > > > > > > > > > >thread 1 doing
> > > > > > > > > > > > > > > > rte_pmd_ring_remove()->rte_eth_dev_allocat
> > > > > > > > > > > > > > > > ed()
> > > > > > > > > > > > > > > > -
> > > > > > > > >strcmp().
> > > > > > > > > > > > > > > > And because of race condition -
> > > > > > > > > > > > > > > > rte_eth_dev_allocated() will
> > > > > > > > > > return
> > > > > > > > > > > > > > > > rte_eth_dev * for the wrong device.
> > > > > > > > > > > > > > > Which wrong device do you mean? I guess it
> > > > > > > > > > > > > > > is the device which
> > > > > > > > > > > > currently is
> > > > > > > > > > > > > > being created by thread 0.
> > > > > > > > > > > > > > > > Then rte_pmd_ring_remove() will call
> > > > > > > > > > > > > > > > rte_free() for related resources, while It
> > > > > > > > > > > > > > > > can still be in use by someone
> > > > > > > else.
> > > > > > > > > > > > > > > The rte_pmd_ring_remove caller(some DPDK
> > > > > > > > > > > > > > > entity) must take
> > > > > > > > > > > > ownership
> > > > > > > > > > > > > > > (or validate that he is the owner) of a port
> > > > > > > > > > > > > > > before doing it(free,
> > > > > > > > > > > > release), so
> > > > > > > > > > > > > > no issue here.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Forget about ownership for a second.
> > > > > > > > > > > > > > Suppose we have a process it created ring port
> > > > > > > > > > > > > > for itself (without
> > > > > > > > > > setting
> > > > > > > > > > > > any
> > > > > > > > > > > > > > ownership)  and used it for some time.
> > > > > > > > > > > > > > Then it decided to remove it, so it calls
> > > > > > > > > > > > > > rte_pmd_ring_remove()
> > > > > > > > for it.
> > > > > > > > > > > > > > At the same time second process decides to
> > > > > > > > > > > > > > call
> > > > > > > > > > rte_eth_dev_allocate()
> > > > > > > > > > > > (let
> > > > > > > > > > > > > > say for anither ring port).
> > > > > > > > > > > > > > They could collide trying to read (process 0)
> > > > > > > > > > > > > > and modify (process 1)
> > > > > > > > > > same
> > > > > > > > > > > > > > string rte_eth_dev_data[].name.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > Do you mean that process 0 will compare
> > > > > > > > > > > > > successfully the process 1
> > > > > > > > > > new
> > > > > > > > > > > > port name?
> > > > > > > > > > > >
> > > > > > > > > > > > Yes.
> > > > > > > > > > > >
> > > > > > > > > > > > > The state are in local process memory - so
> > > > > > > > > > > > > process 0 will not compare
> > > > > > > > > > the
> > > > > > > > > > > > process 1 port, from its point of view this port
> > > > > > > > > > > > is in UNUSED
> > > > > > > > > > > > > state.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Ok, and why it can't be in attached state in process 0 too?
> > > > > > > > > > >
> > > > > > > > > > > Someone in process 0 should attach it using
> > > > > > > > > > > protected attach_secondary
> > > > > > > > > > somewhere in your scenario.
> > > > > > > > > >
> > > > > > > > > > Yes, process 0 can have this port attached too, why not?
> > > > > > > > > See the function with inline comments:
> > > > > > > > >
> > > > > > > > > struct rte_eth_dev *
> > > > > > > > > rte_eth_dev_allocated(const char *name) {
> > > > > > > > > 	unsigned i;
> > > > > > > > >
> > > > > > > > > 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> > > > > > > > >
> > > > > > > > > 	    	The below state are in local process memory,
> > > > > > > > > 		So, if here process 1 will allocate a new port (the
> > > > > > > > > current i),
> > > > > > > > update its local state to ATTACHED and write the name,
> > > > > > > > > 		the state is not visible by process 0 until someone in
> > > > > > > > > process
> > > > > > > > 0 will attach it by rte_eth_dev_attach_secondary.
> > > > > > > > > 		So, to use rte_eth_dev_attach_secondary process 0
> > > must
> > > > > > > > take the lock
> > > > > > > > > and it can't, because it is currently locked by process 1.
> > > > > > > >
> > > > > > > > Ok I see.
> > > > > > > > Thanks for your patience.
> > > > > > > > BTW, that means that if let say process 0 will call
> > > > > > > > rte_eth_dev_allocate("xxx") and process 1 will call
> > > > > > > > rte_eth_dev_allocate("yyy") we can endup with same port_id
> > > > > > > > be used for different devices and 2 processes will
> > > > > > > > overwrite the same
> > > > > > > rte_eth_dev_data[port_id]?
> > > > > > >
> > > > > > > No, contrary to the state, the lock itself is in shared
> > > > > > > memory, so 2 processes cannot allocate port in the same
> > > > > > > time.(you can see it in the next patch of this series).
> > > > >
> > > > > I am not talking about racing here.
> > > > > Let say process 0 calls rte_pmd_ring_probe()->....-
> > > > > >rte_eth_dev_allocate("xxx")
> > > > > rte_eth_dev_allocate() finds that port N is 'free', i.e.
> > > > > local rte_eth_devices[N].state == RTE_ETH_DEV_UNUSED so it
> > > > > assigns new dev ("xxx") to port N.
> > > > > Then after some time process 1 calls rte_pmd_ring_probe()->....-
> > > > > >rte_eth_dev_allocate("yyy").
> > > > > From its perspective port N is still free:
> > > > > rte_eth_devices[N].state == RTE_ETH_DEV_UNUSED, so it will
> > > > > assign new dev ("yyy") to the same
> > > port.
> > > > >
> > > >
> > > > Yes you right, this is a problem(not related actually to port
> > > > ownership)
> > >
> > > Yep that's true - it was there before your patches.
> > >
> > > > but look:
> > > > As I understand the secondary processes are not allowed to create
> > > > a ports and they must to use attach_secondary API, but there is
> > > > not
> > > hardcoded check which prevent them to do it.
> > >
> > > Secondary processes ae the ability to allocate their own vdevs and
> > > probably it should stay like that.
> > > I just thought it is a good opportunity to fix it while you are on
> > > these changes anyway, but ok we can leave it for now.
> > >
> > Looks like the fix should break ABI(moving the state to the shared
> > memory), let's try to fix it in the next version :)
> 
> Not necessarily - I think we can just  add a check inside
> te_eth_dev_find_free_port() that rte_eth_dev_data[port_id].name is an
> empty string.

Good idea, I will add it (actually the first patch in this series allows it).

Thanks.


> Konstantin
> 
> 
> >
> > > Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-17 17:01                                   ` Ananyev, Konstantin
  2018-01-18 13:10                                     ` Neil Horman
@ 2018-01-18 16:27                                     ` Neil Horman
  1 sibling, 0 replies; 214+ messages in thread
From: Neil Horman @ 2018-01-18 16:27 UTC (permalink / raw)
  To: Ananyev, Konstantin
  Cc: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing, dev,
	Richardson, Bruce

On Wed, Jan 17, 2018 at 05:01:10PM +0000, Ananyev, Konstantin wrote:
> 
> 
> > -----Original Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Wednesday, January 17, 2018 2:00 PM
> > To: Matan Azrad <matan@mellanox.com>
> > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>
> > Subject: Re: [PATCH v2 2/6] ethdev: add port ownership
> > 
> > On Wed, Jan 17, 2018 at 12:05:42PM +0000, Matan Azrad wrote:
> > >
> > > Hi Konstantin
> > > From: Ananyev, Konstantin, Sent: Wednesday, January 17, 2018 1:24 PM
> > > > Hi Matan,
> > > >
> > > > > Hi Konstantin
> > > > >
> > > > > From: Ananyev, Konstantin, Tuesday, January 16, 2018 9:11 PM
> > > > > > Hi Matan,
> > > > > >
> > > > > > >
> > > > > > > Hi Konstantin
> > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 8:44 PM
> > > > > > > > Hi Matan,
> > > > > > > > > Hi Konstantin
> > > > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 1:45 PM
> > > > > > > > > > Hi Matan,
> > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > From: Ananyev, Konstantin, Friday, January 12, 2018 2:02
> > > > > > > > > > > AM
> > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > From: Ananyev, Konstantin, Thursday, January 11, 2018
> > > > > > > > > > > > > 2:40 PM
> > > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > > From: Ananyev, Konstantin, Wednesday, January 10,
> > > > > > > > > > > > > > > 2018
> > > > > > > > > > > > > > > 3:36 PM
> > > > > > > > > > > > > > > > Hi Matan,
> > > > > > >  <snip>
> > > > > > > > > > > > > > > > It is good to see that now scanning/updating
> > > > > > > > > > > > > > > > rte_eth_dev_data[] is lock protected, but it
> > > > > > > > > > > > > > > > might be not very plausible to protect both
> > > > > > > > > > > > > > > > data[] and next_owner_id using the
> > > > > > > > > > same lock.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I guess you mean to the owner structure in
> > > > > > > > > > rte_eth_dev_data[port_id].
> > > > > > > > > > > > > > > The next_owner_id is read by ownership APIs(for
> > > > > > > > > > > > > > > owner validation), so it
> > > > > > > > > > > > > > makes sense to use the same lock.
> > > > > > > > > > > > > > > Actually, why not?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Well to me next_owner_id and rte_eth_dev_data[] are
> > > > > > > > > > > > > > not directly
> > > > > > > > > > > > related.
> > > > > > > > > > > > > > You may create new owner_id but it doesn't mean you
> > > > > > > > > > > > > > would update rte_eth_dev_data[] immediately.
> > > > > > > > > > > > > > And visa-versa - you might just want to update
> > > > > > > > > > > > > > rte_eth_dev_data[].name or .owner_id.
> > > > > > > > > > > > > > It is not very good coding practice to use same lock
> > > > > > > > > > > > > > for non-related data structures.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > I see the relation like next:
> > > > > > > > > > > > > Since the ownership mechanism synchronization is in
> > > > > > > > > > > > > ethdev responsibility, we must protect against user
> > > > > > > > > > > > > mistakes as much as we can by
> > > > > > > > > > > > using the same lock.
> > > > > > > > > > > > > So, if user try to set by invalid owner (exactly the
> > > > > > > > > > > > > ID which currently is
> > > > > > > > > > > > allocated) we can protect on it.
> > > > > > > > > > > >
> > > > > > > > > > > > Hmm, not sure why you can't do same checking with
> > > > > > > > > > > > different lock or atomic variable?
> > > > > > > > > > > >
> > > > > > > > > > > The set ownership API is protected by ownership lock and
> > > > > > > > > > > checks the owner ID validity By reading the next owner ID.
> > > > > > > > > > > So, the owner ID allocation and set API should use the
> > > > > > > > > > > same atomic
> > > > > > > > > > mechanism.
> > > > > > > > > >
> > > > > > > > > > Sure but all you are doing for checking validity, is  check
> > > > > > > > > > that owner_id > 0 &&& owner_id < next_ownwe_id, right?
> > > > > > > > > > As you don't allow owner_id overlap (16/3248 bits) you can
> > > > > > > > > > safely do same check with just atomic_get(&next_owner_id).
> > > > > > > > > >
> > > > > > > > > It will not protect it, scenario:
> > > > > > > > > - current next_id is X.
> > > > > > > > > - call set ownership of port A with owner id X by thread 0(by
> > > > > > > > > user
> > > > > > mistake).
> > > > > > > > > - context switch
> > > > > > > > > - allocate new id by thread 1 and get X and change next_id to
> > > > > > > > > X+1
> > > > > > > > atomically.
> > > > > > > > > -  context switch
> > > > > > > > > - Thread 0 validate X by atomic_read and succeed to take
> > > > ownership.
> > > > > > > > > - The system loosed the port(or will be managed by two
> > > > > > > > > entities) -
> > > > > > crash.
> > > > > > > >
> > > > > > > >
> > > > > > > > Ok, and how using lock will protect you with such scenario?
> > > > > > >
> > > > > > > The owner set API validation by thread 0 should fail because the
> > > > > > > owner
> > > > > > validation is included in the protected section.
> > > > > >
> > > > > > Then your validation function would fail even if you'll use atomic
> > > > > > ops instead of lock.
> > > > > No.
> > > > > With atomic this specific scenario will cause the validation to pass.
> > > >
> > > > Can you explain to me how?
> > > >
> > > > rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > >               int32_t cur_owner_id = RTE_MIN(rte_atomic32_get(next_owner_id),
> > > > UINT16_MAX);
> > > >
> > > > 	if (owner_id == RTE_ETH_DEV_NO_OWNER || owner >
> > > > cur_owner_id) {
> > > > 		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > > 		return 0;
> > > > 	}
> > > > 	return 1;
> > > > }
> > > >
> > > > Let say your next_owne_id==X, and you invoke
> > > > rte_eth_is_valid_owner_id(owner_id=X+1)  - it would fail.
> > >
> > > Explanation:
> > > The scenario with locks:
> > > next_owner_id = X.
> > > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > > Context switch.
> > > Thread 1 call to owner_new and stuck in the lock.
> > > Context switch.
> > > Thread 0 does owner id validation and failed(Y>=X) - unlock the lock and return failure to the user.
> > > Context switch.
> > > Thread 1 take the lock and update X to X+1, then, unlock the lock.
> > > Everything is OK!
> > >
> > > The same scenario with atomics:
> > > next_owner_id = X.
> > > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > > Context switch.
> > > Thread 1 call to owner_new and change X to X+1(atomically).
> > > Context switch.
> > > Thread 0 does owner id validation and success(Y<(atomic)X+1) - unlock the lock and return success to the  user.
> > > Problem!
> > >
> > 
> > 
> > Matan is correct here, there is no way to preform parallel set operations using
> > just and atomic variable here, because multiple reads of next_owner_id need to
> > be preformed while it is stable.  That is to say rte_eth_next_owner_id must be
> > compared to RTE_ETH_DEV_NO_OWNER and owner_id in rte_eth_is_valid_owner_id.  If
> > you were to only use an atomic_read on such a variable, it could be incremented
> > by the owner_new function between the checks and an invalid owner value could
> > become valid because  a third thread incremented the next value.  The state of
> > next_owner_id must be kept stable during any validity checks
> 
> It could still be incremented between the checks - if let say different thread will
> invoke new_onwer_id, grab the lock update counter, release the lock - all that
> before the check.
Yes, as I mentioned previously, thats an artifact of this implementation, and
arguably ok, because the state of next is still kept steady during the check
process.  Theres no guarantee that, once you call new, you will be able to take
ownership. The result of the set operation determines that.  If you want to
ensure that you claim ownership on set, then you need to make the allocation of
an owner object atomic with its aquisition of the port, the way my proposed api
below does.

> But ok, there is probably no point to argue on that one any longer -
> let's keep the lock here, nothing will be broken with it for sure.
> 
Agree.

> > 
> > That said, I really have to wonder why ownership ids are really needed here at
> > all.  It seems this design could be much simpler with the addition of a per-port
> > lock (and optional ownership record).  The API could consist of three
> > operations:
> > 
> > ownership_set
> > ownership_tryset
> > ownership_release
> > ownership_get
> > 
> 
> Ok, but how to distinguish who is the current owner of the port?
> To make sure that only owner is allowed to perform control ops?
> Konstantin
> 
As I said above, if you want to have an ownership record, theres no reason you
can't (thats what ownership_get is intended to return to you).  Perhaps a better
api would be an is_owner(owner_record) call, which can atomically compare a
passed in owner record with the current ownership and return true/false if they
match

Neil
> > 
> > The first call simply tries to take the per-port lock (blocking if its already
> > locked)
> > 
> > The second call is a non-blocking version of the first
> > 
> > The third unlocks the port, allowing others to take ownership
> > 
> > The fourth returns whatever ownership record you want to encode with the lock.
> > 
> > The addition of all this id checking seems a bit overcomplicated
> > 
> > Neil
> 
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v3 0/7] Port ownership and syncronization
  2018-01-07  9:45 ` [dpdk-dev] [PATCH v2 0/6] ethdev: " Matan Azrad
                     ` (5 preceding siblings ...)
  2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port ownership Matan Azrad
@ 2018-01-18 16:35   ` Matan Azrad
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 1/7] ethdev: fix port data reset timing Matan Azrad
                       ` (8 more replies)
  6 siblings, 9 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-18 16:35 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Add ownership mechanism to DPDK Ethernet devices to avoid multiple management of a device by different DPDK entities.
The port ownership mechanism is a good point to redefine the synchronization rules in ethdev:

	1. The port allocation and port release synchronization will be managed by ethdev.
	2. The port usage synchronization will be managed by the port owner.
	3. The port ownership synchronization will be managed by ethdev.
	4. DPDK entity which want to use a port safely must take ownership before.


V2:  
Synchronize ethdev port creation.
Synchronize port ownership mechanism.
Rename owner remove API to rte_eth_dev_owner_unset.
Remove "ethdev: free a port by a dedicated API" patch - passed to another series.
Add "ethdev: fix port data reset timing" patch.
Cahnge owner get API to return int value and to pass copy of the owner structure.
Adjust testpmd to the improved owner get API.
Adjust documentations.

V3:
Change RTE_ETH_FOREACH_DEV iterator to skip owned ports(Gaetan suggestion).
Prevent goto in set\unset APIs by adding internal API - this also adds reuse of code(Konstantin suggestion).
Group all the shared processes variables in one struct to allow easy allocation of it(Konstantin suggestion).
Take owner name truncation as warning and not as error(Konstantin suggestion).
Mark the new APIs as EXPERIMENTAL.
Rebase on top of master_net_mlx.
Rebase on top of "[PATCH v6 0/6] Fail-safe\ethdev: fix removal handling lack" series.
Rebase on top of "[PATCH v5 0/8] Introduce virtual driver for Hyper-V/Azure platforms" series.
Add "ethdev: fix used portid allocation" patch suggested by Konstantin.



Matan Azrad (7):
  ethdev: fix port data reset timing
  ethdev: fix used portid allocation
  ethdev: add port ownership
  ethdev: synchronize port allocation
  net/failsafe: free an eth port by a dedicated API
  net/failsafe: use ownership mechanism to own ports
  app/testpmd: adjust ethdev port ownership

 app/test-pmd/cmdline.c                  |  89 +++++-------
 app/test-pmd/cmdline_flow.c             |   2 +-
 app/test-pmd/config.c                   |  37 ++---
 app/test-pmd/parameters.c               |   4 +-
 app/test-pmd/testpmd.c                  |  63 ++++++---
 app/test-pmd/testpmd.h                  |   3 +
 doc/guides/prog_guide/poll_mode_drv.rst |  14 +-
 drivers/net/failsafe/failsafe.c         |   7 +
 drivers/net/failsafe/failsafe_eal.c     |   6 +
 drivers/net/failsafe/failsafe_ether.c   |   2 +-
 drivers/net/failsafe/failsafe_private.h |   2 +
 lib/librte_ether/rte_ethdev.c           | 241 +++++++++++++++++++++++++++-----
 lib/librte_ether/rte_ethdev.h           | 115 ++++++++++++++-
 lib/librte_ether/rte_ethdev_version.map |   6 +
 14 files changed, 454 insertions(+), 137 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v3 1/7] ethdev: fix port data reset timing
  2018-01-18 16:35   ` [dpdk-dev] [PATCH v3 0/7] Port ownership and syncronization Matan Azrad
@ 2018-01-18 16:35     ` Matan Azrad
  2018-01-18 17:00       ` Thomas Monjalon
                         ` (2 more replies)
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 2/7] ethdev: fix used portid allocation Matan Azrad
                       ` (7 subsequent siblings)
  8 siblings, 3 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-18 16:35 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev, stable

rte_eth_dev_data structure is allocated per ethdev port and can be
used to get a data of the port internally.

rte_eth_dev_attach_secondary tries to find the port identifier using
rte_eth_dev_data name field comparison and may get an identifier of
invalid port in case of this port was released by the primary process
because the port release API doesn't reset the port data.

So, it will be better to reset the port data in release time instead of
allocation time.

Move the port data reset to the port release API.

Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple process model")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 lib/librte_ether/rte_ethdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 7044159..156231c 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -204,7 +204,6 @@ struct rte_eth_dev *
 		return NULL;
 	}
 
-	memset(&rte_eth_dev_data[port_id], 0, sizeof(struct rte_eth_dev_data));
 	eth_dev = eth_dev_get(port_id);
 	snprintf(eth_dev->data->name, sizeof(eth_dev->data->name), "%s", name);
 	eth_dev->data->port_id = port_id;
@@ -252,6 +251,7 @@ struct rte_eth_dev *
 	if (eth_dev == NULL)
 		return -EINVAL;
 
+	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
 	eth_dev->state = RTE_ETH_DEV_UNUSED;
 
 	_rte_eth_dev_callback_process(eth_dev, RTE_ETH_EVENT_DESTROY, NULL);
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v3 2/7] ethdev: fix used portid allocation
  2018-01-18 16:35   ` [dpdk-dev] [PATCH v3 0/7] Port ownership and syncronization Matan Azrad
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 1/7] ethdev: fix port data reset timing Matan Azrad
@ 2018-01-18 16:35     ` Matan Azrad
  2018-01-18 17:00       ` Thomas Monjalon
  2018-01-19 12:40       ` Ananyev, Konstantin
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 3/7] ethdev: add port ownership Matan Azrad
                       ` (6 subsequent siblings)
  8 siblings, 2 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-18 16:35 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev, stable

rte_eth_dev_find_free_port() found a free port by state checking.
The state field are in local process memory, so other DPDK processes
may get the same port ID because their local states may be different.

Replace the state checking by the ethdev port name checking,
so, if the name is an empty string the port ID will be detected as
unused.

Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple process model")
Cc: stable@dpdk.org

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 lib/librte_ether/rte_ethdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 156231c..5d87f72 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -164,7 +164,7 @@ struct rte_eth_dev *
 	unsigned i;
 
 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
-		if (rte_eth_devices[i].state == RTE_ETH_DEV_UNUSED)
+		if (rte_eth_dev_share_data->data[i].name[0] == '\0')
 			return i;
 	}
 	return RTE_MAX_ETHPORTS;
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v3 3/7] ethdev: add port ownership
  2018-01-18 16:35   ` [dpdk-dev] [PATCH v3 0/7] Port ownership and syncronization Matan Azrad
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 1/7] ethdev: fix port data reset timing Matan Azrad
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 2/7] ethdev: fix used portid allocation Matan Azrad
@ 2018-01-18 16:35     ` Matan Azrad
  2018-01-18 21:11       ` Thomas Monjalon
  2018-01-19 12:41       ` Ananyev, Konstantin
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 4/7] ethdev: synchronize port allocation Matan Azrad
                       ` (5 subsequent siblings)
  8 siblings, 2 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-18 16:35 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

The ownership of a port is implicit in DPDK.
Making it explicit is better from the next reasons:
1. It will define well who is in charge of the port usage synchronization.
2. A library could work on top of a port.
3. A port can work on top of another port.

Also in the fail-safe case, an issue has been met in testpmd.
We need to check that the application is not trying to use a port which
is already managed by fail-safe.

A port owner is built from owner id(number) and owner name(string) while
the owner id must be unique to distinguish between two identical entity
instances and the owner name can be any name.
The name helps to logically recognize the owner by different DPDK
entities and allows easy debug.
Each DPDK entity can allocate an owner unique identifier and can use it
and its preferred name to owns valid ethdev ports.
Each DPDK entity can get any port owner status to decide if it can
manage the port or not.

The mechanism is synchronized for both the primary process threads and
the secondary processes threads to allow secondary process entity to be
a port owner.

Add a sinchronized ownership mechanism to DPDK Ethernet devices to
avoid multiple management of a device by different DPDK entities.

The current ethdev internal port management is not affected by this
feature.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 doc/guides/prog_guide/poll_mode_drv.rst |  14 ++-
 lib/librte_ether/rte_ethdev.c           | 200 ++++++++++++++++++++++++++++----
 lib/librte_ether/rte_ethdev.h           | 115 +++++++++++++++++-
 lib/librte_ether/rte_ethdev_version.map |   6 +
 4 files changed, 305 insertions(+), 30 deletions(-)

diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
index d1d4b1c..d513ee3 100644
--- a/doc/guides/prog_guide/poll_mode_drv.rst
+++ b/doc/guides/prog_guide/poll_mode_drv.rst
@@ -156,8 +156,8 @@ concurrently on the same tx queue without SW lock. This PMD feature found in som
 
 See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE`` capability probing details.
 
-Device Identification and Configuration
----------------------------------------
+Device Identification, Ownership and Configuration
+--------------------------------------------------
 
 Device Identification
 ~~~~~~~~~~~~~~~~~~~~~
@@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports are assigned two other identifiers:
 *   A port name used to designate the port in console messages, for administration or debugging purposes.
     For ease of use, the port name includes the port index.
 
+Port Ownership
+~~~~~~~~~~~~~~
+The Ethernet devices ports can be owned by a single DPDK entity (application, library, PMD, process, etc).
+The ownership mechanism is controlled by ethdev APIs and allows to set/remove/get a port owner by DPDK entities.
+Allowing this should prevent any multiple management of Ethernet port by different entities.
+
+.. note::
+
+    It is the DPDK entity responsibility to set the port owner before using it and to manage the port usage synchronization between different threads or processes.
+
 Device Configuration
 ~~~~~~~~~~~~~~~~~~~~
 
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 5d87f72..b740370 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -41,7 +41,6 @@
 
 static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
 struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
-static struct rte_eth_dev_data *rte_eth_dev_data;
 static uint8_t eth_dev_last_created_port;
 
 /* spinlock for eth device callbacks */
@@ -59,6 +58,13 @@ struct rte_eth_xstats_name_off {
 	unsigned offset;
 };
 
+/* Shared memory between primary and secondary processes. */
+static struct {
+	uint64_t next_owner_id;
+	rte_spinlock_t ownership_lock;
+	struct rte_eth_dev_data data[RTE_MAX_ETHPORTS];
+} *rte_eth_dev_share_data;
+
 static const struct rte_eth_xstats_name_off rte_stats_strings[] = {
 	{"rx_good_packets", offsetof(struct rte_eth_stats, ipackets)},
 	{"tx_good_packets", offsetof(struct rte_eth_stats, opackets)},
@@ -125,24 +131,29 @@ enum {
 }
 
 static void
-rte_eth_dev_data_alloc(void)
+rte_eth_dev_share_data_alloc(void)
 {
 	const unsigned flags = 0;
 	const struct rte_memzone *mz;
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		/* Allocate shared memory for port data and ownership. */
 		mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA,
-				RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data),
-				rte_socket_id(), flags);
+					 sizeof(*rte_eth_dev_share_data),
+					 rte_socket_id(), flags);
 	} else
 		mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
 	if (mz == NULL)
 		rte_panic("Cannot allocate memzone for ethernet port data\n");
 
-	rte_eth_dev_data = mz->addr;
-	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
-		memset(rte_eth_dev_data, 0,
-				RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data));
+	rte_eth_dev_share_data = mz->addr;
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		rte_eth_dev_share_data->next_owner_id =
+				RTE_ETH_DEV_NO_OWNER + 1;
+		rte_spinlock_init(&rte_eth_dev_share_data->ownership_lock);
+		memset(rte_eth_dev_share_data->data, 0,
+		       sizeof(rte_eth_dev_share_data->data));
+	}
 }
 
 struct rte_eth_dev *
@@ -175,7 +186,7 @@ struct rte_eth_dev *
 {
 	struct rte_eth_dev *eth_dev = &rte_eth_devices[port_id];
 
-	eth_dev->data = &rte_eth_dev_data[port_id];
+	eth_dev->data = &rte_eth_dev_share_data->data[port_id];
 	eth_dev->state = RTE_ETH_DEV_ATTACHED;
 
 	eth_dev_last_created_port = port_id;
@@ -195,8 +206,8 @@ struct rte_eth_dev *
 		return NULL;
 	}
 
-	if (rte_eth_dev_data == NULL)
-		rte_eth_dev_data_alloc();
+	if (rte_eth_dev_share_data == NULL)
+		rte_eth_dev_share_data_alloc();
 
 	if (rte_eth_dev_allocated(name) != NULL) {
 		RTE_PMD_DEBUG_TRACE("Ethernet Device with name %s already allocated!\n",
@@ -225,11 +236,11 @@ struct rte_eth_dev *
 	uint16_t i;
 	struct rte_eth_dev *eth_dev;
 
-	if (rte_eth_dev_data == NULL)
-		rte_eth_dev_data_alloc();
+	if (rte_eth_dev_share_data == NULL)
+		rte_eth_dev_share_data_alloc();
 
 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
-		if (strcmp(rte_eth_dev_data[i].name, name) == 0)
+		if (strcmp(rte_eth_dev_share_data->data[i].name, name) == 0)
 			break;
 	}
 	if (i == RTE_MAX_ETHPORTS) {
@@ -251,9 +262,14 @@ struct rte_eth_dev *
 	if (eth_dev == NULL)
 		return -EINVAL;
 
-	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
+	rte_spinlock_lock(&rte_eth_dev_share_data->ownership_lock);
+
 	eth_dev->state = RTE_ETH_DEV_UNUSED;
 
+	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
+
+	rte_spinlock_unlock(&rte_eth_dev_share_data->ownership_lock);
+
 	_rte_eth_dev_callback_process(eth_dev, RTE_ETH_EVENT_DESTROY, NULL);
 
 	return 0;
@@ -269,6 +285,144 @@ struct rte_eth_dev *
 		return 1;
 }
 
+static int
+rte_eth_is_valid_owner_id(uint64_t owner_id)
+{
+	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
+	    rte_eth_dev_share_data->next_owner_id <= owner_id) {
+		RTE_LOG(ERR, EAL, "Invalid owner_id=%016lX.\n", owner_id);
+		return 0;
+	}
+	return 1;
+}
+
+uint64_t
+rte_eth_find_next_owned_by(uint16_t port_id, const uint64_t owner_id)
+{
+	while (port_id < RTE_MAX_ETHPORTS &&
+	       ((rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
+	       rte_eth_devices[port_id].state != RTE_ETH_DEV_REMOVED) ||
+	       rte_eth_devices[port_id].data->owner.id != owner_id))
+		port_id++;
+
+	if (port_id >= RTE_MAX_ETHPORTS)
+		return RTE_MAX_ETHPORTS;
+
+	return port_id;
+}
+
+int
+rte_eth_dev_owner_new(uint64_t *owner_id)
+{
+	rte_spinlock_lock(&rte_eth_dev_share_data->ownership_lock);
+
+	*owner_id = rte_eth_dev_share_data->next_owner_id++;
+
+	rte_spinlock_unlock(&rte_eth_dev_share_data->ownership_lock);
+	return 0;
+}
+
+static int
+_rte_eth_dev_owner_set(const uint16_t port_id, const uint64_t old_owner_id,
+		       const struct rte_eth_dev_owner *new_owner)
+{
+	struct rte_eth_dev_owner *port_owner;
+	int sret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+
+	if (!rte_eth_is_valid_owner_id(new_owner->id) &&
+	    !rte_eth_is_valid_owner_id(old_owner_id))
+		return -EINVAL;
+
+	port_owner = &rte_eth_devices[port_id].data->owner;
+	if (port_owner->id != old_owner_id) {
+		RTE_LOG(ERR, EAL, "Cannot set owner to port %d already owned"
+			" by %s_%016lX.\n", port_id, port_owner->name,
+			port_owner->id);
+		return -EPERM;
+	}
+
+	sret = snprintf(port_owner->name, RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
+			new_owner->name);
+	if (sret < 0 || sret >= RTE_ETH_MAX_OWNER_NAME_LEN)
+		RTE_LOG(WARNING, EAL, "Port %d owner name was truncated.\n",
+			port_id);
+
+	port_owner->id = new_owner->id;
+
+	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%016lX.\n", port_id,
+			    new_owner->name, new_owner->id);
+
+	return 0;
+}
+
+int
+rte_eth_dev_owner_set(const uint16_t port_id,
+		      const struct rte_eth_dev_owner *owner)
+{
+	int ret;
+
+	rte_spinlock_lock(&rte_eth_dev_share_data->ownership_lock);
+
+	ret = _rte_eth_dev_owner_set(port_id, RTE_ETH_DEV_NO_OWNER, owner);
+
+	rte_spinlock_unlock(&rte_eth_dev_share_data->ownership_lock);
+	return ret;
+}
+
+int
+rte_eth_dev_owner_unset(const uint16_t port_id, const uint64_t owner_id)
+{
+	const struct rte_eth_dev_owner new_owner = (struct rte_eth_dev_owner)
+			{.id = RTE_ETH_DEV_NO_OWNER, .name = ""};
+	int ret;
+
+	rte_spinlock_lock(&rte_eth_dev_share_data->ownership_lock);
+
+	ret = _rte_eth_dev_owner_set(port_id, owner_id, &new_owner);
+
+	rte_spinlock_unlock(&rte_eth_dev_share_data->ownership_lock);
+	return ret;
+}
+
+void
+rte_eth_dev_owner_delete(const uint64_t owner_id)
+{
+	uint16_t port_id;
+
+	rte_spinlock_lock(&rte_eth_dev_share_data->ownership_lock);
+
+	if (rte_eth_is_valid_owner_id(owner_id)) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(port_id, owner_id)
+			memset(&rte_eth_devices[port_id].data->owner, 0,
+			       sizeof(struct rte_eth_dev_owner));
+		RTE_PMD_DEBUG_TRACE("All port owners owned by %016X identifier"
+				    " have removed.\n", owner_id);
+	}
+
+	rte_spinlock_unlock(&rte_eth_dev_share_data->ownership_lock);
+}
+
+int
+rte_eth_dev_owner_get(const uint16_t port_id, struct rte_eth_dev_owner *owner)
+{
+	int ret = 0;
+
+	rte_spinlock_lock(&rte_eth_dev_share_data->ownership_lock);
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		RTE_PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		ret = -ENODEV;
+	} else {
+		rte_memcpy(owner, &rte_eth_devices[port_id].data->owner,
+			   sizeof(*owner));
+	}
+
+	rte_spinlock_unlock(&rte_eth_dev_share_data->ownership_lock);
+	return ret;
+}
+
 int
 rte_eth_dev_socket_id(uint16_t port_id)
 {
@@ -311,7 +465,7 @@ struct rte_eth_dev *
 
 	/* shouldn't check 'rte_eth_devices[i].data',
 	 * because it might be overwritten by VDEV PMD */
-	tmp = rte_eth_dev_data[port_id].name;
+	tmp = rte_eth_dev_share_data->data[port_id].name;
 	strcpy(name, tmp);
 	return 0;
 }
@@ -319,22 +473,22 @@ struct rte_eth_dev *
 int
 rte_eth_dev_get_port_by_name(const char *name, uint16_t *port_id)
 {
-	int i;
+	uint32_t pid;
 
 	if (name == NULL) {
 		RTE_PMD_DEBUG_TRACE("Null pointer is specified\n");
 		return -EINVAL;
 	}
 
-	RTE_ETH_FOREACH_DEV(i) {
-		if (!strncmp(name,
-			rte_eth_dev_data[i].name, strlen(name))) {
-
-			*port_id = i;
-
+	for (pid = 0; pid < RTE_MAX_ETHPORTS; pid++) {
+		if (rte_eth_devices[pid].state != RTE_ETH_DEV_UNUSED &&
+		    !strncmp(name, rte_eth_dev_share_data->data[pid].name,
+			     strlen(name))) {
+			*port_id = pid;
 			return 0;
 		}
 	}
+
 	return -ENODEV;
 }
 
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index cf4defb..f021139 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1739,6 +1739,15 @@ struct rte_eth_dev_sriov {
 
 #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
 
+#define RTE_ETH_DEV_NO_OWNER 0
+
+#define RTE_ETH_MAX_OWNER_NAME_LEN 64
+
+struct rte_eth_dev_owner {
+	uint64_t id; /**< The owner unique identifier. */
+	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner name. */
+};
+
 /**
  * @internal
  * The data part, with no function pointers, associated with each ethernet device.
@@ -1789,6 +1798,7 @@ struct rte_eth_dev_data {
 	int numa_node;  /**< NUMA node connection */
 	struct rte_vlan_filter_conf vlan_filter_conf;
 	/**< VLAN filter configuration. */
+	struct rte_eth_dev_owner owner; /**< The port owner. */
 };
 
 /** Device supports link state interrupt */
@@ -1806,6 +1816,30 @@ struct rte_eth_dev_data {
 extern struct rte_eth_dev rte_eth_devices[];
 
 /**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Iterates over valid ethdev ports owned by a specific owner.
+ *
+ * @param port_id
+ *   The id of the next possible valid owned port.
+ * @param	owner_id
+ *  The owner identifier.
+ *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless ports.
+ * @return
+ *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if there is none.
+ */
+uint64_t rte_eth_find_next_owned_by(uint16_t port_id, const uint64_t owner_id);
+
+/**
+ * Macro to iterate over all enabled ethdev ports owned by a specific owner.
+ */
+#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
+	for (p = rte_eth_find_next_owned_by(0, o); \
+	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
+	     p = rte_eth_find_next_owned_by(p + 1, o))
+
+/**
  * Iterates over valid ethdev ports.
  *
  * @param port_id
@@ -1816,13 +1850,84 @@ struct rte_eth_dev_data {
 uint16_t rte_eth_find_next(uint16_t port_id);
 
 /**
- * Macro to iterate over all enabled ethdev ports.
+ * Macro to iterate over all enabled and ownerless ethdev ports.
+ */
+#define RTE_ETH_FOREACH_DEV(p) \
+	RTE_ETH_FOREACH_DEV_OWNED_BY(p, RTE_ETH_DEV_NO_OWNER)
+
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get a new unique owner identifier.
+ * An owner identifier is used to owns Ethernet devices by only one DPDK entity
+ * to avoid multiple management of device by different entities.
+ *
+ * @param	owner_id
+ *   Owner identifier pointer.
+ * @return
+ *   Negative errno value on error, 0 on success.
  */
-#define RTE_ETH_FOREACH_DEV(p)					\
-	for (p = rte_eth_find_next(0);				\
-	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS;	\
-	     p = rte_eth_find_next(p + 1))
+int rte_eth_dev_owner_new(uint64_t *owner_id);
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set an Ethernet device owner.
+ *
+ * @param	port_id
+ *  The identifier of the port to own.
+ * @param	owner
+ *  The owner pointer.
+ * @return
+ *  Negative errno value on error, 0 on success.
+ */
+int rte_eth_dev_owner_set(const uint16_t port_id,
+			  const struct rte_eth_dev_owner *owner);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Unset Ethernet device owner to make the device ownerless.
+ *
+ * @param	port_id
+ *  The identifier of port to make ownerless.
+ * @param	owner
+ *  The owner identifier.
+ * @return
+ *  0 on success, negative errno value on error.
+ */
+int rte_eth_dev_owner_unset(const uint16_t port_id, const uint64_t owner_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Remove owner from all Ethernet devices owned by a specific owner.
+ *
+ * @param	owner
+ *  The owner identifier.
+ */
+void rte_eth_dev_owner_delete(const uint64_t owner_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the owner of an Ethernet device.
+ *
+ * @param	port_id
+ *  The port identifier.
+ * @param	owner
+ *  The owner structure pointer to fill.
+ * @return
+ *  0 on success, negative errno value on error..
+ */
+int rte_eth_dev_owner_get(const uint16_t port_id,
+			  struct rte_eth_dev_owner *owner);
 
 /**
  * Get the total number of Ethernet devices that have been successfully
diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
index 88b7908..545f7a6 100644
--- a/lib/librte_ether/rte_ethdev_version.map
+++ b/lib/librte_ether/rte_ethdev_version.map
@@ -202,6 +202,12 @@ EXPERIMENTAL {
 	global:
 
 	rte_eth_dev_is_removed;
+	rte_eth_dev_owner_delete;
+	rte_eth_dev_owner_get;
+	rte_eth_dev_owner_new;
+	rte_eth_dev_owner_set;
+	rte_eth_dev_owner_unset;
+	rte_eth_find_next_owned_by;
 	rte_mtr_capabilities_get;
 	rte_mtr_create;
 	rte_mtr_destroy;
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v3 4/7] ethdev: synchronize port allocation
  2018-01-18 16:35   ` [dpdk-dev] [PATCH v3 0/7] Port ownership and syncronization Matan Azrad
                       ` (2 preceding siblings ...)
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 3/7] ethdev: add port ownership Matan Azrad
@ 2018-01-18 16:35     ` Matan Azrad
  2018-01-18 20:43       ` Thomas Monjalon
  2018-01-19 12:47       ` Ananyev, Konstantin
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 5/7] net/failsafe: free an eth port by a dedicated API Matan Azrad
                       ` (4 subsequent siblings)
  8 siblings, 2 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-18 16:35 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Ethernet port allocation was not thread safe, means 2 threads which tried
to allocate a new port at the same time might get an identical port
identifier and caused to memory overwrite.
Actually, all the port configurations were not thread safe from ethdev
point of view.

The port ownership mechanism added to the ethdev is a good point to
redefine the synchronization rules in ethdev:

1. The port allocation and port release synchronization will be
   managed by ethdev.
2. The port usage synchronization will be managed by the port owner.
3. The port ownership synchronization will be managed by ethdev.

Add port allocation synchronization to complete the new rules.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 lib/librte_ether/rte_ethdev.c | 43 +++++++++++++++++++++++++++++++------------
 1 file changed, 31 insertions(+), 12 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index b740370..b75cbb2 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -52,6 +52,9 @@
 /* spinlock for add/remove tx callbacks */
 static rte_spinlock_t rte_eth_tx_cb_lock = RTE_SPINLOCK_INITIALIZER;
 
+/* spinlock for shared data allocation */
+static rte_spinlock_t rte_eth_share_data_alloc = RTE_SPINLOCK_INITIALIZER;
+
 /* store statistics names and its offset in stats structure  */
 struct rte_eth_xstats_name_off {
 	char name[RTE_ETH_XSTATS_NAME_SIZE];
@@ -198,21 +201,27 @@ struct rte_eth_dev *
 rte_eth_dev_allocate(const char *name)
 {
 	uint16_t port_id;
-	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev *eth_dev = NULL;
+
+	/* Synchronize share data one time allocation between local threads. */
+	rte_spinlock_lock(&rte_eth_share_data_alloc);
+	if (rte_eth_dev_share_data == NULL)
+		rte_eth_dev_share_data_alloc();
+	rte_spinlock_unlock(&rte_eth_share_data_alloc);
+
+	/* Synchronize port creation between primary and secondary threads. */
+	rte_spinlock_lock(&rte_eth_dev_share_data->ownership_lock);
 
 	port_id = rte_eth_dev_find_free_port();
 	if (port_id == RTE_MAX_ETHPORTS) {
 		RTE_PMD_DEBUG_TRACE("Reached maximum number of Ethernet ports\n");
-		return NULL;
+		goto unlock;
 	}
 
-	if (rte_eth_dev_share_data == NULL)
-		rte_eth_dev_share_data_alloc();
-
 	if (rte_eth_dev_allocated(name) != NULL) {
 		RTE_PMD_DEBUG_TRACE("Ethernet Device with name %s already allocated!\n",
 				name);
-		return NULL;
+		goto unlock;
 	}
 
 	eth_dev = eth_dev_get(port_id);
@@ -220,7 +229,11 @@ struct rte_eth_dev *
 	eth_dev->data->port_id = port_id;
 	eth_dev->data->mtu = ETHER_MTU;
 
-	_rte_eth_dev_callback_process(eth_dev, RTE_ETH_EVENT_NEW, NULL);
+unlock:
+	rte_spinlock_unlock(&rte_eth_dev_share_data->ownership_lock);
+
+	if (eth_dev != NULL)
+		_rte_eth_dev_callback_process(eth_dev, RTE_ETH_EVENT_NEW, NULL);
 
 	return eth_dev;
 }
@@ -234,10 +247,16 @@ struct rte_eth_dev *
 rte_eth_dev_attach_secondary(const char *name)
 {
 	uint16_t i;
-	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev *eth_dev = NULL;
 
+	/* Synchronize share data one time attachment between local threads. */
+	rte_spinlock_lock(&rte_eth_share_data_alloc);
 	if (rte_eth_dev_share_data == NULL)
 		rte_eth_dev_share_data_alloc();
+	rte_spinlock_unlock(&rte_eth_share_data_alloc);
+
+	/* Synchronize port attachment to primary port creation and release. */
+	rte_spinlock_lock(&rte_eth_dev_share_data->ownership_lock);
 
 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
 		if (strcmp(rte_eth_dev_share_data->data[i].name, name) == 0)
@@ -247,12 +266,12 @@ struct rte_eth_dev *
 		RTE_PMD_DEBUG_TRACE(
 			"device %s is not driven by the primary process\n",
 			name);
-		return NULL;
+	} else {
+		eth_dev = eth_dev_get(i);
+		RTE_ASSERT(eth_dev->data->port_id == i);
 	}
 
-	eth_dev = eth_dev_get(i);
-	RTE_ASSERT(eth_dev->data->port_id == i);
-
+	rte_spinlock_unlock(&rte_eth_dev_share_data->ownership_lock);
 	return eth_dev;
 }
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v3 5/7] net/failsafe: free an eth port by a dedicated API
  2018-01-18 16:35   ` [dpdk-dev] [PATCH v3 0/7] Port ownership and syncronization Matan Azrad
                       ` (3 preceding siblings ...)
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 4/7] ethdev: synchronize port allocation Matan Azrad
@ 2018-01-18 16:35     ` Matan Azrad
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 6/7] net/failsafe: use ownership mechanism to own ports Matan Azrad
                       ` (3 subsequent siblings)
  8 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-18 16:35 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Call dedicated ethdev API to free port in remove time as was done in
other fail-safe places.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
---
 drivers/net/failsafe/failsafe_ether.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index 8a4cacf..e9b0cfe 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -297,7 +297,7 @@
 			ERROR("Bus detach failed for sub_device %u",
 			      SUB_ID(sdev));
 		} else {
-			ETH(sdev)->state = RTE_ETH_DEV_UNUSED;
+			rte_eth_dev_release_port(ETH(sdev));
 		}
 		sdev->state = DEV_PARSED;
 		/* fallthrough */
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v3 6/7] net/failsafe: use ownership mechanism to own ports
  2018-01-18 16:35   ` [dpdk-dev] [PATCH v3 0/7] Port ownership and syncronization Matan Azrad
                       ` (4 preceding siblings ...)
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 5/7] net/failsafe: free an eth port by a dedicated API Matan Azrad
@ 2018-01-18 16:35     ` Matan Azrad
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership Matan Azrad
                       ` (2 subsequent siblings)
  8 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-18 16:35 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Fail-safe PMD sub devices management is based on ethdev port mechanism.
So, the sub-devices management structures are exposed to other DPDK
entities which may use them in parallel to fail-safe PMD.

Use the new port ownership mechanism to avoid multiple managments of
fail-safe PMD sub-devices.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
---
 drivers/net/failsafe/failsafe.c         | 7 +++++++
 drivers/net/failsafe/failsafe_eal.c     | 6 ++++++
 drivers/net/failsafe/failsafe_private.h | 2 ++
 3 files changed, 15 insertions(+)

diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index b767352..a1e1c7a 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -196,6 +196,13 @@
 	ret = failsafe_args_parse(dev, params);
 	if (ret)
 		goto free_subs;
+	ret = rte_eth_dev_owner_new(&priv->my_owner.id);
+	if (ret) {
+		ERROR("Failed to get unique owner identifier");
+		goto free_args;
+	}
+	snprintf(priv->my_owner.name, sizeof(priv->my_owner.name),
+		 FAILSAFE_OWNER_NAME);
 	ret = failsafe_eal_init(dev);
 	if (ret)
 		goto free_args;
diff --git a/drivers/net/failsafe/failsafe_eal.c b/drivers/net/failsafe/failsafe_eal.c
index 33a5adf..5f3da06 100644
--- a/drivers/net/failsafe/failsafe_eal.c
+++ b/drivers/net/failsafe/failsafe_eal.c
@@ -106,6 +106,12 @@
 			INFO("Taking control of a probed sub device"
 			      " %d named %s", i, da->name);
 		}
+		ret = rte_eth_dev_owner_set(pid, &PRIV(dev)->my_owner);
+		if (ret) {
+			INFO("sub_device %d owner set failed (%s),"
+			     " will try again later", i, strerror(ret));
+			continue;
+		}
 		ETH(sdev) = &rte_eth_devices[pid];
 		SUB_ID(sdev) = i;
 		sdev->fs_dev = dev;
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 4916365..b377046 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -42,6 +42,7 @@
 #include <rte_devargs.h>
 
 #define FAILSAFE_DRIVER_NAME "Fail-safe PMD"
+#define FAILSAFE_OWNER_NAME "Fail-safe"
 
 #define PMD_FAILSAFE_MAC_KVARG "mac"
 #define PMD_FAILSAFE_HOTPLUG_POLL_KVARG "hotplug_poll"
@@ -145,6 +146,7 @@ struct fs_priv {
 	uint32_t mac_addr_pool[FAILSAFE_MAX_ETHADDR];
 	/* current capabilities */
 	struct rte_eth_dev_info infos;
+	struct rte_eth_dev_owner my_owner; /* Unique owner. */
 	/*
 	 * Fail-safe state machine.
 	 * This level will be tracking state of the EAL and eth
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-18 16:35   ` [dpdk-dev] [PATCH v3 0/7] Port ownership and syncronization Matan Azrad
                       ` (5 preceding siblings ...)
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 6/7] net/failsafe: use ownership mechanism to own ports Matan Azrad
@ 2018-01-18 16:35     ` Matan Azrad
  2018-01-19 12:37       ` Ananyev, Konstantin
  2018-01-20 21:24     ` [dpdk-dev] [PATCH v4 0/7] Port ownership and syncronization Matan Azrad
  2018-01-25 14:35     ` [dpdk-dev] [PATCH v3 0/7] Port ownership and syncronization Ferruh Yigit
  8 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-18 16:35 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Testpmd should not use ethdev ports which are managed by other DPDK
entities.

Set Testpmd ownership to each port which is not used by other entity and
prevent any usage of ethdev ports which are not owned by Testpmd.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 app/test-pmd/cmdline.c      | 89 +++++++++++++++++++--------------------------
 app/test-pmd/cmdline_flow.c |  2 +-
 app/test-pmd/config.c       | 37 ++++++++++---------
 app/test-pmd/parameters.c   |  4 +-
 app/test-pmd/testpmd.c      | 63 ++++++++++++++++++++------------
 app/test-pmd/testpmd.h      |  3 ++
 6 files changed, 103 insertions(+), 95 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 31919ba..6199c64 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
 			&link_speed) < 0)
 		return;
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		ports[pid].dev_conf.link_speeds = link_speed;
 	}
 
@@ -1902,7 +1902,7 @@ struct cmd_config_rss {
 	struct cmd_config_rss *res = parsed_result;
 	struct rte_eth_rss_conf rss_conf = { .rss_key_len = 0, };
 	int diag;
-	uint8_t i;
+	uint16_t pid;
 
 	if (!strcmp(res->value, "all"))
 		rss_conf.rss_hf = ETH_RSS_IP | ETH_RSS_TCP |
@@ -1936,12 +1936,12 @@ struct cmd_config_rss {
 		return;
 	}
 	rss_conf.rss_key = NULL;
-	for (i = 0; i < rte_eth_dev_count(); i++) {
-		diag = rte_eth_dev_rss_hash_update(i, &rss_conf);
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
+		diag = rte_eth_dev_rss_hash_update(pid, &rss_conf);
 		if (diag < 0)
 			printf("Configuration of RSS hash at ethernet port %d "
 				"failed with error (%d): %s.\n",
-				i, -diag, strerror(-diag));
+				pid, -diag, strerror(-diag));
 	}
 }
 
@@ -3686,10 +3686,9 @@ struct cmd_csum_result {
 	uint64_t csum_offloads = 0;
 	struct rte_eth_dev_info dev_info;
 
-	if (port_id_is_invalid(res->port_id, ENABLED_WARN)) {
-		printf("invalid port %d\n", res->port_id);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
+
 	if (!port_is_stopped(res->port_id)) {
 		printf("Please stop port %d first\n", res->port_id);
 		return;
@@ -4364,8 +4363,8 @@ struct cmd_gso_show_result {
 {
 	struct cmd_gso_show_result *res = parsed_result;
 
-	if (!rte_eth_dev_is_valid_port(res->cmd_pid)) {
-		printf("invalid port id %u\n", res->cmd_pid);
+	if (port_id_is_invalid(res->cmd_pid, ENABLED_WARN)) {
+		printf("invalid/not owned port id %u\n", res->cmd_pid);
 		return;
 	}
 	if (!strcmp(res->cmd_keyword, "gso")) {
@@ -5375,7 +5374,12 @@ static void cmd_create_bonded_device_parsed(void *parsed_result,
 				port_id);
 
 		/* Update number of ports */
-		nb_ports = rte_eth_dev_count();
+		if (rte_eth_dev_owner_set(port_id, &my_owner) != 0) {
+			printf("Error: cannot own new attached port %d\n",
+			       port_id);
+			return;
+		}
+		nb_ports++;
 		reconfig(port_id, res->socket);
 		rte_eth_promiscuous_enable(port_id);
 	}
@@ -5484,10 +5488,8 @@ static void cmd_set_bond_mon_period_parsed(void *parsed_result,
 	struct cmd_set_bond_mon_period_result *res = parsed_result;
 	int ret;
 
-	if (res->port_num >= nb_ports) {
-		printf("Port id %d must be less than %d\n", res->port_num, nb_ports);
+	if (port_id_is_invalid(res->port_num, ENABLED_WARN))
 		return;
-	}
 
 	ret = rte_eth_bond_link_monitoring_set(res->port_num, res->period_ms);
 
@@ -5545,11 +5547,8 @@ struct cmd_set_bonding_agg_mode_policy_result {
 	struct cmd_set_bonding_agg_mode_policy_result *res = parsed_result;
 	uint8_t policy = AGG_BANDWIDTH;
 
-	if (res->port_num >= nb_ports) {
-		printf("Port id %d must be less than %d\n",
-				res->port_num, nb_ports);
+	if (port_id_is_invalid(res->port_num, ENABLED_WARN))
 		return;
-	}
 
 	if (!strcmp(res->policy, "bandwidth"))
 		policy = AGG_BANDWIDTH;
@@ -5808,7 +5807,7 @@ static void cmd_set_promisc_mode_parsed(void *parsed_result,
 
 	/* all ports */
 	if (allports) {
-		RTE_ETH_FOREACH_DEV(i) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
 			if (enable)
 				rte_eth_promiscuous_enable(i);
 			else
@@ -5888,7 +5887,7 @@ static void cmd_set_allmulti_mode_parsed(void *parsed_result,
 
 	/* all ports */
 	if (allports) {
-		RTE_ETH_FOREACH_DEV(i) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
 			if (enable)
 				rte_eth_allmulticast_enable(i);
 			else
@@ -6622,31 +6621,31 @@ static void cmd_showportall_parsed(void *parsed_result,
 	struct cmd_showportall_result *res = parsed_result;
 	if (!strcmp(res->show, "clear")) {
 		if (!strcmp(res->what, "stats"))
-			RTE_ETH_FOREACH_DEV(i)
+			RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 				nic_stats_clear(i);
 		else if (!strcmp(res->what, "xstats"))
-			RTE_ETH_FOREACH_DEV(i)
+			RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 				nic_xstats_clear(i);
 	} else if (!strcmp(res->what, "info"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			port_infos_display(i);
 	else if (!strcmp(res->what, "stats"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			nic_stats_display(i);
 	else if (!strcmp(res->what, "xstats"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			nic_xstats_display(i);
 	else if (!strcmp(res->what, "fdir"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			fdir_get_infos(i);
 	else if (!strcmp(res->what, "stat_qmap"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			nic_stats_mapping_display(i);
 	else if (!strcmp(res->what, "dcb_tc"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			port_dcb_info_display(i);
 	else if (!strcmp(res->what, "cap"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			port_offload_cap_display(i);
 }
 
@@ -10698,10 +10697,8 @@ struct cmd_flow_director_mask_result {
 	struct rte_eth_fdir_masks *mask;
 	struct rte_port *port;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	port = &ports[res->port_id];
 	/** Check if the port is not started **/
@@ -10899,10 +10896,8 @@ struct cmd_flow_director_flex_mask_result {
 	uint16_t i;
 	int ret;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	port = &ports[res->port_id];
 	/** Check if the port is not started **/
@@ -11053,12 +11048,10 @@ struct cmd_flow_director_flexpayload_result {
 	struct cmd_flow_director_flexpayload_result *res = parsed_result;
 	struct rte_eth_flex_payload_cfg flex_cfg;
 	struct rte_port *port;
-	int ret = 0;
+	int ret;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	port = &ports[res->port_id];
 	/** Check if the port is not started **/
@@ -11774,7 +11767,7 @@ struct cmd_config_l2_tunnel_eth_type_result {
 	entry.l2_tunnel_type = str2fdir_l2_tunnel_type(res->l2_tunnel_type);
 	entry.ether_type = res->eth_type_val;
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		rte_eth_dev_l2_tunnel_eth_type_conf(pid, &entry);
 	}
 }
@@ -11890,7 +11883,7 @@ struct cmd_config_l2_tunnel_en_dis_result {
 	else
 		en = 0;
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		rte_eth_dev_l2_tunnel_offload_set(pid,
 						  &entry,
 						  ETH_L2_TUNNEL_ENABLE_MASK,
@@ -14440,10 +14433,8 @@ struct cmd_ddp_add_result {
 	int file_num;
 	int ret = -ENOTSUP;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	if (!all_ports_stopped()) {
 		printf("Please stop all ports first\n");
@@ -14522,10 +14513,8 @@ struct cmd_ddp_del_result {
 	uint32_t size;
 	int ret = -ENOTSUP;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	if (!all_ports_stopped()) {
 		printf("Please stop all ports first\n");
@@ -14837,10 +14826,8 @@ struct cmd_ddp_get_list_result {
 #endif
 	int ret = -ENOTSUP;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 #ifdef RTE_LIBRTE_I40E_PMD
 	size = PROFILE_INFO_SIZE * MAX_PROFILE_NUM + 4;
@@ -16296,7 +16283,7 @@ struct cmd_cmdfile_result {
 	if (id == (portid_t)RTE_PORT_ALL) {
 		portid_t pid;
 
-		RTE_ETH_FOREACH_DEV(pid) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 			/* check if need_reconfig has been set to 1 */
 			if (ports[pid].need_reconfig == 0)
 				ports[pid].need_reconfig = dev;
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 561e057..e55490f 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -2652,7 +2652,7 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 
 	(void)ctx;
 	(void)token;
-	RTE_ETH_FOREACH_DEV(p) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(p, my_owner.id) {
 		if (buf && i == ent)
 			return snprintf(buf, size, "%u", p);
 		++i;
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 957b820..43b9a7d 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -156,7 +156,7 @@ struct rss_type_info {
 
 	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
 		printf("Valid port range is [0");
-		RTE_ETH_FOREACH_DEV(pid)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 			printf(", %d", pid);
 		printf("]\n");
 		return;
@@ -236,7 +236,7 @@ struct rss_type_info {
 
 	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
 		printf("Valid port range is [0");
-		RTE_ETH_FOREACH_DEV(pid)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 			printf(", %d", pid);
 		printf("]\n");
 		return;
@@ -253,10 +253,9 @@ struct rss_type_info {
 	struct rte_eth_xstat_name *xstats_names;
 
 	printf("###### NIC extended statistics for port %-2d\n", port_id);
-	if (!rte_eth_dev_is_valid_port(port_id)) {
-		printf("Error: Invalid port number %i\n", port_id);
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
 		return;
-	}
 
 	/* Get count */
 	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, 0);
@@ -321,7 +320,7 @@ struct rss_type_info {
 
 	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
 		printf("Valid port range is [0");
-		RTE_ETH_FOREACH_DEV(pid)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 			printf(", %d", pid);
 		printf("]\n");
 		return;
@@ -439,7 +438,7 @@ struct rss_type_info {
 
 	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
 		printf("Valid port range is [0");
-		RTE_ETH_FOREACH_DEV(pid)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 			printf(", %d", pid);
 		printf("]\n");
 		return;
@@ -756,10 +755,15 @@ struct rss_type_info {
 int
 port_id_is_invalid(portid_t port_id, enum print_warning warning)
 {
+	struct rte_eth_dev_owner owner;
+	int ret;
+
 	if (port_id == (portid_t)RTE_PORT_ALL)
 		return 0;
 
-	if (rte_eth_dev_is_valid_port(port_id))
+	ret = rte_eth_dev_owner_get(port_id, &owner);
+
+	if (ret == 0 && owner.id == my_owner.id)
 		return 0;
 
 	if (warning == ENABLED_WARN)
@@ -2373,7 +2377,7 @@ struct igb_ring_desc_16_bytes {
 		return;
 	}
 	nb_pt = 0;
-	RTE_ETH_FOREACH_DEV(i) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
 		if (! ((uint64_t)(1ULL << i) & portmask))
 			continue;
 		portlist[nb_pt++] = i;
@@ -2512,10 +2516,9 @@ struct igb_ring_desc_16_bytes {
 void
 setup_gro(const char *onoff, portid_t port_id)
 {
-	if (!rte_eth_dev_is_valid_port(port_id)) {
-		printf("invalid port id %u\n", port_id);
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
 		return;
-	}
+
 	if (test_done == 0) {
 		printf("Before enable/disable GRO,"
 				" please stop forwarding first\n");
@@ -2574,10 +2577,9 @@ struct igb_ring_desc_16_bytes {
 
 	param = &gro_ports[port_id].param;
 
-	if (!rte_eth_dev_is_valid_port(port_id)) {
-		printf("Invalid port id %u.\n", port_id);
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
 		return;
-	}
+
 	if (gro_ports[port_id].enable) {
 		printf("GRO type: TCP/IPv4\n");
 		if (gro_flush_cycles == GRO_DEFAULT_FLUSH_CYCLES) {
@@ -2595,10 +2597,9 @@ struct igb_ring_desc_16_bytes {
 void
 setup_gso(const char *mode, portid_t port_id)
 {
-	if (!rte_eth_dev_is_valid_port(port_id)) {
-		printf("invalid port id %u\n", port_id);
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
 		return;
-	}
+
 	if (strcmp(mode, "on") == 0) {
 		if (test_done == 0) {
 			printf("before enabling GSO,"
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 878c112..0e57b46 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -398,7 +398,7 @@
 		if (port_id_is_invalid(port_id, ENABLED_WARN) ||
 			port_id == (portid_t)RTE_PORT_ALL) {
 			printf("Valid port range is [0");
-			RTE_ETH_FOREACH_DEV(pid)
+			RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 				printf(", %d", pid);
 			printf("]\n");
 			return -1;
@@ -459,7 +459,7 @@
 		if (port_id_is_invalid(port_id, ENABLED_WARN) ||
 			port_id == (portid_t)RTE_PORT_ALL) {
 			printf("Valid port range is [0");
-			RTE_ETH_FOREACH_DEV(pid)
+			RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 				printf(", %d", pid);
 			printf("]\n");
 			return -1;
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index c066cf9..83f5e84 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -108,6 +108,11 @@
 lcoreid_t nb_lcores;           /**< Number of probed logical cores. */
 
 /*
+ * My port owner structure used to own Ethernet ports.
+ */
+struct rte_eth_dev_owner my_owner; /**< Unique owner. */
+
+/*
  * Test Forwarding Configuration.
  *    nb_fwd_lcores <= nb_cfg_lcores <= nb_lcores
  *    nb_fwd_ports  <= nb_cfg_ports  <= nb_ports
@@ -449,7 +454,7 @@ static int eth_event_callback(portid_t port_id,
 	portid_t pt_id;
 	int i = 0;
 
-	RTE_ETH_FOREACH_DEV(pt_id)
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pt_id, my_owner.id)
 		fwd_ports_ids[i++] = pt_id;
 
 	nb_cfg_ports = nb_ports;
@@ -573,7 +578,7 @@ static int eth_event_callback(portid_t port_id,
 		fwd_lcores[lc_id]->cpuid_idx = lc_id;
 	}
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		port = &ports[pid];
 		/* Apply default Tx configuration for all ports */
 		port->dev_conf.txmode = tx_mode;
@@ -706,7 +711,7 @@ static int eth_event_callback(portid_t port_id,
 	queueid_t q;
 
 	/* set socket id according to numa or not */
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		port = &ports[pid];
 		if (nb_rxq > port->dev_info.max_rx_queues) {
 			printf("Fail: nb_rxq(%d) is greater than "
@@ -1000,9 +1005,8 @@ static int eth_event_callback(portid_t port_id,
 	uint64_t tics_per_1sec;
 	uint64_t tics_datum;
 	uint64_t tics_current;
-	uint8_t idx_port, cnt_ports;
+	uint16_t idx_port;
 
-	cnt_ports = rte_eth_dev_count();
 	tics_datum = rte_rdtsc();
 	tics_per_1sec = rte_get_timer_hz();
 #endif
@@ -1017,11 +1021,10 @@ static int eth_event_callback(portid_t port_id,
 			tics_current = rte_rdtsc();
 			if (tics_current - tics_datum >= tics_per_1sec) {
 				/* Periodic bitrate calculation */
-				for (idx_port = 0;
-						idx_port < cnt_ports;
-						idx_port++)
+				RTE_ETH_FOREACH_DEV_OWNED_BY(idx_port,
+							     my_owner.id)
 					rte_stats_bitrate_calc(bitrate_data,
-						idx_port);
+							       idx_port);
 				tics_datum = tics_current;
 			}
 		}
@@ -1359,7 +1362,7 @@ static int eth_event_callback(portid_t port_id,
 	portid_t pi;
 	struct rte_port *port;
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		port = &ports[pi];
 		/* Check if there is a port which is not started */
 		if ((port->port_status != RTE_PORT_STARTED) &&
@@ -1387,7 +1390,7 @@ static int eth_event_callback(portid_t port_id,
 {
 	portid_t pi;
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (!port_is_stopped(pi))
 			return 0;
 	}
@@ -1434,7 +1437,7 @@ static int eth_event_callback(portid_t port_id,
 
 	if(dcb_config)
 		dcb_test = 1;
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
 			continue;
 
@@ -1620,7 +1623,7 @@ static int eth_event_callback(portid_t port_id,
 
 	printf("Stopping ports...\n");
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
 			continue;
 
@@ -1663,7 +1666,7 @@ static int eth_event_callback(portid_t port_id,
 
 	printf("Closing ports...\n");
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
 			continue;
 
@@ -1714,7 +1717,7 @@ static int eth_event_callback(portid_t port_id,
 
 	printf("Resetting ports...\n");
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
 			continue;
 
@@ -1759,6 +1762,12 @@ static int eth_event_callback(portid_t port_id,
 	if (rte_eth_dev_attach(identifier, &pi))
 		return;
 
+	if (rte_eth_dev_owner_set(pi, &my_owner) != 0) {
+		printf("Error: cannot own new attached port %d\n", pi);
+		return;
+	}
+	nb_ports++;
+
 	socket_id = (unsigned)rte_eth_dev_socket_id(pi);
 	/* if socket_id is invalid, set to 0 */
 	if (check_socket_id(socket_id) < 0)
@@ -1766,8 +1775,6 @@ static int eth_event_callback(portid_t port_id,
 	reconfig(pi, socket_id);
 	rte_eth_promiscuous_enable(pi);
 
-	nb_ports = rte_eth_dev_count();
-
 	ports[pi].port_status = RTE_PORT_STOPPED;
 
 	printf("Port %d is attached. Now total ports is %d\n", pi, nb_ports);
@@ -1781,6 +1788,9 @@ static int eth_event_callback(portid_t port_id,
 
 	printf("Detaching a port...\n");
 
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
+		return;
+
 	if (!port_is_closed(port_id)) {
 		printf("Please close port first\n");
 		return;
@@ -1794,7 +1804,7 @@ static int eth_event_callback(portid_t port_id,
 		return;
 	}
 
-	nb_ports = rte_eth_dev_count();
+	nb_ports--;
 
 	printf("Port '%s' is detached. Now total ports is %d\n",
 			name, nb_ports);
@@ -1812,7 +1822,7 @@ static int eth_event_callback(portid_t port_id,
 
 	if (ports != NULL) {
 		no_link_check = 1;
-		RTE_ETH_FOREACH_DEV(pt_id) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pt_id, my_owner.id) {
 			printf("\nShutting down port %d...\n", pt_id);
 			fflush(stdout);
 			stop_port(pt_id);
@@ -1844,7 +1854,7 @@ struct pmd_test_command {
 	fflush(stdout);
 	for (count = 0; count <= MAX_CHECK_TIME; count++) {
 		all_ports_up = 1;
-		RTE_ETH_FOREACH_DEV(portid) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(portid, my_owner.id) {
 			if ((port_mask & (1 << portid)) == 0)
 				continue;
 			memset(&link, 0, sizeof(link));
@@ -1936,6 +1946,8 @@ struct pmd_test_command {
 
 	switch (type) {
 	case RTE_ETH_EVENT_INTR_RMV:
+		if (port_id_is_invalid(port_id, ENABLED_WARN))
+			break;
 		if (rte_eal_alarm_set(100000,
 				rmv_event_callback, (void *)(intptr_t)port_id))
 			fprintf(stderr, "Could not set up deferred device removal\n");
@@ -2068,7 +2080,7 @@ struct pmd_test_command {
 	portid_t pid;
 	struct rte_port *port;
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		port = &ports[pid];
 		port->dev_conf.fdir_conf = fdir_conf;
 		if (nb_rxq > 1) {
@@ -2383,7 +2395,12 @@ uint8_t port_is_bonding_slave(portid_t slave_pid)
 	rte_pdump_init(NULL);
 #endif
 
-	nb_ports = (portid_t) rte_eth_dev_count();
+	if (rte_eth_dev_owner_new(&my_owner.id))
+		rte_panic("Failed to get unique owner identifier\n");
+	snprintf(my_owner.name, sizeof(my_owner.name), TESTPMD_OWNER_NAME);
+	RTE_ETH_FOREACH_DEV(port_id)
+		if (rte_eth_dev_owner_set(port_id, &my_owner) == 0)
+			nb_ports++;
 	if (nb_ports == 0)
 		TESTPMD_LOG(WARNING, "No probed ethernet devices\n");
 
@@ -2431,7 +2448,7 @@ uint8_t port_is_bonding_slave(portid_t slave_pid)
 		rte_exit(EXIT_FAILURE, "Start ports failed\n");
 
 	/* set all ports to promiscuous mode by default */
-	RTE_ETH_FOREACH_DEV(port_id)
+	RTE_ETH_FOREACH_DEV_OWNED_BY(port_id, my_owner.id)
 		rte_eth_promiscuous_enable(port_id);
 
 	/* Init metrics library */
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 9c739e5..2d253b9 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -50,6 +50,8 @@
 #define NUMA_NO_CONFIG 0xFF
 #define UMA_NO_CONFIG  0xFF
 
+#define TESTPMD_OWNER_NAME "TestPMD"
+
 typedef uint8_t  lcoreid_t;
 typedef uint16_t portid_t;
 typedef uint16_t queueid_t;
@@ -361,6 +363,7 @@ struct queue_stats_mappings {
  * nb_fwd_ports <= nb_cfg_ports <= nb_ports
  */
 extern portid_t nb_ports; /**< Number of ethernet ports probed at init time. */
+extern struct rte_eth_dev_owner my_owner; /**< Unique owner. */
 extern portid_t nb_cfg_ports; /**< Number of configured ports. */
 extern portid_t nb_fwd_ports; /**< Number of forwarding ports. */
 extern portid_t fwd_ports_ids[RTE_MAX_ETHPORTS];
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-18 14:00                                       ` Matan Azrad
@ 2018-01-18 16:54                                         ` Neil Horman
  2018-01-18 17:20                                           ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Neil Horman @ 2018-01-18 16:54 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing,
	dev, Richardson, Bruce

On Thu, Jan 18, 2018 at 02:00:23PM +0000, Matan Azrad wrote:
> Hi Neil
> 
> From: Neil Horman, Thursday, January 18, 2018 3:10 PM
> > On Wed, Jan 17, 2018 at 05:01:10PM +0000, Ananyev, Konstantin wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > > Sent: Wednesday, January 17, 2018 2:00 PM
> > > > To: Matan Azrad <matan@mellanox.com>
> > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Thomas
> > > > Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>;
> > > > dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>
> > > > Subject: Re: [PATCH v2 2/6] ethdev: add port ownership
> > > >
> > > > On Wed, Jan 17, 2018 at 12:05:42PM +0000, Matan Azrad wrote:
> > > > >
> > > > > Hi Konstantin
> > > > > From: Ananyev, Konstantin, Sent: Wednesday, January 17, 2018 1:24
> > > > > PM
> > > > > > Hi Matan,
> > > > > >
> > > > > > > Hi Konstantin
> > > > > > >
> > > > > > > From: Ananyev, Konstantin, Tuesday, January 16, 2018 9:11 PM
> > > > > > > > Hi Matan,
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Hi Konstantin
> > > > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018 8:44
> > > > > > > > > PM
> > > > > > > > > > Hi Matan,
> > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018
> > > > > > > > > > > 1:45 PM
> > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > From: Ananyev, Konstantin, Friday, January 12,
> > > > > > > > > > > > > 2018 2:02 AM
> > > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > > From: Ananyev, Konstantin, Thursday, January
> > > > > > > > > > > > > > > 11, 2018
> > > > > > > > > > > > > > > 2:40 PM
> > > > > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > > > > From: Ananyev, Konstantin, Wednesday,
> > > > > > > > > > > > > > > > > January 10,
> > > > > > > > > > > > > > > > > 2018
> > > > > > > > > > > > > > > > > 3:36 PM
> > > > > > > > > > > > > > > > > > Hi Matan,
> > > > > > > > >  <snip>
> > > > > > > > > > > > > > > > > > It is good to see that now
> > > > > > > > > > > > > > > > > > scanning/updating rte_eth_dev_data[] is
> > > > > > > > > > > > > > > > > > lock protected, but it might be not very
> > > > > > > > > > > > > > > > > > plausible to protect both data[] and
> > > > > > > > > > > > > > > > > > next_owner_id using the
> > > > > > > > > > > > same lock.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I guess you mean to the owner structure in
> > > > > > > > > > > > rte_eth_dev_data[port_id].
> > > > > > > > > > > > > > > > > The next_owner_id is read by ownership
> > > > > > > > > > > > > > > > > APIs(for owner validation), so it
> > > > > > > > > > > > > > > > makes sense to use the same lock.
> > > > > > > > > > > > > > > > > Actually, why not?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Well to me next_owner_id and
> > > > > > > > > > > > > > > > rte_eth_dev_data[] are not directly
> > > > > > > > > > > > > > related.
> > > > > > > > > > > > > > > > You may create new owner_id but it doesn't
> > > > > > > > > > > > > > > > mean you would update rte_eth_dev_data[]
> > immediately.
> > > > > > > > > > > > > > > > And visa-versa - you might just want to
> > > > > > > > > > > > > > > > update rte_eth_dev_data[].name or .owner_id.
> > > > > > > > > > > > > > > > It is not very good coding practice to use
> > > > > > > > > > > > > > > > same lock for non-related data structures.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I see the relation like next:
> > > > > > > > > > > > > > > Since the ownership mechanism synchronization
> > > > > > > > > > > > > > > is in ethdev responsibility, we must protect
> > > > > > > > > > > > > > > against user mistakes as much as we can by
> > > > > > > > > > > > > > using the same lock.
> > > > > > > > > > > > > > > So, if user try to set by invalid owner
> > > > > > > > > > > > > > > (exactly the ID which currently is
> > > > > > > > > > > > > > allocated) we can protect on it.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hmm, not sure why you can't do same checking
> > > > > > > > > > > > > > with different lock or atomic variable?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > The set ownership API is protected by ownership
> > > > > > > > > > > > > lock and checks the owner ID validity By reading the next
> > owner ID.
> > > > > > > > > > > > > So, the owner ID allocation and set API should use
> > > > > > > > > > > > > the same atomic
> > > > > > > > > > > > mechanism.
> > > > > > > > > > > >
> > > > > > > > > > > > Sure but all you are doing for checking validity, is
> > > > > > > > > > > > check that owner_id > 0 &&& owner_id < next_ownwe_id,
> > right?
> > > > > > > > > > > > As you don't allow owner_id overlap (16/3248 bits)
> > > > > > > > > > > > you can safely do same check with just
> > atomic_get(&next_owner_id).
> > > > > > > > > > > >
> > > > > > > > > > > It will not protect it, scenario:
> > > > > > > > > > > - current next_id is X.
> > > > > > > > > > > - call set ownership of port A with owner id X by
> > > > > > > > > > > thread 0(by user
> > > > > > > > mistake).
> > > > > > > > > > > - context switch
> > > > > > > > > > > - allocate new id by thread 1 and get X and change
> > > > > > > > > > > next_id to
> > > > > > > > > > > X+1
> > > > > > > > > > atomically.
> > > > > > > > > > > -  context switch
> > > > > > > > > > > - Thread 0 validate X by atomic_read and succeed to
> > > > > > > > > > > take
> > > > > > ownership.
> > > > > > > > > > > - The system loosed the port(or will be managed by two
> > > > > > > > > > > entities) -
> > > > > > > > crash.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Ok, and how using lock will protect you with such scenario?
> > > > > > > > >
> > > > > > > > > The owner set API validation by thread 0 should fail
> > > > > > > > > because the owner
> > > > > > > > validation is included in the protected section.
> > > > > > > >
> > > > > > > > Then your validation function would fail even if you'll use
> > > > > > > > atomic ops instead of lock.
> > > > > > > No.
> > > > > > > With atomic this specific scenario will cause the validation to pass.
> > > > > >
> > > > > > Can you explain to me how?
> > > > > >
> > > > > > rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > > > >               int32_t cur_owner_id =
> > > > > > RTE_MIN(rte_atomic32_get(next_owner_id),
> > > > > > UINT16_MAX);
> > > > > >
> > > > > > 	if (owner_id == RTE_ETH_DEV_NO_OWNER || owner >
> > > > > > cur_owner_id) {
> > > > > > 		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n", owner_id);
> > > > > > 		return 0;
> > > > > > 	}
> > > > > > 	return 1;
> > > > > > }
> > > > > >
> > > > > > Let say your next_owne_id==X, and you invoke
> > > > > > rte_eth_is_valid_owner_id(owner_id=X+1)  - it would fail.
> > > > >
> > > > > Explanation:
> > > > > The scenario with locks:
> > > > > next_owner_id = X.
> > > > > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > > > > Context switch.
> > > > > Thread 1 call to owner_new and stuck in the lock.
> > > > > Context switch.
> > > > > Thread 0 does owner id validation and failed(Y>=X) - unlock the lock and
> > return failure to the user.
> > > > > Context switch.
> > > > > Thread 1 take the lock and update X to X+1, then, unlock the lock.
> > > > > Everything is OK!
> > > > >
> > > > > The same scenario with atomics:
> > > > > next_owner_id = X.
> > > > > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > > > > Context switch.
> > > > > Thread 1 call to owner_new and change X to X+1(atomically).
> > > > > Context switch.
> > > > > Thread 0 does owner id validation and success(Y<(atomic)X+1) - unlock
> > the lock and return success to the  user.
> > > > > Problem!
> > > > >
> > > >
> > > >
> > > > Matan is correct here, there is no way to preform parallel set
> > > > operations using just and atomic variable here, because multiple
> > > > reads of next_owner_id need to be preformed while it is stable.
> > > > That is to say rte_eth_next_owner_id must be compared to
> > > > RTE_ETH_DEV_NO_OWNER and owner_id in rte_eth_is_valid_owner_id.
> > If
> > > > you were to only use an atomic_read on such a variable, it could be
> > > > incremented by the owner_new function between the checks and an
> > > > invalid owner value could become valid because  a third thread
> > > > incremented the next value.  The state of next_owner_id must be kept
> > > > stable during any validity checks
> > >
> > > It could still be incremented between the checks - if let say
> > > different thread will invoke new_onwer_id, grab the lock update
> > > counter, release the lock - all that before the check.
> > I don't see how all of the contents of rte_eth_dev_owner_set is protected
> > under rte_eth_dev_ownership_lock, as is rte_eth_dev_owner_new.
> > Next_owner might increment between another threads calls to owner_new
> > and owner_set, but that will just cause a transition from an ownership id
> > being valid to invalid, and thats ok, as long as there is consistency in the
> > model that enforces a single valid owner at a time (in that case the
> > subsequent caller to owner_new).
> > 
> 
> I'm not sure I fully understand you, but see:
> we can't protect all of the user mistakes(using the wrong owner id).
> But we are doing the maximum for it.
> 
Yeah, my writing was atrocious, apologies.  All I meant to say was that the
locking you have is ok, in that it maintains a steady state for the data being
read during the period its being read.  The fact that a given set operation may
fail because someone else created an ownership record is an artifact of the api,
not a bug in its implementation.  I think we're basically in agreement on the
semantics here, but this goes to my argument about complexity (more below).

> 
> > Though this confusion does underscore my assertion I think that this API is
> > overly complicated
> > 
> 
> I really don't think it is complicated. - just take ownership of a port(by owner id allocation and set APIs) and manage the port as you want. 
> 
But thats not all.  The determination of success or failure in claiming
ownership is largely dependent on the behavior of other threads actions, not a
function of the state of the system at the moment ownership is requested.  That
is to say, if you have N threads, and they all create ownership objects
identified as X, x+1, X+2...X+N, only the thread with id X+N will be able to
claim ownership of any port, because they all will have incremented the shared
nex_id variable.  Determination of ownership by the programmer will have to be
done via debugging, and errors will likely be transient dependent on the order
in which threads execute (subject to scheduling jitter).  

Rather than making ownership success dependent on any data contained within the
ownership record, ownership should be entirely dependent on the state of port
ownership at the time that it was requested.  That is to say, port ownership
should succede if and only if the port is unowned at the time that a given
thread requets ownership.  Any ancilliary data regarding which context owns the
port should be exactly that, ancilliary, and have no impact on weather or not
the port ownership request succedes.

Regards
Neil

> > Neil
> 
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/7] ethdev: fix port data reset timing
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 1/7] ethdev: fix port data reset timing Matan Azrad
@ 2018-01-18 17:00       ` Thomas Monjalon
  2018-01-19 12:38       ` Ananyev, Konstantin
  2018-03-05 11:24       ` [dpdk-dev] [dpdk-stable] " Ferruh Yigit
  2 siblings, 0 replies; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-18 17:00 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Gaetan Rivet, Jingjing Wu, dev, Neil Horman, Bruce Richardson,
	Konstantin Ananyev, stable

18/01/2018 17:35, Matan Azrad:
> rte_eth_dev_data structure is allocated per ethdev port and can be
> used to get a data of the port internally.
> 
> rte_eth_dev_attach_secondary tries to find the port identifier using
> rte_eth_dev_data name field comparison and may get an identifier of
> invalid port in case of this port was released by the primary process
> because the port release API doesn't reset the port data.
> 
> So, it will be better to reset the port data in release time instead of
> allocation time.
> 
> Move the port data reset to the port release API.
> 
> Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple process model")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>

Acked-by: Thomas Monjalon <thomas@monjalon.net>

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/7] ethdev: fix used portid allocation
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 2/7] ethdev: fix used portid allocation Matan Azrad
@ 2018-01-18 17:00       ` Thomas Monjalon
  2018-01-19 12:40       ` Ananyev, Konstantin
  1 sibling, 0 replies; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-18 17:00 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Gaetan Rivet, Jingjing Wu, dev, Neil Horman, Bruce Richardson,
	Konstantin Ananyev, stable

18/01/2018 17:35, Matan Azrad:
> rte_eth_dev_find_free_port() found a free port by state checking.
> The state field are in local process memory, so other DPDK processes
> may get the same port ID because their local states may be different.
> 
> Replace the state checking by the ethdev port name checking,
> so, if the name is an empty string the port ID will be detected as
> unused.
> 
> Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple process model")
> Cc: stable@dpdk.org
> 
> Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> Signed-off-by: Matan Azrad <matan@mellanox.com>

Acked-by: Thomas Monjalon <thomas@monjalon.net>

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-18 16:54                                         ` Neil Horman
@ 2018-01-18 17:20                                           ` Matan Azrad
  2018-01-18 18:41                                             ` Neil Horman
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-18 17:20 UTC (permalink / raw)
  To: Neil Horman
  Cc: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing,
	dev, Richardson, Bruce

Hi Neil

From: Neil Horman, Thursday, January 18, 2018 6:55 PM
> On Thu, Jan 18, 2018 at 02:00:23PM +0000, Matan Azrad wrote:
> > Hi Neil
> >
> > From: Neil Horman, Thursday, January 18, 2018 3:10 PM
> > > On Wed, Jan 17, 2018 at 05:01:10PM +0000, Ananyev, Konstantin wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > > > Sent: Wednesday, January 17, 2018 2:00 PM
> > > > > To: Matan Azrad <matan@mellanox.com>
> > > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Thomas
> > > > > Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>;
> > > > > dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>
> > > > > Subject: Re: [PATCH v2 2/6] ethdev: add port ownership
> > > > >
> > > > > On Wed, Jan 17, 2018 at 12:05:42PM +0000, Matan Azrad wrote:
> > > > > >
> > > > > > Hi Konstantin
> > > > > > From: Ananyev, Konstantin, Sent: Wednesday, January 17, 2018
> > > > > > 1:24 PM
> > > > > > > Hi Matan,
> > > > > > >
> > > > > > > > Hi Konstantin
> > > > > > > >
> > > > > > > > From: Ananyev, Konstantin, Tuesday, January 16, 2018 9:11
> > > > > > > > PM
> > > > > > > > > Hi Matan,
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Hi Konstantin
> > > > > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018
> > > > > > > > > > 8:44 PM
> > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > From: Ananyev, Konstantin, Monday, January 15,
> > > > > > > > > > > > 2018
> > > > > > > > > > > > 1:45 PM
> > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > From: Ananyev, Konstantin, Friday, January 12,
> > > > > > > > > > > > > > 2018 2:02 AM
> > > > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > > > From: Ananyev, Konstantin, Thursday,
> > > > > > > > > > > > > > > > January 11, 2018
> > > > > > > > > > > > > > > > 2:40 PM
> > > > > > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > > > > > From: Ananyev, Konstantin, Wednesday,
> > > > > > > > > > > > > > > > > > January 10,
> > > > > > > > > > > > > > > > > > 2018
> > > > > > > > > > > > > > > > > > 3:36 PM
> > > > > > > > > > > > > > > > > > > Hi Matan,
> > > > > > > > > >  <snip>
> > > > > > > > > > > > > > > > > > > It is good to see that now
> > > > > > > > > > > > > > > > > > > scanning/updating rte_eth_dev_data[]
> > > > > > > > > > > > > > > > > > > is lock protected, but it might be
> > > > > > > > > > > > > > > > > > > not very plausible to protect both
> > > > > > > > > > > > > > > > > > > data[] and next_owner_id using the
> > > > > > > > > > > > > same lock.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I guess you mean to the owner
> > > > > > > > > > > > > > > > > > structure in
> > > > > > > > > > > > > rte_eth_dev_data[port_id].
> > > > > > > > > > > > > > > > > > The next_owner_id is read by ownership
> > > > > > > > > > > > > > > > > > APIs(for owner validation), so it
> > > > > > > > > > > > > > > > > makes sense to use the same lock.
> > > > > > > > > > > > > > > > > > Actually, why not?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Well to me next_owner_id and
> > > > > > > > > > > > > > > > > rte_eth_dev_data[] are not directly
> > > > > > > > > > > > > > > related.
> > > > > > > > > > > > > > > > > You may create new owner_id but it
> > > > > > > > > > > > > > > > > doesn't mean you would update
> > > > > > > > > > > > > > > > > rte_eth_dev_data[]
> > > immediately.
> > > > > > > > > > > > > > > > > And visa-versa - you might just want to
> > > > > > > > > > > > > > > > > update rte_eth_dev_data[].name or
> .owner_id.
> > > > > > > > > > > > > > > > > It is not very good coding practice to
> > > > > > > > > > > > > > > > > use same lock for non-related data structures.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I see the relation like next:
> > > > > > > > > > > > > > > > Since the ownership mechanism
> > > > > > > > > > > > > > > > synchronization is in ethdev
> > > > > > > > > > > > > > > > responsibility, we must protect against
> > > > > > > > > > > > > > > > user mistakes as much as we can by
> > > > > > > > > > > > > > > using the same lock.
> > > > > > > > > > > > > > > > So, if user try to set by invalid owner
> > > > > > > > > > > > > > > > (exactly the ID which currently is
> > > > > > > > > > > > > > > allocated) we can protect on it.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hmm, not sure why you can't do same checking
> > > > > > > > > > > > > > > with different lock or atomic variable?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > The set ownership API is protected by
> > > > > > > > > > > > > > ownership lock and checks the owner ID
> > > > > > > > > > > > > > validity By reading the next
> > > owner ID.
> > > > > > > > > > > > > > So, the owner ID allocation and set API should
> > > > > > > > > > > > > > use the same atomic
> > > > > > > > > > > > > mechanism.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Sure but all you are doing for checking
> > > > > > > > > > > > > validity, is check that owner_id > 0 &&&
> > > > > > > > > > > > > owner_id < next_ownwe_id,
> > > right?
> > > > > > > > > > > > > As you don't allow owner_id overlap (16/3248
> > > > > > > > > > > > > bits) you can safely do same check with just
> > > atomic_get(&next_owner_id).
> > > > > > > > > > > > >
> > > > > > > > > > > > It will not protect it, scenario:
> > > > > > > > > > > > - current next_id is X.
> > > > > > > > > > > > - call set ownership of port A with owner id X by
> > > > > > > > > > > > thread 0(by user
> > > > > > > > > mistake).
> > > > > > > > > > > > - context switch
> > > > > > > > > > > > - allocate new id by thread 1 and get X and change
> > > > > > > > > > > > next_id to
> > > > > > > > > > > > X+1
> > > > > > > > > > > atomically.
> > > > > > > > > > > > -  context switch
> > > > > > > > > > > > - Thread 0 validate X by atomic_read and succeed
> > > > > > > > > > > > to take
> > > > > > > ownership.
> > > > > > > > > > > > - The system loosed the port(or will be managed by
> > > > > > > > > > > > two
> > > > > > > > > > > > entities) -
> > > > > > > > > crash.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Ok, and how using lock will protect you with such scenario?
> > > > > > > > > >
> > > > > > > > > > The owner set API validation by thread 0 should fail
> > > > > > > > > > because the owner
> > > > > > > > > validation is included in the protected section.
> > > > > > > > >
> > > > > > > > > Then your validation function would fail even if you'll
> > > > > > > > > use atomic ops instead of lock.
> > > > > > > > No.
> > > > > > > > With atomic this specific scenario will cause the validation to
> pass.
> > > > > > >
> > > > > > > Can you explain to me how?
> > > > > > >
> > > > > > > rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > > > > >               int32_t cur_owner_id =
> > > > > > > RTE_MIN(rte_atomic32_get(next_owner_id),
> > > > > > > UINT16_MAX);
> > > > > > >
> > > > > > > 	if (owner_id == RTE_ETH_DEV_NO_OWNER || owner >
> > > > > > > cur_owner_id) {
> > > > > > > 		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n",
> owner_id);
> > > > > > > 		return 0;
> > > > > > > 	}
> > > > > > > 	return 1;
> > > > > > > }
> > > > > > >
> > > > > > > Let say your next_owne_id==X, and you invoke
> > > > > > > rte_eth_is_valid_owner_id(owner_id=X+1)  - it would fail.
> > > > > >
> > > > > > Explanation:
> > > > > > The scenario with locks:
> > > > > > next_owner_id = X.
> > > > > > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > > > > > Context switch.
> > > > > > Thread 1 call to owner_new and stuck in the lock.
> > > > > > Context switch.
> > > > > > Thread 0 does owner id validation and failed(Y>=X) - unlock
> > > > > > the lock and
> > > return failure to the user.
> > > > > > Context switch.
> > > > > > Thread 1 take the lock and update X to X+1, then, unlock the lock.
> > > > > > Everything is OK!
> > > > > >
> > > > > > The same scenario with atomics:
> > > > > > next_owner_id = X.
> > > > > > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > > > > > Context switch.
> > > > > > Thread 1 call to owner_new and change X to X+1(atomically).
> > > > > > Context switch.
> > > > > > Thread 0 does owner id validation and success(Y<(atomic)X+1) -
> > > > > > unlock
> > > the lock and return success to the  user.
> > > > > > Problem!
> > > > > >
> > > > >
> > > > >
> > > > > Matan is correct here, there is no way to preform parallel set
> > > > > operations using just and atomic variable here, because multiple
> > > > > reads of next_owner_id need to be preformed while it is stable.
> > > > > That is to say rte_eth_next_owner_id must be compared to
> > > > > RTE_ETH_DEV_NO_OWNER and owner_id in
> rte_eth_is_valid_owner_id.
> > > If
> > > > > you were to only use an atomic_read on such a variable, it could
> > > > > be incremented by the owner_new function between the checks and
> > > > > an invalid owner value could become valid because  a third
> > > > > thread incremented the next value.  The state of next_owner_id
> > > > > must be kept stable during any validity checks
> > > >
> > > > It could still be incremented between the checks - if let say
> > > > different thread will invoke new_onwer_id, grab the lock update
> > > > counter, release the lock - all that before the check.
> > > I don't see how all of the contents of rte_eth_dev_owner_set is
> > > protected under rte_eth_dev_ownership_lock, as is
> rte_eth_dev_owner_new.
> > > Next_owner might increment between another threads calls to
> > > owner_new and owner_set, but that will just cause a transition from
> > > an ownership id being valid to invalid, and thats ok, as long as
> > > there is consistency in the model that enforces a single valid owner
> > > at a time (in that case the subsequent caller to owner_new).
> > >
> >
> > I'm not sure I fully understand you, but see:
> > we can't protect all of the user mistakes(using the wrong owner id).
> > But we are doing the maximum for it.
> >
> Yeah, my writing was atrocious, apologies.  All I meant to say was that the
> locking you have is ok, in that it maintains a steady state for the data being
> read during the period its being read.  The fact that a given set operation may
> fail because someone else created an ownership record is an artifact of the
> api, not a bug in its implementation.  I think we're basically in agreement on
> the semantics here, but this goes to my argument about complexity (more
> below).
> 
> >
> > > Though this confusion does underscore my assertion I think that this
> > > API is overly complicated
> > >
> >
> > I really don't think it is complicated. - just take ownership of a port(by
> owner id allocation and set APIs) and manage the port as you want.
> >
> But thats not all.  The determination of success or failure in claiming
> ownership is largely dependent on the behavior of other threads actions, not
> a function of the state of the system at the moment ownership is requested.
> That is to say, if you have N threads, and they all create ownership objects
> identified as X, x+1, X+2...X+N, only the thread with id X+N will be able to
> claim ownership of any port, because they all will have incremented the
> shared nex_id variable.

Why? Each one will get its owner id according to some order(The critical section is protected by spinlock).

>  Determination of ownership by the programmer will
> have to be done via debugging, and errors will likely be transient dependent
> on the order in which threads execute (subject to scheduling jitter).
> 
Yes.

> Rather than making ownership success dependent on any data contained
> within the ownership record, ownership should be entirely dependent on
> the state of port ownership at the time that it was requested.  That is to say,
> port ownership should succede if and only if the port is unowned at the time
> that a given thread requets ownership.

Yes.

>  Any ancilliary data regarding which
> context owns the port should be exactly that, ancilliary, and have no impact
> on weather or not the port ownership request succedes.
> 

Yes, I understand what you say - there is no deterministic state for ownership set success.
Actually I think it will be very hard to arrive to determination in DPDK regarding port ownership when multi-thread is in the game,
Especially it depend in a lot of DPDK entities implementation..
But the current non-deterministic approach makes good order in the game. 



> Regards
> Neil
> 
> > > Neil
> >
> >

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-18 17:20                                           ` Matan Azrad
@ 2018-01-18 18:41                                             ` Neil Horman
  2018-01-18 20:21                                               ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Neil Horman @ 2018-01-18 18:41 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing,
	dev, Richardson, Bruce

On Thu, Jan 18, 2018 at 05:20:31PM +0000, Matan Azrad wrote:
> Hi Neil
> 
> From: Neil Horman, Thursday, January 18, 2018 6:55 PM
> > On Thu, Jan 18, 2018 at 02:00:23PM +0000, Matan Azrad wrote:
> > > Hi Neil
> > >
> > > From: Neil Horman, Thursday, January 18, 2018 3:10 PM
> > > > On Wed, Jan 17, 2018 at 05:01:10PM +0000, Ananyev, Konstantin wrote:
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > > > > > Sent: Wednesday, January 17, 2018 2:00 PM
> > > > > > To: Matan Azrad <matan@mellanox.com>
> > > > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Thomas
> > > > > > Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > > > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>;
> > > > > > dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>
> > > > > > Subject: Re: [PATCH v2 2/6] ethdev: add port ownership
> > > > > >
> > > > > > On Wed, Jan 17, 2018 at 12:05:42PM +0000, Matan Azrad wrote:
> > > > > > >
> > > > > > > Hi Konstantin
> > > > > > > From: Ananyev, Konstantin, Sent: Wednesday, January 17, 2018
> > > > > > > 1:24 PM
> > > > > > > > Hi Matan,
> > > > > > > >
> > > > > > > > > Hi Konstantin
> > > > > > > > >
> > > > > > > > > From: Ananyev, Konstantin, Tuesday, January 16, 2018 9:11
> > > > > > > > > PM
> > > > > > > > > > Hi Matan,
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > From: Ananyev, Konstantin, Monday, January 15, 2018
> > > > > > > > > > > 8:44 PM
> > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > From: Ananyev, Konstantin, Monday, January 15,
> > > > > > > > > > > > > 2018
> > > > > > > > > > > > > 1:45 PM
> > > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > > From: Ananyev, Konstantin, Friday, January 12,
> > > > > > > > > > > > > > > 2018 2:02 AM
> > > > > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > > > > From: Ananyev, Konstantin, Thursday,
> > > > > > > > > > > > > > > > > January 11, 2018
> > > > > > > > > > > > > > > > > 2:40 PM
> > > > > > > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > > > > > > > > > > Hi Konstantin
> > > > > > > > > > > > > > > > > > > From: Ananyev, Konstantin, Wednesday,
> > > > > > > > > > > > > > > > > > > January 10,
> > > > > > > > > > > > > > > > > > > 2018
> > > > > > > > > > > > > > > > > > > 3:36 PM
> > > > > > > > > > > > > > > > > > > > Hi Matan,
> > > > > > > > > > >  <snip>
> > > > > > > > > > > > > > > > > > > > It is good to see that now
> > > > > > > > > > > > > > > > > > > > scanning/updating rte_eth_dev_data[]
> > > > > > > > > > > > > > > > > > > > is lock protected, but it might be
> > > > > > > > > > > > > > > > > > > > not very plausible to protect both
> > > > > > > > > > > > > > > > > > > > data[] and next_owner_id using the
> > > > > > > > > > > > > > same lock.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I guess you mean to the owner
> > > > > > > > > > > > > > > > > > > structure in
> > > > > > > > > > > > > > rte_eth_dev_data[port_id].
> > > > > > > > > > > > > > > > > > > The next_owner_id is read by ownership
> > > > > > > > > > > > > > > > > > > APIs(for owner validation), so it
> > > > > > > > > > > > > > > > > > makes sense to use the same lock.
> > > > > > > > > > > > > > > > > > > Actually, why not?
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Well to me next_owner_id and
> > > > > > > > > > > > > > > > > > rte_eth_dev_data[] are not directly
> > > > > > > > > > > > > > > > related.
> > > > > > > > > > > > > > > > > > You may create new owner_id but it
> > > > > > > > > > > > > > > > > > doesn't mean you would update
> > > > > > > > > > > > > > > > > > rte_eth_dev_data[]
> > > > immediately.
> > > > > > > > > > > > > > > > > > And visa-versa - you might just want to
> > > > > > > > > > > > > > > > > > update rte_eth_dev_data[].name or
> > .owner_id.
> > > > > > > > > > > > > > > > > > It is not very good coding practice to
> > > > > > > > > > > > > > > > > > use same lock for non-related data structures.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I see the relation like next:
> > > > > > > > > > > > > > > > > Since the ownership mechanism
> > > > > > > > > > > > > > > > > synchronization is in ethdev
> > > > > > > > > > > > > > > > > responsibility, we must protect against
> > > > > > > > > > > > > > > > > user mistakes as much as we can by
> > > > > > > > > > > > > > > > using the same lock.
> > > > > > > > > > > > > > > > > So, if user try to set by invalid owner
> > > > > > > > > > > > > > > > > (exactly the ID which currently is
> > > > > > > > > > > > > > > > allocated) we can protect on it.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hmm, not sure why you can't do same checking
> > > > > > > > > > > > > > > > with different lock or atomic variable?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The set ownership API is protected by
> > > > > > > > > > > > > > > ownership lock and checks the owner ID
> > > > > > > > > > > > > > > validity By reading the next
> > > > owner ID.
> > > > > > > > > > > > > > > So, the owner ID allocation and set API should
> > > > > > > > > > > > > > > use the same atomic
> > > > > > > > > > > > > > mechanism.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Sure but all you are doing for checking
> > > > > > > > > > > > > > validity, is check that owner_id > 0 &&&
> > > > > > > > > > > > > > owner_id < next_ownwe_id,
> > > > right?
> > > > > > > > > > > > > > As you don't allow owner_id overlap (16/3248
> > > > > > > > > > > > > > bits) you can safely do same check with just
> > > > atomic_get(&next_owner_id).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > It will not protect it, scenario:
> > > > > > > > > > > > > - current next_id is X.
> > > > > > > > > > > > > - call set ownership of port A with owner id X by
> > > > > > > > > > > > > thread 0(by user
> > > > > > > > > > mistake).
> > > > > > > > > > > > > - context switch
> > > > > > > > > > > > > - allocate new id by thread 1 and get X and change
> > > > > > > > > > > > > next_id to
> > > > > > > > > > > > > X+1
> > > > > > > > > > > > atomically.
> > > > > > > > > > > > > -  context switch
> > > > > > > > > > > > > - Thread 0 validate X by atomic_read and succeed
> > > > > > > > > > > > > to take
> > > > > > > > ownership.
> > > > > > > > > > > > > - The system loosed the port(or will be managed by
> > > > > > > > > > > > > two
> > > > > > > > > > > > > entities) -
> > > > > > > > > > crash.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Ok, and how using lock will protect you with such scenario?
> > > > > > > > > > >
> > > > > > > > > > > The owner set API validation by thread 0 should fail
> > > > > > > > > > > because the owner
> > > > > > > > > > validation is included in the protected section.
> > > > > > > > > >
> > > > > > > > > > Then your validation function would fail even if you'll
> > > > > > > > > > use atomic ops instead of lock.
> > > > > > > > > No.
> > > > > > > > > With atomic this specific scenario will cause the validation to
> > pass.
> > > > > > > >
> > > > > > > > Can you explain to me how?
> > > > > > > >
> > > > > > > > rte_eth_is_valid_owner_id(uint16_t owner_id) {
> > > > > > > >               int32_t cur_owner_id =
> > > > > > > > RTE_MIN(rte_atomic32_get(next_owner_id),
> > > > > > > > UINT16_MAX);
> > > > > > > >
> > > > > > > > 	if (owner_id == RTE_ETH_DEV_NO_OWNER || owner >
> > > > > > > > cur_owner_id) {
> > > > > > > > 		RTE_LOG(ERR, EAL, "Invalid owner_id=%d.\n",
> > owner_id);
> > > > > > > > 		return 0;
> > > > > > > > 	}
> > > > > > > > 	return 1;
> > > > > > > > }
> > > > > > > >
> > > > > > > > Let say your next_owne_id==X, and you invoke
> > > > > > > > rte_eth_is_valid_owner_id(owner_id=X+1)  - it would fail.
> > > > > > >
> > > > > > > Explanation:
> > > > > > > The scenario with locks:
> > > > > > > next_owner_id = X.
> > > > > > > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > > > > > > Context switch.
> > > > > > > Thread 1 call to owner_new and stuck in the lock.
> > > > > > > Context switch.
> > > > > > > Thread 0 does owner id validation and failed(Y>=X) - unlock
> > > > > > > the lock and
> > > > return failure to the user.
> > > > > > > Context switch.
> > > > > > > Thread 1 take the lock and update X to X+1, then, unlock the lock.
> > > > > > > Everything is OK!
> > > > > > >
> > > > > > > The same scenario with atomics:
> > > > > > > next_owner_id = X.
> > > > > > > Thread 0 call to set API(with invalid owner Y=X) and take lock.
> > > > > > > Context switch.
> > > > > > > Thread 1 call to owner_new and change X to X+1(atomically).
> > > > > > > Context switch.
> > > > > > > Thread 0 does owner id validation and success(Y<(atomic)X+1) -
> > > > > > > unlock
> > > > the lock and return success to the  user.
> > > > > > > Problem!
> > > > > > >
> > > > > >
> > > > > >
> > > > > > Matan is correct here, there is no way to preform parallel set
> > > > > > operations using just and atomic variable here, because multiple
> > > > > > reads of next_owner_id need to be preformed while it is stable.
> > > > > > That is to say rte_eth_next_owner_id must be compared to
> > > > > > RTE_ETH_DEV_NO_OWNER and owner_id in
> > rte_eth_is_valid_owner_id.
> > > > If
> > > > > > you were to only use an atomic_read on such a variable, it could
> > > > > > be incremented by the owner_new function between the checks and
> > > > > > an invalid owner value could become valid because  a third
> > > > > > thread incremented the next value.  The state of next_owner_id
> > > > > > must be kept stable during any validity checks
> > > > >
> > > > > It could still be incremented between the checks - if let say
> > > > > different thread will invoke new_onwer_id, grab the lock update
> > > > > counter, release the lock - all that before the check.
> > > > I don't see how all of the contents of rte_eth_dev_owner_set is
> > > > protected under rte_eth_dev_ownership_lock, as is
> > rte_eth_dev_owner_new.
> > > > Next_owner might increment between another threads calls to
> > > > owner_new and owner_set, but that will just cause a transition from
> > > > an ownership id being valid to invalid, and thats ok, as long as
> > > > there is consistency in the model that enforces a single valid owner
> > > > at a time (in that case the subsequent caller to owner_new).
> > > >
> > >
> > > I'm not sure I fully understand you, but see:
> > > we can't protect all of the user mistakes(using the wrong owner id).
> > > But we are doing the maximum for it.
> > >
> > Yeah, my writing was atrocious, apologies.  All I meant to say was that the
> > locking you have is ok, in that it maintains a steady state for the data being
> > read during the period its being read.  The fact that a given set operation may
> > fail because someone else created an ownership record is an artifact of the
> > api, not a bug in its implementation.  I think we're basically in agreement on
> > the semantics here, but this goes to my argument about complexity (more
> > below).
> > 
> > >
> > > > Though this confusion does underscore my assertion I think that this
> > > > API is overly complicated
> > > >
> > >
> > > I really don't think it is complicated. - just take ownership of a port(by
> > owner id allocation and set APIs) and manage the port as you want.
> > >
> > But thats not all.  The determination of success or failure in claiming
> > ownership is largely dependent on the behavior of other threads actions, not
> > a function of the state of the system at the moment ownership is requested.
> > That is to say, if you have N threads, and they all create ownership objects
> > identified as X, x+1, X+2...X+N, only the thread with id X+N will be able to
> > claim ownership of any port, because they all will have incremented the
> > shared nex_id variable.
> 
> Why? Each one will get its owner id according to some order(The critical section is protected by spinlock).
> 
Yes, and thats my issue here, the ordering.  Perhaps my issue is one of
perception.  When I consider an ownership library, what I really think about is
mutual exclusion (i.e. guaranteing that only one entity is capable of access to
a resource at any one time).  This semantics of this library don't really
conform to any semantics that you usually see with other mutual exclusion
mechanisms.  That is to say a spinlock or a mutex succedes locking if its prior
state is unlocked.  This library succeeds aqusition of the resource it protects
if and only if allocation of ownership records occurs in a particular order
relative to one another.  That just seems odd to me.  What advantage do these
new semantics have over more traditional established semantics?

 
> >  Determination of ownership by the programmer will
> > have to be done via debugging, and errors will likely be transient dependent
> > on the order in which threads execute (subject to scheduling jitter).
> > 
> Yes.
> 
But why put yourself through that pain?  Traditional semantics are far simpler
to comprehend, with and without a debugger.

> > Rather than making ownership success dependent on any data contained
> > within the ownership record, ownership should be entirely dependent on
> > the state of port ownership at the time that it was requested.  That is to say,
> > port ownership should succede if and only if the port is unowned at the time
> > that a given thread requets ownership.
> 
> Yes.
> 
Soo. We agree?  Then why do your ownership semantics require a check for the
highest allocated owner id?

> >  Any ancilliary data regarding which
> > context owns the port should be exactly that, ancilliary, and have no impact
> > on weather or not the port ownership request succedes.
> > 
> 
> Yes, I understand what you say - there is no deterministic state for ownership set success.
I think we agree here.  To be clear, I'm not saying that aquisition success or
failure should be deterministic in the sense that you should know which thread
can claim ownership, but only that you should be able to determine the success
of failure of ownership aqusition based on data in the locking mechanism, rather
than both data in the lock mechanism and data held by the requesting context.

> Actually I think it will be very hard to arrive to determination in DPDK regarding port ownership when multi-thread is in the game,
> Especially it depend in a lot of DPDK entities implementation..
Why?  A simple spinlock is sufficient for what I'm talking about.  If its locked
you don't get ownership, if it isn't you do.


> But the current non-deterministic approach makes good order in the game. 
Can you explain why the ordering is valuable to me?  Perhaps that would help me
out here, because currently, I don't see how the order is valuable, especially
given that the allocating contexts have no real control over the order in which
those objects are allocated

Neil

> 
> 
> 
> > Regards
> > Neil
> > 
> > > > Neil
> > >
> > >
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-18 18:41                                             ` Neil Horman
@ 2018-01-18 20:21                                               ` Matan Azrad
  2018-01-19  1:41                                                 ` Neil Horman
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-18 20:21 UTC (permalink / raw)
  To: Neil Horman
  Cc: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing,
	dev, Richardson, Bruce

Hi Neil.

From: Neil Horman, Thursday, January 18, 2018 8:42 PM

<snip>
> > > But thats not all.  The determination of success or failure in
> > > claiming ownership is largely dependent on the behavior of other
> > > threads actions, not a function of the state of the system at the moment
> ownership is requested.
> > > That is to say, if you have N threads, and they all create ownership
> > > objects identified as X, x+1, X+2...X+N, only the thread with id X+N
> > > will be able to claim ownership of any port, because they all will
> > > have incremented the shared nex_id variable.
> >
> > Why? Each one will get its owner id according to some order(The critical
> section is protected by spinlock).
> >
> Yes, and thats my issue here, the ordering.  Perhaps my issue is one of
> perception.  When I consider an ownership library, what I really think about is
> mutual exclusion (i.e. guaranteing that only one entity is capable of access to
> a resource at any one time).  This semantics of this library don't really
> conform to any semantics that you usually see with other mutual exclusion
> mechanisms.  That is to say a spinlock or a mutex succedes locking if its prior
> state is unlocked.  This library succeeds aqusition of the resource it protects if
> and only if allocation of ownership records occurs in a particular order relative
> to one another.  That just seems odd to me.  What advantage do these new
> semantics have over more traditional established semantics?
> 
> 
> > >  Determination of ownership by the programmer will have to be done
> > > via debugging, and errors will likely be transient dependent on the
> > > order in which threads execute (subject to scheduling jitter).
> > >
> > Yes.
> >
> But why put yourself through that pain?  Traditional semantics are far simpler
> to comprehend, with and without a debugger.
> 

Looks like I missed you, sorry:
Please describe next:

1. What exactly do you want to improve?(in details)
2. Which API specifically do you want to change(\ part of code)?
3. What is the missing in current code(you can answer it in V3 I sent if you want) which should be fixed?


<snip> sorry for that, I think it is not relevant continue discussion if we are not fully understand each other. So let's start from the beginning "with good order :)" by answering the above questions.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 4/7] ethdev: synchronize port allocation
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 4/7] ethdev: synchronize port allocation Matan Azrad
@ 2018-01-18 20:43       ` Thomas Monjalon
  2018-01-18 20:52         ` Matan Azrad
  2018-01-19 12:47       ` Ananyev, Konstantin
  1 sibling, 1 reply; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-18 20:43 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Gaetan Rivet, Jingjing Wu, dev, Neil Horman, Bruce Richardson,
	Konstantin Ananyev

18/01/2018 17:35, Matan Azrad:
>  rte_eth_dev_allocate(const char *name)
>  {
>         uint16_t port_id;
> -       struct rte_eth_dev *eth_dev;
> +       struct rte_eth_dev *eth_dev = NULL;
> +
> +       /* Synchronize share data one time allocation between local threads. */

I don't understand the "one time" part of this comment.
Please could you try to rephrase it?

> +       rte_spinlock_lock(&rte_eth_share_data_alloc);
> +       if (rte_eth_dev_share_data == NULL)
> +               rte_eth_dev_share_data_alloc();
> +       rte_spinlock_unlock(&rte_eth_share_data_alloc);

I think the correct wording is "shared data", instead of "share data".

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 4/7] ethdev: synchronize port allocation
  2018-01-18 20:43       ` Thomas Monjalon
@ 2018-01-18 20:52         ` Matan Azrad
  2018-01-18 21:17           ` Thomas Monjalon
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-18 20:52 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Gaetan Rivet, Jingjing Wu, dev, Neil Horman, Bruce Richardson,
	Konstantin Ananyev



From: Thomas Monjalon, Thursday, January 18, 2018 10:44 PM
> 18/01/2018 17:35, Matan Azrad:
> >  rte_eth_dev_allocate(const char *name)  {
> >         uint16_t port_id;
> > -       struct rte_eth_dev *eth_dev;
> > +       struct rte_eth_dev *eth_dev = NULL;
> > +
> > +       /* Synchronize share data one time allocation between local
> > + threads. */
> 
> I don't understand the "one time" part of this comment.
> Please could you try to rephrase it?
> 

One-time means this allocation will run only 1 time.

After the first allocation the pointer is not null, so no calling anymore to this function.
  
> > +       rte_spinlock_lock(&rte_eth_share_data_alloc);
> > +       if (rte_eth_dev_share_data == NULL)
> > +               rte_eth_dev_share_data_alloc();
> > +       rte_spinlock_unlock(&rte_eth_share_data_alloc);
> 
> I think the correct wording is "shared data", instead of "share data".

Yes you right - will change it.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 3/7] ethdev: add port ownership
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 3/7] ethdev: add port ownership Matan Azrad
@ 2018-01-18 21:11       ` Thomas Monjalon
  2018-01-19 12:41       ` Ananyev, Konstantin
  1 sibling, 0 replies; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-18 21:11 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Gaetan Rivet, Jingjing Wu, dev, Neil Horman, Bruce Richardson,
	Konstantin Ananyev

18/01/2018 17:35, Matan Azrad:
> The ownership of a port is implicit in DPDK.
> Making it explicit is better from the next reasons:
> 1. It will define well who is in charge of the port usage synchronization.
> 2. A library could work on top of a port.
> 3. A port can work on top of another port.
> 
> Also in the fail-safe case, an issue has been met in testpmd.
> We need to check that the application is not trying to use a port which
> is already managed by fail-safe.
> 
> A port owner is built from owner id(number) and owner name(string) while
> the owner id must be unique to distinguish between two identical entity
> instances and the owner name can be any name.
> The name helps to logically recognize the owner by different DPDK
> entities and allows easy debug.
> Each DPDK entity can allocate an owner unique identifier and can use it
> and its preferred name to owns valid ethdev ports.
> Each DPDK entity can get any port owner status to decide if it can
> manage the port or not.
> 
> The mechanism is synchronized for both the primary process threads and
> the secondary processes threads to allow secondary process entity to be
> a port owner.
> 
> Add a sinchronized ownership mechanism to DPDK Ethernet devices to

s/sinchronized/synchronized/

> avoid multiple management of a device by different DPDK entities.
> 
> The current ethdev internal port management is not affected by this
> feature.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>

I think it is a good compromise between application and
ethdev responsibilities.
The application is still responsible of thread safety per port,
and it is consistent with the checkless Rx/Tx design (for performance).

Except the wording (see below),
Acked-by: Thomas Monjalon <thomas@monjalon.net>

> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> +/* Shared memory between primary and secondary processes. */
> +static struct {
> +	uint64_t next_owner_id;
> +	rte_spinlock_t ownership_lock;
> +	struct rte_eth_dev_data data[RTE_MAX_ETHPORTS];
> +} *rte_eth_dev_share_data;

Should be rte_eth_dev_shared_data.

> -rte_eth_dev_data_alloc(void)
> +rte_eth_dev_share_data_alloc(void)

rte_eth_dev_shared_data_alloc

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 4/7] ethdev: synchronize port allocation
  2018-01-18 20:52         ` Matan Azrad
@ 2018-01-18 21:17           ` Thomas Monjalon
  0 siblings, 0 replies; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-18 21:17 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Gaetan Rivet, Jingjing Wu, dev, Neil Horman, Bruce Richardson,
	Konstantin Ananyev

18/01/2018 21:52, Matan Azrad:
> From: Thomas Monjalon, Thursday, January 18, 2018 10:44 PM
> > 18/01/2018 17:35, Matan Azrad:
> > >  rte_eth_dev_allocate(const char *name)  {
> > >         uint16_t port_id;
> > > -       struct rte_eth_dev *eth_dev;
> > > +       struct rte_eth_dev *eth_dev = NULL;
> > > +
> > > +       /* Synchronize share data one time allocation between local
> > > + threads. */
> > 
> > I don't understand the "one time" part of this comment.
> > Please could you try to rephrase it?
> 
> One-time means this allocation will run only 1 time.
> 
> After the first allocation the pointer is not null, so no calling anymore to this function.

Suggestion:
	Synchronize local threads to allocate shared data only once.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-18 20:21                                               ` Matan Azrad
@ 2018-01-19  1:41                                                 ` Neil Horman
  2018-01-19  7:14                                                   ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Neil Horman @ 2018-01-19  1:41 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing,
	dev, Richardson, Bruce

On Thu, Jan 18, 2018 at 08:21:34PM +0000, Matan Azrad wrote:
> Hi Neil.
> 
> From: Neil Horman, Thursday, January 18, 2018 8:42 PM
> 
> <snip>
> > > > But thats not all.  The determination of success or failure in
> > > > claiming ownership is largely dependent on the behavior of other
> > > > threads actions, not a function of the state of the system at the moment
> > ownership is requested.
> > > > That is to say, if you have N threads, and they all create ownership
> > > > objects identified as X, x+1, X+2...X+N, only the thread with id X+N
> > > > will be able to claim ownership of any port, because they all will
> > > > have incremented the shared nex_id variable.
> > >
> > > Why? Each one will get its owner id according to some order(The critical
> > section is protected by spinlock).
> > >
> > Yes, and thats my issue here, the ordering.  Perhaps my issue is one of
> > perception.  When I consider an ownership library, what I really think about is
> > mutual exclusion (i.e. guaranteing that only one entity is capable of access to
> > a resource at any one time).  This semantics of this library don't really
> > conform to any semantics that you usually see with other mutual exclusion
> > mechanisms.  That is to say a spinlock or a mutex succedes locking if its prior
> > state is unlocked.  This library succeeds aqusition of the resource it protects if
> > and only if allocation of ownership records occurs in a particular order relative
> > to one another.  That just seems odd to me.  What advantage do these new
> > semantics have over more traditional established semantics?
> > 
> > 
> > > >  Determination of ownership by the programmer will have to be done
> > > > via debugging, and errors will likely be transient dependent on the
> > > > order in which threads execute (subject to scheduling jitter).
> > > >
> > > Yes.
> > >
> > But why put yourself through that pain?  Traditional semantics are far simpler
> > to comprehend, with and without a debugger.
> > 
> 
> Looks like I missed you, sorry:
> Please describe next:
> 
> 1. What exactly do you want to improve?(in details)
> 2. Which API specifically do you want to change(\ part of code)?
> 3. What is the missing in current code(you can answer it in V3 I sent if you want) which should be fixed?
> 
> 
> <snip> sorry for that, I think it is not relevant continue discussion if we are not fully understand each other. So let's start from the beginning "with good order :)" by answering the above questions.


Sure, this seems like a reasonable way to level set.  

I mentioned in another thread that perhaps some of my issue here is perception
regarding what is meant by ownership.  When I think of an ownership api I think
primarily of mutual exclusion (that is to say, enforcement of a single execution
context having access to a resource at any given time.  In my mind the simplest
form of ownership is a spinlock or a mutex.  A single execution context either
does or does not hold the resource at any one time.  Those contexts that attempt
to gain excusive access to the resource call an api that (depending on
implementation) either block continued execution of that thread until exclusive
access to the resource can be granted, or returns immediately with a success or
error indicator to let the caller know if access is granted.

If I were to codify this port ownership api in pseudo code it would look
something like this:

struct rte_eth_dev {

	< eth dev bits >
	rte_spinlock_t owner_lock;
	bool locked;
	pid_t owner_pid;
}


bool rte_port_claim_ownership(struct rte_eth_dev *dev)
{
	bool ret = false;

	spin_lock(dev->owner_lock);
	if (dev->locked)
		goto out;
	dev->locked = true;
	dev->owner_pid = getpid();
	ret = true;
out:
	spin_unlock(dev->lock)
	return ret;		
}


bool rte_port_release_ownership(rte_eth_dev *dev)
{

	boot ret = false;
	spin_lock(dev->owner_lock);
	if (!dev->locked)
		goto out;
	if (dev->owner_pid != getpid())
		goto out;
	dev->locked = false;
	dev_owner_pid = 0;
	ret = true;
out:
	spin_unlock(dev->owner_lock)
	return ret;
}

bool rte_port_is_owned_by(struct rte_eth_dev *dev, pid_t pid)
{
	bool ret = false;

	spin_lock(dev->owner_lock);
	if (pid)
		ret = (dev->locked && (pid == dev->owner_pid));
	else
		ret = dev->locked;
	spin_unlock(dev->owner_lock);
	return ret;
}

The idea here is that lock state is isolated from ownership information.  Any
context has the opportunity to lock the resource (in this case the eth port)
despite its ownership object.  

In comparison, your api, which is in may ways simmilar, separates the creation
of ownership objects to a separate api call, and that ownership information
embodies state that is integral to the ability to get exclusive access to the
resource.  I.E. if thread A calls your owner_new call, and then thread B calls
owner_new, thread A will never be able to get access to any port unless it calls
owner_new again.

Does that help clarify my position?

Regards
Neil

}

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19  1:41                                                 ` Neil Horman
@ 2018-01-19  7:14                                                   ` Matan Azrad
  2018-01-19  9:30                                                     ` Bruce Richardson
  2018-01-19 13:52                                                     ` Neil Horman
  0 siblings, 2 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-19  7:14 UTC (permalink / raw)
  To: Neil Horman
  Cc: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing,
	dev, Richardson, Bruce


Hi Neil
From: Neil Horman, Friday, January 19, 2018 3:41 AM
> On Thu, Jan 18, 2018 at 08:21:34PM +0000, Matan Azrad wrote:
> > Hi Neil.
> >
> > From: Neil Horman, Thursday, January 18, 2018 8:42 PM

<snip>
> > 1. What exactly do you want to improve?(in details) 2. Which API
> > specifically do you want to change(\ part of code)?
> > 3. What is the missing in current code(you can answer it in V3 I sent if you
> want) which should be fixed?
> >
> >
> > <snip> sorry for that, I think it is not relevant continue discussion if we are
> not fully understand each other. So let's start from the beginning "with good
> order :)" by answering the above questions.
> 
> 
> Sure, this seems like a reasonable way to level set.
> 
> I mentioned in another thread that perhaps some of my issue here is
> perception regarding what is meant by ownership.  When I think of an
> ownership api I think primarily of mutual exclusion (that is to say,
> enforcement of a single execution context having access to a resource at any
> given time.  In my mind the simplest form of ownership is a spinlock or a
> mutex.  A single execution context either does or does not hold the resource
> at any one time.  Those contexts that attempt to gain excusive access to the
> resource call an api that (depending on
> implementation) either block continued execution of that thread until
> exclusive access to the resource can be granted, or returns immediately with
> a success or error indicator to let the caller know if access is granted.
> 
> If I were to codify this port ownership api in pseudo code it would look
> something like this:
> 
> struct rte_eth_dev {
> 
> 	< eth dev bits >
> 	rte_spinlock_t owner_lock;
> 	bool locked;
> 	pid_t owner_pid;
> }
> 
> 
> bool rte_port_claim_ownership(struct rte_eth_dev *dev) {
> 	bool ret = false;
> 
> 	spin_lock(dev->owner_lock);
> 	if (dev->locked)
> 		goto out;
> 	dev->locked = true;
> 	dev->owner_pid = getpid();
> 	ret = true;
> out:
> 	spin_unlock(dev->lock)
> 	return ret;
> }
> 
> 
> bool rte_port_release_ownership(rte_eth_dev *dev) {
> 
> 	boot ret = false;
> 	spin_lock(dev->owner_lock);
> 	if (!dev->locked)
> 		goto out;
> 	if (dev->owner_pid != getpid())
> 		goto out;
> 	dev->locked = false;
> 	dev_owner_pid = 0;
> 	ret = true;
> out:
> 	spin_unlock(dev->owner_lock)
> 	return ret;
> }
> 
> bool rte_port_is_owned_by(struct rte_eth_dev *dev, pid_t pid) {
> 	bool ret = false;
> 
> 	spin_lock(dev->owner_lock);
> 	if (pid)
> 		ret = (dev->locked && (pid == dev->owner_pid));
> 	else
> 		ret = dev->locked;
> 	spin_unlock(dev->owner_lock);
> 	return ret;
> }
> 
> The idea here is that lock state is isolated from ownership information.  Any
> context has the opportunity to lock the resource (in this case the eth port)
> despite its ownership object.
> 
> In comparison, your api, which is in may ways simmilar, separates the
> creation of ownership objects to a separate api call, and that ownership
> information embodies state that is integral to the ability to get exclusive
> access to the resource.  I.E. if thread A calls your owner_new call, and then
> thread B calls owner_new, thread A will never be able to get access to any
> port unless it calls owner_new again.
> 
> Does that help clarify my position?

Now I fully understand you, thanks for your patience.

So, you are missing here one of the main ideas of my port ownership intention.
There are options for X>1 different uncoordinated owners running in the same thread.

For example:
1. Think about Testpmd control commands that call to failsafe port devop which call to its sub-devices devops, while tespmd is different owner(controlling failsafe-port) and failsafe is a different owner(controlling all its sub-devices ports), There are both run control commands in the same thread and there are uncoordinated!
 2. Interrupt callbacks that anyone can register to them and all will run by the DPDK host thread. 

So, no any optional  owner becomes an owner, it depends in the specific implementation.

So if some "part of code" wants to manage a port exclusively and wants to take ownership of it to prevent other "part of code" to use this port :
1. Take ownership.
2. It should ask itself: Am I run in different threads\processes? If yes, it should synchronize its port management. 
3. Release ownership in the end.

Remember that may be different "part of code"s running in the same thread\threads\process\processes.

Thanks, Matan.
> 
> Regards
> Neil
> 
> }

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19  7:14                                                   ` Matan Azrad
@ 2018-01-19  9:30                                                     ` Bruce Richardson
  2018-01-19 10:44                                                       ` Matan Azrad
  2018-01-19 12:55                                                       ` Neil Horman
  2018-01-19 13:52                                                     ` Neil Horman
  1 sibling, 2 replies; 214+ messages in thread
From: Bruce Richardson @ 2018-01-19  9:30 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Neil Horman, Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet,
	Wu, Jingjing, dev

On Fri, Jan 19, 2018 at 07:14:17AM +0000, Matan Azrad wrote:
> 
> Hi Neil
> From: Neil Horman, Friday, January 19, 2018 3:41 AM
> > On Thu, Jan 18, 2018 at 08:21:34PM +0000, Matan Azrad wrote:
> > > Hi Neil.
> > >
> > > From: Neil Horman, Thursday, January 18, 2018 8:42 PM
> 
> <snip>
> > > 1. What exactly do you want to improve?(in details) 2. Which API
> > > specifically do you want to change(\ part of code)?
> > > 3. What is the missing in current code(you can answer it in V3 I sent if you
> > want) which should be fixed?
> > >
> > >
> > > <snip> sorry for that, I think it is not relevant continue discussion if we are
> > not fully understand each other. So let's start from the beginning "with good
> > order :)" by answering the above questions.
> > 
> > 
> > Sure, this seems like a reasonable way to level set.
> > 
> > I mentioned in another thread that perhaps some of my issue here is
> > perception regarding what is meant by ownership.  When I think of an
> > ownership api I think primarily of mutual exclusion (that is to say,
> > enforcement of a single execution context having access to a resource at any
> > given time.  In my mind the simplest form of ownership is a spinlock or a
> > mutex.  A single execution context either does or does not hold the resource
> > at any one time.  Those contexts that attempt to gain excusive access to the
> > resource call an api that (depending on
> > implementation) either block continued execution of that thread until
> > exclusive access to the resource can be granted, or returns immediately with
> > a success or error indicator to let the caller know if access is granted.
> > 
> > If I were to codify this port ownership api in pseudo code it would look
> > something like this:
> > 
> > struct rte_eth_dev {
> > 
> > 	< eth dev bits >
> > 	rte_spinlock_t owner_lock;
> > 	bool locked;
> > 	pid_t owner_pid;
> > }
> > 
As an aside, if you ensure that both locked (or "owned", I think in this
context) and owner_pid are integer values, you can do away with the lock
and use a compare-and-set to take ownership, by setting both atomically
if unmodified from the originally read values.

> > 
> > bool rte_port_claim_ownership(struct rte_eth_dev *dev) {
> > 	bool ret = false;
> > 
> > 	spin_lock(dev->owner_lock);
> > 	if (dev->locked)
> > 		goto out;
> > 	dev->locked = true;
> > 	dev->owner_pid = getpid();
> > 	ret = true;
> > out:
> > 	spin_unlock(dev->lock)
> > 	return ret;
> > }
> > 
> > 
> > bool rte_port_release_ownership(rte_eth_dev *dev) {
> > 
> > 	boot ret = false;
> > 	spin_lock(dev->owner_lock);
> > 	if (!dev->locked)
> > 		goto out;
> > 	if (dev->owner_pid != getpid())
> > 		goto out;
> > 	dev->locked = false;
> > 	dev_owner_pid = 0;
> > 	ret = true;
> > out:
> > 	spin_unlock(dev->owner_lock)
> > 	return ret;
> > }
> > 
> > bool rte_port_is_owned_by(struct rte_eth_dev *dev, pid_t pid) {
> > 	bool ret = false;
> > 
> > 	spin_lock(dev->owner_lock);
> > 	if (pid)
> > 		ret = (dev->locked && (pid == dev->owner_pid));
> > 	else
> > 		ret = dev->locked;
> > 	spin_unlock(dev->owner_lock);
> > 	return ret;
> > }
> > 
> > The idea here is that lock state is isolated from ownership information.  Any
> > context has the opportunity to lock the resource (in this case the eth port)
> > despite its ownership object.
> > 
> > In comparison, your api, which is in may ways simmilar, separates the
> > creation of ownership objects to a separate api call, and that ownership
> > information embodies state that is integral to the ability to get exclusive
> > access to the resource.  I.E. if thread A calls your owner_new call, and then
> > thread B calls owner_new, thread A will never be able to get access to any
> > port unless it calls owner_new again.
> > 
> > Does that help clarify my position?
This would have been my understanding of what was being looked for too,
from my minimal understanding of the problem. Thanks for putting that
forward on behalf of many of us!

> 
> Now I fully understand you, thanks for your patience.
> 
> So, you are missing here one of the main ideas of my port ownership intention.
> There are options for X>1 different uncoordinated owners running in the same thread.

Thanks Matan for taking time to try and explain how your idea differs,
but I for one am still a little confused. Sorry for the late questions.

Sure, Neil's example above takes the pid or thread id as the owner id
parameter, but there is no reason we can't use the same scheme with
arbitrarily assigned owner ids, so long as they are unique. We can even
have a simple mapping table mapping ids to names of components.
> 
> For example:
> 1. Think about Testpmd control commands that call to failsafe port devop which call to its sub-devices devops, while tespmd is different owner(controlling failsafe-port) and failsafe is a different owner(controlling all its sub-devices ports), There are both run control commands in the same thread and there are uncoordinated!
>  2. Interrupt callbacks that anyone can register to them and all will run by the DPDK host thread.

Can you provide a little more details here: what is the specific issue
or conflict in each of these examples and how does your ownership
proposal fix it, when Neil's simpler approach doesn't?

> 
> So, no any optional  owner becomes an owner, it depends in the specific implementation.
> 
> So if some "part of code" wants to manage a port exclusively and wants to take ownership of it to prevent other "part of code" to use this port :
> 1. Take ownership.
> 2. It should ask itself: Am I run in different threads\processes? If yes, it should synchronize its port management. 
> 3. Release ownership in the end.
> 
> Remember that may be different "part of code"s running in the same thread\threads\process\processes.
> 
> Thanks, Matan.
> > 
> > Regards
> > Neil
> > 
> > }

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19  9:30                                                     ` Bruce Richardson
@ 2018-01-19 10:44                                                       ` Matan Azrad
  2018-01-19 13:30                                                         ` Neil Horman
  2018-01-19 12:55                                                       ` Neil Horman
  1 sibling, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-19 10:44 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: Neil Horman, Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet,
	Wu, Jingjing, dev

Hi Bruce
From: Bruce Richardson, Friday, January 19, 2018 11:30 AM
> On Fri, Jan 19, 2018 at 07:14:17AM +0000, Matan Azrad wrote:
> >
> > Hi Neil
> > From: Neil Horman, Friday, January 19, 2018 3:41 AM
> > > On Thu, Jan 18, 2018 at 08:21:34PM +0000, Matan Azrad wrote:
> > > > Hi Neil.
> > > >
> > > > From: Neil Horman, Thursday, January 18, 2018 8:42 PM
> >
> > <snip>
> > > > 1. What exactly do you want to improve?(in details) 2. Which API
> > > > specifically do you want to change(\ part of code)?
> > > > 3. What is the missing in current code(you can answer it in V3 I
> > > > sent if you
> > > want) which should be fixed?
> > > >
> > > >
> > > > <snip> sorry for that, I think it is not relevant continue
> > > > discussion if we are
> > > not fully understand each other. So let's start from the beginning
> > > "with good order :)" by answering the above questions.
> > >
> > >
> > > Sure, this seems like a reasonable way to level set.
> > >
> > > I mentioned in another thread that perhaps some of my issue here is
> > > perception regarding what is meant by ownership.  When I think of an
> > > ownership api I think primarily of mutual exclusion (that is to say,
> > > enforcement of a single execution context having access to a
> > > resource at any given time.  In my mind the simplest form of
> > > ownership is a spinlock or a mutex.  A single execution context
> > > either does or does not hold the resource at any one time.  Those
> > > contexts that attempt to gain excusive access to the resource call
> > > an api that (depending on
> > > implementation) either block continued execution of that thread
> > > until exclusive access to the resource can be granted, or returns
> > > immediately with a success or error indicator to let the caller know if
> access is granted.
> > >
> > > If I were to codify this port ownership api in pseudo code it would
> > > look something like this:
> > >
> > > struct rte_eth_dev {
> > >
> > > 	< eth dev bits >
> > > 	rte_spinlock_t owner_lock;
> > > 	bool locked;
> > > 	pid_t owner_pid;
> > > }
> > >
> As an aside, if you ensure that both locked (or "owned", I think in this
> context) and owner_pid are integer values, you can do away with the lock
> and use a compare-and-set to take ownership, by setting both atomically if
> unmodified from the originally read values.
> 
> > >
> > > bool rte_port_claim_ownership(struct rte_eth_dev *dev) {
> > > 	bool ret = false;
> > >
> > > 	spin_lock(dev->owner_lock);
> > > 	if (dev->locked)
> > > 		goto out;
> > > 	dev->locked = true;
> > > 	dev->owner_pid = getpid();
> > > 	ret = true;
> > > out:
> > > 	spin_unlock(dev->lock)
> > > 	return ret;
> > > }
> > >
> > >
> > > bool rte_port_release_ownership(rte_eth_dev *dev) {
> > >
> > > 	boot ret = false;
> > > 	spin_lock(dev->owner_lock);
> > > 	if (!dev->locked)
> > > 		goto out;
> > > 	if (dev->owner_pid != getpid())
> > > 		goto out;
> > > 	dev->locked = false;
> > > 	dev_owner_pid = 0;
> > > 	ret = true;
> > > out:
> > > 	spin_unlock(dev->owner_lock)
> > > 	return ret;
> > > }
> > >
> > > bool rte_port_is_owned_by(struct rte_eth_dev *dev, pid_t pid) {
> > > 	bool ret = false;
> > >
> > > 	spin_lock(dev->owner_lock);
> > > 	if (pid)
> > > 		ret = (dev->locked && (pid == dev->owner_pid));
> > > 	else
> > > 		ret = dev->locked;
> > > 	spin_unlock(dev->owner_lock);
> > > 	return ret;
> > > }
> > >
> > > The idea here is that lock state is isolated from ownership
> > > information.  Any context has the opportunity to lock the resource
> > > (in this case the eth port) despite its ownership object.
> > >
> > > In comparison, your api, which is in may ways simmilar, separates
> > > the creation of ownership objects to a separate api call, and that
> > > ownership information embodies state that is integral to the ability
> > > to get exclusive access to the resource.  I.E. if thread A calls
> > > your owner_new call, and then thread B calls owner_new, thread A
> > > will never be able to get access to any port unless it calls owner_new
> again.
> > >
> > > Does that help clarify my position?
> This would have been my understanding of what was being looked for too,
> from my minimal understanding of the problem. Thanks for putting that
> forward on behalf of many of us!
> 
> >
> > Now I fully understand you, thanks for your patience.
> >
> > So, you are missing here one of the main ideas of my port ownership
> intention.
> > There are options for X>1 different uncoordinated owners running in the
> same thread.
> 
> Thanks Matan for taking time to try and explain how your idea differs, but I
> for one am still a little confused. Sorry for the late questions.
> 
> Sure, Neil's example above takes the pid or thread id as the owner id
> parameter, but there is no reason we can't use the same scheme with
> arbitrarily assigned owner ids, so long as they are unique. We can even have
> a simple mapping table mapping ids to names of components.
> >
Sorry, don't understand your point here.
My approach asked to allocate unique ID for "any part of code want to manage\use a port".
What is the problem here and how do you suggest to fix it?

Neil approach (with process iD\ thread id ) is wrong because 2 different owners can run in same thread (as I explained a lot below).

> > For example:
> > 1. Think about Testpmd control commands that call to failsafe port devop
> which call to its sub-devices devops, while tespmd is different
> owner(controlling failsafe-port) and failsafe is a different owner(controlling
> all its sub-devices ports), There are both run control commands in the same
> thread and there are uncoordinated!
> >  2. Interrupt callbacks that anyone can register to them and all will run by
> the DPDK host thread.
> 
> Can you provide a little more details here: what is the specific issue or conflict
> in each of these examples and how does your ownership proposal fix it,
> when Neil's simpler approach doesn't?
> 
For the first example:
My approach:
Testpmd want to manage the fail-safe port, therefore it should allocate unique ID(only one time) and use owner set(by its ID) to take ownership of this port.
If it succeed to take ownership it can manage the port.
Failsafe PMD wants to manage its sub-devices ports and does the same process as Testpmd.
Everything is ok.

Neil  approach:
Testpmd want to manage the fail-safe port, therefore it just need to claim ownership(set) and its pid will take as the owner identifier.
Failsafe PMD wants to manage its sub-devices ports and does the same process as Testpmd.
But look these 2 entities run in same threads and there both can set the same pid. -> problem!

The second one just describe more scenario about more than one DPDK entities which run from the same thread.

> >
> > So, no any optional  owner becomes an owner, it depends in the specific
> implementation.
> >
> > So if some "part of code" wants to manage a port exclusively and wants to
> take ownership of it to prevent other "part of code" to use this port :
> > 1. Take ownership.
> > 2. It should ask itself: Am I run in different threads\processes? If yes, it
> should synchronize its port management.
> > 3. Release ownership in the end.
> >
> > Remember that may be different "part of code"s running in the same
> thread\threads\process\processes.
> >
> > Thanks, Matan.
> > >
> > > Regards
> > > Neil
> > >
> > > }

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership Matan Azrad
@ 2018-01-19 12:37       ` Ananyev, Konstantin
  2018-01-19 12:51         ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-19 12:37 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce

Hi Matan,

> -----Original Message-----
> From: Matan Azrad [mailto:matan@mellanox.com]
> Sent: Thursday, January 18, 2018 4:35 PM
> To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson, Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>
> Subject: [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> 
> Testpmd should not use ethdev ports which are managed by other DPDK
> entities.
> 
> Set Testpmd ownership to each port which is not used by other entity and
> prevent any usage of ethdev ports which are not owned by Testpmd.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> ---
>  app/test-pmd/cmdline.c      | 89 +++++++++++++++++++--------------------------
>  app/test-pmd/cmdline_flow.c |  2 +-
>  app/test-pmd/config.c       | 37 ++++++++++---------
>  app/test-pmd/parameters.c   |  4 +-
>  app/test-pmd/testpmd.c      | 63 ++++++++++++++++++++------------
>  app/test-pmd/testpmd.h      |  3 ++
>  6 files changed, 103 insertions(+), 95 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index 31919ba..6199c64 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
>  			&link_speed) < 0)
>  		return;
> 
> -	RTE_ETH_FOREACH_DEV(pid) {
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {

Why do we need all these changes?
As I understand you changed definition of RTE_ETH_FOREACH_DEV(),
so no testpmd should work ok default (no_owner case).
Am I missing something here? 
Konstantin

>  		ports[pid].dev_conf.link_speeds = link_speed;
>  	}
> 
> @@ -1902,7 +1902,7 @@ struct cmd_config_rss {
>  	struct cmd_config_rss *res = parsed_result;
>  	struct rte_eth_rss_conf rss_conf = { .rss_key_len = 0, };
>  	int diag;
> -	uint8_t i;
> +	uint16_t pid;
> 
>  	if (!strcmp(res->value, "all"))
>  		rss_conf.rss_hf = ETH_RSS_IP | ETH_RSS_TCP |
> @@ -1936,12 +1936,12 @@ struct cmd_config_rss {
>  		return;
>  	}
>  	rss_conf.rss_key = NULL;
> -	for (i = 0; i < rte_eth_dev_count(); i++) {
> -		diag = rte_eth_dev_rss_hash_update(i, &rss_conf);
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
> +		diag = rte_eth_dev_rss_hash_update(pid, &rss_conf);
>  		if (diag < 0)
>  			printf("Configuration of RSS hash at ethernet port %d "
>  				"failed with error (%d): %s.\n",
> -				i, -diag, strerror(-diag));
> +				pid, -diag, strerror(-diag));
>  	}
>  }
> 
> @@ -3686,10 +3686,9 @@ struct cmd_csum_result {
>  	uint64_t csum_offloads = 0;
>  	struct rte_eth_dev_info dev_info;
> 
> -	if (port_id_is_invalid(res->port_id, ENABLED_WARN)) {
> -		printf("invalid port %d\n", res->port_id);
> +	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
>  		return;
> -	}
> +
>  	if (!port_is_stopped(res->port_id)) {
>  		printf("Please stop port %d first\n", res->port_id);
>  		return;
> @@ -4364,8 +4363,8 @@ struct cmd_gso_show_result {
>  {
>  	struct cmd_gso_show_result *res = parsed_result;
> 
> -	if (!rte_eth_dev_is_valid_port(res->cmd_pid)) {
> -		printf("invalid port id %u\n", res->cmd_pid);
> +	if (port_id_is_invalid(res->cmd_pid, ENABLED_WARN)) {
> +		printf("invalid/not owned port id %u\n", res->cmd_pid);
>  		return;
>  	}
>  	if (!strcmp(res->cmd_keyword, "gso")) {
> @@ -5375,7 +5374,12 @@ static void cmd_create_bonded_device_parsed(void *parsed_result,
>  				port_id);
> 
>  		/* Update number of ports */
> -		nb_ports = rte_eth_dev_count();
> +		if (rte_eth_dev_owner_set(port_id, &my_owner) != 0) {
> +			printf("Error: cannot own new attached port %d\n",
> +			       port_id);
> +			return;
> +		}
> +		nb_ports++;
>  		reconfig(port_id, res->socket);
>  		rte_eth_promiscuous_enable(port_id);
>  	}
> @@ -5484,10 +5488,8 @@ static void cmd_set_bond_mon_period_parsed(void *parsed_result,
>  	struct cmd_set_bond_mon_period_result *res = parsed_result;
>  	int ret;
> 
> -	if (res->port_num >= nb_ports) {
> -		printf("Port id %d must be less than %d\n", res->port_num, nb_ports);
> +	if (port_id_is_invalid(res->port_num, ENABLED_WARN))
>  		return;
> -	}
> 
>  	ret = rte_eth_bond_link_monitoring_set(res->port_num, res->period_ms);
> 
> @@ -5545,11 +5547,8 @@ struct cmd_set_bonding_agg_mode_policy_result {
>  	struct cmd_set_bonding_agg_mode_policy_result *res = parsed_result;
>  	uint8_t policy = AGG_BANDWIDTH;
> 
> -	if (res->port_num >= nb_ports) {
> -		printf("Port id %d must be less than %d\n",
> -				res->port_num, nb_ports);
> +	if (port_id_is_invalid(res->port_num, ENABLED_WARN))
>  		return;
> -	}
> 
>  	if (!strcmp(res->policy, "bandwidth"))
>  		policy = AGG_BANDWIDTH;
> @@ -5808,7 +5807,7 @@ static void cmd_set_promisc_mode_parsed(void *parsed_result,
> 
>  	/* all ports */
>  	if (allports) {
> -		RTE_ETH_FOREACH_DEV(i) {
> +		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
>  			if (enable)
>  				rte_eth_promiscuous_enable(i);
>  			else
> @@ -5888,7 +5887,7 @@ static void cmd_set_allmulti_mode_parsed(void *parsed_result,
> 
>  	/* all ports */
>  	if (allports) {
> -		RTE_ETH_FOREACH_DEV(i) {
> +		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
>  			if (enable)
>  				rte_eth_allmulticast_enable(i);
>  			else
> @@ -6622,31 +6621,31 @@ static void cmd_showportall_parsed(void *parsed_result,
>  	struct cmd_showportall_result *res = parsed_result;
>  	if (!strcmp(res->show, "clear")) {
>  		if (!strcmp(res->what, "stats"))
> -			RTE_ETH_FOREACH_DEV(i)
> +			RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
>  				nic_stats_clear(i);
>  		else if (!strcmp(res->what, "xstats"))
> -			RTE_ETH_FOREACH_DEV(i)
> +			RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
>  				nic_xstats_clear(i);
>  	} else if (!strcmp(res->what, "info"))
> -		RTE_ETH_FOREACH_DEV(i)
> +		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
>  			port_infos_display(i);
>  	else if (!strcmp(res->what, "stats"))
> -		RTE_ETH_FOREACH_DEV(i)
> +		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
>  			nic_stats_display(i);
>  	else if (!strcmp(res->what, "xstats"))
> -		RTE_ETH_FOREACH_DEV(i)
> +		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
>  			nic_xstats_display(i);
>  	else if (!strcmp(res->what, "fdir"))
> -		RTE_ETH_FOREACH_DEV(i)
> +		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
>  			fdir_get_infos(i);
>  	else if (!strcmp(res->what, "stat_qmap"))
> -		RTE_ETH_FOREACH_DEV(i)
> +		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
>  			nic_stats_mapping_display(i);
>  	else if (!strcmp(res->what, "dcb_tc"))
> -		RTE_ETH_FOREACH_DEV(i)
> +		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
>  			port_dcb_info_display(i);
>  	else if (!strcmp(res->what, "cap"))
> -		RTE_ETH_FOREACH_DEV(i)
> +		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
>  			port_offload_cap_display(i);
>  }
> 
> @@ -10698,10 +10697,8 @@ struct cmd_flow_director_mask_result {
>  	struct rte_eth_fdir_masks *mask;
>  	struct rte_port *port;
> 
> -	if (res->port_id > nb_ports) {
> -		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
> +	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
>  		return;
> -	}
> 
>  	port = &ports[res->port_id];
>  	/** Check if the port is not started **/
> @@ -10899,10 +10896,8 @@ struct cmd_flow_director_flex_mask_result {
>  	uint16_t i;
>  	int ret;
> 
> -	if (res->port_id > nb_ports) {
> -		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
> +	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
>  		return;
> -	}
> 
>  	port = &ports[res->port_id];
>  	/** Check if the port is not started **/
> @@ -11053,12 +11048,10 @@ struct cmd_flow_director_flexpayload_result {
>  	struct cmd_flow_director_flexpayload_result *res = parsed_result;
>  	struct rte_eth_flex_payload_cfg flex_cfg;
>  	struct rte_port *port;
> -	int ret = 0;
> +	int ret;
> 
> -	if (res->port_id > nb_ports) {
> -		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
> +	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
>  		return;
> -	}
> 
>  	port = &ports[res->port_id];
>  	/** Check if the port is not started **/
> @@ -11774,7 +11767,7 @@ struct cmd_config_l2_tunnel_eth_type_result {
>  	entry.l2_tunnel_type = str2fdir_l2_tunnel_type(res->l2_tunnel_type);
>  	entry.ether_type = res->eth_type_val;
> 
> -	RTE_ETH_FOREACH_DEV(pid) {
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
>  		rte_eth_dev_l2_tunnel_eth_type_conf(pid, &entry);
>  	}
>  }
> @@ -11890,7 +11883,7 @@ struct cmd_config_l2_tunnel_en_dis_result {
>  	else
>  		en = 0;
> 
> -	RTE_ETH_FOREACH_DEV(pid) {
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
>  		rte_eth_dev_l2_tunnel_offload_set(pid,
>  						  &entry,
>  						  ETH_L2_TUNNEL_ENABLE_MASK,
> @@ -14440,10 +14433,8 @@ struct cmd_ddp_add_result {
>  	int file_num;
>  	int ret = -ENOTSUP;
> 
> -	if (res->port_id > nb_ports) {
> -		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
> +	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
>  		return;
> -	}
> 
>  	if (!all_ports_stopped()) {
>  		printf("Please stop all ports first\n");
> @@ -14522,10 +14513,8 @@ struct cmd_ddp_del_result {
>  	uint32_t size;
>  	int ret = -ENOTSUP;
> 
> -	if (res->port_id > nb_ports) {
> -		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
> +	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
>  		return;
> -	}
> 
>  	if (!all_ports_stopped()) {
>  		printf("Please stop all ports first\n");
> @@ -14837,10 +14826,8 @@ struct cmd_ddp_get_list_result {
>  #endif
>  	int ret = -ENOTSUP;
> 
> -	if (res->port_id > nb_ports) {
> -		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
> +	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
>  		return;
> -	}
> 
>  #ifdef RTE_LIBRTE_I40E_PMD
>  	size = PROFILE_INFO_SIZE * MAX_PROFILE_NUM + 4;
> @@ -16296,7 +16283,7 @@ struct cmd_cmdfile_result {
>  	if (id == (portid_t)RTE_PORT_ALL) {
>  		portid_t pid;
> 
> -		RTE_ETH_FOREACH_DEV(pid) {
> +		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
>  			/* check if need_reconfig has been set to 1 */
>  			if (ports[pid].need_reconfig == 0)
>  				ports[pid].need_reconfig = dev;
> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> index 561e057..e55490f 100644
> --- a/app/test-pmd/cmdline_flow.c
> +++ b/app/test-pmd/cmdline_flow.c
> @@ -2652,7 +2652,7 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
> 
>  	(void)ctx;
>  	(void)token;
> -	RTE_ETH_FOREACH_DEV(p) {
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, my_owner.id) {
>  		if (buf && i == ent)
>  			return snprintf(buf, size, "%u", p);
>  		++i;
> diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
> index 957b820..43b9a7d 100644
> --- a/app/test-pmd/config.c
> +++ b/app/test-pmd/config.c
> @@ -156,7 +156,7 @@ struct rss_type_info {
> 
>  	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
>  		printf("Valid port range is [0");
> -		RTE_ETH_FOREACH_DEV(pid)
> +		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
>  			printf(", %d", pid);
>  		printf("]\n");
>  		return;
> @@ -236,7 +236,7 @@ struct rss_type_info {
> 
>  	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
>  		printf("Valid port range is [0");
> -		RTE_ETH_FOREACH_DEV(pid)
> +		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
>  			printf(", %d", pid);
>  		printf("]\n");
>  		return;
> @@ -253,10 +253,9 @@ struct rss_type_info {
>  	struct rte_eth_xstat_name *xstats_names;
> 
>  	printf("###### NIC extended statistics for port %-2d\n", port_id);
> -	if (!rte_eth_dev_is_valid_port(port_id)) {
> -		printf("Error: Invalid port number %i\n", port_id);
> +
> +	if (port_id_is_invalid(port_id, ENABLED_WARN))
>  		return;
> -	}
> 
>  	/* Get count */
>  	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, 0);
> @@ -321,7 +320,7 @@ struct rss_type_info {
> 
>  	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
>  		printf("Valid port range is [0");
> -		RTE_ETH_FOREACH_DEV(pid)
> +		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
>  			printf(", %d", pid);
>  		printf("]\n");
>  		return;
> @@ -439,7 +438,7 @@ struct rss_type_info {
> 
>  	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
>  		printf("Valid port range is [0");
> -		RTE_ETH_FOREACH_DEV(pid)
> +		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
>  			printf(", %d", pid);
>  		printf("]\n");
>  		return;
> @@ -756,10 +755,15 @@ struct rss_type_info {
>  int
>  port_id_is_invalid(portid_t port_id, enum print_warning warning)
>  {
> +	struct rte_eth_dev_owner owner;
> +	int ret;
> +
>  	if (port_id == (portid_t)RTE_PORT_ALL)
>  		return 0;
> 
> -	if (rte_eth_dev_is_valid_port(port_id))
> +	ret = rte_eth_dev_owner_get(port_id, &owner);
> +
> +	if (ret == 0 && owner.id == my_owner.id)
>  		return 0;
> 
>  	if (warning == ENABLED_WARN)
> @@ -2373,7 +2377,7 @@ struct igb_ring_desc_16_bytes {
>  		return;
>  	}
>  	nb_pt = 0;
> -	RTE_ETH_FOREACH_DEV(i) {
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
>  		if (! ((uint64_t)(1ULL << i) & portmask))
>  			continue;
>  		portlist[nb_pt++] = i;
> @@ -2512,10 +2516,9 @@ struct igb_ring_desc_16_bytes {
>  void
>  setup_gro(const char *onoff, portid_t port_id)
>  {
> -	if (!rte_eth_dev_is_valid_port(port_id)) {
> -		printf("invalid port id %u\n", port_id);
> +	if (port_id_is_invalid(port_id, ENABLED_WARN))
>  		return;
> -	}
> +
>  	if (test_done == 0) {
>  		printf("Before enable/disable GRO,"
>  				" please stop forwarding first\n");
> @@ -2574,10 +2577,9 @@ struct igb_ring_desc_16_bytes {
> 
>  	param = &gro_ports[port_id].param;
> 
> -	if (!rte_eth_dev_is_valid_port(port_id)) {
> -		printf("Invalid port id %u.\n", port_id);
> +	if (port_id_is_invalid(port_id, ENABLED_WARN))
>  		return;
> -	}
> +
>  	if (gro_ports[port_id].enable) {
>  		printf("GRO type: TCP/IPv4\n");
>  		if (gro_flush_cycles == GRO_DEFAULT_FLUSH_CYCLES) {
> @@ -2595,10 +2597,9 @@ struct igb_ring_desc_16_bytes {
>  void
>  setup_gso(const char *mode, portid_t port_id)
>  {
> -	if (!rte_eth_dev_is_valid_port(port_id)) {
> -		printf("invalid port id %u\n", port_id);
> +	if (port_id_is_invalid(port_id, ENABLED_WARN))
>  		return;
> -	}
> +
>  	if (strcmp(mode, "on") == 0) {
>  		if (test_done == 0) {
>  			printf("before enabling GSO,"
> diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
> index 878c112..0e57b46 100644
> --- a/app/test-pmd/parameters.c
> +++ b/app/test-pmd/parameters.c
> @@ -398,7 +398,7 @@
>  		if (port_id_is_invalid(port_id, ENABLED_WARN) ||
>  			port_id == (portid_t)RTE_PORT_ALL) {
>  			printf("Valid port range is [0");
> -			RTE_ETH_FOREACH_DEV(pid)
> +			RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
>  				printf(", %d", pid);
>  			printf("]\n");
>  			return -1;
> @@ -459,7 +459,7 @@
>  		if (port_id_is_invalid(port_id, ENABLED_WARN) ||
>  			port_id == (portid_t)RTE_PORT_ALL) {
>  			printf("Valid port range is [0");
> -			RTE_ETH_FOREACH_DEV(pid)
> +			RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
>  				printf(", %d", pid);
>  			printf("]\n");
>  			return -1;
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
> index c066cf9..83f5e84 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -108,6 +108,11 @@
>  lcoreid_t nb_lcores;           /**< Number of probed logical cores. */
> 
>  /*
> + * My port owner structure used to own Ethernet ports.
> + */
> +struct rte_eth_dev_owner my_owner; /**< Unique owner. */
> +
> +/*
>   * Test Forwarding Configuration.
>   *    nb_fwd_lcores <= nb_cfg_lcores <= nb_lcores
>   *    nb_fwd_ports  <= nb_cfg_ports  <= nb_ports
> @@ -449,7 +454,7 @@ static int eth_event_callback(portid_t port_id,
>  	portid_t pt_id;
>  	int i = 0;
> 
> -	RTE_ETH_FOREACH_DEV(pt_id)
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(pt_id, my_owner.id)
>  		fwd_ports_ids[i++] = pt_id;
> 
>  	nb_cfg_ports = nb_ports;
> @@ -573,7 +578,7 @@ static int eth_event_callback(portid_t port_id,
>  		fwd_lcores[lc_id]->cpuid_idx = lc_id;
>  	}
> 
> -	RTE_ETH_FOREACH_DEV(pid) {
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
>  		port = &ports[pid];
>  		/* Apply default Tx configuration for all ports */
>  		port->dev_conf.txmode = tx_mode;
> @@ -706,7 +711,7 @@ static int eth_event_callback(portid_t port_id,
>  	queueid_t q;
> 
>  	/* set socket id according to numa or not */
> -	RTE_ETH_FOREACH_DEV(pid) {
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
>  		port = &ports[pid];
>  		if (nb_rxq > port->dev_info.max_rx_queues) {
>  			printf("Fail: nb_rxq(%d) is greater than "
> @@ -1000,9 +1005,8 @@ static int eth_event_callback(portid_t port_id,
>  	uint64_t tics_per_1sec;
>  	uint64_t tics_datum;
>  	uint64_t tics_current;
> -	uint8_t idx_port, cnt_ports;
> +	uint16_t idx_port;
> 
> -	cnt_ports = rte_eth_dev_count();
>  	tics_datum = rte_rdtsc();
>  	tics_per_1sec = rte_get_timer_hz();
>  #endif
> @@ -1017,11 +1021,10 @@ static int eth_event_callback(portid_t port_id,
>  			tics_current = rte_rdtsc();
>  			if (tics_current - tics_datum >= tics_per_1sec) {
>  				/* Periodic bitrate calculation */
> -				for (idx_port = 0;
> -						idx_port < cnt_ports;
> -						idx_port++)
> +				RTE_ETH_FOREACH_DEV_OWNED_BY(idx_port,
> +							     my_owner.id)
>  					rte_stats_bitrate_calc(bitrate_data,
> -						idx_port);
> +							       idx_port);
>  				tics_datum = tics_current;
>  			}
>  		}
> @@ -1359,7 +1362,7 @@ static int eth_event_callback(portid_t port_id,
>  	portid_t pi;
>  	struct rte_port *port;
> 
> -	RTE_ETH_FOREACH_DEV(pi) {
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
>  		port = &ports[pi];
>  		/* Check if there is a port which is not started */
>  		if ((port->port_status != RTE_PORT_STARTED) &&
> @@ -1387,7 +1390,7 @@ static int eth_event_callback(portid_t port_id,
>  {
>  	portid_t pi;
> 
> -	RTE_ETH_FOREACH_DEV(pi) {
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
>  		if (!port_is_stopped(pi))
>  			return 0;
>  	}
> @@ -1434,7 +1437,7 @@ static int eth_event_callback(portid_t port_id,
> 
>  	if(dcb_config)
>  		dcb_test = 1;
> -	RTE_ETH_FOREACH_DEV(pi) {
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
>  		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
>  			continue;
> 
> @@ -1620,7 +1623,7 @@ static int eth_event_callback(portid_t port_id,
> 
>  	printf("Stopping ports...\n");
> 
> -	RTE_ETH_FOREACH_DEV(pi) {
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
>  		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
>  			continue;
> 
> @@ -1663,7 +1666,7 @@ static int eth_event_callback(portid_t port_id,
> 
>  	printf("Closing ports...\n");
> 
> -	RTE_ETH_FOREACH_DEV(pi) {
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
>  		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
>  			continue;
> 
> @@ -1714,7 +1717,7 @@ static int eth_event_callback(portid_t port_id,
> 
>  	printf("Resetting ports...\n");
> 
> -	RTE_ETH_FOREACH_DEV(pi) {
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
>  		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
>  			continue;
> 
> @@ -1759,6 +1762,12 @@ static int eth_event_callback(portid_t port_id,
>  	if (rte_eth_dev_attach(identifier, &pi))
>  		return;
> 
> +	if (rte_eth_dev_owner_set(pi, &my_owner) != 0) {
> +		printf("Error: cannot own new attached port %d\n", pi);
> +		return;
> +	}
> +	nb_ports++;
> +
>  	socket_id = (unsigned)rte_eth_dev_socket_id(pi);
>  	/* if socket_id is invalid, set to 0 */
>  	if (check_socket_id(socket_id) < 0)
> @@ -1766,8 +1775,6 @@ static int eth_event_callback(portid_t port_id,
>  	reconfig(pi, socket_id);
>  	rte_eth_promiscuous_enable(pi);
> 
> -	nb_ports = rte_eth_dev_count();
> -
>  	ports[pi].port_status = RTE_PORT_STOPPED;
> 
>  	printf("Port %d is attached. Now total ports is %d\n", pi, nb_ports);
> @@ -1781,6 +1788,9 @@ static int eth_event_callback(portid_t port_id,
> 
>  	printf("Detaching a port...\n");
> 
> +	if (port_id_is_invalid(port_id, ENABLED_WARN))
> +		return;
> +
>  	if (!port_is_closed(port_id)) {
>  		printf("Please close port first\n");
>  		return;
> @@ -1794,7 +1804,7 @@ static int eth_event_callback(portid_t port_id,
>  		return;
>  	}
> 
> -	nb_ports = rte_eth_dev_count();
> +	nb_ports--;
> 
>  	printf("Port '%s' is detached. Now total ports is %d\n",
>  			name, nb_ports);
> @@ -1812,7 +1822,7 @@ static int eth_event_callback(portid_t port_id,
> 
>  	if (ports != NULL) {
>  		no_link_check = 1;
> -		RTE_ETH_FOREACH_DEV(pt_id) {
> +		RTE_ETH_FOREACH_DEV_OWNED_BY(pt_id, my_owner.id) {
>  			printf("\nShutting down port %d...\n", pt_id);
>  			fflush(stdout);
>  			stop_port(pt_id);
> @@ -1844,7 +1854,7 @@ struct pmd_test_command {
>  	fflush(stdout);
>  	for (count = 0; count <= MAX_CHECK_TIME; count++) {
>  		all_ports_up = 1;
> -		RTE_ETH_FOREACH_DEV(portid) {
> +		RTE_ETH_FOREACH_DEV_OWNED_BY(portid, my_owner.id) {
>  			if ((port_mask & (1 << portid)) == 0)
>  				continue;
>  			memset(&link, 0, sizeof(link));
> @@ -1936,6 +1946,8 @@ struct pmd_test_command {
> 
>  	switch (type) {
>  	case RTE_ETH_EVENT_INTR_RMV:
> +		if (port_id_is_invalid(port_id, ENABLED_WARN))
> +			break;
>  		if (rte_eal_alarm_set(100000,
>  				rmv_event_callback, (void *)(intptr_t)port_id))
>  			fprintf(stderr, "Could not set up deferred device removal\n");
> @@ -2068,7 +2080,7 @@ struct pmd_test_command {
>  	portid_t pid;
>  	struct rte_port *port;
> 
> -	RTE_ETH_FOREACH_DEV(pid) {
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
>  		port = &ports[pid];
>  		port->dev_conf.fdir_conf = fdir_conf;
>  		if (nb_rxq > 1) {
> @@ -2383,7 +2395,12 @@ uint8_t port_is_bonding_slave(portid_t slave_pid)
>  	rte_pdump_init(NULL);
>  #endif
> 
> -	nb_ports = (portid_t) rte_eth_dev_count();
> +	if (rte_eth_dev_owner_new(&my_owner.id))
> +		rte_panic("Failed to get unique owner identifier\n");
> +	snprintf(my_owner.name, sizeof(my_owner.name), TESTPMD_OWNER_NAME);
> +	RTE_ETH_FOREACH_DEV(port_id)
> +		if (rte_eth_dev_owner_set(port_id, &my_owner) == 0)
> +			nb_ports++;
>  	if (nb_ports == 0)
>  		TESTPMD_LOG(WARNING, "No probed ethernet devices\n");
> 
> @@ -2431,7 +2448,7 @@ uint8_t port_is_bonding_slave(portid_t slave_pid)
>  		rte_exit(EXIT_FAILURE, "Start ports failed\n");
> 
>  	/* set all ports to promiscuous mode by default */
> -	RTE_ETH_FOREACH_DEV(port_id)
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(port_id, my_owner.id)
>  		rte_eth_promiscuous_enable(port_id);
> 
>  	/* Init metrics library */
> diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
> index 9c739e5..2d253b9 100644
> --- a/app/test-pmd/testpmd.h
> +++ b/app/test-pmd/testpmd.h
> @@ -50,6 +50,8 @@
>  #define NUMA_NO_CONFIG 0xFF
>  #define UMA_NO_CONFIG  0xFF
> 
> +#define TESTPMD_OWNER_NAME "TestPMD"
> +
>  typedef uint8_t  lcoreid_t;
>  typedef uint16_t portid_t;
>  typedef uint16_t queueid_t;
> @@ -361,6 +363,7 @@ struct queue_stats_mappings {
>   * nb_fwd_ports <= nb_cfg_ports <= nb_ports
>   */
>  extern portid_t nb_ports; /**< Number of ethernet ports probed at init time. */
> +extern struct rte_eth_dev_owner my_owner; /**< Unique owner. */
>  extern portid_t nb_cfg_ports; /**< Number of configured ports. */
>  extern portid_t nb_fwd_ports; /**< Number of forwarding ports. */
>  extern portid_t fwd_ports_ids[RTE_MAX_ETHPORTS];
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/7] ethdev: fix port data reset timing
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 1/7] ethdev: fix port data reset timing Matan Azrad
  2018-01-18 17:00       ` Thomas Monjalon
@ 2018-01-19 12:38       ` Ananyev, Konstantin
  2018-03-05 11:24       ` [dpdk-dev] [dpdk-stable] " Ferruh Yigit
  2 siblings, 0 replies; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-19 12:38 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce, stable



> -----Original Message-----
> From: Matan Azrad [mailto:matan@mellanox.com]
> Sent: Thursday, January 18, 2018 4:35 PM
> To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson, Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; stable@dpdk.org
> Subject: [PATCH v3 1/7] ethdev: fix port data reset timing
> 
> rte_eth_dev_data structure is allocated per ethdev port and can be
> used to get a data of the port internally.
> 
> rte_eth_dev_attach_secondary tries to find the port identifier using
> rte_eth_dev_data name field comparison and may get an identifier of
> invalid port in case of this port was released by the primary process
> because the port release API doesn't reset the port data.
> 
> So, it will be better to reset the port data in release time instead of
> allocation time.
> 
> Move the port data reset to the port release API.
> 
> Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple process model")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> ---
>  lib/librte_ether/rte_ethdev.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 7044159..156231c 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -204,7 +204,6 @@ struct rte_eth_dev *
>  		return NULL;
>  	}
> 
> -	memset(&rte_eth_dev_data[port_id], 0, sizeof(struct rte_eth_dev_data));
>  	eth_dev = eth_dev_get(port_id);
>  	snprintf(eth_dev->data->name, sizeof(eth_dev->data->name), "%s", name);
>  	eth_dev->data->port_id = port_id;
> @@ -252,6 +251,7 @@ struct rte_eth_dev *
>  	if (eth_dev == NULL)
>  		return -EINVAL;
> 
> +	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
>  	eth_dev->state = RTE_ETH_DEV_UNUSED;
> 
>  	_rte_eth_dev_callback_process(eth_dev, RTE_ETH_EVENT_DESTROY, NULL);
> --
> 1.8.3.1

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/7] ethdev: fix used portid allocation
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 2/7] ethdev: fix used portid allocation Matan Azrad
  2018-01-18 17:00       ` Thomas Monjalon
@ 2018-01-19 12:40       ` Ananyev, Konstantin
  2018-01-20 16:48         ` Matan Azrad
  1 sibling, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-19 12:40 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce, stable



> -----Original Message-----
> From: Matan Azrad [mailto:matan@mellanox.com]
> Sent: Thursday, January 18, 2018 4:35 PM
> To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson, Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; stable@dpdk.org
> Subject: [PATCH v3 2/7] ethdev: fix used portid allocation
> 
> rte_eth_dev_find_free_port() found a free port by state checking.
> The state field are in local process memory, so other DPDK processes
> may get the same port ID because their local states may be different.
> 
> Replace the state checking by the ethdev port name checking,
> so, if the name is an empty string the port ID will be detected as
> unused.
> 
> Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple process model")
> Cc: stable@dpdk.org
> 
> Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> ---
>  lib/librte_ether/rte_ethdev.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 156231c..5d87f72 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -164,7 +164,7 @@ struct rte_eth_dev *
>  	unsigned i;
> 
>  	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> -		if (rte_eth_devices[i].state == RTE_ETH_DEV_UNUSED)
> +		if (rte_eth_dev_share_data->data[i].name[0] == '\0')

I know it is not really necessary, but I'd keep both (just in case):
if (rte_eth_devices[i].state == RTE_ETH_DEV_UNUSED) && rte_eth_dev_share_data->data[i].name[0] == '\0')

Aprart from that: Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

>  			return i;
>  	}
>  	return RTE_MAX_ETHPORTS;
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 3/7] ethdev: add port ownership
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 3/7] ethdev: add port ownership Matan Azrad
  2018-01-18 21:11       ` Thomas Monjalon
@ 2018-01-19 12:41       ` Ananyev, Konstantin
  1 sibling, 0 replies; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-19 12:41 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce



> -----Original Message-----
> From: Matan Azrad [mailto:matan@mellanox.com]
> Sent: Thursday, January 18, 2018 4:35 PM
> To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson, Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>
> Subject: [PATCH v3 3/7] ethdev: add port ownership
> 
> The ownership of a port is implicit in DPDK.
> Making it explicit is better from the next reasons:
> 1. It will define well who is in charge of the port usage synchronization.
> 2. A library could work on top of a port.
> 3. A port can work on top of another port.
> 
> Also in the fail-safe case, an issue has been met in testpmd.
> We need to check that the application is not trying to use a port which
> is already managed by fail-safe.
> 
> A port owner is built from owner id(number) and owner name(string) while
> the owner id must be unique to distinguish between two identical entity
> instances and the owner name can be any name.
> The name helps to logically recognize the owner by different DPDK
> entities and allows easy debug.
> Each DPDK entity can allocate an owner unique identifier and can use it
> and its preferred name to owns valid ethdev ports.
> Each DPDK entity can get any port owner status to decide if it can
> manage the port or not.
> 
> The mechanism is synchronized for both the primary process threads and
> the secondary processes threads to allow secondary process entity to be
> a port owner.
> 
> Add a sinchronized ownership mechanism to DPDK Ethernet devices to
> avoid multiple management of a device by different DPDK entities.
> 
> The current ethdev internal port management is not affected by this
> feature.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> ---

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 4/7] ethdev: synchronize port allocation
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 4/7] ethdev: synchronize port allocation Matan Azrad
  2018-01-18 20:43       ` Thomas Monjalon
@ 2018-01-19 12:47       ` Ananyev, Konstantin
  1 sibling, 0 replies; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-19 12:47 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce



> -----Original Message-----
> From: Matan Azrad [mailto:matan@mellanox.com]
> Sent: Thursday, January 18, 2018 4:35 PM
> To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson, Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>
> Subject: [PATCH v3 4/7] ethdev: synchronize port allocation
> 
> Ethernet port allocation was not thread safe, means 2 threads which tried
> to allocate a new port at the same time might get an identical port
> identifier and caused to memory overwrite.
> Actually, all the port configurations were not thread safe from ethdev
> point of view.
> 
> The port ownership mechanism added to the ethdev is a good point to
> redefine the synchronization rules in ethdev:
> 
> 1. The port allocation and port release synchronization will be
>    managed by ethdev.
> 2. The port usage synchronization will be managed by the port owner.
> 3. The port ownership synchronization will be managed by ethdev.
> 
> Add port allocation synchronization to complete the new rules.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> ---

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-19 12:37       ` Ananyev, Konstantin
@ 2018-01-19 12:51         ` Matan Azrad
  2018-01-19 13:08           ` Ananyev, Konstantin
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-19 12:51 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce

Hi Konstantin

From: Ananyev, Konstantin, Friday, January 19, 2018 2:38 PM
> To: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>; Wu,
> Jingjing <jingjing.wu@intel.com>
> Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> Bruce <bruce.richardson@intel.com>
> Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> 
> Hi Matan,
> 
> > -----Original Message-----
> > From: Matan Azrad [mailto:matan@mellanox.com]
> > Sent: Thursday, January 18, 2018 4:35 PM
> > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>
> > Subject: [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> >
> > Testpmd should not use ethdev ports which are managed by other DPDK
> > entities.
> >
> > Set Testpmd ownership to each port which is not used by other entity
> > and prevent any usage of ethdev ports which are not owned by Testpmd.
> >
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > ---
> >  app/test-pmd/cmdline.c      | 89 +++++++++++++++++++---------------------
> -----
> >  app/test-pmd/cmdline_flow.c |  2 +-
> >  app/test-pmd/config.c       | 37 ++++++++++---------
> >  app/test-pmd/parameters.c   |  4 +-
> >  app/test-pmd/testpmd.c      | 63 ++++++++++++++++++++------------
> >  app/test-pmd/testpmd.h      |  3 ++
> >  6 files changed, 103 insertions(+), 95 deletions(-)
> >
> > diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
> > 31919ba..6199c64 100644
> > --- a/app/test-pmd/cmdline.c
> > +++ b/app/test-pmd/cmdline.c
> > @@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
> >  			&link_speed) < 0)
> >  		return;
> >
> > -	RTE_ETH_FOREACH_DEV(pid) {
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
> 
> Why do we need all these changes?
> As I understand you changed definition of RTE_ETH_FOREACH_DEV(), so no
> testpmd should work ok default (no_owner case).
> Am I missing something here?

Now, After Gaetan suggestion RTE_ETH_FOREACH_DEV(pid) will iterate over all valid and ownerless ports.
Here Testpmd wants to iterate over its owned ports.

I added to Testpmd ability to take an ownership of ports as the new ownership and synchronization rules suggested,
Since Tespmd is a DPDK entity which wants that no one will touch its owned ports,
It must allocate an unique ID, set owner for its ports (see in main function) and recognizes them by its owner ID.    

> Konstantin
> 
> >  		ports[pid].dev_conf.link_speeds = link_speed;
> >  	}
> >
> > @@ -1902,7 +1902,7 @@ struct cmd_config_rss {
> >  	struct cmd_config_rss *res = parsed_result;
> >  	struct rte_eth_rss_conf rss_conf = { .rss_key_len = 0, };
> >  	int diag;
> > -	uint8_t i;
> > +	uint16_t pid;
> >
> >  	if (!strcmp(res->value, "all"))
> >  		rss_conf.rss_hf = ETH_RSS_IP | ETH_RSS_TCP | @@ -1936,12
> +1936,12
> > @@ struct cmd_config_rss {
> >  		return;
> >  	}
> >  	rss_conf.rss_key = NULL;
> > -	for (i = 0; i < rte_eth_dev_count(); i++) {
> > -		diag = rte_eth_dev_rss_hash_update(i, &rss_conf);
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
> > +		diag = rte_eth_dev_rss_hash_update(pid, &rss_conf);
> >  		if (diag < 0)
> >  			printf("Configuration of RSS hash at ethernet port %d
> "
> >  				"failed with error (%d): %s.\n",
> > -				i, -diag, strerror(-diag));
> > +				pid, -diag, strerror(-diag));
> >  	}
> >  }
> >
> > @@ -3686,10 +3686,9 @@ struct cmd_csum_result {
> >  	uint64_t csum_offloads = 0;
> >  	struct rte_eth_dev_info dev_info;
> >
> > -	if (port_id_is_invalid(res->port_id, ENABLED_WARN)) {
> > -		printf("invalid port %d\n", res->port_id);
> > +	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
> >  		return;
> > -	}
> > +
> >  	if (!port_is_stopped(res->port_id)) {
> >  		printf("Please stop port %d first\n", res->port_id);
> >  		return;
> > @@ -4364,8 +4363,8 @@ struct cmd_gso_show_result {  {
> >  	struct cmd_gso_show_result *res = parsed_result;
> >
> > -	if (!rte_eth_dev_is_valid_port(res->cmd_pid)) {
> > -		printf("invalid port id %u\n", res->cmd_pid);
> > +	if (port_id_is_invalid(res->cmd_pid, ENABLED_WARN)) {
> > +		printf("invalid/not owned port id %u\n", res->cmd_pid);
> >  		return;
> >  	}
> >  	if (!strcmp(res->cmd_keyword, "gso")) { @@ -5375,7 +5374,12 @@
> > static void cmd_create_bonded_device_parsed(void *parsed_result,
> >  				port_id);
> >
> >  		/* Update number of ports */
> > -		nb_ports = rte_eth_dev_count();
> > +		if (rte_eth_dev_owner_set(port_id, &my_owner) != 0) {
> > +			printf("Error: cannot own new attached port %d\n",
> > +			       port_id);
> > +			return;
> > +		}
> > +		nb_ports++;
> >  		reconfig(port_id, res->socket);
> >  		rte_eth_promiscuous_enable(port_id);
> >  	}
> > @@ -5484,10 +5488,8 @@ static void
> cmd_set_bond_mon_period_parsed(void *parsed_result,
> >  	struct cmd_set_bond_mon_period_result *res = parsed_result;
> >  	int ret;
> >
> > -	if (res->port_num >= nb_ports) {
> > -		printf("Port id %d must be less than %d\n", res->port_num,
> nb_ports);
> > +	if (port_id_is_invalid(res->port_num, ENABLED_WARN))
> >  		return;
> > -	}
> >
> >  	ret = rte_eth_bond_link_monitoring_set(res->port_num,
> > res->period_ms);
> >
> > @@ -5545,11 +5547,8 @@ struct
> cmd_set_bonding_agg_mode_policy_result {
> >  	struct cmd_set_bonding_agg_mode_policy_result *res =
> parsed_result;
> >  	uint8_t policy = AGG_BANDWIDTH;
> >
> > -	if (res->port_num >= nb_ports) {
> > -		printf("Port id %d must be less than %d\n",
> > -				res->port_num, nb_ports);
> > +	if (port_id_is_invalid(res->port_num, ENABLED_WARN))
> >  		return;
> > -	}
> >
> >  	if (!strcmp(res->policy, "bandwidth"))
> >  		policy = AGG_BANDWIDTH;
> > @@ -5808,7 +5807,7 @@ static void cmd_set_promisc_mode_parsed(void
> > *parsed_result,
> >
> >  	/* all ports */
> >  	if (allports) {
> > -		RTE_ETH_FOREACH_DEV(i) {
> > +		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
> >  			if (enable)
> >  				rte_eth_promiscuous_enable(i);
> >  			else
> > @@ -5888,7 +5887,7 @@ static void cmd_set_allmulti_mode_parsed(void
> > *parsed_result,
> >
> >  	/* all ports */
> >  	if (allports) {
> > -		RTE_ETH_FOREACH_DEV(i) {
> > +		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
> >  			if (enable)
> >  				rte_eth_allmulticast_enable(i);
> >  			else
> > @@ -6622,31 +6621,31 @@ static void cmd_showportall_parsed(void
> *parsed_result,
> >  	struct cmd_showportall_result *res = parsed_result;
> >  	if (!strcmp(res->show, "clear")) {
> >  		if (!strcmp(res->what, "stats"))
> > -			RTE_ETH_FOREACH_DEV(i)
> > +			RTE_ETH_FOREACH_DEV_OWNED_BY(i,
> my_owner.id)
> >  				nic_stats_clear(i);
> >  		else if (!strcmp(res->what, "xstats"))
> > -			RTE_ETH_FOREACH_DEV(i)
> > +			RTE_ETH_FOREACH_DEV_OWNED_BY(i,
> my_owner.id)
> >  				nic_xstats_clear(i);
> >  	} else if (!strcmp(res->what, "info"))
> > -		RTE_ETH_FOREACH_DEV(i)
> > +		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
> >  			port_infos_display(i);
> >  	else if (!strcmp(res->what, "stats"))
> > -		RTE_ETH_FOREACH_DEV(i)
> > +		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
> >  			nic_stats_display(i);
> >  	else if (!strcmp(res->what, "xstats"))
> > -		RTE_ETH_FOREACH_DEV(i)
> > +		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
> >  			nic_xstats_display(i);
> >  	else if (!strcmp(res->what, "fdir"))
> > -		RTE_ETH_FOREACH_DEV(i)
> > +		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
> >  			fdir_get_infos(i);
> >  	else if (!strcmp(res->what, "stat_qmap"))
> > -		RTE_ETH_FOREACH_DEV(i)
> > +		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
> >  			nic_stats_mapping_display(i);
> >  	else if (!strcmp(res->what, "dcb_tc"))
> > -		RTE_ETH_FOREACH_DEV(i)
> > +		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
> >  			port_dcb_info_display(i);
> >  	else if (!strcmp(res->what, "cap"))
> > -		RTE_ETH_FOREACH_DEV(i)
> > +		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
> >  			port_offload_cap_display(i);
> >  }
> >
> > @@ -10698,10 +10697,8 @@ struct cmd_flow_director_mask_result {
> >  	struct rte_eth_fdir_masks *mask;
> >  	struct rte_port *port;
> >
> > -	if (res->port_id > nb_ports) {
> > -		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
> > +	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
> >  		return;
> > -	}
> >
> >  	port = &ports[res->port_id];
> >  	/** Check if the port is not started **/ @@ -10899,10 +10896,8 @@
> > struct cmd_flow_director_flex_mask_result {
> >  	uint16_t i;
> >  	int ret;
> >
> > -	if (res->port_id > nb_ports) {
> > -		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
> > +	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
> >  		return;
> > -	}
> >
> >  	port = &ports[res->port_id];
> >  	/** Check if the port is not started **/ @@ -11053,12 +11048,10 @@
> > struct cmd_flow_director_flexpayload_result {
> >  	struct cmd_flow_director_flexpayload_result *res = parsed_result;
> >  	struct rte_eth_flex_payload_cfg flex_cfg;
> >  	struct rte_port *port;
> > -	int ret = 0;
> > +	int ret;
> >
> > -	if (res->port_id > nb_ports) {
> > -		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
> > +	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
> >  		return;
> > -	}
> >
> >  	port = &ports[res->port_id];
> >  	/** Check if the port is not started **/ @@ -11774,7 +11767,7 @@
> > struct cmd_config_l2_tunnel_eth_type_result {
> >  	entry.l2_tunnel_type = str2fdir_l2_tunnel_type(res-
> >l2_tunnel_type);
> >  	entry.ether_type = res->eth_type_val;
> >
> > -	RTE_ETH_FOREACH_DEV(pid) {
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
> >  		rte_eth_dev_l2_tunnel_eth_type_conf(pid, &entry);
> >  	}
> >  }
> > @@ -11890,7 +11883,7 @@ struct cmd_config_l2_tunnel_en_dis_result {
> >  	else
> >  		en = 0;
> >
> > -	RTE_ETH_FOREACH_DEV(pid) {
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
> >  		rte_eth_dev_l2_tunnel_offload_set(pid,
> >  						  &entry,
> >
> ETH_L2_TUNNEL_ENABLE_MASK,
> > @@ -14440,10 +14433,8 @@ struct cmd_ddp_add_result {
> >  	int file_num;
> >  	int ret = -ENOTSUP;
> >
> > -	if (res->port_id > nb_ports) {
> > -		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
> > +	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
> >  		return;
> > -	}
> >
> >  	if (!all_ports_stopped()) {
> >  		printf("Please stop all ports first\n"); @@ -14522,10 +14513,8
> @@
> > struct cmd_ddp_del_result {
> >  	uint32_t size;
> >  	int ret = -ENOTSUP;
> >
> > -	if (res->port_id > nb_ports) {
> > -		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
> > +	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
> >  		return;
> > -	}
> >
> >  	if (!all_ports_stopped()) {
> >  		printf("Please stop all ports first\n"); @@ -14837,10 +14826,8
> @@
> > struct cmd_ddp_get_list_result {  #endif
> >  	int ret = -ENOTSUP;
> >
> > -	if (res->port_id > nb_ports) {
> > -		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
> > +	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
> >  		return;
> > -	}
> >
> >  #ifdef RTE_LIBRTE_I40E_PMD
> >  	size = PROFILE_INFO_SIZE * MAX_PROFILE_NUM + 4; @@ -16296,7
> +16283,7
> > @@ struct cmd_cmdfile_result {
> >  	if (id == (portid_t)RTE_PORT_ALL) {
> >  		portid_t pid;
> >
> > -		RTE_ETH_FOREACH_DEV(pid) {
> > +		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
> >  			/* check if need_reconfig has been set to 1 */
> >  			if (ports[pid].need_reconfig == 0)
> >  				ports[pid].need_reconfig = dev;
> > diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> > index 561e057..e55490f 100644
> > --- a/app/test-pmd/cmdline_flow.c
> > +++ b/app/test-pmd/cmdline_flow.c
> > @@ -2652,7 +2652,7 @@ static int comp_vc_action_rss_queue(struct
> > context *, const struct token *,
> >
> >  	(void)ctx;
> >  	(void)token;
> > -	RTE_ETH_FOREACH_DEV(p) {
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(p, my_owner.id) {
> >  		if (buf && i == ent)
> >  			return snprintf(buf, size, "%u", p);
> >  		++i;
> > diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c index
> > 957b820..43b9a7d 100644
> > --- a/app/test-pmd/config.c
> > +++ b/app/test-pmd/config.c
> > @@ -156,7 +156,7 @@ struct rss_type_info {
> >
> >  	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
> >  		printf("Valid port range is [0");
> > -		RTE_ETH_FOREACH_DEV(pid)
> > +		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
> >  			printf(", %d", pid);
> >  		printf("]\n");
> >  		return;
> > @@ -236,7 +236,7 @@ struct rss_type_info {
> >
> >  	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
> >  		printf("Valid port range is [0");
> > -		RTE_ETH_FOREACH_DEV(pid)
> > +		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
> >  			printf(", %d", pid);
> >  		printf("]\n");
> >  		return;
> > @@ -253,10 +253,9 @@ struct rss_type_info {
> >  	struct rte_eth_xstat_name *xstats_names;
> >
> >  	printf("###### NIC extended statistics for port %-2d\n", port_id);
> > -	if (!rte_eth_dev_is_valid_port(port_id)) {
> > -		printf("Error: Invalid port number %i\n", port_id);
> > +
> > +	if (port_id_is_invalid(port_id, ENABLED_WARN))
> >  		return;
> > -	}
> >
> >  	/* Get count */
> >  	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, 0); @@ -
> 321,7
> > +320,7 @@ struct rss_type_info {
> >
> >  	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
> >  		printf("Valid port range is [0");
> > -		RTE_ETH_FOREACH_DEV(pid)
> > +		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
> >  			printf(", %d", pid);
> >  		printf("]\n");
> >  		return;
> > @@ -439,7 +438,7 @@ struct rss_type_info {
> >
> >  	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
> >  		printf("Valid port range is [0");
> > -		RTE_ETH_FOREACH_DEV(pid)
> > +		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
> >  			printf(", %d", pid);
> >  		printf("]\n");
> >  		return;
> > @@ -756,10 +755,15 @@ struct rss_type_info {  int
> > port_id_is_invalid(portid_t port_id, enum print_warning warning)  {
> > +	struct rte_eth_dev_owner owner;
> > +	int ret;
> > +
> >  	if (port_id == (portid_t)RTE_PORT_ALL)
> >  		return 0;
> >
> > -	if (rte_eth_dev_is_valid_port(port_id))
> > +	ret = rte_eth_dev_owner_get(port_id, &owner);
> > +
> > +	if (ret == 0 && owner.id == my_owner.id)
> >  		return 0;
> >
> >  	if (warning == ENABLED_WARN)
> > @@ -2373,7 +2377,7 @@ struct igb_ring_desc_16_bytes {
> >  		return;
> >  	}
> >  	nb_pt = 0;
> > -	RTE_ETH_FOREACH_DEV(i) {
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
> >  		if (! ((uint64_t)(1ULL << i) & portmask))
> >  			continue;
> >  		portlist[nb_pt++] = i;
> > @@ -2512,10 +2516,9 @@ struct igb_ring_desc_16_bytes {  void
> > setup_gro(const char *onoff, portid_t port_id)  {
> > -	if (!rte_eth_dev_is_valid_port(port_id)) {
> > -		printf("invalid port id %u\n", port_id);
> > +	if (port_id_is_invalid(port_id, ENABLED_WARN))
> >  		return;
> > -	}
> > +
> >  	if (test_done == 0) {
> >  		printf("Before enable/disable GRO,"
> >  				" please stop forwarding first\n"); @@ -
> 2574,10 +2577,9 @@ struct
> > igb_ring_desc_16_bytes {
> >
> >  	param = &gro_ports[port_id].param;
> >
> > -	if (!rte_eth_dev_is_valid_port(port_id)) {
> > -		printf("Invalid port id %u.\n", port_id);
> > +	if (port_id_is_invalid(port_id, ENABLED_WARN))
> >  		return;
> > -	}
> > +
> >  	if (gro_ports[port_id].enable) {
> >  		printf("GRO type: TCP/IPv4\n");
> >  		if (gro_flush_cycles == GRO_DEFAULT_FLUSH_CYCLES) { @@
> -2595,10
> > +2597,9 @@ struct igb_ring_desc_16_bytes {  void  setup_gso(const char
> > *mode, portid_t port_id)  {
> > -	if (!rte_eth_dev_is_valid_port(port_id)) {
> > -		printf("invalid port id %u\n", port_id);
> > +	if (port_id_is_invalid(port_id, ENABLED_WARN))
> >  		return;
> > -	}
> > +
> >  	if (strcmp(mode, "on") == 0) {
> >  		if (test_done == 0) {
> >  			printf("before enabling GSO,"
> > diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
> > index 878c112..0e57b46 100644
> > --- a/app/test-pmd/parameters.c
> > +++ b/app/test-pmd/parameters.c
> > @@ -398,7 +398,7 @@
> >  		if (port_id_is_invalid(port_id, ENABLED_WARN) ||
> >  			port_id == (portid_t)RTE_PORT_ALL) {
> >  			printf("Valid port range is [0");
> > -			RTE_ETH_FOREACH_DEV(pid)
> > +			RTE_ETH_FOREACH_DEV_OWNED_BY(pid,
> my_owner.id)
> >  				printf(", %d", pid);
> >  			printf("]\n");
> >  			return -1;
> > @@ -459,7 +459,7 @@
> >  		if (port_id_is_invalid(port_id, ENABLED_WARN) ||
> >  			port_id == (portid_t)RTE_PORT_ALL) {
> >  			printf("Valid port range is [0");
> > -			RTE_ETH_FOREACH_DEV(pid)
> > +			RTE_ETH_FOREACH_DEV_OWNED_BY(pid,
> my_owner.id)
> >  				printf(", %d", pid);
> >  			printf("]\n");
> >  			return -1;
> > diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> > c066cf9..83f5e84 100644
> > --- a/app/test-pmd/testpmd.c
> > +++ b/app/test-pmd/testpmd.c
> > @@ -108,6 +108,11 @@
> >  lcoreid_t nb_lcores;           /**< Number of probed logical cores. */
> >
> >  /*
> > + * My port owner structure used to own Ethernet ports.
> > + */
> > +struct rte_eth_dev_owner my_owner; /**< Unique owner. */
> > +
> > +/*
> >   * Test Forwarding Configuration.
> >   *    nb_fwd_lcores <= nb_cfg_lcores <= nb_lcores
> >   *    nb_fwd_ports  <= nb_cfg_ports  <= nb_ports
> > @@ -449,7 +454,7 @@ static int eth_event_callback(portid_t port_id,
> >  	portid_t pt_id;
> >  	int i = 0;
> >
> > -	RTE_ETH_FOREACH_DEV(pt_id)
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pt_id, my_owner.id)
> >  		fwd_ports_ids[i++] = pt_id;
> >
> >  	nb_cfg_ports = nb_ports;
> > @@ -573,7 +578,7 @@ static int eth_event_callback(portid_t port_id,
> >  		fwd_lcores[lc_id]->cpuid_idx = lc_id;
> >  	}
> >
> > -	RTE_ETH_FOREACH_DEV(pid) {
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
> >  		port = &ports[pid];
> >  		/* Apply default Tx configuration for all ports */
> >  		port->dev_conf.txmode = tx_mode;
> > @@ -706,7 +711,7 @@ static int eth_event_callback(portid_t port_id,
> >  	queueid_t q;
> >
> >  	/* set socket id according to numa or not */
> > -	RTE_ETH_FOREACH_DEV(pid) {
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
> >  		port = &ports[pid];
> >  		if (nb_rxq > port->dev_info.max_rx_queues) {
> >  			printf("Fail: nb_rxq(%d) is greater than "
> > @@ -1000,9 +1005,8 @@ static int eth_event_callback(portid_t port_id,
> >  	uint64_t tics_per_1sec;
> >  	uint64_t tics_datum;
> >  	uint64_t tics_current;
> > -	uint8_t idx_port, cnt_ports;
> > +	uint16_t idx_port;
> >
> > -	cnt_ports = rte_eth_dev_count();
> >  	tics_datum = rte_rdtsc();
> >  	tics_per_1sec = rte_get_timer_hz();
> >  #endif
> > @@ -1017,11 +1021,10 @@ static int eth_event_callback(portid_t port_id,
> >  			tics_current = rte_rdtsc();
> >  			if (tics_current - tics_datum >= tics_per_1sec) {
> >  				/* Periodic bitrate calculation */
> > -				for (idx_port = 0;
> > -						idx_port < cnt_ports;
> > -						idx_port++)
> > +
> 	RTE_ETH_FOREACH_DEV_OWNED_BY(idx_port,
> > +							     my_owner.id)
> >  					rte_stats_bitrate_calc(bitrate_data,
> > -						idx_port);
> > +							       idx_port);
> >  				tics_datum = tics_current;
> >  			}
> >  		}
> > @@ -1359,7 +1362,7 @@ static int eth_event_callback(portid_t port_id,
> >  	portid_t pi;
> >  	struct rte_port *port;
> >
> > -	RTE_ETH_FOREACH_DEV(pi) {
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
> >  		port = &ports[pi];
> >  		/* Check if there is a port which is not started */
> >  		if ((port->port_status != RTE_PORT_STARTED) && @@ -
> 1387,7 +1390,7
> > @@ static int eth_event_callback(portid_t port_id,  {
> >  	portid_t pi;
> >
> > -	RTE_ETH_FOREACH_DEV(pi) {
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
> >  		if (!port_is_stopped(pi))
> >  			return 0;
> >  	}
> > @@ -1434,7 +1437,7 @@ static int eth_event_callback(portid_t port_id,
> >
> >  	if(dcb_config)
> >  		dcb_test = 1;
> > -	RTE_ETH_FOREACH_DEV(pi) {
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
> >  		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
> >  			continue;
> >
> > @@ -1620,7 +1623,7 @@ static int eth_event_callback(portid_t port_id,
> >
> >  	printf("Stopping ports...\n");
> >
> > -	RTE_ETH_FOREACH_DEV(pi) {
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
> >  		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
> >  			continue;
> >
> > @@ -1663,7 +1666,7 @@ static int eth_event_callback(portid_t port_id,
> >
> >  	printf("Closing ports...\n");
> >
> > -	RTE_ETH_FOREACH_DEV(pi) {
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
> >  		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
> >  			continue;
> >
> > @@ -1714,7 +1717,7 @@ static int eth_event_callback(portid_t port_id,
> >
> >  	printf("Resetting ports...\n");
> >
> > -	RTE_ETH_FOREACH_DEV(pi) {
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
> >  		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
> >  			continue;
> >
> > @@ -1759,6 +1762,12 @@ static int eth_event_callback(portid_t port_id,
> >  	if (rte_eth_dev_attach(identifier, &pi))
> >  		return;
> >
> > +	if (rte_eth_dev_owner_set(pi, &my_owner) != 0) {
> > +		printf("Error: cannot own new attached port %d\n", pi);
> > +		return;
> > +	}
> > +	nb_ports++;
> > +
> >  	socket_id = (unsigned)rte_eth_dev_socket_id(pi);
> >  	/* if socket_id is invalid, set to 0 */
> >  	if (check_socket_id(socket_id) < 0)
> > @@ -1766,8 +1775,6 @@ static int eth_event_callback(portid_t port_id,
> >  	reconfig(pi, socket_id);
> >  	rte_eth_promiscuous_enable(pi);
> >
> > -	nb_ports = rte_eth_dev_count();
> > -
> >  	ports[pi].port_status = RTE_PORT_STOPPED;
> >
> >  	printf("Port %d is attached. Now total ports is %d\n", pi,
> > nb_ports); @@ -1781,6 +1788,9 @@ static int
> > eth_event_callback(portid_t port_id,
> >
> >  	printf("Detaching a port...\n");
> >
> > +	if (port_id_is_invalid(port_id, ENABLED_WARN))
> > +		return;
> > +
> >  	if (!port_is_closed(port_id)) {
> >  		printf("Please close port first\n");
> >  		return;
> > @@ -1794,7 +1804,7 @@ static int eth_event_callback(portid_t port_id,
> >  		return;
> >  	}
> >
> > -	nb_ports = rte_eth_dev_count();
> > +	nb_ports--;
> >
> >  	printf("Port '%s' is detached. Now total ports is %d\n",
> >  			name, nb_ports);
> > @@ -1812,7 +1822,7 @@ static int eth_event_callback(portid_t port_id,
> >
> >  	if (ports != NULL) {
> >  		no_link_check = 1;
> > -		RTE_ETH_FOREACH_DEV(pt_id) {
> > +		RTE_ETH_FOREACH_DEV_OWNED_BY(pt_id, my_owner.id) {
> >  			printf("\nShutting down port %d...\n", pt_id);
> >  			fflush(stdout);
> >  			stop_port(pt_id);
> > @@ -1844,7 +1854,7 @@ struct pmd_test_command {
> >  	fflush(stdout);
> >  	for (count = 0; count <= MAX_CHECK_TIME; count++) {
> >  		all_ports_up = 1;
> > -		RTE_ETH_FOREACH_DEV(portid) {
> > +		RTE_ETH_FOREACH_DEV_OWNED_BY(portid, my_owner.id)
> {
> >  			if ((port_mask & (1 << portid)) == 0)
> >  				continue;
> >  			memset(&link, 0, sizeof(link));
> > @@ -1936,6 +1946,8 @@ struct pmd_test_command {
> >
> >  	switch (type) {
> >  	case RTE_ETH_EVENT_INTR_RMV:
> > +		if (port_id_is_invalid(port_id, ENABLED_WARN))
> > +			break;
> >  		if (rte_eal_alarm_set(100000,
> >  				rmv_event_callback, (void
> *)(intptr_t)port_id))
> >  			fprintf(stderr, "Could not set up deferred device
> removal\n"); @@
> > -2068,7 +2080,7 @@ struct pmd_test_command {
> >  	portid_t pid;
> >  	struct rte_port *port;
> >
> > -	RTE_ETH_FOREACH_DEV(pid) {
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
> >  		port = &ports[pid];
> >  		port->dev_conf.fdir_conf = fdir_conf;
> >  		if (nb_rxq > 1) {
> > @@ -2383,7 +2395,12 @@ uint8_t port_is_bonding_slave(portid_t
> slave_pid)
> >  	rte_pdump_init(NULL);
> >  #endif
> >
> > -	nb_ports = (portid_t) rte_eth_dev_count();
> > +	if (rte_eth_dev_owner_new(&my_owner.id))
> > +		rte_panic("Failed to get unique owner identifier\n");
> > +	snprintf(my_owner.name, sizeof(my_owner.name),
> TESTPMD_OWNER_NAME);
> > +	RTE_ETH_FOREACH_DEV(port_id)
> > +		if (rte_eth_dev_owner_set(port_id, &my_owner) == 0)
> > +			nb_ports++;
> >  	if (nb_ports == 0)
> >  		TESTPMD_LOG(WARNING, "No probed ethernet
> devices\n");
> >
> > @@ -2431,7 +2448,7 @@ uint8_t port_is_bonding_slave(portid_t
> slave_pid)
> >  		rte_exit(EXIT_FAILURE, "Start ports failed\n");
> >
> >  	/* set all ports to promiscuous mode by default */
> > -	RTE_ETH_FOREACH_DEV(port_id)
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(port_id, my_owner.id)
> >  		rte_eth_promiscuous_enable(port_id);
> >
> >  	/* Init metrics library */
> > diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h index
> > 9c739e5..2d253b9 100644
> > --- a/app/test-pmd/testpmd.h
> > +++ b/app/test-pmd/testpmd.h
> > @@ -50,6 +50,8 @@
> >  #define NUMA_NO_CONFIG 0xFF
> >  #define UMA_NO_CONFIG  0xFF
> >
> > +#define TESTPMD_OWNER_NAME "TestPMD"
> > +
> >  typedef uint8_t  lcoreid_t;
> >  typedef uint16_t portid_t;
> >  typedef uint16_t queueid_t;
> > @@ -361,6 +363,7 @@ struct queue_stats_mappings {
> >   * nb_fwd_ports <= nb_cfg_ports <= nb_ports
> >   */
> >  extern portid_t nb_ports; /**< Number of ethernet ports probed at
> > init time. */
> > +extern struct rte_eth_dev_owner my_owner; /**< Unique owner. */
> >  extern portid_t nb_cfg_ports; /**< Number of configured ports. */
> > extern portid_t nb_fwd_ports; /**< Number of forwarding ports. */
> > extern portid_t fwd_ports_ids[RTE_MAX_ETHPORTS];
> > --
> > 1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19  9:30                                                     ` Bruce Richardson
  2018-01-19 10:44                                                       ` Matan Azrad
@ 2018-01-19 12:55                                                       ` Neil Horman
  1 sibling, 0 replies; 214+ messages in thread
From: Neil Horman @ 2018-01-19 12:55 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: Matan Azrad, Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet,
	Wu, Jingjing, dev

On Fri, Jan 19, 2018 at 09:30:17AM +0000, Bruce Richardson wrote:
> On Fri, Jan 19, 2018 at 07:14:17AM +0000, Matan Azrad wrote:
> > 
> > Hi Neil
> > From: Neil Horman, Friday, January 19, 2018 3:41 AM
> > > On Thu, Jan 18, 2018 at 08:21:34PM +0000, Matan Azrad wrote:
> > > > Hi Neil.
> > > >
> > > > From: Neil Horman, Thursday, January 18, 2018 8:42 PM
> > 
> > <snip>
> > > > 1. What exactly do you want to improve?(in details) 2. Which API
> > > > specifically do you want to change(\ part of code)?
> > > > 3. What is the missing in current code(you can answer it in V3 I sent if you
> > > want) which should be fixed?
> > > >
> > > >
> > > > <snip> sorry for that, I think it is not relevant continue discussion if we are
> > > not fully understand each other. So let's start from the beginning "with good
> > > order :)" by answering the above questions.
> > > 
> > > 
> > > Sure, this seems like a reasonable way to level set.
> > > 
> > > I mentioned in another thread that perhaps some of my issue here is
> > > perception regarding what is meant by ownership.  When I think of an
> > > ownership api I think primarily of mutual exclusion (that is to say,
> > > enforcement of a single execution context having access to a resource at any
> > > given time.  In my mind the simplest form of ownership is a spinlock or a
> > > mutex.  A single execution context either does or does not hold the resource
> > > at any one time.  Those contexts that attempt to gain excusive access to the
> > > resource call an api that (depending on
> > > implementation) either block continued execution of that thread until
> > > exclusive access to the resource can be granted, or returns immediately with
> > > a success or error indicator to let the caller know if access is granted.
> > > 
> > > If I were to codify this port ownership api in pseudo code it would look
> > > something like this:
> > > 
> > > struct rte_eth_dev {
> > > 
> > > 	< eth dev bits >
> > > 	rte_spinlock_t owner_lock;
> > > 	bool locked;
> > > 	pid_t owner_pid;
> > > }
> > > 
> As an aside, if you ensure that both locked (or "owned", I think in this
> context) and owner_pid are integer values, you can do away with the lock
> and use a compare-and-set to take ownership, by setting both atomically
> if unmodified from the originally read values.
> 
This is true, since the lock is release at the end of each API function
(effectively making each API function atomic).  Though, a dpdk spinlock is just
a compare_and_set operation with a built in yield()

Neil
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-19 12:51         ` Matan Azrad
@ 2018-01-19 13:08           ` Ananyev, Konstantin
  2018-01-19 13:35             ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-19 13:08 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce



> -----Original Message-----
> From: Matan Azrad [mailto:matan@mellanox.com]
> Sent: Friday, January 19, 2018 12:52 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> 
> Hi Konstantin
> 
> From: Ananyev, Konstantin, Friday, January 19, 2018 2:38 PM
> > To: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>; Wu,
> > Jingjing <jingjing.wu@intel.com>
> > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > Bruce <bruce.richardson@intel.com>
> > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> >
> > Hi Matan,
> >
> > > -----Original Message-----
> > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > Sent: Thursday, January 18, 2018 4:35 PM
> > > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > <konstantin.ananyev@intel.com>
> > > Subject: [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> > >
> > > Testpmd should not use ethdev ports which are managed by other DPDK
> > > entities.
> > >
> > > Set Testpmd ownership to each port which is not used by other entity
> > > and prevent any usage of ethdev ports which are not owned by Testpmd.
> > >
> > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > ---
> > >  app/test-pmd/cmdline.c      | 89 +++++++++++++++++++---------------------
> > -----
> > >  app/test-pmd/cmdline_flow.c |  2 +-
> > >  app/test-pmd/config.c       | 37 ++++++++++---------
> > >  app/test-pmd/parameters.c   |  4 +-
> > >  app/test-pmd/testpmd.c      | 63 ++++++++++++++++++++------------
> > >  app/test-pmd/testpmd.h      |  3 ++
> > >  6 files changed, 103 insertions(+), 95 deletions(-)
> > >
> > > diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
> > > 31919ba..6199c64 100644
> > > --- a/app/test-pmd/cmdline.c
> > > +++ b/app/test-pmd/cmdline.c
> > > @@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
> > >  			&link_speed) < 0)
> > >  		return;
> > >
> > > -	RTE_ETH_FOREACH_DEV(pid) {
> > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
> >
> > Why do we need all these changes?
> > As I understand you changed definition of RTE_ETH_FOREACH_DEV(), so no
> > testpmd should work ok default (no_owner case).
> > Am I missing something here?
> 
> Now, After Gaetan suggestion RTE_ETH_FOREACH_DEV(pid) will iterate over all valid and ownerless ports.

Yes.

> Here Testpmd wants to iterate over its owned ports.

Why? Why it can't just iterate over all valid and ownerless ports?
As I understand it would be enough to fix current problems and would allow
us to avoid any changes in testmpd (which I think is a good thing).
Konstantin

> 
> I added to Testpmd ability to take an ownership of ports as the new ownership and synchronization rules suggested,
> Since Tespmd is a DPDK entity which wants that no one will touch its owned ports,
> It must allocate an unique ID, set owner for its ports (see in main function) and recognizes them by its owner ID.
> 
> > Konstantin
> >

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19 10:44                                                       ` Matan Azrad
@ 2018-01-19 13:30                                                         ` Neil Horman
  2018-01-19 13:57                                                           ` Matan Azrad
  2018-01-19 14:13                                                           ` Thomas Monjalon
  0 siblings, 2 replies; 214+ messages in thread
From: Neil Horman @ 2018-01-19 13:30 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Bruce Richardson, Ananyev, Konstantin, Thomas Monjalon,
	Gaetan Rivet, Wu, Jingjing, dev

On Fri, Jan 19, 2018 at 10:44:32AM +0000, Matan Azrad wrote:
> Hi Bruce
> From: Bruce Richardson, Friday, January 19, 2018 11:30 AM
> > On Fri, Jan 19, 2018 at 07:14:17AM +0000, Matan Azrad wrote:
> > >
> > > Hi Neil
> > > From: Neil Horman, Friday, January 19, 2018 3:41 AM
> > > > On Thu, Jan 18, 2018 at 08:21:34PM +0000, Matan Azrad wrote:
> > > > > Hi Neil.
> > > > >
> > > > > From: Neil Horman, Thursday, January 18, 2018 8:42 PM
> > >
> > > <snip>
> > > > > 1. What exactly do you want to improve?(in details) 2. Which API
> > > > > specifically do you want to change(\ part of code)?
> > > > > 3. What is the missing in current code(you can answer it in V3 I
> > > > > sent if you
> > > > want) which should be fixed?
> > > > >
> > > > >
> > > > > <snip> sorry for that, I think it is not relevant continue
> > > > > discussion if we are
> > > > not fully understand each other. So let's start from the beginning
> > > > "with good order :)" by answering the above questions.
> > > >
> > > >
> > > > Sure, this seems like a reasonable way to level set.
> > > >
> > > > I mentioned in another thread that perhaps some of my issue here is
> > > > perception regarding what is meant by ownership.  When I think of an
> > > > ownership api I think primarily of mutual exclusion (that is to say,
> > > > enforcement of a single execution context having access to a
> > > > resource at any given time.  In my mind the simplest form of
> > > > ownership is a spinlock or a mutex.  A single execution context
> > > > either does or does not hold the resource at any one time.  Those
> > > > contexts that attempt to gain excusive access to the resource call
> > > > an api that (depending on
> > > > implementation) either block continued execution of that thread
> > > > until exclusive access to the resource can be granted, or returns
> > > > immediately with a success or error indicator to let the caller know if
> > access is granted.
> > > >
> > > > If I were to codify this port ownership api in pseudo code it would
> > > > look something like this:
> > > >
> > > > struct rte_eth_dev {
> > > >
> > > > 	< eth dev bits >
> > > > 	rte_spinlock_t owner_lock;
> > > > 	bool locked;
> > > > 	pid_t owner_pid;
> > > > }
> > > >
> > As an aside, if you ensure that both locked (or "owned", I think in this
> > context) and owner_pid are integer values, you can do away with the lock
> > and use a compare-and-set to take ownership, by setting both atomically if
> > unmodified from the originally read values.
> > 
> > > >
> > > > bool rte_port_claim_ownership(struct rte_eth_dev *dev) {
> > > > 	bool ret = false;
> > > >
> > > > 	spin_lock(dev->owner_lock);
> > > > 	if (dev->locked)
> > > > 		goto out;
> > > > 	dev->locked = true;
> > > > 	dev->owner_pid = getpid();
> > > > 	ret = true;
> > > > out:
> > > > 	spin_unlock(dev->lock)
> > > > 	return ret;
> > > > }
> > > >
> > > >
> > > > bool rte_port_release_ownership(rte_eth_dev *dev) {
> > > >
> > > > 	boot ret = false;
> > > > 	spin_lock(dev->owner_lock);
> > > > 	if (!dev->locked)
> > > > 		goto out;
> > > > 	if (dev->owner_pid != getpid())
> > > > 		goto out;
> > > > 	dev->locked = false;
> > > > 	dev_owner_pid = 0;
> > > > 	ret = true;
> > > > out:
> > > > 	spin_unlock(dev->owner_lock)
> > > > 	return ret;
> > > > }
> > > >
> > > > bool rte_port_is_owned_by(struct rte_eth_dev *dev, pid_t pid) {
> > > > 	bool ret = false;
> > > >
> > > > 	spin_lock(dev->owner_lock);
> > > > 	if (pid)
> > > > 		ret = (dev->locked && (pid == dev->owner_pid));
> > > > 	else
> > > > 		ret = dev->locked;
> > > > 	spin_unlock(dev->owner_lock);
> > > > 	return ret;
> > > > }
> > > >
> > > > The idea here is that lock state is isolated from ownership
> > > > information.  Any context has the opportunity to lock the resource
> > > > (in this case the eth port) despite its ownership object.
> > > >
> > > > In comparison, your api, which is in may ways simmilar, separates
> > > > the creation of ownership objects to a separate api call, and that
> > > > ownership information embodies state that is integral to the ability
> > > > to get exclusive access to the resource.  I.E. if thread A calls
> > > > your owner_new call, and then thread B calls owner_new, thread A
> > > > will never be able to get access to any port unless it calls owner_new
> > again.
> > > >
> > > > Does that help clarify my position?
> > This would have been my understanding of what was being looked for too,
> > from my minimal understanding of the problem. Thanks for putting that
> > forward on behalf of many of us!
> > 
> > >
> > > Now I fully understand you, thanks for your patience.
> > >
> > > So, you are missing here one of the main ideas of my port ownership
> > intention.
> > > There are options for X>1 different uncoordinated owners running in the
> > same thread.
> > 
> > Thanks Matan for taking time to try and explain how your idea differs, but I
> > for one am still a little confused. Sorry for the late questions.
> > 
> > Sure, Neil's example above takes the pid or thread id as the owner id
> > parameter, but there is no reason we can't use the same scheme with
> > arbitrarily assigned owner ids, so long as they are unique. We can even have
> > a simple mapping table mapping ids to names of components.
> > >
> Sorry, don't understand your point here.
> My approach asked to allocate unique ID for "any part of code want to manage\use a port".
> What is the problem here and how do you suggest to fix it?
> 
> Neil approach (with process iD\ thread id ) is wrong because 2 different owners can run in same thread (as I explained a lot below).
> 
So, I may be wrong here, but it would be my opinion that the ownership record
should codify something about the owning context.  The fact that you want two
different owners to run in the context of the same thread is not a problem per
se, but rather an artifact of your adherence to the statement "any part of code
to manage/use a port".  I would assert that was perhaps a statement made in
error early during the design phase.  Perhaps it would be better to state that
any exectution context may take ownership of a port.

> > > For example:
> > > 1. Think about Testpmd control commands that call to failsafe port devop
> > which call to its sub-devices devops, while tespmd is different
> > owner(controlling failsafe-port) and failsafe is a different owner(controlling
> > all its sub-devices ports), There are both run control commands in the same
> > thread and there are uncoordinated!
Answered below.

> > >  2. Interrupt callbacks that anyone can register to them and all will run by
> > the DPDK host thread.
> > 
I'm sorry, I'm not clear on how your solution succededs here where my alternate
model fails.  Both models require co-ordination such that ownership of a port is
released and re-aquired by another thread, if I'm understanding this correctly.

> > Can you provide a little more details here: what is the specific issue or conflict
> > in each of these examples and how does your ownership proposal fix it,
> > when Neil's simpler approach doesn't?
> > 
> For the first example:
> My approach:
> Testpmd want to manage the fail-safe port, therefore it should allocate unique ID(only one time) and use owner set(by its ID) to take ownership of this port.
> If it succeed to take ownership it can manage the port.
> Failsafe PMD wants to manage its sub-devices ports and does the same process as Testpmd.
> Everything is ok.
> 
> Neil  approach:
> Testpmd want to manage the fail-safe port, therefore it just need to claim ownership(set) and its pid will take as the owner identifier.
> Failsafe PMD wants to manage its sub-devices ports and does the same process as Testpmd.
> But look these 2 entities run in same threads and there both can set the same pid. -> problem!
> 
I would argue thats not an error at all.  As above, the only thing wrong with
using the same ID to claim ownership of both ports is that it violates the
statement you referred to above, which I think is somewhat erroneous.  I would
further argue that using the same Id in both scenarios is preferable because it
accurately indicates the ownership relation between the top level failsafe
device and its slaves (i.e. that the application thread owns the failsafe
device, and transitively, the slaves).  There is no real need to codify the fact
that the failsafe port actually owns the slaves, above and beyond that statement
above.

There is a convienience to having ownership be differentiated in the
master/slave model when it comes to iterating over top level vs subordinate
ports, but I would agrue thats a problem that should be solved independently,
adding it here is somewhat confusing.  I would suggest adding a parent
rte_eth_Dev and childrent rte_eth_dev list to the rte_eth_dev structure so that
iterations can be preformed over top level devices, children, children of
children, etc.  You can do this with your ownership model as well of course, but
there are other ways to skin that cat.


> The second one just describe more scenario about more than one DPDK entities which run from the same thread.
> 
> > >
> > > So, no any optional  owner becomes an owner, it depends in the specific
> > implementation.
> > >
> > > So if some "part of code" wants to manage a port exclusively and wants to
> > take ownership of it to prevent other "part of code" to use this port :
> > > 1. Take ownership.
> > > 2. It should ask itself: Am I run in different threads\processes? If yes, it
> > should synchronize its port management.
> > > 3. Release ownership in the end.
> > >
> > > Remember that may be different "part of code"s running in the same
> > thread\threads\process\processes.
> > >
So it seems like the real point of contention that we need to settle here is,
what codifies an 'owner'.  Must it be a specific execution context, or can we
define any arbitrary section of code as being an owner?  I would agrue against
the latter.  While in your master/slave model I can see how it seems tempting, I
would suggest alternate use cases that make that ownership model ambiguous.  If,
for example, we use your interrupt example above, and an interrupt call back is
run for a given port, how, using your example, does it now which area of
code/object/thread to co-ordinate releasing of that port with so that it can
operate exclusively?


Thanks
Neil
> > > Thanks, Matan.
> > > >
> > > > Regards
> > > > Neil
> > > >
> > > > }
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-19 13:08           ` Ananyev, Konstantin
@ 2018-01-19 13:35             ` Matan Azrad
  2018-01-19 15:00               ` Gaëtan Rivet
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-19 13:35 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce

Hi Konstantin

From: Ananyev, Konstantin, Friday, January 19, 2018 3:09 PM
> > -----Original Message-----
> > From: Matan Azrad [mailto:matan@mellanox.com]
> > Sent: Friday, January 19, 2018 12:52 PM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Thomas
> > Monjalon <thomas@monjalon.net>; Gaetan Rivet
> <gaetan.rivet@6wind.com>;
> > Wu, Jingjing <jingjing.wu@intel.com>
> > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > Bruce <bruce.richardson@intel.com>
> > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> >
> > Hi Konstantin
> >
> > From: Ananyev, Konstantin, Friday, January 19, 2018 2:38 PM
> > > To: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>;
> Wu,
> > > Jingjing <jingjing.wu@intel.com>
> > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > > Bruce <bruce.richardson@intel.com>
> > > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port
> > > ownership
> > >
> > > Hi Matan,
> > >
> > > > -----Original Message-----
> > > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > > Sent: Thursday, January 18, 2018 4:35 PM
> > > > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> Richardson,
> > > > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > > <konstantin.ananyev@intel.com>
> > > > Subject: [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> > > >
> > > > Testpmd should not use ethdev ports which are managed by other
> > > > DPDK entities.
> > > >
> > > > Set Testpmd ownership to each port which is not used by other
> > > > entity and prevent any usage of ethdev ports which are not owned by
> Testpmd.
> > > >
> > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > ---
> > > >  app/test-pmd/cmdline.c      | 89 +++++++++++++++++++-----------------
> ----
> > > -----
> > > >  app/test-pmd/cmdline_flow.c |  2 +-
> > > >  app/test-pmd/config.c       | 37 ++++++++++---------
> > > >  app/test-pmd/parameters.c   |  4 +-
> > > >  app/test-pmd/testpmd.c      | 63 ++++++++++++++++++++------------
> > > >  app/test-pmd/testpmd.h      |  3 ++
> > > >  6 files changed, 103 insertions(+), 95 deletions(-)
> > > >
> > > > diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
> > > > 31919ba..6199c64 100644
> > > > --- a/app/test-pmd/cmdline.c
> > > > +++ b/app/test-pmd/cmdline.c
> > > > @@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
> > > >  			&link_speed) < 0)
> > > >  		return;
> > > >
> > > > -	RTE_ETH_FOREACH_DEV(pid) {
> > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
> > >
> > > Why do we need all these changes?
> > > As I understand you changed definition of RTE_ETH_FOREACH_DEV(), so
> > > no testpmd should work ok default (no_owner case).
> > > Am I missing something here?
> >
> > Now, After Gaetan suggestion RTE_ETH_FOREACH_DEV(pid) will iterate
> over all valid and ownerless ports.
> 
> Yes.
> 
> > Here Testpmd wants to iterate over its owned ports.
> 
> Why? Why it can't just iterate over all valid and ownerless ports?
> As I understand it would be enough to fix current problems and would allow
> us to avoid any changes in testmpd (which I think is a good thing).

Yes, I understand that this big change is very daunted, But I think the current a lot of bugs in testpmd(regarding port ownership) even more daunted.

Look,
Testpmd initiates some of its internal databases depends on specific port iteration,
In some time someone may take ownership of Testpmd ports and testpmd will continue to touch them.

In addition
Using the old iterator in some places in testpmd will cause a race for run-time new ports(can be created by failsafe or any hotplug code):
- testpmd finds an ownerless port(just now created) by the old iterator and start traffic there,
- failsafe takes ownership of this new port and start traffic there.
Problem!
 
In addition
As a good example for well-done application (free from ownership bugs) I tried here to adjust Tespmd to the new rules and BTW to fix a lot of bugs.    


So actually applications which are not aware to the port ownership still are exposed to races, but if there use the old iterator(with the new change) the amount of races decreases. 

Thanks, Matan.
> Konstantin
> 
> >
> > I added to Testpmd ability to take an ownership of ports as the new
> > ownership and synchronization rules suggested, Since Tespmd is a DPDK
> > entity which wants that no one will touch its owned ports, It must allocate
> an unique ID, set owner for its ports (see in main function) and recognizes
> them by its owner ID.
> >
> > > Konstantin
> > >

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19  7:14                                                   ` Matan Azrad
  2018-01-19  9:30                                                     ` Bruce Richardson
@ 2018-01-19 13:52                                                     ` Neil Horman
  1 sibling, 0 replies; 214+ messages in thread
From: Neil Horman @ 2018-01-19 13:52 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing,
	dev, Richardson, Bruce

On Fri, Jan 19, 2018 at 07:14:17AM +0000, Matan Azrad wrote:
> 
> Hi Neil
> From: Neil Horman, Friday, January 19, 2018 3:41 AM
> > On Thu, Jan 18, 2018 at 08:21:34PM +0000, Matan Azrad wrote:
> > > Hi Neil.
> > >
> > > From: Neil Horman, Thursday, January 18, 2018 8:42 PM
> 
> <snip>
> > > 1. What exactly do you want to improve?(in details) 2. Which API
> > > specifically do you want to change(\ part of code)?
> > > 3. What is the missing in current code(you can answer it in V3 I sent if you
> > want) which should be fixed?
> > >
> > >
> > > <snip> sorry for that, I think it is not relevant continue discussion if we are
> > not fully understand each other. So let's start from the beginning "with good
> > order :)" by answering the above questions.
> > 
> > 
> > Sure, this seems like a reasonable way to level set.
> > 
> > I mentioned in another thread that perhaps some of my issue here is
> > perception regarding what is meant by ownership.  When I think of an
> > ownership api I think primarily of mutual exclusion (that is to say,
> > enforcement of a single execution context having access to a resource at any
> > given time.  In my mind the simplest form of ownership is a spinlock or a
> > mutex.  A single execution context either does or does not hold the resource
> > at any one time.  Those contexts that attempt to gain excusive access to the
> > resource call an api that (depending on
> > implementation) either block continued execution of that thread until
> > exclusive access to the resource can be granted, or returns immediately with
> > a success or error indicator to let the caller know if access is granted.
> > 
> > If I were to codify this port ownership api in pseudo code it would look
> > something like this:
> > 
> > struct rte_eth_dev {
> > 
> > 	< eth dev bits >
> > 	rte_spinlock_t owner_lock;
> > 	bool locked;
> > 	pid_t owner_pid;
> > }
> > 
> > 
> > bool rte_port_claim_ownership(struct rte_eth_dev *dev) {
> > 	bool ret = false;
> > 
> > 	spin_lock(dev->owner_lock);
> > 	if (dev->locked)
> > 		goto out;
> > 	dev->locked = true;
> > 	dev->owner_pid = getpid();
> > 	ret = true;
> > out:
> > 	spin_unlock(dev->lock)
> > 	return ret;
> > }
> > 
> > 
> > bool rte_port_release_ownership(rte_eth_dev *dev) {
> > 
> > 	boot ret = false;
> > 	spin_lock(dev->owner_lock);
> > 	if (!dev->locked)
> > 		goto out;
> > 	if (dev->owner_pid != getpid())
> > 		goto out;
> > 	dev->locked = false;
> > 	dev_owner_pid = 0;
> > 	ret = true;
> > out:
> > 	spin_unlock(dev->owner_lock)
> > 	return ret;
> > }
> > 
> > bool rte_port_is_owned_by(struct rte_eth_dev *dev, pid_t pid) {
> > 	bool ret = false;
> > 
> > 	spin_lock(dev->owner_lock);
> > 	if (pid)
> > 		ret = (dev->locked && (pid == dev->owner_pid));
> > 	else
> > 		ret = dev->locked;
> > 	spin_unlock(dev->owner_lock);
> > 	return ret;
> > }
> > 
> > The idea here is that lock state is isolated from ownership information.  Any
> > context has the opportunity to lock the resource (in this case the eth port)
> > despite its ownership object.
> > 
> > In comparison, your api, which is in may ways simmilar, separates the
> > creation of ownership objects to a separate api call, and that ownership
> > information embodies state that is integral to the ability to get exclusive
> > access to the resource.  I.E. if thread A calls your owner_new call, and then
> > thread B calls owner_new, thread A will never be able to get access to any
> > port unless it calls owner_new again.
> > 
> > Does that help clarify my position?
> 
> Now I fully understand you, thanks for your patience.
> 
> So, you are missing here one of the main ideas of my port ownership intention.
> There are options for X>1 different uncoordinated owners running in the same thread.
> 
> For example:
> 1. Think about Testpmd control commands that call to failsafe port devop which call to its sub-devices devops, while tespmd is different owner(controlling failsafe-port) and failsafe is a different owner(controlling all its sub-devices ports), There are both run control commands in the same thread and there are uncoordinated!
>  2. Interrupt callbacks that anyone can register to them and all will run by the DPDK host thread. 
> 
> So, no any optional  owner becomes an owner, it depends in the specific implementation.
> 
> So if some "part of code" wants to manage a port exclusively and wants to take ownership of it to prevent other "part of code" to use this port :
> 1. Take ownership.
> 2. It should ask itself: Am I run in different threads\processes? If yes, it should synchronize its port management. 
> 3. Release ownership in the end.
> 
> Remember that may be different "part of code"s running in the same thread\threads\process\processes.
> 
Apologies for not responding in full here, but in the interests of
de-duplication, I'm providing a larger response to the above higher up in the
the thread (the branch in which Bruce commented).

Best
Neil

> Thanks, Matan.
> > 
> > Regards
> > Neil
> > 
> > }
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-18 14:52                                       ` Matan Azrad
@ 2018-01-19 13:57                                         ` Neil Horman
  2018-01-19 14:07                                           ` Thomas Monjalon
  0 siblings, 1 reply; 214+ messages in thread
From: Neil Horman @ 2018-01-19 13:57 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing,
	dev, Richardson, Bruce

On Thu, Jan 18, 2018 at 02:52:20PM +0000, Matan Azrad wrote:
> Hi Neil
> 
> From: Neil Horman, Thursday, January 18, 2018 3:21 PM
> > On Wed, Jan 17, 2018 at 05:58:07PM +0000, Matan Azrad wrote:
> > >
> > > Hi Neil
> > >
> > >  From: Neil Horman, Wednesday, January 17, 2018 4:00 PM
> > > > On Wed, Jan 17, 2018 at 12:05:42PM +0000, Matan Azrad wrote:
> <snip>
> > > > Matan is correct here, there is no way to preform parallel set
> > > > operations using just and atomic variable here, because multiple
> > > > reads of next_owner_id need to be preformed while it is stable.
> > > > That is to say rte_eth_next_owner_id must be compared to
> > > > RTE_ETH_DEV_NO_OWNER and owner_id in rte_eth_is_valid_owner_id.
> > If
> > > > you were to only use an atomic_read on such a variable, it could be
> > > > incremented by the owner_new function between the checks and an
> > > > invalid owner value could become valid because  a third thread
> > > > incremented the next value.  The state of next_owner_id must be kept
> > > > stable during any validity checks
> > > >
> > > > That said, I really have to wonder why ownership ids are really
> > > > needed here at all.  It seems this design could be much simpler with
> > > > the addition of a per- port lock (and optional ownership record).
> > > > The API could consist of three
> > > > operations:
> > > >
> > > > ownership_set
> > > > ownership_tryset
> > > > ownership_release
> > > > ownership_get
> > > >
> > > >
> > > > The first call simply tries to take the per-port lock (blocking if
> > > > its already
> > > > locked)
> > > >
> > >
> > > Per port lock is not good because the ownership mechanism must to be
> > synchronized with the port creation\release.
> > > So the port creation and port ownership should use the same lock.
> > >
> > In what way do you need to synchronize with port creation?
> 
> The port release zeroes the data field of the port owner, so it should be synchronized with the ownership APIs.
> The port creation should be synchronized with the port release.
> 
Ok, thats fair, but you can do that, as long as you don't get hung up on the
necessity to zero all the port data.  Keep the state of the spinlock, and
mandate that the port be in an unowned state during release.

> 
> >  If a port has not
> > yet been created, then by definition the owner must be the thread calling
> > the create function.
> 
> No, the owner can be any dpdk entity. (an application - multi\single threads\proccesses, a PMD, a library).
> So the port allocation(usually done from the port PMD by one thread from one process) just should to allocate a port.
> 
Again, in the interests of de-duplication, I made an argument on this point in
the part of the thread where Bruce commented.  I don't think we need to adhere
to the notion that any block of code can be declared the owner of a port.

> 
> >  If you are concerned about the mechanics of the port
> > data structure (i.e. the fact that rte_eth_devices is statically allocated, you
> > can add a lock structure to the rte_eth_dev struct and initialize it statically
> > with
> > RTE_SPINLOCK_INITAIZER()
> > 
> 
> The lock should be in shared memory to allow secondary processes entities to take owner safely.
>  
Ok, thats entirely doable.

> > > I didn't find precedence for blocking function in ethdev.
> > >
> > Then perhaps we don't need that api call.  Perhaps ownership_tryset is
> > enough.
> >
> 
> As I already did :)
>  
> > > > The second call is a non-blocking version of the first
> > > >
> > > > The third unlocks the port, allowing others to take ownership
> > > >
> > > > The fourth returns whatever ownership record you want to encode with
> > > > the lock.
> > > >
> > > > The addition of all this id checking seems a bit overcomplicated
> > >
> > > You miss the identification of the owner - we want to allow info of the
> > owner for printing and easy debug.
> > > And it is makes sense to manage the owner uniqueness by unique ID.
> > >
> > I specifically pointed that out above.  There is no reason an owernship record
> > couldn't be added to the rte_eth_dev structure.
> > 
> 
> Sorry, don't understand why.
> 
Because, thats the resource your trying to protect, and the object you want to
identify ownership of, no?
 

> > > The API already discussed a lot in the previous version, Do you really want,
> > now, to open it again?
> > >
> > What I want is the most useful and elegant ownership API available.  If you
> > think what you have is that, so be it.  I only bring this up because the amount
> > of debate you and Konstantin have had over lock safety causes me to
> > wonder if this isn't an overly complex design.
> 
> I think the complex design is in secondary\primary processes, not in the current port ownership.
> I think there is some work to do there regardless port ownership.
> I think also there is some work in progress for it.
> 
> Thanks, a lot.
> 
> > 
> > Neil
> > 
> > 
> > > > Neil
> > >
> > >
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19 13:30                                                         ` Neil Horman
@ 2018-01-19 13:57                                                           ` Matan Azrad
  2018-01-19 14:13                                                           ` Thomas Monjalon
  1 sibling, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-19 13:57 UTC (permalink / raw)
  To: Neil Horman
  Cc: Bruce Richardson, Ananyev, Konstantin, Thomas Monjalon,
	Gaetan Rivet, Wu, Jingjing, dev

Hi Neil
From: Neil Horman, Friday, January 19, 2018 3:30 PM
> On Fri, Jan 19, 2018 at 10:44:32AM +0000, Matan Azrad wrote:
> > Hi Bruce
> > From: Bruce Richardson, Friday, January 19, 2018 11:30 AM
> > > On Fri, Jan 19, 2018 at 07:14:17AM +0000, Matan Azrad wrote:
> > > >
> > > > Hi Neil
> > > > From: Neil Horman, Friday, January 19, 2018 3:41 AM
> > > > > On Thu, Jan 18, 2018 at 08:21:34PM +0000, Matan Azrad wrote:
> > > > > > Hi Neil.
> > > > > >
> > > > > > From: Neil Horman, Thursday, January 18, 2018 8:42 PM
> > > >
> > > > <snip>
> > > > > > 1. What exactly do you want to improve?(in details) 2. Which
> > > > > > API specifically do you want to change(\ part of code)?
> > > > > > 3. What is the missing in current code(you can answer it in V3
> > > > > > I sent if you
> > > > > want) which should be fixed?
> > > > > >
> > > > > >
> > > > > > <snip> sorry for that, I think it is not relevant continue
> > > > > > discussion if we are
> > > > > not fully understand each other. So let's start from the
> > > > > beginning "with good order :)" by answering the above questions.
> > > > >
> > > > >
> > > > > Sure, this seems like a reasonable way to level set.
> > > > >
> > > > > I mentioned in another thread that perhaps some of my issue here
> > > > > is perception regarding what is meant by ownership.  When I
> > > > > think of an ownership api I think primarily of mutual exclusion
> > > > > (that is to say, enforcement of a single execution context
> > > > > having access to a resource at any given time.  In my mind the
> > > > > simplest form of ownership is a spinlock or a mutex.  A single
> > > > > execution context either does or does not hold the resource at
> > > > > any one time.  Those contexts that attempt to gain excusive
> > > > > access to the resource call an api that (depending on
> > > > > implementation) either block continued execution of that thread
> > > > > until exclusive access to the resource can be granted, or
> > > > > returns immediately with a success or error indicator to let the
> > > > > caller know if
> > > access is granted.
> > > > >
> > > > > If I were to codify this port ownership api in pseudo code it
> > > > > would look something like this:
> > > > >
> > > > > struct rte_eth_dev {
> > > > >
> > > > > 	< eth dev bits >
> > > > > 	rte_spinlock_t owner_lock;
> > > > > 	bool locked;
> > > > > 	pid_t owner_pid;
> > > > > }
> > > > >
> > > As an aside, if you ensure that both locked (or "owned", I think in
> > > this
> > > context) and owner_pid are integer values, you can do away with the
> > > lock and use a compare-and-set to take ownership, by setting both
> > > atomically if unmodified from the originally read values.
> > >
> > > > >
> > > > > bool rte_port_claim_ownership(struct rte_eth_dev *dev) {
> > > > > 	bool ret = false;
> > > > >
> > > > > 	spin_lock(dev->owner_lock);
> > > > > 	if (dev->locked)
> > > > > 		goto out;
> > > > > 	dev->locked = true;
> > > > > 	dev->owner_pid = getpid();
> > > > > 	ret = true;
> > > > > out:
> > > > > 	spin_unlock(dev->lock)
> > > > > 	return ret;
> > > > > }
> > > > >
> > > > >
> > > > > bool rte_port_release_ownership(rte_eth_dev *dev) {
> > > > >
> > > > > 	boot ret = false;
> > > > > 	spin_lock(dev->owner_lock);
> > > > > 	if (!dev->locked)
> > > > > 		goto out;
> > > > > 	if (dev->owner_pid != getpid())
> > > > > 		goto out;
> > > > > 	dev->locked = false;
> > > > > 	dev_owner_pid = 0;
> > > > > 	ret = true;
> > > > > out:
> > > > > 	spin_unlock(dev->owner_lock)
> > > > > 	return ret;
> > > > > }
> > > > >
> > > > > bool rte_port_is_owned_by(struct rte_eth_dev *dev, pid_t pid) {
> > > > > 	bool ret = false;
> > > > >
> > > > > 	spin_lock(dev->owner_lock);
> > > > > 	if (pid)
> > > > > 		ret = (dev->locked && (pid == dev->owner_pid));
> > > > > 	else
> > > > > 		ret = dev->locked;
> > > > > 	spin_unlock(dev->owner_lock);
> > > > > 	return ret;
> > > > > }
> > > > >
> > > > > The idea here is that lock state is isolated from ownership
> > > > > information.  Any context has the opportunity to lock the
> > > > > resource (in this case the eth port) despite its ownership object.
> > > > >
> > > > > In comparison, your api, which is in may ways simmilar,
> > > > > separates the creation of ownership objects to a separate api
> > > > > call, and that ownership information embodies state that is
> > > > > integral to the ability to get exclusive access to the resource.
> > > > > I.E. if thread A calls your owner_new call, and then thread B
> > > > > calls owner_new, thread A will never be able to get access to
> > > > > any port unless it calls owner_new
> > > again.
> > > > >
> > > > > Does that help clarify my position?
> > > This would have been my understanding of what was being looked for
> > > too, from my minimal understanding of the problem. Thanks for
> > > putting that forward on behalf of many of us!
> > >
> > > >
> > > > Now I fully understand you, thanks for your patience.
> > > >
> > > > So, you are missing here one of the main ideas of my port
> > > > ownership
> > > intention.
> > > > There are options for X>1 different uncoordinated owners running
> > > > in the
> > > same thread.
> > >
> > > Thanks Matan for taking time to try and explain how your idea
> > > differs, but I for one am still a little confused. Sorry for the late questions.
> > >
> > > Sure, Neil's example above takes the pid or thread id as the owner
> > > id parameter, but there is no reason we can't use the same scheme
> > > with arbitrarily assigned owner ids, so long as they are unique. We
> > > can even have a simple mapping table mapping ids to names of
> components.
> > > >
> > Sorry, don't understand your point here.
> > My approach asked to allocate unique ID for "any part of code want to
> manage\use a port".
> > What is the problem here and how do you suggest to fix it?
> >
> > Neil approach (with process iD\ thread id ) is wrong because 2 different
> owners can run in same thread (as I explained a lot below).
> >
> So, I may be wrong here, but it would be my opinion that the ownership
> record should codify something about the owning context.

So, the context is the allocated ID and the name.
I think pid is not necessary.

>  The fact that you
> want two different owners to run in the context of the same thread is not a
> problem per se, but rather an artifact of your adherence to the statement
> "any part of code to manage/use a port".  I would assert that was perhaps a
> statement made in error early during the design phase.  Perhaps it would be
> better to state that any exectution context may take ownership of a port.
>

It is just semantic.
 
> > > > For example:
> > > > 1. Think about Testpmd control commands that call to failsafe port
> > > > devop
> > > which call to its sub-devices devops, while tespmd is different
> > > owner(controlling failsafe-port) and failsafe is a different
> > > owner(controlling all its sub-devices ports), There are both run
> > > control commands in the same thread and there are uncoordinated!
> Answered below.
> 
> > > >  2. Interrupt callbacks that anyone can register to them and all
> > > > will run by
> > > the DPDK host thread.
> > >
> I'm sorry, I'm not clear on how your solution succededs here where my
> alternate model fails.  Both models require co-ordination such that
> ownership of a port is released and re-aquired by another thread, if I'm
> understanding this correctly.
>

I think you don't understand, please see below.
 
> > > Can you provide a little more details here: what is the specific
> > > issue or conflict in each of these examples and how does your
> > > ownership proposal fix it, when Neil's simpler approach doesn't?
> > >
> > For the first example:
> > My approach:
> > Testpmd want to manage the fail-safe port, therefore it should allocate
> unique ID(only one time) and use owner set(by its ID) to take ownership of
> this port.
> > If it succeed to take ownership it can manage the port.
> > Failsafe PMD wants to manage its sub-devices ports and does the same
> process as Testpmd.
> > Everything is ok.
> >
> > Neil  approach:
> > Testpmd want to manage the fail-safe port, therefore it just need to claim
> ownership(set) and its pid will take as the owner identifier.
> > Failsafe PMD wants to manage its sub-devices ports and does the same
> process as Testpmd.
> > But look these 2 entities run in same threads and there both can set the
> same pid. -> problem!
> >
> I would argue thats not an error at all.  As above, the only thing wrong with
> using the same ID to claim ownership of both ports is that it violates the
> statement you referred to above, which I think is somewhat erroneous.  I
> would further argue that using the same Id in both scenarios is preferable
> because it accurately indicates the ownership relation between the top level
> failsafe device and its slaves (i.e. that the application thread owns the failsafe
> device, and transitively, the slaves).  There is no real need to codify the fact
> that the failsafe port actually owns the slaves, above and beyond that
> statement above.

Look, The two different entities run in the same thread,
They actually even don't know each other,
One set MTU to 1500,
The second set MTU to 3000.
The first one run rx burst for port 5 while the second do it exactly the same.
Crash is not far to come.
How can you say that this is OK and no error here?

> 
> There is a convienience to having ownership be differentiated in the
> master/slave model when it comes to iterating over top level vs subordinate
> ports, but I would agrue thats a problem that should be solved
> independently, adding it here is somewhat confusing.  I would suggest
> adding a parent rte_eth_Dev and childrent rte_eth_dev list to the
> rte_eth_dev structure so that iterations can be preformed over top level
> devices, children, children of children, etc.  You can do this with your
> ownership model as well of course, but there are other ways to skin that cat.
> 

Suggest a full design, I will be happy to review it if you want :)

> 
> > The second one just describe more scenario about more than one DPDK
> entities which run from the same thread.
> >
> > > >
> > > > So, no any optional  owner becomes an owner, it depends in the
> > > > specific
> > > implementation.
> > > >
> > > > So if some "part of code" wants to manage a port exclusively and
> > > > wants to
> > > take ownership of it to prevent other "part of code" to use this port :
> > > > 1. Take ownership.
> > > > 2. It should ask itself: Am I run in different threads\processes?
> > > > If yes, it
> > > should synchronize its port management.
> > > > 3. Release ownership in the end.
> > > >
> > > > Remember that may be different "part of code"s running in the same
> > > thread\threads\process\processes.
> > > >
> So it seems like the real point of contention that we need to settle here is,
> what codifies an 'owner'.  Must it be a specific execution context, or can we
> define any arbitrary section of code as being an owner?  I would agrue
> against the latter.  While in your master/slave model I can see how it seems
> tempting, I would suggest alternate use cases that make that ownership
> model ambiguous.  If, for example, we use your interrupt example above,
> and an interrupt call back is run for a given port, how, using your example,
> does it now which area of code/object/thread to co-ordinate releasing of
> that port with so that it can operate exclusively?

Example:
Some DPDK entity succeed to take ownership of port X.
Than it wants to register for LINK event - and configure something in the callback.
There is another code for this DPDK entity which may configure same area in other thread.
Since the DPDK entity knows about all its code(includes the cb code) it can just synchronize these 2 configurations by itself.
 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19 13:57                                         ` Neil Horman
@ 2018-01-19 14:07                                           ` Thomas Monjalon
  2018-01-19 14:32                                             ` Neil Horman
  0 siblings, 1 reply; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-19 14:07 UTC (permalink / raw)
  To: Neil Horman
  Cc: Matan Azrad, Ananyev, Konstantin, Gaetan Rivet, Wu, Jingjing,
	dev, Richardson, Bruce

19/01/2018 14:57, Neil Horman:
> > > I specifically pointed that out above.  There is no reason an owernship record
> > > couldn't be added to the rte_eth_dev structure.
> >
> > Sorry, don't understand why.
> >
> Because, thats the resource your trying to protect, and the object you want to
> identify ownership of, no?

No
The rte_eth_dev structure is the port representation in the process.
The rte_eth_dev_data structure is the port represenation across multi-process.
The ownership must be in rte_eth_dev_data to cover multi-process protection.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19 13:30                                                         ` Neil Horman
  2018-01-19 13:57                                                           ` Matan Azrad
@ 2018-01-19 14:13                                                           ` Thomas Monjalon
  2018-01-19 15:27                                                             ` Neil Horman
  1 sibling, 1 reply; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-19 14:13 UTC (permalink / raw)
  To: Neil Horman
  Cc: dev, Matan Azrad, Bruce Richardson, Ananyev, Konstantin,
	Gaetan Rivet, Wu, Jingjing

19/01/2018 14:30, Neil Horman:
> So it seems like the real point of contention that we need to settle here is,
> what codifies an 'owner'.  Must it be a specific execution context, or can we
> define any arbitrary section of code as being an owner?  I would agrue against
> the latter.

This is the first thing explained in the cover letter:
"2. The port usage synchronization will be managed by the port owner."
There is no intent to manage the threads synchronization for a given port.
It is the responsibility of the owner (a code object) to configure its
port via only one thread.
It is consistent with not trying to manage threads synchronization
for Rx/Tx on a given queue.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19 14:07                                           ` Thomas Monjalon
@ 2018-01-19 14:32                                             ` Neil Horman
  2018-01-19 17:09                                               ` Thomas Monjalon
  0 siblings, 1 reply; 214+ messages in thread
From: Neil Horman @ 2018-01-19 14:32 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Matan Azrad, Ananyev, Konstantin, Gaetan Rivet, Wu, Jingjing,
	dev, Richardson, Bruce

On Fri, Jan 19, 2018 at 03:07:28PM +0100, Thomas Monjalon wrote:
> 19/01/2018 14:57, Neil Horman:
> > > > I specifically pointed that out above.  There is no reason an owernship record
> > > > couldn't be added to the rte_eth_dev structure.
> > >
> > > Sorry, don't understand why.
> > >
> > Because, thats the resource your trying to protect, and the object you want to
> > identify ownership of, no?
> 
> No
> The rte_eth_dev structure is the port representation in the process.
> The rte_eth_dev_data structure is the port represenation across multi-process.
> The ownership must be in rte_eth_dev_data to cover multi-process protection.
> 
Ok.   You get the idea though right?  That the port representation,
for some definition thereof, should embody the ownership state.
Neil

> 
> 
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-19 13:35             ` Matan Azrad
@ 2018-01-19 15:00               ` Gaëtan Rivet
  2018-01-20 18:14                 ` Matan Azrad
  2018-01-22 12:28                 ` Ananyev, Konstantin
  0 siblings, 2 replies; 214+ messages in thread
From: Gaëtan Rivet @ 2018-01-19 15:00 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Ananyev, Konstantin, Thomas Monjalon, Wu, Jingjing, dev,
	Neil Horman, Richardson, Bruce

Hi Matan,

On Fri, Jan 19, 2018 at 01:35:10PM +0000, Matan Azrad wrote:
> Hi Konstantin
> 
> From: Ananyev, Konstantin, Friday, January 19, 2018 3:09 PM
> > > -----Original Message-----
> > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > Sent: Friday, January 19, 2018 12:52 PM
> > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Thomas
> > > Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > <gaetan.rivet@6wind.com>;
> > > Wu, Jingjing <jingjing.wu@intel.com>
> > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > > Bruce <bruce.richardson@intel.com>
> > > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> > >
> > > Hi Konstantin
> > >
> > > From: Ananyev, Konstantin, Friday, January 19, 2018 2:38 PM
> > > > To: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>;
> > Wu,
> > > > Jingjing <jingjing.wu@intel.com>
> > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > > > Bruce <bruce.richardson@intel.com>
> > > > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port
> > > > ownership
> > > >
> > > > Hi Matan,
> > > >
> > > > > -----Original Message-----
> > > > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > > > Sent: Thursday, January 18, 2018 4:35 PM
> > > > > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> > Richardson,
> > > > > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > > > <konstantin.ananyev@intel.com>
> > > > > Subject: [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> > > > >
> > > > > Testpmd should not use ethdev ports which are managed by other
> > > > > DPDK entities.
> > > > >
> > > > > Set Testpmd ownership to each port which is not used by other
> > > > > entity and prevent any usage of ethdev ports which are not owned by
> > Testpmd.
> > > > >
> > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > ---
> > > > >  app/test-pmd/cmdline.c      | 89 +++++++++++++++++++-----------------
> > ----
> > > > -----
> > > > >  app/test-pmd/cmdline_flow.c |  2 +-
> > > > >  app/test-pmd/config.c       | 37 ++++++++++---------
> > > > >  app/test-pmd/parameters.c   |  4 +-
> > > > >  app/test-pmd/testpmd.c      | 63 ++++++++++++++++++++------------
> > > > >  app/test-pmd/testpmd.h      |  3 ++
> > > > >  6 files changed, 103 insertions(+), 95 deletions(-)
> > > > >
> > > > > diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
> > > > > 31919ba..6199c64 100644
> > > > > --- a/app/test-pmd/cmdline.c
> > > > > +++ b/app/test-pmd/cmdline.c
> > > > > @@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
> > > > >  			&link_speed) < 0)
> > > > >  		return;
> > > > >
> > > > > -	RTE_ETH_FOREACH_DEV(pid) {
> > > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
> > > >
> > > > Why do we need all these changes?
> > > > As I understand you changed definition of RTE_ETH_FOREACH_DEV(), so
> > > > no testpmd should work ok default (no_owner case).
> > > > Am I missing something here?
> > >
> > > Now, After Gaetan suggestion RTE_ETH_FOREACH_DEV(pid) will iterate
> > over all valid and ownerless ports.
> > 
> > Yes.
> > 
> > > Here Testpmd wants to iterate over its owned ports.
> > 
> > Why? Why it can't just iterate over all valid and ownerless ports?
> > As I understand it would be enough to fix current problems and would allow
> > us to avoid any changes in testmpd (which I think is a good thing).
> 
> Yes, I understand that this big change is very daunted, But I think the current a lot of bugs in testpmd(regarding port ownership) even more daunted.
> 
> Look,
> Testpmd initiates some of its internal databases depends on specific port iteration,
> In some time someone may take ownership of Testpmd ports and testpmd will continue to touch them.
> 

If I look back on the fail-safe, its sole purpose is to have seamless
hotplug with existing applications.

Port ownership is a genericization of some functions introduced by the
fail-safe, that could structure DPDK further. It should allow
applications to have a seamless integration with subsystems using port
ownership. Without this, port ownership cannot be used.

Testpmd should be fixed, but follow the most common design patterns of
DPDK applications. Going with port ownership seems like a paradigm
shift.

> In addition
> Using the old iterator in some places in testpmd will cause a race for run-time new ports(can be created by failsafe or any hotplug code):
> - testpmd finds an ownerless port(just now created) by the old iterator and start traffic there,
> - failsafe takes ownership of this new port and start traffic there.
> Problem!

Testpmd does not handle detection of new port. If it did, testing
fail-safe with it would be wrong.

At startup, RTE_ETH_FOREACH_DEV already fixed the issue of registering
DEFERRED ports. There are still remaining issues regarding this, but I
think they should be fixed. The architecture does not need to be
completely moved to port ownership.

If anything, this should serve as a test for your API with common
applications. I think you'd prefer to know and debug with testpmd
instead of firing up VPP or something like that to determine what went
wrong with using the fail-safe.

>  
> In addition
> As a good example for well-done application (free from ownership bugs) I tried here to adjust Tespmd to the new rules and BTW to fix a lot of bugs.    

Testpmd has too much cruft, it won't ever be a good example of a
well-done application.

If you want to demonstrate ownership, I think you should start an
example application demonstrating race conditions and their mitigation.

I think that would be interesting for many DPDK users.

> 
> 
> So actually applications which are not aware to the port ownership still are exposed to races, but if there use the old iterator(with the new change) the amount of races decreases. 
> 
> Thanks, Matan.
> > Konstantin
> > 
> > >
> > > I added to Testpmd ability to take an ownership of ports as the new
> > > ownership and synchronization rules suggested, Since Tespmd is a DPDK
> > > entity which wants that no one will touch its owned ports, It must allocate
> > an unique ID, set owner for its ports (see in main function) and recognizes
> > them by its owner ID.
> > >
> > > > Konstantin
> > > >

Regards,
-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19 14:13                                                           ` Thomas Monjalon
@ 2018-01-19 15:27                                                             ` Neil Horman
  2018-01-19 17:17                                                               ` Thomas Monjalon
  0 siblings, 1 reply; 214+ messages in thread
From: Neil Horman @ 2018-01-19 15:27 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, Matan Azrad, Bruce Richardson, Ananyev, Konstantin,
	Gaetan Rivet, Wu, Jingjing

On Fri, Jan 19, 2018 at 03:13:47PM +0100, Thomas Monjalon wrote:
> 19/01/2018 14:30, Neil Horman:
> > So it seems like the real point of contention that we need to settle here is,
> > what codifies an 'owner'.  Must it be a specific execution context, or can we
> > define any arbitrary section of code as being an owner?  I would agrue against
> > the latter.
> 
> This is the first thing explained in the cover letter:
> "2. The port usage synchronization will be managed by the port owner."
> There is no intent to manage the threads synchronization for a given port.
> It is the responsibility of the owner (a code object) to configure its
> port via only one thread.
> It is consistent with not trying to manage threads synchronization
> for Rx/Tx on a given queue.
> 
> 
Yes, in his cover letter, and I contend that notion is an invalid design point.
By codifying an area of code as an 'owner', rather than an execution context,
you're defining the notion of heirarchy, not ownership. That is to say,
you want to codify the notion that there are top level ports that the
application might see, and some of those top level ports are parents to
subordinate ports, which only the parent port should access directly.  If thats
all you want to encode, there are far easier ways to do it:

struct rte_eth_shared_data {
	< existing bits >
	struct rte_eth_port_list {
		struct rte_eth_port_list *children;
		struct rte_eth_port_list *parent;
	};
};


Build an api around a structure like that, so that the parent/child relationship
is globally clear, and this would be much easier, especially if you want to
continue asserting that the notion of synchronization/exclusion is an exercise
left to the application.

Neil

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19 14:32                                             ` Neil Horman
@ 2018-01-19 17:09                                               ` Thomas Monjalon
  2018-01-19 17:37                                                 ` Neil Horman
  0 siblings, 1 reply; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-19 17:09 UTC (permalink / raw)
  To: Neil Horman
  Cc: Matan Azrad, Ananyev, Konstantin, Gaetan Rivet, Wu, Jingjing,
	dev, Richardson, Bruce

19/01/2018 15:32, Neil Horman:
> On Fri, Jan 19, 2018 at 03:07:28PM +0100, Thomas Monjalon wrote:
> > 19/01/2018 14:57, Neil Horman:
> > > > > I specifically pointed that out above.  There is no reason an owernship record
> > > > > couldn't be added to the rte_eth_dev structure.
> > > >
> > > > Sorry, don't understand why.
> > > >
> > > Because, thats the resource your trying to protect, and the object you want to
> > > identify ownership of, no?
> > 
> > No
> > The rte_eth_dev structure is the port representation in the process.
> > The rte_eth_dev_data structure is the port represenation across multi-process.
> > The ownership must be in rte_eth_dev_data to cover multi-process protection.
> > 
> Ok.   You get the idea though right?  That the port representation,
> for some definition thereof, should embody the ownership state.
> Neil

Not sure to understand your question.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19 15:27                                                             ` Neil Horman
@ 2018-01-19 17:17                                                               ` Thomas Monjalon
  2018-01-19 17:43                                                                 ` Neil Horman
  0 siblings, 1 reply; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-19 17:17 UTC (permalink / raw)
  To: Neil Horman
  Cc: dev, Matan Azrad, Bruce Richardson, Ananyev, Konstantin,
	Gaetan Rivet, Wu, Jingjing

19/01/2018 16:27, Neil Horman:
> On Fri, Jan 19, 2018 at 03:13:47PM +0100, Thomas Monjalon wrote:
> > 19/01/2018 14:30, Neil Horman:
> > > So it seems like the real point of contention that we need to settle here is,
> > > what codifies an 'owner'.  Must it be a specific execution context, or can we
> > > define any arbitrary section of code as being an owner?  I would agrue against
> > > the latter.
> > 
> > This is the first thing explained in the cover letter:
> > "2. The port usage synchronization will be managed by the port owner."
> > There is no intent to manage the threads synchronization for a given port.
> > It is the responsibility of the owner (a code object) to configure its
> > port via only one thread.
> > It is consistent with not trying to manage threads synchronization
> > for Rx/Tx on a given queue.
> > 
> > 
> Yes, in his cover letter, and I contend that notion is an invalid design point.
> By codifying an area of code as an 'owner', rather than an execution context,
> you're defining the notion of heirarchy, not ownership. That is to say,
> you want to codify the notion that there are top level ports that the
> application might see, and some of those top level ports are parents to
> subordinate ports, which only the parent port should access directly.  If thats
> all you want to encode, there are far easier ways to do it:
> 
> struct rte_eth_shared_data {
> 	< existing bits >
> 	struct rte_eth_port_list {
> 		struct rte_eth_port_list *children;
> 		struct rte_eth_port_list *parent;
> 	};
> };
> 
> 
> Build an api around a structure like that, so that the parent/child relationship
> is globally clear, and this would be much easier, especially if you want to
> continue asserting that the notion of synchronization/exclusion is an exercise
> left to the application.

Not only Neil.
An owner can be something else than a port.
An owner can be an app process (multi-processes).
An owner can be a library.
The intent is really to solve the generic problem of which code
is managing a port.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19 17:09                                               ` Thomas Monjalon
@ 2018-01-19 17:37                                                 ` Neil Horman
  2018-01-19 18:10                                                   ` Thomas Monjalon
  0 siblings, 1 reply; 214+ messages in thread
From: Neil Horman @ 2018-01-19 17:37 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Matan Azrad, Ananyev, Konstantin, Gaetan Rivet, Wu, Jingjing,
	dev, Richardson, Bruce

On Fri, Jan 19, 2018 at 06:09:47PM +0100, Thomas Monjalon wrote:
> 19/01/2018 15:32, Neil Horman:
> > On Fri, Jan 19, 2018 at 03:07:28PM +0100, Thomas Monjalon wrote:
> > > 19/01/2018 14:57, Neil Horman:
> > > > > > I specifically pointed that out above.  There is no reason an owernship record
> > > > > > couldn't be added to the rte_eth_dev structure.
> > > > >
> > > > > Sorry, don't understand why.
> > > > >
> > > > Because, thats the resource your trying to protect, and the object you want to
> > > > identify ownership of, no?
> > > 
> > > No
> > > The rte_eth_dev structure is the port representation in the process.
> > > The rte_eth_dev_data structure is the port represenation across multi-process.
> > > The ownership must be in rte_eth_dev_data to cover multi-process protection.
> > > 
> > Ok.   You get the idea though right?  That the port representation,
> > for some definition thereof, should embody the ownership state.
> > Neil
> 
> Not sure to understand your question.
> 
There is no real question here, only confirming that we are saying the same
thing.  I misspoke when I indicated ownership information should be embodied in
rte_eth_dev rather than its shared data.  But regardless, the concept is the
same

Neil

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19 17:17                                                               ` Thomas Monjalon
@ 2018-01-19 17:43                                                                 ` Neil Horman
  2018-01-19 18:12                                                                   ` Thomas Monjalon
  0 siblings, 1 reply; 214+ messages in thread
From: Neil Horman @ 2018-01-19 17:43 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, Matan Azrad, Bruce Richardson, Ananyev, Konstantin,
	Gaetan Rivet, Wu, Jingjing

On Fri, Jan 19, 2018 at 06:17:51PM +0100, Thomas Monjalon wrote:
> 19/01/2018 16:27, Neil Horman:
> > On Fri, Jan 19, 2018 at 03:13:47PM +0100, Thomas Monjalon wrote:
> > > 19/01/2018 14:30, Neil Horman:
> > > > So it seems like the real point of contention that we need to settle here is,
> > > > what codifies an 'owner'.  Must it be a specific execution context, or can we
> > > > define any arbitrary section of code as being an owner?  I would agrue against
> > > > the latter.
> > > 
> > > This is the first thing explained in the cover letter:
> > > "2. The port usage synchronization will be managed by the port owner."
> > > There is no intent to manage the threads synchronization for a given port.
> > > It is the responsibility of the owner (a code object) to configure its
> > > port via only one thread.
> > > It is consistent with not trying to manage threads synchronization
> > > for Rx/Tx on a given queue.
> > > 
> > > 
> > Yes, in his cover letter, and I contend that notion is an invalid design point.
> > By codifying an area of code as an 'owner', rather than an execution context,
> > you're defining the notion of heirarchy, not ownership. That is to say,
> > you want to codify the notion that there are top level ports that the
> > application might see, and some of those top level ports are parents to
> > subordinate ports, which only the parent port should access directly.  If thats
> > all you want to encode, there are far easier ways to do it:
> > 
> > struct rte_eth_shared_data {
> > 	< existing bits >
> > 	struct rte_eth_port_list {
> > 		struct rte_eth_port_list *children;
> > 		struct rte_eth_port_list *parent;
> > 	};
> > };
> > 
> > 
> > Build an api around a structure like that, so that the parent/child relationship
> > is globally clear, and this would be much easier, especially if you want to
> > continue asserting that the notion of synchronization/exclusion is an exercise
> > left to the application.
> 
> Not only Neil.
> An owner can be something else than a port.
> An owner can be an app process (multi-processes).
> An owner can be a library.
> The intent is really to solve the generic problem of which code
> is managing a port.
> 
I don't see how this precludes any part of what you just said.  Define the
rte_eth_port_list externally to the shared_data struct and allow any object you
want to allocate it, then anything you want to control a heirarchy of ports can
do so without issue, and the structure is far more clear than an opaque id that
carries subtle semantic ordering with it.

Neil

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19 17:37                                                 ` Neil Horman
@ 2018-01-19 18:10                                                   ` Thomas Monjalon
  2018-01-21 22:12                                                     ` Ferruh Yigit
  0 siblings, 1 reply; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-19 18:10 UTC (permalink / raw)
  To: Neil Horman
  Cc: Matan Azrad, Ananyev, Konstantin, Gaetan Rivet, Wu, Jingjing,
	dev, Richardson, Bruce

19/01/2018 18:37, Neil Horman:
> On Fri, Jan 19, 2018 at 06:09:47PM +0100, Thomas Monjalon wrote:
> > 19/01/2018 15:32, Neil Horman:
> > > On Fri, Jan 19, 2018 at 03:07:28PM +0100, Thomas Monjalon wrote:
> > > > 19/01/2018 14:57, Neil Horman:
> > > > > > > I specifically pointed that out above.  There is no reason an owernship record
> > > > > > > couldn't be added to the rte_eth_dev structure.
> > > > > >
> > > > > > Sorry, don't understand why.
> > > > > >
> > > > > Because, thats the resource your trying to protect, and the object you want to
> > > > > identify ownership of, no?
> > > > 
> > > > No
> > > > The rte_eth_dev structure is the port representation in the process.
> > > > The rte_eth_dev_data structure is the port represenation across multi-process.
> > > > The ownership must be in rte_eth_dev_data to cover multi-process protection.
> > > > 
> > > Ok.   You get the idea though right?  That the port representation,
> > > for some definition thereof, should embody the ownership state.
> > > Neil
> > 
> > Not sure to understand your question.
> > 
> There is no real question here, only confirming that we are saying the same
> thing.  I misspoke when I indicated ownership information should be embodied in
> rte_eth_dev rather than its shared data.  But regardless, the concept is the
> same

Yes we agree.
And I think it is what Matan did.
The owner is in struct rte_eth_dev_data:

@@ -1789,6 +1798,7 @@ struct rte_eth_dev_data {
        int numa_node;  /**< NUMA node connection */
        struct rte_vlan_filter_conf vlan_filter_conf;
        /**< VLAN filter configuration. */
+       struct rte_eth_dev_owner owner; /**< The port owner. */
 };

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19 17:43                                                                 ` Neil Horman
@ 2018-01-19 18:12                                                                   ` Thomas Monjalon
  2018-01-19 19:47                                                                     ` Neil Horman
  0 siblings, 1 reply; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-19 18:12 UTC (permalink / raw)
  To: Neil Horman
  Cc: dev, Matan Azrad, Bruce Richardson, Ananyev, Konstantin,
	Gaetan Rivet, Wu, Jingjing

19/01/2018 18:43, Neil Horman:
> On Fri, Jan 19, 2018 at 06:17:51PM +0100, Thomas Monjalon wrote:
> > 19/01/2018 16:27, Neil Horman:
> > > On Fri, Jan 19, 2018 at 03:13:47PM +0100, Thomas Monjalon wrote:
> > > > 19/01/2018 14:30, Neil Horman:
> > > > > So it seems like the real point of contention that we need to settle here is,
> > > > > what codifies an 'owner'.  Must it be a specific execution context, or can we
> > > > > define any arbitrary section of code as being an owner?  I would agrue against
> > > > > the latter.
> > > > 
> > > > This is the first thing explained in the cover letter:
> > > > "2. The port usage synchronization will be managed by the port owner."
> > > > There is no intent to manage the threads synchronization for a given port.
> > > > It is the responsibility of the owner (a code object) to configure its
> > > > port via only one thread.
> > > > It is consistent with not trying to manage threads synchronization
> > > > for Rx/Tx on a given queue.
> > > > 
> > > > 
> > > Yes, in his cover letter, and I contend that notion is an invalid design point.
> > > By codifying an area of code as an 'owner', rather than an execution context,
> > > you're defining the notion of heirarchy, not ownership. That is to say,
> > > you want to codify the notion that there are top level ports that the
> > > application might see, and some of those top level ports are parents to
> > > subordinate ports, which only the parent port should access directly.  If thats
> > > all you want to encode, there are far easier ways to do it:
> > > 
> > > struct rte_eth_shared_data {
> > > 	< existing bits >
> > > 	struct rte_eth_port_list {
> > > 		struct rte_eth_port_list *children;
> > > 		struct rte_eth_port_list *parent;
> > > 	};
> > > };
> > > 
> > > 
> > > Build an api around a structure like that, so that the parent/child relationship
> > > is globally clear, and this would be much easier, especially if you want to
> > > continue asserting that the notion of synchronization/exclusion is an exercise
> > > left to the application.
> > 
> > Not only Neil.
> > An owner can be something else than a port.
> > An owner can be an app process (multi-processes).
> > An owner can be a library.
> > The intent is really to solve the generic problem of which code
> > is managing a port.
> > 
> I don't see how this precludes any part of what you just said.  Define the
> rte_eth_port_list externally to the shared_data struct and allow any object you
> want to allocate it, then anything you want to control a heirarchy of ports can
> do so without issue, and the structure is far more clear than an opaque id that
> carries subtle semantic ordering with it.

Sorry, I don't understand. Please could you rephrase?

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19 18:12                                                                   ` Thomas Monjalon
@ 2018-01-19 19:47                                                                     ` Neil Horman
  2018-01-19 20:19                                                                       ` Thomas Monjalon
  2018-01-20 12:54                                                                       ` Ananyev, Konstantin
  0 siblings, 2 replies; 214+ messages in thread
From: Neil Horman @ 2018-01-19 19:47 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, Matan Azrad, Bruce Richardson, Ananyev, Konstantin,
	Gaetan Rivet, Wu, Jingjing

On Fri, Jan 19, 2018 at 07:12:36PM +0100, Thomas Monjalon wrote:
> 19/01/2018 18:43, Neil Horman:
> > On Fri, Jan 19, 2018 at 06:17:51PM +0100, Thomas Monjalon wrote:
> > > 19/01/2018 16:27, Neil Horman:
> > > > On Fri, Jan 19, 2018 at 03:13:47PM +0100, Thomas Monjalon wrote:
> > > > > 19/01/2018 14:30, Neil Horman:
> > > > > > So it seems like the real point of contention that we need to settle here is,
> > > > > > what codifies an 'owner'.  Must it be a specific execution context, or can we
> > > > > > define any arbitrary section of code as being an owner?  I would agrue against
> > > > > > the latter.
> > > > > 
> > > > > This is the first thing explained in the cover letter:
> > > > > "2. The port usage synchronization will be managed by the port owner."
> > > > > There is no intent to manage the threads synchronization for a given port.
> > > > > It is the responsibility of the owner (a code object) to configure its
> > > > > port via only one thread.
> > > > > It is consistent with not trying to manage threads synchronization
> > > > > for Rx/Tx on a given queue.
> > > > > 
> > > > > 
> > > > Yes, in his cover letter, and I contend that notion is an invalid design point.
> > > > By codifying an area of code as an 'owner', rather than an execution context,
> > > > you're defining the notion of heirarchy, not ownership. That is to say,
> > > > you want to codify the notion that there are top level ports that the
> > > > application might see, and some of those top level ports are parents to
> > > > subordinate ports, which only the parent port should access directly.  If thats
> > > > all you want to encode, there are far easier ways to do it:
> > > > 
> > > > struct rte_eth_shared_data {
> > > > 	< existing bits >
> > > > 	struct rte_eth_port_list {
> > > > 		struct rte_eth_port_list *children;
> > > > 		struct rte_eth_port_list *parent;
> > > > 	};
> > > > };
> > > > 
> > > > 
> > > > Build an api around a structure like that, so that the parent/child relationship
> > > > is globally clear, and this would be much easier, especially if you want to
> > > > continue asserting that the notion of synchronization/exclusion is an exercise
> > > > left to the application.
> > > 
> > > Not only Neil.
> > > An owner can be something else than a port.
> > > An owner can be an app process (multi-processes).
> > > An owner can be a library.
> > > The intent is really to solve the generic problem of which code
> > > is managing a port.
> > > 
> > I don't see how this precludes any part of what you just said.  Define the
> > rte_eth_port_list externally to the shared_data struct and allow any object you
> > want to allocate it, then anything you want to control a heirarchy of ports can
> > do so without issue, and the structure is far more clear than an opaque id that
> > carries subtle semantic ordering with it.
> 
> Sorry, I don't understand. Please could you rephrase?
> 

Sure, I'm saying the fact that you want an owner to be an object
(library/port/process) rather than strictly an execution context
(process/thread) doesn't preclude what I'm proposing above.  You can create a
generic version of the strcture I propose above like so:

struct rte_obj_heirarchy {
	struct rte_obj_heirarchy *children;
	struct rte_obj_heirarchy *parent;
	void *owner_data; /* optional */
};

And embed that structure in any object you would like to give a representative
heirarchy to, you then have a fairly simple api

struct rte_obj_heirarchy *heirarchy_alloc();
bool heirarchy_set(struct rte_obj_heirarchy *parent, struct rte_obj_heirarcy *child)
void heirarchy_release(struct rte_obj_heirarchy *obj)

That gives you the privately held list relationship I think you are in part
looking for (i.e. the ability for a failsafe device to iterate over the ports it
is in control of), without the awkwardness of the ordinal priority that the
current implementation imposes.

In summary, if what you want is ownership in the strictest sense of the word
(i.e. mutually exclusive access, which I think makes sense), then using a lock
and flag is really the simplest way to go.  If instead what you want is a
heirarchical relationship where you can iterate over a limited set of objects
(the failsafe child port example), then the above is what you want.


The soution Matan is providing does some of each of these things, but comes with
very odd side effects

It offers a level of mutual exclusion, in that only one
object can own another at a time, but does so in a way that introduces this very
atypical ordinality (once an ownership object is created with owner_new, any
previously created ownership object will be denied the ability to take ownership
of a port)

It also offers a level of filtering (in that if you can set the ownership id of
a given set of object to the value X, you can then iterate over them by
iterating over all objects of that type, and filtering on their id), but it
offers no clear in-memory relationship between parent and children (i.e. if you
were to look at at an object in a debugger and see that it was owned by owner id
X, it would provide you with no indicator of what object held the allocated
ownership object assigned id X.  My proposal trades a few bytes of data in
exchage for a global clear, definitive heirarcy relationship.  And if you add an
api call and a spinlock, you can easily graft on mutual exclusion here, by
blocking access to objects that arent the immediate parent of a given object.

Neil
 


subsequently created object 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19 19:47                                                                     ` Neil Horman
@ 2018-01-19 20:19                                                                       ` Thomas Monjalon
  2018-01-19 22:52                                                                         ` Neil Horman
  2018-01-20  3:38                                                                         ` Tuxdriver
  2018-01-20 12:54                                                                       ` Ananyev, Konstantin
  1 sibling, 2 replies; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-19 20:19 UTC (permalink / raw)
  To: Neil Horman
  Cc: dev, Matan Azrad, Bruce Richardson, Ananyev, Konstantin,
	Gaetan Rivet, Wu, Jingjing

19/01/2018 20:47, Neil Horman:
> On Fri, Jan 19, 2018 at 07:12:36PM +0100, Thomas Monjalon wrote:
> > 19/01/2018 18:43, Neil Horman:
> > > On Fri, Jan 19, 2018 at 06:17:51PM +0100, Thomas Monjalon wrote:
> > > > 19/01/2018 16:27, Neil Horman:
> > > > > On Fri, Jan 19, 2018 at 03:13:47PM +0100, Thomas Monjalon wrote:
> > > > > > 19/01/2018 14:30, Neil Horman:
> > > > > > > So it seems like the real point of contention that we need to settle here is,
> > > > > > > what codifies an 'owner'.  Must it be a specific execution context, or can we
> > > > > > > define any arbitrary section of code as being an owner?  I would agrue against
> > > > > > > the latter.
> > > > > > 
> > > > > > This is the first thing explained in the cover letter:
> > > > > > "2. The port usage synchronization will be managed by the port owner."
> > > > > > There is no intent to manage the threads synchronization for a given port.
> > > > > > It is the responsibility of the owner (a code object) to configure its
> > > > > > port via only one thread.
> > > > > > It is consistent with not trying to manage threads synchronization
> > > > > > for Rx/Tx on a given queue.
> > > > > > 
> > > > > > 
> > > > > Yes, in his cover letter, and I contend that notion is an invalid design point.
> > > > > By codifying an area of code as an 'owner', rather than an execution context,
> > > > > you're defining the notion of heirarchy, not ownership. That is to say,
> > > > > you want to codify the notion that there are top level ports that the
> > > > > application might see, and some of those top level ports are parents to
> > > > > subordinate ports, which only the parent port should access directly.  If thats
> > > > > all you want to encode, there are far easier ways to do it:
> > > > > 
> > > > > struct rte_eth_shared_data {
> > > > > 	< existing bits >
> > > > > 	struct rte_eth_port_list {
> > > > > 		struct rte_eth_port_list *children;
> > > > > 		struct rte_eth_port_list *parent;
> > > > > 	};
> > > > > };
> > > > > 
> > > > > 
> > > > > Build an api around a structure like that, so that the parent/child relationship
> > > > > is globally clear, and this would be much easier, especially if you want to
> > > > > continue asserting that the notion of synchronization/exclusion is an exercise
> > > > > left to the application.
> > > > 
> > > > Not only Neil.
> > > > An owner can be something else than a port.
> > > > An owner can be an app process (multi-processes).
> > > > An owner can be a library.
> > > > The intent is really to solve the generic problem of which code
> > > > is managing a port.
> > > > 
> > > I don't see how this precludes any part of what you just said.  Define the
> > > rte_eth_port_list externally to the shared_data struct and allow any object you
> > > want to allocate it, then anything you want to control a heirarchy of ports can
> > > do so without issue, and the structure is far more clear than an opaque id that
> > > carries subtle semantic ordering with it.
> > 
> > Sorry, I don't understand. Please could you rephrase?
> > 
> 
> Sure, I'm saying the fact that you want an owner to be an object
> (library/port/process) rather than strictly an execution context
> (process/thread) doesn't preclude what I'm proposing above.  You can create a
> generic version of the strcture I propose above like so:
> 
> struct rte_obj_heirarchy {
> 	struct rte_obj_heirarchy *children;
> 	struct rte_obj_heirarchy *parent;
> 	void *owner_data; /* optional */
> };
> 
> And embed that structure in any object you would like to give a representative
> heirarchy to, you then have a fairly simple api
> 
> struct rte_obj_heirarchy *heirarchy_alloc();
> bool heirarchy_set(struct rte_obj_heirarchy *parent, struct rte_obj_heirarcy *child)
> void heirarchy_release(struct rte_obj_heirarchy *obj)
> 
> That gives you the privately held list relationship I think you are in part
> looking for (i.e. the ability for a failsafe device to iterate over the ports it
> is in control of), without the awkwardness of the ordinal priority that the
> current implementation imposes.

What is the awkward ordinal priority?
I see you discuss it below. So let's discuss it below.

> In summary, if what you want is ownership in the strictest sense of the word
> (i.e. mutually exclusive access, which I think makes sense), then using a lock
> and flag is really the simplest way to go.  If instead what you want is a
> heirarchical relationship where you can iterate over a limited set of objects
> (the failsafe child port example), then the above is what you want.

We want only ownership. That's why it's called ownership :)
The hierarchical relationship is private to the owner.
For instance, failsafe implements its own list of sub-devices.
So we just need to expose that the ports are already owned.

> The soution Matan is providing does some of each of these things, but comes with
> very odd side effects
> 
> It offers a level of mutual exclusion, in that only one
> object can own another at a time, but does so in a way that introduces this very
> atypical ordinality (once an ownership object is created with owner_new, any
> previously created ownership object will be denied the ability to take ownership
> of a port)

You mean only the last owner id can take an ownership?
If yes, it looks like a bug.
Please could you show what is responsible of this effect in the patch?

> It also offers a level of filtering (in that if you can set the ownership id of
> a given set of object to the value X, you can then iterate over them by
> iterating over all objects of that type, and filtering on their id), but it
> offers no clear in-memory relationship between parent and children (i.e. if you
> were to look at at an object in a debugger and see that it was owned by owner id
> X, it would provide you with no indicator of what object held the allocated
> ownership object assigned id X.

I think it is wrong. There is an owner name for debug/printing purpose.

> My proposal trades a few bytes of data in
> exchage for a global clear, definitive heirarcy relationship.  And if you add an
> api call and a spinlock, you can easily graft on mutual exclusion here, by
> blocking access to objects that arent the immediate parent of a given object.

For the hierarchical relationship, I think it is over-engineered.
For blocking access, it means you need a caller id parameter in every
functions in order to identify if the caller is the owner.

My summary:
- you think there is a bug - needs to show
- you think about relationship needs that I don't see
- you think about access permission which would be a huge change

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19 20:19                                                                       ` Thomas Monjalon
@ 2018-01-19 22:52                                                                         ` Neil Horman
  2018-01-20  3:38                                                                         ` Tuxdriver
  1 sibling, 0 replies; 214+ messages in thread
From: Neil Horman @ 2018-01-19 22:52 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, Matan Azrad, Bruce Richardson, Ananyev, Konstantin,
	Gaetan Rivet, Wu, Jingjing

On Fri, Jan 19, 2018 at 09:19:18PM +0100, Thomas Monjalon wrote:
Apolgies for the top post, but I'm preparing for a trip out of the country, and
so may not have time to fully answer these questions until I get back (or at
least until I get someplace with power and internet).  If the conversation is
still going at that time, I'll chime back in
Neil

> 19/01/2018 20:47, Neil Horman:
> > On Fri, Jan 19, 2018 at 07:12:36PM +0100, Thomas Monjalon wrote:
> > > 19/01/2018 18:43, Neil Horman:
> > > > On Fri, Jan 19, 2018 at 06:17:51PM +0100, Thomas Monjalon wrote:
> > > > > 19/01/2018 16:27, Neil Horman:
> > > > > > On Fri, Jan 19, 2018 at 03:13:47PM +0100, Thomas Monjalon wrote:
> > > > > > > 19/01/2018 14:30, Neil Horman:
> > > > > > > > So it seems like the real point of contention that we need to settle here is,
> > > > > > > > what codifies an 'owner'.  Must it be a specific execution context, or can we
> > > > > > > > define any arbitrary section of code as being an owner?  I would agrue against
> > > > > > > > the latter.
> > > > > > > 
> > > > > > > This is the first thing explained in the cover letter:
> > > > > > > "2. The port usage synchronization will be managed by the port owner."
> > > > > > > There is no intent to manage the threads synchronization for a given port.
> > > > > > > It is the responsibility of the owner (a code object) to configure its
> > > > > > > port via only one thread.
> > > > > > > It is consistent with not trying to manage threads synchronization
> > > > > > > for Rx/Tx on a given queue.
> > > > > > > 
> > > > > > > 
> > > > > > Yes, in his cover letter, and I contend that notion is an invalid design point.
> > > > > > By codifying an area of code as an 'owner', rather than an execution context,
> > > > > > you're defining the notion of heirarchy, not ownership. That is to say,
> > > > > > you want to codify the notion that there are top level ports that the
> > > > > > application might see, and some of those top level ports are parents to
> > > > > > subordinate ports, which only the parent port should access directly.  If thats
> > > > > > all you want to encode, there are far easier ways to do it:
> > > > > > 
> > > > > > struct rte_eth_shared_data {
> > > > > > 	< existing bits >
> > > > > > 	struct rte_eth_port_list {
> > > > > > 		struct rte_eth_port_list *children;
> > > > > > 		struct rte_eth_port_list *parent;
> > > > > > 	};
> > > > > > };
> > > > > > 
> > > > > > 
> > > > > > Build an api around a structure like that, so that the parent/child relationship
> > > > > > is globally clear, and this would be much easier, especially if you want to
> > > > > > continue asserting that the notion of synchronization/exclusion is an exercise
> > > > > > left to the application.
> > > > > 
> > > > > Not only Neil.
> > > > > An owner can be something else than a port.
> > > > > An owner can be an app process (multi-processes).
> > > > > An owner can be a library.
> > > > > The intent is really to solve the generic problem of which code
> > > > > is managing a port.
> > > > > 
> > > > I don't see how this precludes any part of what you just said.  Define the
> > > > rte_eth_port_list externally to the shared_data struct and allow any object you
> > > > want to allocate it, then anything you want to control a heirarchy of ports can
> > > > do so without issue, and the structure is far more clear than an opaque id that
> > > > carries subtle semantic ordering with it.
> > > 
> > > Sorry, I don't understand. Please could you rephrase?
> > > 
> > 
> > Sure, I'm saying the fact that you want an owner to be an object
> > (library/port/process) rather than strictly an execution context
> > (process/thread) doesn't preclude what I'm proposing above.  You can create a
> > generic version of the strcture I propose above like so:
> > 
> > struct rte_obj_heirarchy {
> > 	struct rte_obj_heirarchy *children;
> > 	struct rte_obj_heirarchy *parent;
> > 	void *owner_data; /* optional */
> > };
> > 
> > And embed that structure in any object you would like to give a representative
> > heirarchy to, you then have a fairly simple api
> > 
> > struct rte_obj_heirarchy *heirarchy_alloc();
> > bool heirarchy_set(struct rte_obj_heirarchy *parent, struct rte_obj_heirarcy *child)
> > void heirarchy_release(struct rte_obj_heirarchy *obj)
> > 
> > That gives you the privately held list relationship I think you are in part
> > looking for (i.e. the ability for a failsafe device to iterate over the ports it
> > is in control of), without the awkwardness of the ordinal priority that the
> > current implementation imposes.
> 
> What is the awkward ordinal priority?
> I see you discuss it below. So let's discuss it below.
> 
> > In summary, if what you want is ownership in the strictest sense of the word
> > (i.e. mutually exclusive access, which I think makes sense), then using a lock
> > and flag is really the simplest way to go.  If instead what you want is a
> > heirarchical relationship where you can iterate over a limited set of objects
> > (the failsafe child port example), then the above is what you want.
> 
> We want only ownership. That's why it's called ownership :)
> The hierarchical relationship is private to the owner.
> For instance, failsafe implements its own list of sub-devices.
> So we just need to expose that the ports are already owned.
> 
> > The soution Matan is providing does some of each of these things, but comes with
> > very odd side effects
> > 
> > It offers a level of mutual exclusion, in that only one
> > object can own another at a time, but does so in a way that introduces this very
> > atypical ordinality (once an ownership object is created with owner_new, any
> > previously created ownership object will be denied the ability to take ownership
> > of a port)
> 
> You mean only the last owner id can take an ownership?
> If yes, it looks like a bug.
> Please could you show what is responsible of this effect in the patch?
> 
> > It also offers a level of filtering (in that if you can set the ownership id of
> > a given set of object to the value X, you can then iterate over them by
> > iterating over all objects of that type, and filtering on their id), but it
> > offers no clear in-memory relationship between parent and children (i.e. if you
> > were to look at at an object in a debugger and see that it was owned by owner id
> > X, it would provide you with no indicator of what object held the allocated
> > ownership object assigned id X.
> 
> I think it is wrong. There is an owner name for debug/printing purpose.
> 
> > My proposal trades a few bytes of data in
> > exchage for a global clear, definitive heirarcy relationship.  And if you add an
> > api call and a spinlock, you can easily graft on mutual exclusion here, by
> > blocking access to objects that arent the immediate parent of a given object.
> 
> For the hierarchical relationship, I think it is over-engineered.
> For blocking access, it means you need a caller id parameter in every
> functions in order to identify if the caller is the owner.
> 
> My summary:
> - you think there is a bug - needs to show
> - you think about relationship needs that I don't see
> - you think about access permission which would be a huge change
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19 20:19                                                                       ` Thomas Monjalon
  2018-01-19 22:52                                                                         ` Neil Horman
@ 2018-01-20  3:38                                                                         ` Tuxdriver
  1 sibling, 0 replies; 214+ messages in thread
From: Tuxdriver @ 2018-01-20  3:38 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, Matan Azrad, Bruce Richardson, Ananyev, Konstantin,
	Gaetan Rivet, Wu, Jingjing

Writing from my phone, so sorry for typos and top posting.

Need to apologise for a misunderstanding on my part.  I had a dyslexic 
moment and reversed the validity check on the port owner comparison.  What 
I thought was => was actually =<, and so my concern that only the last 
allocated owner is false, and erroneous on my part.

More comments as I'm able while afk
Neil

Sent with AquaMail for Android
http://www.aqua-mail.com


On January 19, 2018 3:20:49 PM Thomas Monjalon <thomas@monjalon.net> wrote:

> 19/01/2018 20:47, Neil Horman:
>> On Fri, Jan 19, 2018 at 07:12:36PM +0100, Thomas Monjalon wrote:
>> > 19/01/2018 18:43, Neil Horman:
>> > > On Fri, Jan 19, 2018 at 06:17:51PM +0100, Thomas Monjalon wrote:
>> > > > 19/01/2018 16:27, Neil Horman:
>> > > > > On Fri, Jan 19, 2018 at 03:13:47PM +0100, Thomas Monjalon wrote:
>> > > > > > 19/01/2018 14:30, Neil Horman:
>> > > > > > > So it seems like the real point of contention that we need to 
>> settle here is,
>> > > > > > > what codifies an 'owner'.  Must it be a specific execution 
>> context, or can we
>> > > > > > > define any arbitrary section of code as being an owner?  I 
>> would agrue against
>> > > > > > > the latter.
>> > > > > >
>> > > > > > This is the first thing explained in the cover letter:
>> > > > > > "2. The port usage synchronization will be managed by the port 
>> owner."
>> > > > > > There is no intent to manage the threads synchronization for a 
>> given port.
>> > > > > > It is the responsibility of the owner (a code object) to 
>> configure its
>> > > > > > port via only one thread.
>> > > > > > It is consistent with not trying to manage threads synchronization
>> > > > > > for Rx/Tx on a given queue.
>> > > > > >
>> > > > > >
>> > > > > Yes, in his cover letter, and I contend that notion is an invalid 
>> design point.
>> > > > > By codifying an area of code as an 'owner', rather than an 
>> execution context,
>> > > > > you're defining the notion of heirarchy, not ownership. That is to say,
>> > > > > you want to codify the notion that there are top level ports that the
>> > > > > application might see, and some of those top level ports are parents to
>> > > > > subordinate ports, which only the parent port should access 
>> directly.  If thats
>> > > > > all you want to encode, there are far easier ways to do it:
>> > > > >
>> > > > > struct rte_eth_shared_data {
>> > > > > 	< existing bits >
>> > > > > 	struct rte_eth_port_list {
>> > > > > 		struct rte_eth_port_list *children;
>> > > > > 		struct rte_eth_port_list *parent;
>> > > > > 	};
>> > > > > };
>> > > > >
>> > > > >
>> > > > > Build an api around a structure like that, so that the parent/child 
>> relationship
>> > > > > is globally clear, and this would be much easier, especially if you 
>> want to
>> > > > > continue asserting that the notion of synchronization/exclusion is 
>> an exercise
>> > > > > left to the application.
>> > > >
>> > > > Not only Neil.
>> > > > An owner can be something else than a port.
>> > > > An owner can be an app process (multi-processes).
>> > > > An owner can be a library.
>> > > > The intent is really to solve the generic problem of which code
>> > > > is managing a port.
>> > > >
>> > > I don't see how this precludes any part of what you just said.  Define the
>> > > rte_eth_port_list externally to the shared_data struct and allow any 
>> object you
>> > > want to allocate it, then anything you want to control a heirarchy of 
>> ports can
>> > > do so without issue, and the structure is far more clear than an opaque 
>> id that
>> > > carries subtle semantic ordering with it.
>> >
>> > Sorry, I don't understand. Please could you rephrase?
>> >
>>
>> Sure, I'm saying the fact that you want an owner to be an object
>> (library/port/process) rather than strictly an execution context
>> (process/thread) doesn't preclude what I'm proposing above.  You can create a
>> generic version of the strcture I propose above like so:
>>
>> struct rte_obj_heirarchy {
>> 	struct rte_obj_heirarchy *children;
>> 	struct rte_obj_heirarchy *parent;
>> 	void *owner_data; /* optional */
>> };
>>
>> And embed that structure in any object you would like to give a representative
>> heirarchy to, you then have a fairly simple api
>>
>> struct rte_obj_heirarchy *heirarchy_alloc();
>> bool heirarchy_set(struct rte_obj_heirarchy *parent, struct 
>> rte_obj_heirarcy *child)
>> void heirarchy_release(struct rte_obj_heirarchy *obj)
>>
>> That gives you the privately held list relationship I think you are in part
>> looking for (i.e. the ability for a failsafe device to iterate over the 
>> ports it
>> is in control of), without the awkwardness of the ordinal priority that the
>> current implementation imposes.
>
> What is the awkward ordinal priority?
> I see you discuss it below. So let's discuss it below.
>
>> In summary, if what you want is ownership in the strictest sense of the word
>> (i.e. mutually exclusive access, which I think makes sense), then using a lock
>> and flag is really the simplest way to go.  If instead what you want is a
>> heirarchical relationship where you can iterate over a limited set of objects
>> (the failsafe child port example), then the above is what you want.
>
> We want only ownership. That's why it's called ownership :)
> The hierarchical relationship is private to the owner.
> For instance, failsafe implements its own list of sub-devices.
> So we just need to expose that the ports are already owned.
>
>> The soution Matan is providing does some of each of these things, but comes 
>> with
>> very odd side effects
>>
>> It offers a level of mutual exclusion, in that only one
>> object can own another at a time, but does so in a way that introduces this 
>> very
>> atypical ordinality (once an ownership object is created with owner_new, any
>> previously created ownership object will be denied the ability to take 
>> ownership
>> of a port)
>
> You mean only the last owner id can take an ownership?
> If yes, it looks like a bug.
> Please could you show what is responsible of this effect in the patch?
>
>> It also offers a level of filtering (in that if you can set the ownership id of
>> a given set of object to the value X, you can then iterate over them by
>> iterating over all objects of that type, and filtering on their id), but it
>> offers no clear in-memory relationship between parent and children (i.e. if you
>> were to look at at an object in a debugger and see that it was owned by 
>> owner id
>> X, it would provide you with no indicator of what object held the allocated
>> ownership object assigned id X.
>
> I think it is wrong. There is an owner name for debug/printing purpose.
>
>> My proposal trades a few bytes of data in
>> exchage for a global clear, definitive heirarcy relationship.  And if you 
>> add an
>> api call and a spinlock, you can easily graft on mutual exclusion here, by
>> blocking access to objects that arent the immediate parent of a given object.
>
> For the hierarchical relationship, I think it is over-engineered.
> For blocking access, it means you need a caller id parameter in every
> functions in order to identify if the caller is the owner.
>
> My summary:
> - you think there is a bug - needs to show
> - you think about relationship needs that I don't see
> - you think about access permission which would be a huge change
>

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19 19:47                                                                     ` Neil Horman
  2018-01-19 20:19                                                                       ` Thomas Monjalon
@ 2018-01-20 12:54                                                                       ` Ananyev, Konstantin
  2018-01-20 14:02                                                                         ` Thomas Monjalon
  1 sibling, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-20 12:54 UTC (permalink / raw)
  To: Neil Horman, Thomas Monjalon
  Cc: dev, Matan Azrad, Richardson, Bruce, Gaetan Rivet, Wu, Jingjing

Hi Neil,

> ----- Message-----
> From: Neil Horman [mailto:nhorman@tuxdriver.com]
> Sent: Friday, January 19, 2018 7:48 PM
> To: Thomas Monjalon <thomas@monjalon.net>
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Richardson, Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; Gaetan Rivet <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
> 
> On Fri, Jan 19, 2018 at 07:12:36PM +0100, Thomas Monjalon wrote:
> > 19/01/2018 18:43, Neil Horman:
> > > On Fri, Jan 19, 2018 at 06:17:51PM +0100, Thomas Monjalon wrote:
> > > > 19/01/2018 16:27, Neil Horman:
> > > > > On Fri, Jan 19, 2018 at 03:13:47PM +0100, Thomas Monjalon wrote:
> > > > > > 19/01/2018 14:30, Neil Horman:
> > > > > > > So it seems like the real point of contention that we need to settle here is,
> > > > > > > what codifies an 'owner'.  Must it be a specific execution context, or can we
> > > > > > > define any arbitrary section of code as being an owner?  I would agrue against
> > > > > > > the latter.
> > > > > >
> > > > > > This is the first thing explained in the cover letter:
> > > > > > "2. The port usage synchronization will be managed by the port owner."
> > > > > > There is no intent to manage the threads synchronization for a given port.
> > > > > > It is the responsibility of the owner (a code object) to configure its
> > > > > > port via only one thread.
> > > > > > It is consistent with not trying to manage threads synchronization
> > > > > > for Rx/Tx on a given queue.
> > > > > >
> > > > > >
> > > > > Yes, in his cover letter, and I contend that notion is an invalid design point.
> > > > > By codifying an area of code as an 'owner', rather than an execution context,
> > > > > you're defining the notion of heirarchy, not ownership. That is to say,
> > > > > you want to codify the notion that there are top level ports that the
> > > > > application might see, and some of those top level ports are parents to
> > > > > subordinate ports, which only the parent port should access directly.  If thats
> > > > > all you want to encode, there are far easier ways to do it:
> > > > >
> > > > > struct rte_eth_shared_data {
> > > > > 	< existing bits >
> > > > > 	struct rte_eth_port_list {
> > > > > 		struct rte_eth_port_list *children;
> > > > > 		struct rte_eth_port_list *parent;
> > > > > 	};
> > > > > };
> > > > >
> > > > >
> > > > > Build an api around a structure like that, so that the parent/child relationship
> > > > > is globally clear, and this would be much easier, especially if you want to
> > > > > continue asserting that the notion of synchronization/exclusion is an exercise
> > > > > left to the application.
> > > >
> > > > Not only Neil.
> > > > An owner can be something else than a port.
> > > > An owner can be an app process (multi-processes).
> > > > An owner can be a library.
> > > > The intent is really to solve the generic problem of which code
> > > > is managing a port.
> > > >
> > > I don't see how this precludes any part of what you just said.  Define the
> > > rte_eth_port_list externally to the shared_data struct and allow any object you
> > > want to allocate it, then anything you want to control a heirarchy of ports can
> > > do so without issue, and the structure is far more clear than an opaque id that
> > > carries subtle semantic ordering with it.
> >
> > Sorry, I don't understand. Please could you rephrase?
> >
> 
> Sure, I'm saying the fact that you want an owner to be an object
> (library/port/process) rather than strictly an execution context
> (process/thread) doesn't preclude what I'm proposing above.  You can create a
> generic version of the strcture I propose above like so:
> 
> struct rte_obj_heirarchy {
> 	struct rte_obj_heirarchy *children;
> 	struct rte_obj_heirarchy *parent;
> 	void *owner_data; /* optional */
> };
> 
> And embed that structure in any object you would like to give a representative
> heirarchy to, you then have a fairly simple api
> 
> struct rte_obj_heirarchy *heirarchy_alloc();
> bool heirarchy_set(struct rte_obj_heirarchy *parent, struct rte_obj_heirarcy *child)
> void heirarchy_release(struct rte_obj_heirarchy *obj)
> 
> That gives you the privately held list relationship I think you are in part
> looking for (i.e. the ability for a failsafe device to iterate over the ports it
> is in control of), without the awkwardness of the ordinal priority that the
> current implementation imposes.
> 
> In summary, if what you want is ownership in the strictest sense of the word
> (i.e. mutually exclusive access, which I think makes sense), then using a lock
> and flag is really the simplest way to go.  If instead what you want is a
> heirarchical relationship where you can iterate over a limited set of objects
> (the failsafe child port example), then the above is what you want.
> 
> 
> The soution Matan is providing does some of each of these things, but comes with
> very odd side effects
> 
> It offers a level of mutual exclusion, in that only one
> object can own another at a time, but does so in a way that introduces this very
> atypical ordinality (once an ownership object is created with owner_new, any
> previously created ownership object will be denied the ability to take ownership
> of a port)

Why is that?
As I understand current code: any owner id between 1 and next_owner_id 
is considered as valid.
Konstantin


> 
> It also offers a level of filtering (in that if you can set the ownership id of
> a given set of object to the value X, you can then iterate over them by
> iterating over all objects of that type, and filtering on their id), but it
> offers no clear in-memory relationship between parent and children (i.e. if you
> were to look at at an object in a debugger and see that it was owned by owner id
> X, it would provide you with no indicator of what object held the allocated
> ownership object assigned id X.  My proposal trades a few bytes of data in
> exchage for a global clear, definitive heirarcy relationship.  And if you add an
> api call and a spinlock, you can easily graft on mutual exclusion here, by
> blocking access to objects that arent the immediate parent of a given object.
> 
> Neil
> 
> 
> 
> subsequently created object

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-20 12:54                                                                       ` Ananyev, Konstantin
@ 2018-01-20 14:02                                                                         ` Thomas Monjalon
  0 siblings, 0 replies; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-20 14:02 UTC (permalink / raw)
  To: Ananyev, Konstantin
  Cc: Neil Horman, dev, Matan Azrad, Richardson, Bruce, Gaetan Rivet,
	Wu, Jingjing

20/01/2018 13:54, Ananyev, Konstantin:
> Hi Neil,
> 
> > ----- Message-----
> > From: Neil Horman [mailto:nhorman@tuxdriver.com]
> > Sent: Friday, January 19, 2018 7:48 PM
> > To: Thomas Monjalon <thomas@monjalon.net>
> > Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Richardson, Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>; Gaetan Rivet <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
> > 
> > On Fri, Jan 19, 2018 at 07:12:36PM +0100, Thomas Monjalon wrote:
> > > 19/01/2018 18:43, Neil Horman:
> > > > On Fri, Jan 19, 2018 at 06:17:51PM +0100, Thomas Monjalon wrote:
> > > > > 19/01/2018 16:27, Neil Horman:
> > > > > > On Fri, Jan 19, 2018 at 03:13:47PM +0100, Thomas Monjalon wrote:
> > > > > > > 19/01/2018 14:30, Neil Horman:
> > > > > > > > So it seems like the real point of contention that we need to settle here is,
> > > > > > > > what codifies an 'owner'.  Must it be a specific execution context, or can we
> > > > > > > > define any arbitrary section of code as being an owner?  I would agrue against
> > > > > > > > the latter.
> > > > > > >
> > > > > > > This is the first thing explained in the cover letter:
> > > > > > > "2. The port usage synchronization will be managed by the port owner."
> > > > > > > There is no intent to manage the threads synchronization for a given port.
> > > > > > > It is the responsibility of the owner (a code object) to configure its
> > > > > > > port via only one thread.
> > > > > > > It is consistent with not trying to manage threads synchronization
> > > > > > > for Rx/Tx on a given queue.
> > > > > > >
> > > > > > >
> > > > > > Yes, in his cover letter, and I contend that notion is an invalid design point.
> > > > > > By codifying an area of code as an 'owner', rather than an execution context,
> > > > > > you're defining the notion of heirarchy, not ownership. That is to say,
> > > > > > you want to codify the notion that there are top level ports that the
> > > > > > application might see, and some of those top level ports are parents to
> > > > > > subordinate ports, which only the parent port should access directly.  If thats
> > > > > > all you want to encode, there are far easier ways to do it:
> > > > > >
> > > > > > struct rte_eth_shared_data {
> > > > > > 	< existing bits >
> > > > > > 	struct rte_eth_port_list {
> > > > > > 		struct rte_eth_port_list *children;
> > > > > > 		struct rte_eth_port_list *parent;
> > > > > > 	};
> > > > > > };
> > > > > >
> > > > > >
> > > > > > Build an api around a structure like that, so that the parent/child relationship
> > > > > > is globally clear, and this would be much easier, especially if you want to
> > > > > > continue asserting that the notion of synchronization/exclusion is an exercise
> > > > > > left to the application.
> > > > >
> > > > > Not only Neil.
> > > > > An owner can be something else than a port.
> > > > > An owner can be an app process (multi-processes).
> > > > > An owner can be a library.
> > > > > The intent is really to solve the generic problem of which code
> > > > > is managing a port.
> > > > >
> > > > I don't see how this precludes any part of what you just said.  Define the
> > > > rte_eth_port_list externally to the shared_data struct and allow any object you
> > > > want to allocate it, then anything you want to control a heirarchy of ports can
> > > > do so without issue, and the structure is far more clear than an opaque id that
> > > > carries subtle semantic ordering with it.
> > >
> > > Sorry, I don't understand. Please could you rephrase?
> > >
> > 
> > Sure, I'm saying the fact that you want an owner to be an object
> > (library/port/process) rather than strictly an execution context
> > (process/thread) doesn't preclude what I'm proposing above.  You can create a
> > generic version of the strcture I propose above like so:
> > 
> > struct rte_obj_heirarchy {
> > 	struct rte_obj_heirarchy *children;
> > 	struct rte_obj_heirarchy *parent;
> > 	void *owner_data; /* optional */
> > };
> > 
> > And embed that structure in any object you would like to give a representative
> > heirarchy to, you then have a fairly simple api
> > 
> > struct rte_obj_heirarchy *heirarchy_alloc();
> > bool heirarchy_set(struct rte_obj_heirarchy *parent, struct rte_obj_heirarcy *child)
> > void heirarchy_release(struct rte_obj_heirarchy *obj)
> > 
> > That gives you the privately held list relationship I think you are in part
> > looking for (i.e. the ability for a failsafe device to iterate over the ports it
> > is in control of), without the awkwardness of the ordinal priority that the
> > current implementation imposes.
> > 
> > In summary, if what you want is ownership in the strictest sense of the word
> > (i.e. mutually exclusive access, which I think makes sense), then using a lock
> > and flag is really the simplest way to go.  If instead what you want is a
> > heirarchical relationship where you can iterate over a limited set of objects
> > (the failsafe child port example), then the above is what you want.
> > 
> > 
> > The soution Matan is providing does some of each of these things, but comes with
> > very odd side effects
> > 
> > It offers a level of mutual exclusion, in that only one
> > object can own another at a time, but does so in a way that introduces this very
> > atypical ordinality (once an ownership object is created with owner_new, any
> > previously created ownership object will be denied the ability to take ownership
> > of a port)
> 
> Why is that?
> As I understand current code: any owner id between 1 and next_owner_id 
> is considered as valid.

Yes, Neil sent another email to explain it was a review mistake.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/7] ethdev: fix used portid allocation
  2018-01-19 12:40       ` Ananyev, Konstantin
@ 2018-01-20 16:48         ` Matan Azrad
  2018-01-20 17:26           ` Ananyev, Konstantin
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-20 16:48 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce, stable

Hi Konstantin

From: Ananyev, Konstantin, Friday, January 19, 2018 2:40 PM
> > -----Original Message-----
> > From: Matan Azrad [mailto:matan@mellanox.com]
> > Sent: Thursday, January 18, 2018 4:35 PM
> > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>; stable@dpdk.org
> > Subject: [PATCH v3 2/7] ethdev: fix used portid allocation
> >
> > rte_eth_dev_find_free_port() found a free port by state checking.
> > The state field are in local process memory, so other DPDK processes
> > may get the same port ID because their local states may be different.
> >
> > Replace the state checking by the ethdev port name checking, so, if
> > the name is an empty string the port ID will be detected as unused.
> >
> > Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple
> > process model")
> > Cc: stable@dpdk.org
> >
> > Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > ---
> >  lib/librte_ether/rte_ethdev.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_ether/rte_ethdev.c
> > b/lib/librte_ether/rte_ethdev.c index 156231c..5d87f72 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -164,7 +164,7 @@ struct rte_eth_dev *
> >  	unsigned i;
> >
> >  	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> > -		if (rte_eth_devices[i].state == RTE_ETH_DEV_UNUSED)
> > +		if (rte_eth_dev_share_data->data[i].name[0] == '\0')
> 
> I know it is not really necessary, but I'd keep both (just in case):
> if (rte_eth_devices[i].state == RTE_ETH_DEV_UNUSED) &&
> rte_eth_dev_share_data->data[i].name[0] == '\0')
> 
Since, as you, I don't think it is necessary, searched again and didn't find reason to that,
What's about 
RTE_ASSERT(rte_eth_devices[i].state == RTE_ETH_DEV_UNUSED);
 Instead?

> Aprart from that: Acked-by: Konstantin Ananyev
> <konstantin.ananyev@intel.com>
> 
> >  			return i;
> >  	}
> >  	return RTE_MAX_ETHPORTS;
> > --
> > 1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/7] ethdev: fix used portid allocation
  2018-01-20 16:48         ` Matan Azrad
@ 2018-01-20 17:26           ` Ananyev, Konstantin
  0 siblings, 0 replies; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-20 17:26 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce, stable

Hi Matan,

> 
> Hi Konstantin
> 
> From: Ananyev, Konstantin, Friday, January 19, 2018 2:40 PM
> > > -----Original Message-----
> > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > Sent: Thursday, January 18, 2018 4:35 PM
> > > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > <konstantin.ananyev@intel.com>; stable@dpdk.org
> > > Subject: [PATCH v3 2/7] ethdev: fix used portid allocation
> > >
> > > rte_eth_dev_find_free_port() found a free port by state checking.
> > > The state field are in local process memory, so other DPDK processes
> > > may get the same port ID because their local states may be different.
> > >
> > > Replace the state checking by the ethdev port name checking, so, if
> > > the name is an empty string the port ID will be detected as unused.
> > >
> > > Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple
> > > process model")
> > > Cc: stable@dpdk.org
> > >
> > > Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > ---
> > >  lib/librte_ether/rte_ethdev.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > b/lib/librte_ether/rte_ethdev.c index 156231c..5d87f72 100644
> > > --- a/lib/librte_ether/rte_ethdev.c
> > > +++ b/lib/librte_ether/rte_ethdev.c
> > > @@ -164,7 +164,7 @@ struct rte_eth_dev *
> > >  	unsigned i;
> > >
> > >  	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
> > > -		if (rte_eth_devices[i].state == RTE_ETH_DEV_UNUSED)
> > > +		if (rte_eth_dev_share_data->data[i].name[0] == '\0')
> >
> > I know it is not really necessary, but I'd keep both (just in case):
> > if (rte_eth_devices[i].state == RTE_ETH_DEV_UNUSED) &&
> > rte_eth_dev_share_data->data[i].name[0] == '\0')
> >
> Since, as you, I don't think it is necessary, searched again and didn't find reason to that,
> What's about
> RTE_ASSERT(rte_eth_devices[i].state == RTE_ETH_DEV_UNUSED);
>  Instead?

Sounds ok to me.
Konstantin

> 
> > Aprart from that: Acked-by: Konstantin Ananyev
> > <konstantin.ananyev@intel.com>
> >
> > >  			return i;
> > >  	}
> > >  	return RTE_MAX_ETHPORTS;
> > > --
> > > 1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-19 15:00               ` Gaëtan Rivet
@ 2018-01-20 18:14                 ` Matan Azrad
  2018-01-22 10:17                   ` Gaëtan Rivet
  2018-01-22 12:28                 ` Ananyev, Konstantin
  1 sibling, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-20 18:14 UTC (permalink / raw)
  To: Gaëtan Rivet
  Cc: Ananyev, Konstantin, Thomas Monjalon, Wu, Jingjing, dev,
	Neil Horman, Richardson, Bruce

Hi Gaetan

From: Gaëtan Rivet, Friday, January 19, 2018 5:00 PM
> Hi Matan,
> 
> On Fri, Jan 19, 2018 at 01:35:10PM +0000, Matan Azrad wrote:
> > Hi Konstantin
> >
> > From: Ananyev, Konstantin, Friday, January 19, 2018 3:09 PM
> > > > -----Original Message-----
> > > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > > Sent: Friday, January 19, 2018 12:52 PM
> > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Thomas
> > > > Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > <gaetan.rivet@6wind.com>;
> > > > Wu, Jingjing <jingjing.wu@intel.com>
> > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> Richardson,
> > > > Bruce <bruce.richardson@intel.com>
> > > > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port
> > > > ownership
> > > >
> > > > Hi Konstantin
> > > >
> > > > From: Ananyev, Konstantin, Friday, January 19, 2018 2:38 PM
> > > > > To: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > > <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>;
> > > Wu,
> > > > > Jingjing <jingjing.wu@intel.com>
> > > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> > > > > Richardson, Bruce <bruce.richardson@intel.com>
> > > > > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port
> > > > > ownership
> > > > >
> > > > > Hi Matan,
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > > > > Sent: Thursday, January 18, 2018 4:35 PM
> > > > > > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > > > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > > > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> > > Richardson,
> > > > > > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > > > > <konstantin.ananyev@intel.com>
> > > > > > Subject: [PATCH v3 7/7] app/testpmd: adjust ethdev port
> > > > > > ownership
> > > > > >
> > > > > > Testpmd should not use ethdev ports which are managed by other
> > > > > > DPDK entities.
> > > > > >
> > > > > > Set Testpmd ownership to each port which is not used by other
> > > > > > entity and prevent any usage of ethdev ports which are not
> > > > > > owned by
> > > Testpmd.
> > > > > >
> > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > > ---
> > > > > >  app/test-pmd/cmdline.c      | 89 +++++++++++++++++++-------------
> ----
> > > ----
> > > > > -----
> > > > > >  app/test-pmd/cmdline_flow.c |  2 +-
> > > > > >  app/test-pmd/config.c       | 37 ++++++++++---------
> > > > > >  app/test-pmd/parameters.c   |  4 +-
> > > > > >  app/test-pmd/testpmd.c      | 63 ++++++++++++++++++++----------
> --
> > > > > >  app/test-pmd/testpmd.h      |  3 ++
> > > > > >  6 files changed, 103 insertions(+), 95 deletions(-)
> > > > > >
> > > > > > diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> > > > > > index
> > > > > > 31919ba..6199c64 100644
> > > > > > --- a/app/test-pmd/cmdline.c
> > > > > > +++ b/app/test-pmd/cmdline.c
> > > > > > @@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
> > > > > >  			&link_speed) < 0)
> > > > > >  		return;
> > > > > >
> > > > > > -	RTE_ETH_FOREACH_DEV(pid) {
> > > > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
> > > > >
> > > > > Why do we need all these changes?
> > > > > As I understand you changed definition of RTE_ETH_FOREACH_DEV(),
> > > > > so no testpmd should work ok default (no_owner case).
> > > > > Am I missing something here?
> > > >
> > > > Now, After Gaetan suggestion RTE_ETH_FOREACH_DEV(pid) will iterate
> > > over all valid and ownerless ports.
> > >
> > > Yes.
> > >
> > > > Here Testpmd wants to iterate over its owned ports.
> > >
> > > Why? Why it can't just iterate over all valid and ownerless ports?
> > > As I understand it would be enough to fix current problems and would
> > > allow us to avoid any changes in testmpd (which I think is a good thing).
> >
> > Yes, I understand that this big change is very daunted, But I think the
> current a lot of bugs in testpmd(regarding port ownership) even more
> daunted.
> >
> > Look,
> > Testpmd initiates some of its internal databases depends on specific
> > port iteration, In some time someone may take ownership of Testpmd
> ports and testpmd will continue to touch them.
> >
> 
> If I look back on the fail-safe, its sole purpose is to have seamless hotplug
> with existing applications.
> 

Yes.

> Port ownership is a genericization of some functions introduced by the fail-
> safe, that could structure DPDK further.

Not only.
Port ownership is a new concept saying that not all the ports are only for the application
and defines well the new port usage synchronization rules.

It can be a solution for failsafe scenario, but it solves a big generic problem regardless fail-safe.  

> It should allow applications to have a seamless integration with subsystems using port ownership. Without this, port ownership cannot be used.

I do not think it is accurate.
We can use different solution to solve the fail-safe case (seamless) by using the DEFFERED state as you did.
Port ownership is not only for failsafe case - it is a generic new concept which BTW can fix the fail-safe case(full fix).
So, application should use port ownership regardless the failsafe using, just to be sure no one touch its ports.

> Testpmd should be fixed, but follow the most common design patterns of
> DPDK applications. Going with port ownership seems like a paradigm shift.

I think this patch is a classic fix for it and for the full generic issue.
Do you have simpler fix to the races?

> > In addition
> > Using the old iterator in some places in testpmd will cause a race for run-
> time new ports(can be created by failsafe or any hotplug code):
> > - testpmd finds an ownerless port(just now created) by the old
> > iterator and start traffic there,
> > - failsafe takes ownership of this new port and start traffic there.
> > Problem!
> 
> Testpmd does not handle detection of new port. If it did, testing fail-safe
> with it would be wrong.

It used the old iterator everywhere.
So it see(and uses) all the valid ports all the time.
As the new concept - it should be changed to use its owned ports,
It is a simple classic solution just to use the new iterator.

> At startup, RTE_ETH_FOREACH_DEV already fixed the issue of registering
> DEFERRED ports. There are still remaining issues regarding this, but I think
> they should be fixed. The architecture does not need to be completely
> moved to port ownership.

I think port ownership fixes the issues nicely, don't you think so?

> If anything, this should serve as a test for your API with common applications.
> I think you'd prefer to know and debug with testpmd instead of firing up VPP
> or something like that to determine what went wrong with using the fail-
> safe.
> >

Yes as your examples in docs.

> > In addition
> > As a good example for well-done application (free from ownership bugs) I
> tried here to adjust Tespmd to the new rules and BTW to fix a lot of bugs.
> 
> Testpmd has too much cruft, it won't ever be a good example of a well-done
> application.
> 
> If you want to demonstrate ownership, I think you should start an example
> application demonstrating race conditions and their mitigation.
> 
> I think that would be interesting for many DPDK users.

I will think about that regardless the testpmd adjustment (need to find time :))

> >
> > So actually applications which are not aware to the port ownership still are
> exposed to races, but if there use the old iterator(with the new change) the
> amount of races decreases.
> >
> > Thanks, Matan.
> > > Konstantin
> > >
> > > >
> > > > I added to Testpmd ability to take an ownership of ports as the
> > > > new ownership and synchronization rules suggested, Since Tespmd is
> > > > a DPDK entity which wants that no one will touch its owned ports,
> > > > It must allocate
> > > an unique ID, set owner for its ports (see in main function) and
> > > recognizes them by its owner ID.
> > > >
> > > > > Konstantin
> > > > >
> 
> Regards,
> --
> Gaëtan Rivet
> 6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v4 0/7] Port ownership and syncronization
  2018-01-18 16:35   ` [dpdk-dev] [PATCH v3 0/7] Port ownership and syncronization Matan Azrad
                       ` (6 preceding siblings ...)
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership Matan Azrad
@ 2018-01-20 21:24     ` Matan Azrad
  2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 1/7] ethdev: fix port data reset timing Matan Azrad
                         ` (7 more replies)
  2018-01-25 14:35     ` [dpdk-dev] [PATCH v3 0/7] Port ownership and syncronization Ferruh Yigit
  8 siblings, 8 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-20 21:24 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Add ownership mechanism to DPDK Ethernet devices to avoid multiple management of a device by different DPDK entities.
The port ownership mechanism is a good point to redefine the synchronization rules in ethdev:

	1. The port allocation and port release synchronization will be managed by ethdev.
	2. The port usage synchronization will be managed by the port owner.
	3. The port ownership synchronization will be managed by ethdev.
	4. DPDK entity which want to use a port safely must take ownership before.


V2:  
Synchronize ethdev port creation.
Synchronize port ownership mechanism.
Rename owner remove API to rte_eth_dev_owner_unset.
Remove "ethdev: free a port by a dedicated API" patch - passed to another series.
Add "ethdev: fix port data reset timing" patch.
Cahnge owner get API to return int value and to pass copy of the owner structure.
Adjust testpmd to the improved owner get API.
Adjust documentations.

V3:
Change RTE_ETH_FOREACH_DEV iterator to skip owned ports(Gaetan suggestion).
Prevent goto in set\unset APIs by adding internal API - this also adds reuse of code(Konstantin suggestion).
Group all the shared processes variables in one struct to allow easy allocation of it(Konstantin suggestion).
Take owner name truncation as warning and not as error(Konstantin suggestion).
Mark the new APIs as EXPERIMENTAL.
Rebase on top of master_net_mlx.
Rebase on top of "[PATCH v6 0/6] Fail-safe\ethdev: fix removal handling lack" series.
Rebase on top of "[PATCH v5 0/8] Introduce virtual driver for Hyper-V/Azure platforms" .
Add "ethdev: fix used portid allocation" patch suggested y Konstantin.

v4:
Share => shared in ethdev patches(Thomas suggestion).
Rephase some code comments(Thomas suggestion).
Fix compilation issue caused by wrong rebase with "fix used portid allocation" patch.
Add assert check for the correct port state to above fix patch.

Matan Azrad (7):
  ethdev: fix port data reset timing
  ethdev: fix used portid allocation
  ethdev: add port ownership
  ethdev: synchronize port allocation
  net/failsafe: free an eth port by a dedicated API
  net/failsafe: use ownership mechanism to own ports
  app/testpmd: adjust ethdev port ownership

 app/test-pmd/cmdline.c                  |  89 +++++-------
 app/test-pmd/cmdline_flow.c             |   2 +-
 app/test-pmd/config.c                   |  37 ++---
 app/test-pmd/parameters.c               |   4 +-
 app/test-pmd/testpmd.c                  |  63 +++++---
 app/test-pmd/testpmd.h                  |   3 +
 doc/guides/prog_guide/poll_mode_drv.rst |  14 +-
 drivers/net/failsafe/failsafe.c         |   7 +
 drivers/net/failsafe/failsafe_eal.c     |   6 +
 drivers/net/failsafe/failsafe_ether.c   |   2 +-
 drivers/net/failsafe/failsafe_private.h |   2 +
 lib/librte_ether/rte_ethdev.c           | 245 +++++++++++++++++++++++++++-----
 lib/librte_ether/rte_ethdev.h           | 115 ++++++++++++++-
 lib/librte_ether/rte_ethdev_version.map |   6 +
 14 files changed, 458 insertions(+), 137 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v4 1/7] ethdev: fix port data reset timing
  2018-01-20 21:24     ` [dpdk-dev] [PATCH v4 0/7] Port ownership and syncronization Matan Azrad
@ 2018-01-20 21:24       ` Matan Azrad
  2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 2/7] ethdev: fix used portid allocation Matan Azrad
                         ` (6 subsequent siblings)
  7 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-20 21:24 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev, stable

rte_eth_dev_data structure is allocated per ethdev port and can be
used to get a data of the port internally.

rte_eth_dev_attach_secondary tries to find the port identifier using
rte_eth_dev_data name field comparison and may get an identifier of
invalid port in case of this port was released by the primary process
because the port release API doesn't reset the port data.

So, it will be better to reset the port data in release time instead of
allocation time.

Move the port data reset to the port release API.

Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple process model")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 lib/librte_ether/rte_ethdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index c4ff1b0..23b7442 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -204,7 +204,6 @@ struct rte_eth_dev *
 		return NULL;
 	}
 
-	memset(&rte_eth_dev_data[port_id], 0, sizeof(struct rte_eth_dev_data));
 	eth_dev = eth_dev_get(port_id);
 	snprintf(eth_dev->data->name, sizeof(eth_dev->data->name), "%s", name);
 	eth_dev->data->port_id = port_id;
@@ -252,6 +251,7 @@ struct rte_eth_dev *
 	if (eth_dev == NULL)
 		return -EINVAL;
 
+	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
 	eth_dev->state = RTE_ETH_DEV_UNUSED;
 
 	_rte_eth_dev_callback_process(eth_dev, RTE_ETH_EVENT_DESTROY, NULL);
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v4 2/7] ethdev: fix used portid allocation
  2018-01-20 21:24     ` [dpdk-dev] [PATCH v4 0/7] Port ownership and syncronization Matan Azrad
  2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 1/7] ethdev: fix port data reset timing Matan Azrad
@ 2018-01-20 21:24       ` Matan Azrad
  2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 3/7] ethdev: add port ownership Matan Azrad
                         ` (5 subsequent siblings)
  7 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-20 21:24 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev, stable

rte_eth_dev_find_free_port() found a free port by state checking.
The state field are in local process memory, so other DPDK processes
may get the same port ID because their local states may be different.

Replace the state checking by the ethdev port name checking,
so, if the name is an empty string the port ID will be detected as
unused.

Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple process model")
Cc: stable@dpdk.org

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 lib/librte_ether/rte_ethdev.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 23b7442..3a25a64 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -164,8 +164,12 @@ struct rte_eth_dev *
 	unsigned i;
 
 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
-		if (rte_eth_devices[i].state == RTE_ETH_DEV_UNUSED)
+		/* Using shared name field to find a free port. */
+		if (rte_eth_dev_data[i].name[0] == '\0') {
+			RTE_ASSERT(rte_eth_devices[i].state ==
+				   RTE_ETH_DEV_UNUSED);
 			return i;
+		}
 	}
 	return RTE_MAX_ETHPORTS;
 }
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v4 3/7] ethdev: add port ownership
  2018-01-20 21:24     ` [dpdk-dev] [PATCH v4 0/7] Port ownership and syncronization Matan Azrad
  2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 1/7] ethdev: fix port data reset timing Matan Azrad
  2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 2/7] ethdev: fix used portid allocation Matan Azrad
@ 2018-01-20 21:24       ` Matan Azrad
  2018-01-21 20:43         ` Ferruh Yigit
  2018-01-21 20:46         ` Ferruh Yigit
  2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 4/7] ethdev: synchronize port allocation Matan Azrad
                         ` (4 subsequent siblings)
  7 siblings, 2 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-20 21:24 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

The ownership of a port is implicit in DPDK.
Making it explicit is better from the next reasons:
1. It will define well who is in charge of the port usage synchronization.
2. A library could work on top of a port.
3. A port can work on top of another port.

Also in the fail-safe case, an issue has been met in testpmd.
We need to check that the application is not trying to use a port which
is already managed by fail-safe.

A port owner is built from owner id(number) and owner name(string) while
the owner id must be unique to distinguish between two identical entity
instances and the owner name can be any name.
The name helps to logically recognize the owner by different DPDK
entities and allows easy debug.
Each DPDK entity can allocate an owner unique identifier and can use it
and its preferred name to owns valid ethdev ports.
Each DPDK entity can get any port owner status to decide if it can
manage the port or not.

The mechanism is synchronized for both the primary process threads and
the secondary processes threads to allow secondary process entity to be
a port owner.

Add a synchronized ownership mechanism to DPDK Ethernet devices to
avoid multiple management of a device by different DPDK entities.

The current ethdev internal port management is not affected by this
feature.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 doc/guides/prog_guide/poll_mode_drv.rst |  14 ++-
 lib/librte_ether/rte_ethdev.c           | 202 ++++++++++++++++++++++++++++----
 lib/librte_ether/rte_ethdev.h           | 115 +++++++++++++++++-
 lib/librte_ether/rte_ethdev_version.map |   6 +
 4 files changed, 306 insertions(+), 31 deletions(-)

diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
index d1d4b1c..d513ee3 100644
--- a/doc/guides/prog_guide/poll_mode_drv.rst
+++ b/doc/guides/prog_guide/poll_mode_drv.rst
@@ -156,8 +156,8 @@ concurrently on the same tx queue without SW lock. This PMD feature found in som
 
 See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE`` capability probing details.
 
-Device Identification and Configuration
----------------------------------------
+Device Identification, Ownership and Configuration
+--------------------------------------------------
 
 Device Identification
 ~~~~~~~~~~~~~~~~~~~~~
@@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports are assigned two other identifiers:
 *   A port name used to designate the port in console messages, for administration or debugging purposes.
     For ease of use, the port name includes the port index.
 
+Port Ownership
+~~~~~~~~~~~~~~
+The Ethernet devices ports can be owned by a single DPDK entity (application, library, PMD, process, etc).
+The ownership mechanism is controlled by ethdev APIs and allows to set/remove/get a port owner by DPDK entities.
+Allowing this should prevent any multiple management of Ethernet port by different entities.
+
+.. note::
+
+    It is the DPDK entity responsibility to set the port owner before using it and to manage the port usage synchronization between different threads or processes.
+
 Device Configuration
 ~~~~~~~~~~~~~~~~~~~~
 
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 3a25a64..af0e072 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -41,7 +41,6 @@
 
 static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
 struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
-static struct rte_eth_dev_data *rte_eth_dev_data;
 static uint8_t eth_dev_last_created_port;
 
 /* spinlock for eth device callbacks */
@@ -59,6 +58,13 @@ struct rte_eth_xstats_name_off {
 	unsigned offset;
 };
 
+/* Shared memory between primary and secondary processes. */
+static struct {
+	uint64_t next_owner_id;
+	rte_spinlock_t ownership_lock;
+	struct rte_eth_dev_data data[RTE_MAX_ETHPORTS];
+} *rte_eth_dev_shared_data;
+
 static const struct rte_eth_xstats_name_off rte_stats_strings[] = {
 	{"rx_good_packets", offsetof(struct rte_eth_stats, ipackets)},
 	{"tx_good_packets", offsetof(struct rte_eth_stats, opackets)},
@@ -125,24 +131,29 @@ enum {
 }
 
 static void
-rte_eth_dev_data_alloc(void)
+rte_eth_dev_shared_data_alloc(void)
 {
 	const unsigned flags = 0;
 	const struct rte_memzone *mz;
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		/* Allocate shared memory for port data and ownership. */
 		mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA,
-				RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data),
-				rte_socket_id(), flags);
+					 sizeof(*rte_eth_dev_shared_data),
+					 rte_socket_id(), flags);
 	} else
 		mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
 	if (mz == NULL)
 		rte_panic("Cannot allocate memzone for ethernet port data\n");
 
-	rte_eth_dev_data = mz->addr;
-	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
-		memset(rte_eth_dev_data, 0,
-				RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data));
+	rte_eth_dev_shared_data = mz->addr;
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		rte_eth_dev_shared_data->next_owner_id =
+				RTE_ETH_DEV_NO_OWNER + 1;
+		rte_spinlock_init(&rte_eth_dev_shared_data->ownership_lock);
+		memset(rte_eth_dev_shared_data->data, 0,
+		       sizeof(rte_eth_dev_shared_data->data));
+	}
 }
 
 struct rte_eth_dev *
@@ -165,7 +176,7 @@ struct rte_eth_dev *
 
 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
 		/* Using shared name field to find a free port. */
-		if (rte_eth_dev_data[i].name[0] == '\0') {
+		if (rte_eth_dev_shared_data->data[i].name[0] == '\0') {
 			RTE_ASSERT(rte_eth_devices[i].state ==
 				   RTE_ETH_DEV_UNUSED);
 			return i;
@@ -179,7 +190,7 @@ struct rte_eth_dev *
 {
 	struct rte_eth_dev *eth_dev = &rte_eth_devices[port_id];
 
-	eth_dev->data = &rte_eth_dev_data[port_id];
+	eth_dev->data = &rte_eth_dev_shared_data->data[port_id];
 	eth_dev->state = RTE_ETH_DEV_ATTACHED;
 
 	eth_dev_last_created_port = port_id;
@@ -199,8 +210,8 @@ struct rte_eth_dev *
 		return NULL;
 	}
 
-	if (rte_eth_dev_data == NULL)
-		rte_eth_dev_data_alloc();
+	if (rte_eth_dev_shared_data == NULL)
+		rte_eth_dev_shared_data_alloc();
 
 	if (rte_eth_dev_allocated(name) != NULL) {
 		RTE_PMD_DEBUG_TRACE("Ethernet Device with name %s already allocated!\n",
@@ -229,11 +240,11 @@ struct rte_eth_dev *
 	uint16_t i;
 	struct rte_eth_dev *eth_dev;
 
-	if (rte_eth_dev_data == NULL)
-		rte_eth_dev_data_alloc();
+	if (rte_eth_dev_shared_data == NULL)
+		rte_eth_dev_shared_data_alloc();
 
 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
-		if (strcmp(rte_eth_dev_data[i].name, name) == 0)
+		if (strcmp(rte_eth_dev_shared_data->data[i].name, name) == 0)
 			break;
 	}
 	if (i == RTE_MAX_ETHPORTS) {
@@ -255,9 +266,14 @@ struct rte_eth_dev *
 	if (eth_dev == NULL)
 		return -EINVAL;
 
-	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
+	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
+
 	eth_dev->state = RTE_ETH_DEV_UNUSED;
 
+	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
+
+	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
+
 	_rte_eth_dev_callback_process(eth_dev, RTE_ETH_EVENT_DESTROY, NULL);
 
 	return 0;
@@ -273,6 +289,144 @@ struct rte_eth_dev *
 		return 1;
 }
 
+static int
+rte_eth_is_valid_owner_id(uint64_t owner_id)
+{
+	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
+	    rte_eth_dev_shared_data->next_owner_id <= owner_id) {
+		RTE_LOG(ERR, EAL, "Invalid owner_id=%016lX.\n", owner_id);
+		return 0;
+	}
+	return 1;
+}
+
+uint64_t
+rte_eth_find_next_owned_by(uint16_t port_id, const uint64_t owner_id)
+{
+	while (port_id < RTE_MAX_ETHPORTS &&
+	       ((rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
+	       rte_eth_devices[port_id].state != RTE_ETH_DEV_REMOVED) ||
+	       rte_eth_devices[port_id].data->owner.id != owner_id))
+		port_id++;
+
+	if (port_id >= RTE_MAX_ETHPORTS)
+		return RTE_MAX_ETHPORTS;
+
+	return port_id;
+}
+
+int
+rte_eth_dev_owner_new(uint64_t *owner_id)
+{
+	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
+
+	*owner_id = rte_eth_dev_shared_data->next_owner_id++;
+
+	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
+	return 0;
+}
+
+static int
+_rte_eth_dev_owner_set(const uint16_t port_id, const uint64_t old_owner_id,
+		       const struct rte_eth_dev_owner *new_owner)
+{
+	struct rte_eth_dev_owner *port_owner;
+	int sret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+
+	if (!rte_eth_is_valid_owner_id(new_owner->id) &&
+	    !rte_eth_is_valid_owner_id(old_owner_id))
+		return -EINVAL;
+
+	port_owner = &rte_eth_devices[port_id].data->owner;
+	if (port_owner->id != old_owner_id) {
+		RTE_LOG(ERR, EAL, "Cannot set owner to port %d already owned"
+			" by %s_%016lX.\n", port_id, port_owner->name,
+			port_owner->id);
+		return -EPERM;
+	}
+
+	sret = snprintf(port_owner->name, RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
+			new_owner->name);
+	if (sret < 0 || sret >= RTE_ETH_MAX_OWNER_NAME_LEN)
+		RTE_LOG(WARNING, EAL, "Port %d owner name was truncated.\n",
+			port_id);
+
+	port_owner->id = new_owner->id;
+
+	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%016lX.\n", port_id,
+			    new_owner->name, new_owner->id);
+
+	return 0;
+}
+
+int
+rte_eth_dev_owner_set(const uint16_t port_id,
+		      const struct rte_eth_dev_owner *owner)
+{
+	int ret;
+
+	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
+
+	ret = _rte_eth_dev_owner_set(port_id, RTE_ETH_DEV_NO_OWNER, owner);
+
+	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
+	return ret;
+}
+
+int
+rte_eth_dev_owner_unset(const uint16_t port_id, const uint64_t owner_id)
+{
+	const struct rte_eth_dev_owner new_owner = (struct rte_eth_dev_owner)
+			{.id = RTE_ETH_DEV_NO_OWNER, .name = ""};
+	int ret;
+
+	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
+
+	ret = _rte_eth_dev_owner_set(port_id, owner_id, &new_owner);
+
+	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
+	return ret;
+}
+
+void
+rte_eth_dev_owner_delete(const uint64_t owner_id)
+{
+	uint16_t port_id;
+
+	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
+
+	if (rte_eth_is_valid_owner_id(owner_id)) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(port_id, owner_id)
+			memset(&rte_eth_devices[port_id].data->owner, 0,
+			       sizeof(struct rte_eth_dev_owner));
+		RTE_PMD_DEBUG_TRACE("All port owners owned by %016X identifier"
+				    " have removed.\n", owner_id);
+	}
+
+	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
+}
+
+int
+rte_eth_dev_owner_get(const uint16_t port_id, struct rte_eth_dev_owner *owner)
+{
+	int ret = 0;
+
+	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		RTE_PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		ret = -ENODEV;
+	} else {
+		rte_memcpy(owner, &rte_eth_devices[port_id].data->owner,
+			   sizeof(*owner));
+	}
+
+	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
+	return ret;
+}
+
 int
 rte_eth_dev_socket_id(uint16_t port_id)
 {
@@ -315,7 +469,7 @@ struct rte_eth_dev *
 
 	/* shouldn't check 'rte_eth_devices[i].data',
 	 * because it might be overwritten by VDEV PMD */
-	tmp = rte_eth_dev_data[port_id].name;
+	tmp = rte_eth_dev_shared_data->data[port_id].name;
 	strcpy(name, tmp);
 	return 0;
 }
@@ -323,22 +477,22 @@ struct rte_eth_dev *
 int
 rte_eth_dev_get_port_by_name(const char *name, uint16_t *port_id)
 {
-	int i;
+	uint32_t pid;
 
 	if (name == NULL) {
 		RTE_PMD_DEBUG_TRACE("Null pointer is specified\n");
 		return -EINVAL;
 	}
 
-	RTE_ETH_FOREACH_DEV(i) {
-		if (!strncmp(name,
-			rte_eth_dev_data[i].name, strlen(name))) {
-
-			*port_id = i;
-
+	for (pid = 0; pid < RTE_MAX_ETHPORTS; pid++) {
+		if (rte_eth_devices[pid].state != RTE_ETH_DEV_UNUSED &&
+		    !strncmp(name, rte_eth_dev_shared_data->data[pid].name,
+			     strlen(name))) {
+			*port_id = pid;
 			return 0;
 		}
 	}
+
 	return -ENODEV;
 }
 
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 084eeeb..cdeaf7a 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1739,6 +1739,15 @@ struct rte_eth_dev_sriov {
 
 #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
 
+#define RTE_ETH_DEV_NO_OWNER 0
+
+#define RTE_ETH_MAX_OWNER_NAME_LEN 64
+
+struct rte_eth_dev_owner {
+	uint64_t id; /**< The owner unique identifier. */
+	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner name. */
+};
+
 /**
  * @internal
  * The data part, with no function pointers, associated with each ethernet device.
@@ -1789,6 +1798,7 @@ struct rte_eth_dev_data {
 	int numa_node;  /**< NUMA node connection */
 	struct rte_vlan_filter_conf vlan_filter_conf;
 	/**< VLAN filter configuration. */
+	struct rte_eth_dev_owner owner; /**< The port owner. */
 };
 
 /** Device supports link state interrupt */
@@ -1806,6 +1816,30 @@ struct rte_eth_dev_data {
 extern struct rte_eth_dev rte_eth_devices[];
 
 /**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Iterates over valid ethdev ports owned by a specific owner.
+ *
+ * @param port_id
+ *   The id of the next possible valid owned port.
+ * @param	owner_id
+ *  The owner identifier.
+ *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless ports.
+ * @return
+ *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if there is none.
+ */
+uint64_t rte_eth_find_next_owned_by(uint16_t port_id, const uint64_t owner_id);
+
+/**
+ * Macro to iterate over all enabled ethdev ports owned by a specific owner.
+ */
+#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
+	for (p = rte_eth_find_next_owned_by(0, o); \
+	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
+	     p = rte_eth_find_next_owned_by(p + 1, o))
+
+/**
  * Iterates over valid ethdev ports.
  *
  * @param port_id
@@ -1816,13 +1850,84 @@ struct rte_eth_dev_data {
 uint16_t rte_eth_find_next(uint16_t port_id);
 
 /**
- * Macro to iterate over all enabled ethdev ports.
+ * Macro to iterate over all enabled and ownerless ethdev ports.
+ */
+#define RTE_ETH_FOREACH_DEV(p) \
+	RTE_ETH_FOREACH_DEV_OWNED_BY(p, RTE_ETH_DEV_NO_OWNER)
+
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get a new unique owner identifier.
+ * An owner identifier is used to owns Ethernet devices by only one DPDK entity
+ * to avoid multiple management of device by different entities.
+ *
+ * @param	owner_id
+ *   Owner identifier pointer.
+ * @return
+ *   Negative errno value on error, 0 on success.
  */
-#define RTE_ETH_FOREACH_DEV(p)					\
-	for (p = rte_eth_find_next(0);				\
-	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS;	\
-	     p = rte_eth_find_next(p + 1))
+int rte_eth_dev_owner_new(uint64_t *owner_id);
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set an Ethernet device owner.
+ *
+ * @param	port_id
+ *  The identifier of the port to own.
+ * @param	owner
+ *  The owner pointer.
+ * @return
+ *  Negative errno value on error, 0 on success.
+ */
+int rte_eth_dev_owner_set(const uint16_t port_id,
+			  const struct rte_eth_dev_owner *owner);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Unset Ethernet device owner to make the device ownerless.
+ *
+ * @param	port_id
+ *  The identifier of port to make ownerless.
+ * @param	owner
+ *  The owner identifier.
+ * @return
+ *  0 on success, negative errno value on error.
+ */
+int rte_eth_dev_owner_unset(const uint16_t port_id, const uint64_t owner_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Remove owner from all Ethernet devices owned by a specific owner.
+ *
+ * @param	owner
+ *  The owner identifier.
+ */
+void rte_eth_dev_owner_delete(const uint64_t owner_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the owner of an Ethernet device.
+ *
+ * @param	port_id
+ *  The port identifier.
+ * @param	owner
+ *  The owner structure pointer to fill.
+ * @return
+ *  0 on success, negative errno value on error..
+ */
+int rte_eth_dev_owner_get(const uint16_t port_id,
+			  struct rte_eth_dev_owner *owner);
 
 /**
  * Get the total number of Ethernet devices that have been successfully
diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
index 88b7908..545f7a6 100644
--- a/lib/librte_ether/rte_ethdev_version.map
+++ b/lib/librte_ether/rte_ethdev_version.map
@@ -202,6 +202,12 @@ EXPERIMENTAL {
 	global:
 
 	rte_eth_dev_is_removed;
+	rte_eth_dev_owner_delete;
+	rte_eth_dev_owner_get;
+	rte_eth_dev_owner_new;
+	rte_eth_dev_owner_set;
+	rte_eth_dev_owner_unset;
+	rte_eth_find_next_owned_by;
 	rte_mtr_capabilities_get;
 	rte_mtr_create;
 	rte_mtr_destroy;
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v4 4/7] ethdev: synchronize port allocation
  2018-01-20 21:24     ` [dpdk-dev] [PATCH v4 0/7] Port ownership and syncronization Matan Azrad
                         ` (2 preceding siblings ...)
  2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 3/7] ethdev: add port ownership Matan Azrad
@ 2018-01-20 21:24       ` Matan Azrad
  2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 5/7] net/failsafe: free an eth port by a dedicated API Matan Azrad
                         ` (3 subsequent siblings)
  7 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-20 21:24 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Ethernet port allocation was not thread safe, means 2 threads which tried
to allocate a new port at the same time might get an identical port
identifier and caused to memory overwrite.
Actually, all the port configurations were not thread safe from ethdev
point of view.

The port ownership mechanism added to the ethdev is a good point to
redefine the synchronization rules in ethdev:

1. The port allocation and port release synchronization will be
   managed by ethdev.
2. The port usage synchronization will be managed by the port owner.
3. The port ownership synchronization will be managed by ethdev.

Add port allocation synchronization to complete the new rules.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 lib/librte_ether/rte_ethdev.c | 43 +++++++++++++++++++++++++++++++------------
 1 file changed, 31 insertions(+), 12 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index af0e072..f616775 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -52,6 +52,9 @@
 /* spinlock for add/remove tx callbacks */
 static rte_spinlock_t rte_eth_tx_cb_lock = RTE_SPINLOCK_INITIALIZER;
 
+/* spinlock for shared data allocation */
+static rte_spinlock_t rte_eth_shared_data_lock = RTE_SPINLOCK_INITIALIZER;
+
 /* store statistics names and its offset in stats structure  */
 struct rte_eth_xstats_name_off {
 	char name[RTE_ETH_XSTATS_NAME_SIZE];
@@ -202,21 +205,27 @@ struct rte_eth_dev *
 rte_eth_dev_allocate(const char *name)
 {
 	uint16_t port_id;
-	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev *eth_dev = NULL;
+
+	/* Synchronize local threads to allocate shared data only once. */
+	rte_spinlock_lock(&rte_eth_shared_data_lock);
+	if (rte_eth_dev_shared_data == NULL)
+		rte_eth_dev_shared_data_alloc();
+	rte_spinlock_unlock(&rte_eth_shared_data_lock);
+
+	/* Synchronize port creation between primary and secondary threads. */
+	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
 
 	port_id = rte_eth_dev_find_free_port();
 	if (port_id == RTE_MAX_ETHPORTS) {
 		RTE_PMD_DEBUG_TRACE("Reached maximum number of Ethernet ports\n");
-		return NULL;
+		goto unlock;
 	}
 
-	if (rte_eth_dev_shared_data == NULL)
-		rte_eth_dev_shared_data_alloc();
-
 	if (rte_eth_dev_allocated(name) != NULL) {
 		RTE_PMD_DEBUG_TRACE("Ethernet Device with name %s already allocated!\n",
 				name);
-		return NULL;
+		goto unlock;
 	}
 
 	eth_dev = eth_dev_get(port_id);
@@ -224,7 +233,11 @@ struct rte_eth_dev *
 	eth_dev->data->port_id = port_id;
 	eth_dev->data->mtu = ETHER_MTU;
 
-	_rte_eth_dev_callback_process(eth_dev, RTE_ETH_EVENT_NEW, NULL);
+unlock:
+	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
+
+	if (eth_dev != NULL)
+		_rte_eth_dev_callback_process(eth_dev, RTE_ETH_EVENT_NEW, NULL);
 
 	return eth_dev;
 }
@@ -238,10 +251,16 @@ struct rte_eth_dev *
 rte_eth_dev_attach_secondary(const char *name)
 {
 	uint16_t i;
-	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev *eth_dev = NULL;
 
+	/* Synchronize local threads to attach shared data only once. */
+	rte_spinlock_lock(&rte_eth_shared_data_lock);
 	if (rte_eth_dev_shared_data == NULL)
 		rte_eth_dev_shared_data_alloc();
+	rte_spinlock_unlock(&rte_eth_shared_data_lock);
+
+	/* Synchronize port attachment to primary port creation and release. */
+	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
 
 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
 		if (strcmp(rte_eth_dev_shared_data->data[i].name, name) == 0)
@@ -251,12 +270,12 @@ struct rte_eth_dev *
 		RTE_PMD_DEBUG_TRACE(
 			"device %s is not driven by the primary process\n",
 			name);
-		return NULL;
+	} else {
+		eth_dev = eth_dev_get(i);
+		RTE_ASSERT(eth_dev->data->port_id == i);
 	}
 
-	eth_dev = eth_dev_get(i);
-	RTE_ASSERT(eth_dev->data->port_id == i);
-
+	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
 	return eth_dev;
 }
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v4 5/7] net/failsafe: free an eth port by a dedicated API
  2018-01-20 21:24     ` [dpdk-dev] [PATCH v4 0/7] Port ownership and syncronization Matan Azrad
                         ` (3 preceding siblings ...)
  2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 4/7] ethdev: synchronize port allocation Matan Azrad
@ 2018-01-20 21:24       ` Matan Azrad
  2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 6/7] net/failsafe: use ownership mechanism to own ports Matan Azrad
                         ` (2 subsequent siblings)
  7 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-20 21:24 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Call dedicated ethdev API to free port in remove time as was done in
other fail-safe places.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
---
 drivers/net/failsafe/failsafe_ether.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index 8a4cacf..e9b0cfe 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -297,7 +297,7 @@
 			ERROR("Bus detach failed for sub_device %u",
 			      SUB_ID(sdev));
 		} else {
-			ETH(sdev)->state = RTE_ETH_DEV_UNUSED;
+			rte_eth_dev_release_port(ETH(sdev));
 		}
 		sdev->state = DEV_PARSED;
 		/* fallthrough */
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v4 6/7] net/failsafe: use ownership mechanism to own ports
  2018-01-20 21:24     ` [dpdk-dev] [PATCH v4 0/7] Port ownership and syncronization Matan Azrad
                         ` (4 preceding siblings ...)
  2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 5/7] net/failsafe: free an eth port by a dedicated API Matan Azrad
@ 2018-01-20 21:24       ` Matan Azrad
  2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 7/7] app/testpmd: adjust ethdev port ownership Matan Azrad
  2018-01-22 16:38       ` [dpdk-dev] [PATCH v5 0/7] Port ownership and synchronization Matan Azrad
  7 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-20 21:24 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Fail-safe PMD sub devices management is based on ethdev port mechanism.
So, the sub-devices management structures are exposed to other DPDK
entities which may use them in parallel to fail-safe PMD.

Use the new port ownership mechanism to avoid multiple managments of
fail-safe PMD sub-devices.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
---
 drivers/net/failsafe/failsafe.c         | 7 +++++++
 drivers/net/failsafe/failsafe_eal.c     | 6 ++++++
 drivers/net/failsafe/failsafe_private.h | 2 ++
 3 files changed, 15 insertions(+)

diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index b767352..a1e1c7a 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -196,6 +196,13 @@
 	ret = failsafe_args_parse(dev, params);
 	if (ret)
 		goto free_subs;
+	ret = rte_eth_dev_owner_new(&priv->my_owner.id);
+	if (ret) {
+		ERROR("Failed to get unique owner identifier");
+		goto free_args;
+	}
+	snprintf(priv->my_owner.name, sizeof(priv->my_owner.name),
+		 FAILSAFE_OWNER_NAME);
 	ret = failsafe_eal_init(dev);
 	if (ret)
 		goto free_args;
diff --git a/drivers/net/failsafe/failsafe_eal.c b/drivers/net/failsafe/failsafe_eal.c
index 33a5adf..5f3da06 100644
--- a/drivers/net/failsafe/failsafe_eal.c
+++ b/drivers/net/failsafe/failsafe_eal.c
@@ -106,6 +106,12 @@
 			INFO("Taking control of a probed sub device"
 			      " %d named %s", i, da->name);
 		}
+		ret = rte_eth_dev_owner_set(pid, &PRIV(dev)->my_owner);
+		if (ret) {
+			INFO("sub_device %d owner set failed (%s),"
+			     " will try again later", i, strerror(ret));
+			continue;
+		}
 		ETH(sdev) = &rte_eth_devices[pid];
 		SUB_ID(sdev) = i;
 		sdev->fs_dev = dev;
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 4916365..b377046 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -42,6 +42,7 @@
 #include <rte_devargs.h>
 
 #define FAILSAFE_DRIVER_NAME "Fail-safe PMD"
+#define FAILSAFE_OWNER_NAME "Fail-safe"
 
 #define PMD_FAILSAFE_MAC_KVARG "mac"
 #define PMD_FAILSAFE_HOTPLUG_POLL_KVARG "hotplug_poll"
@@ -145,6 +146,7 @@ struct fs_priv {
 	uint32_t mac_addr_pool[FAILSAFE_MAX_ETHADDR];
 	/* current capabilities */
 	struct rte_eth_dev_info infos;
+	struct rte_eth_dev_owner my_owner; /* Unique owner. */
 	/*
 	 * Fail-safe state machine.
 	 * This level will be tracking state of the EAL and eth
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v4 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-20 21:24     ` [dpdk-dev] [PATCH v4 0/7] Port ownership and syncronization Matan Azrad
                         ` (5 preceding siblings ...)
  2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 6/7] net/failsafe: use ownership mechanism to own ports Matan Azrad
@ 2018-01-20 21:24       ` Matan Azrad
  2018-01-22 16:38       ` [dpdk-dev] [PATCH v5 0/7] Port ownership and synchronization Matan Azrad
  7 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-20 21:24 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Testpmd should not use ethdev ports which are managed by other DPDK
entities.

Set Testpmd ownership to each port which is not used by other entity and
prevent any usage of ethdev ports which are not owned by Testpmd.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 app/test-pmd/cmdline.c      | 89 +++++++++++++++++++--------------------------
 app/test-pmd/cmdline_flow.c |  2 +-
 app/test-pmd/config.c       | 37 ++++++++++---------
 app/test-pmd/parameters.c   |  4 +-
 app/test-pmd/testpmd.c      | 63 ++++++++++++++++++++------------
 app/test-pmd/testpmd.h      |  3 ++
 6 files changed, 103 insertions(+), 95 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 31919ba..6199c64 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
 			&link_speed) < 0)
 		return;
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		ports[pid].dev_conf.link_speeds = link_speed;
 	}
 
@@ -1902,7 +1902,7 @@ struct cmd_config_rss {
 	struct cmd_config_rss *res = parsed_result;
 	struct rte_eth_rss_conf rss_conf = { .rss_key_len = 0, };
 	int diag;
-	uint8_t i;
+	uint16_t pid;
 
 	if (!strcmp(res->value, "all"))
 		rss_conf.rss_hf = ETH_RSS_IP | ETH_RSS_TCP |
@@ -1936,12 +1936,12 @@ struct cmd_config_rss {
 		return;
 	}
 	rss_conf.rss_key = NULL;
-	for (i = 0; i < rte_eth_dev_count(); i++) {
-		diag = rte_eth_dev_rss_hash_update(i, &rss_conf);
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
+		diag = rte_eth_dev_rss_hash_update(pid, &rss_conf);
 		if (diag < 0)
 			printf("Configuration of RSS hash at ethernet port %d "
 				"failed with error (%d): %s.\n",
-				i, -diag, strerror(-diag));
+				pid, -diag, strerror(-diag));
 	}
 }
 
@@ -3686,10 +3686,9 @@ struct cmd_csum_result {
 	uint64_t csum_offloads = 0;
 	struct rte_eth_dev_info dev_info;
 
-	if (port_id_is_invalid(res->port_id, ENABLED_WARN)) {
-		printf("invalid port %d\n", res->port_id);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
+
 	if (!port_is_stopped(res->port_id)) {
 		printf("Please stop port %d first\n", res->port_id);
 		return;
@@ -4364,8 +4363,8 @@ struct cmd_gso_show_result {
 {
 	struct cmd_gso_show_result *res = parsed_result;
 
-	if (!rte_eth_dev_is_valid_port(res->cmd_pid)) {
-		printf("invalid port id %u\n", res->cmd_pid);
+	if (port_id_is_invalid(res->cmd_pid, ENABLED_WARN)) {
+		printf("invalid/not owned port id %u\n", res->cmd_pid);
 		return;
 	}
 	if (!strcmp(res->cmd_keyword, "gso")) {
@@ -5375,7 +5374,12 @@ static void cmd_create_bonded_device_parsed(void *parsed_result,
 				port_id);
 
 		/* Update number of ports */
-		nb_ports = rte_eth_dev_count();
+		if (rte_eth_dev_owner_set(port_id, &my_owner) != 0) {
+			printf("Error: cannot own new attached port %d\n",
+			       port_id);
+			return;
+		}
+		nb_ports++;
 		reconfig(port_id, res->socket);
 		rte_eth_promiscuous_enable(port_id);
 	}
@@ -5484,10 +5488,8 @@ static void cmd_set_bond_mon_period_parsed(void *parsed_result,
 	struct cmd_set_bond_mon_period_result *res = parsed_result;
 	int ret;
 
-	if (res->port_num >= nb_ports) {
-		printf("Port id %d must be less than %d\n", res->port_num, nb_ports);
+	if (port_id_is_invalid(res->port_num, ENABLED_WARN))
 		return;
-	}
 
 	ret = rte_eth_bond_link_monitoring_set(res->port_num, res->period_ms);
 
@@ -5545,11 +5547,8 @@ struct cmd_set_bonding_agg_mode_policy_result {
 	struct cmd_set_bonding_agg_mode_policy_result *res = parsed_result;
 	uint8_t policy = AGG_BANDWIDTH;
 
-	if (res->port_num >= nb_ports) {
-		printf("Port id %d must be less than %d\n",
-				res->port_num, nb_ports);
+	if (port_id_is_invalid(res->port_num, ENABLED_WARN))
 		return;
-	}
 
 	if (!strcmp(res->policy, "bandwidth"))
 		policy = AGG_BANDWIDTH;
@@ -5808,7 +5807,7 @@ static void cmd_set_promisc_mode_parsed(void *parsed_result,
 
 	/* all ports */
 	if (allports) {
-		RTE_ETH_FOREACH_DEV(i) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
 			if (enable)
 				rte_eth_promiscuous_enable(i);
 			else
@@ -5888,7 +5887,7 @@ static void cmd_set_allmulti_mode_parsed(void *parsed_result,
 
 	/* all ports */
 	if (allports) {
-		RTE_ETH_FOREACH_DEV(i) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
 			if (enable)
 				rte_eth_allmulticast_enable(i);
 			else
@@ -6622,31 +6621,31 @@ static void cmd_showportall_parsed(void *parsed_result,
 	struct cmd_showportall_result *res = parsed_result;
 	if (!strcmp(res->show, "clear")) {
 		if (!strcmp(res->what, "stats"))
-			RTE_ETH_FOREACH_DEV(i)
+			RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 				nic_stats_clear(i);
 		else if (!strcmp(res->what, "xstats"))
-			RTE_ETH_FOREACH_DEV(i)
+			RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 				nic_xstats_clear(i);
 	} else if (!strcmp(res->what, "info"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			port_infos_display(i);
 	else if (!strcmp(res->what, "stats"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			nic_stats_display(i);
 	else if (!strcmp(res->what, "xstats"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			nic_xstats_display(i);
 	else if (!strcmp(res->what, "fdir"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			fdir_get_infos(i);
 	else if (!strcmp(res->what, "stat_qmap"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			nic_stats_mapping_display(i);
 	else if (!strcmp(res->what, "dcb_tc"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			port_dcb_info_display(i);
 	else if (!strcmp(res->what, "cap"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			port_offload_cap_display(i);
 }
 
@@ -10698,10 +10697,8 @@ struct cmd_flow_director_mask_result {
 	struct rte_eth_fdir_masks *mask;
 	struct rte_port *port;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	port = &ports[res->port_id];
 	/** Check if the port is not started **/
@@ -10899,10 +10896,8 @@ struct cmd_flow_director_flex_mask_result {
 	uint16_t i;
 	int ret;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	port = &ports[res->port_id];
 	/** Check if the port is not started **/
@@ -11053,12 +11048,10 @@ struct cmd_flow_director_flexpayload_result {
 	struct cmd_flow_director_flexpayload_result *res = parsed_result;
 	struct rte_eth_flex_payload_cfg flex_cfg;
 	struct rte_port *port;
-	int ret = 0;
+	int ret;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	port = &ports[res->port_id];
 	/** Check if the port is not started **/
@@ -11774,7 +11767,7 @@ struct cmd_config_l2_tunnel_eth_type_result {
 	entry.l2_tunnel_type = str2fdir_l2_tunnel_type(res->l2_tunnel_type);
 	entry.ether_type = res->eth_type_val;
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		rte_eth_dev_l2_tunnel_eth_type_conf(pid, &entry);
 	}
 }
@@ -11890,7 +11883,7 @@ struct cmd_config_l2_tunnel_en_dis_result {
 	else
 		en = 0;
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		rte_eth_dev_l2_tunnel_offload_set(pid,
 						  &entry,
 						  ETH_L2_TUNNEL_ENABLE_MASK,
@@ -14440,10 +14433,8 @@ struct cmd_ddp_add_result {
 	int file_num;
 	int ret = -ENOTSUP;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	if (!all_ports_stopped()) {
 		printf("Please stop all ports first\n");
@@ -14522,10 +14513,8 @@ struct cmd_ddp_del_result {
 	uint32_t size;
 	int ret = -ENOTSUP;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	if (!all_ports_stopped()) {
 		printf("Please stop all ports first\n");
@@ -14837,10 +14826,8 @@ struct cmd_ddp_get_list_result {
 #endif
 	int ret = -ENOTSUP;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 #ifdef RTE_LIBRTE_I40E_PMD
 	size = PROFILE_INFO_SIZE * MAX_PROFILE_NUM + 4;
@@ -16296,7 +16283,7 @@ struct cmd_cmdfile_result {
 	if (id == (portid_t)RTE_PORT_ALL) {
 		portid_t pid;
 
-		RTE_ETH_FOREACH_DEV(pid) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 			/* check if need_reconfig has been set to 1 */
 			if (ports[pid].need_reconfig == 0)
 				ports[pid].need_reconfig = dev;
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 561e057..e55490f 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -2652,7 +2652,7 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 
 	(void)ctx;
 	(void)token;
-	RTE_ETH_FOREACH_DEV(p) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(p, my_owner.id) {
 		if (buf && i == ent)
 			return snprintf(buf, size, "%u", p);
 		++i;
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 957b820..43b9a7d 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -156,7 +156,7 @@ struct rss_type_info {
 
 	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
 		printf("Valid port range is [0");
-		RTE_ETH_FOREACH_DEV(pid)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 			printf(", %d", pid);
 		printf("]\n");
 		return;
@@ -236,7 +236,7 @@ struct rss_type_info {
 
 	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
 		printf("Valid port range is [0");
-		RTE_ETH_FOREACH_DEV(pid)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 			printf(", %d", pid);
 		printf("]\n");
 		return;
@@ -253,10 +253,9 @@ struct rss_type_info {
 	struct rte_eth_xstat_name *xstats_names;
 
 	printf("###### NIC extended statistics for port %-2d\n", port_id);
-	if (!rte_eth_dev_is_valid_port(port_id)) {
-		printf("Error: Invalid port number %i\n", port_id);
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
 		return;
-	}
 
 	/* Get count */
 	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, 0);
@@ -321,7 +320,7 @@ struct rss_type_info {
 
 	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
 		printf("Valid port range is [0");
-		RTE_ETH_FOREACH_DEV(pid)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 			printf(", %d", pid);
 		printf("]\n");
 		return;
@@ -439,7 +438,7 @@ struct rss_type_info {
 
 	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
 		printf("Valid port range is [0");
-		RTE_ETH_FOREACH_DEV(pid)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 			printf(", %d", pid);
 		printf("]\n");
 		return;
@@ -756,10 +755,15 @@ struct rss_type_info {
 int
 port_id_is_invalid(portid_t port_id, enum print_warning warning)
 {
+	struct rte_eth_dev_owner owner;
+	int ret;
+
 	if (port_id == (portid_t)RTE_PORT_ALL)
 		return 0;
 
-	if (rte_eth_dev_is_valid_port(port_id))
+	ret = rte_eth_dev_owner_get(port_id, &owner);
+
+	if (ret == 0 && owner.id == my_owner.id)
 		return 0;
 
 	if (warning == ENABLED_WARN)
@@ -2373,7 +2377,7 @@ struct igb_ring_desc_16_bytes {
 		return;
 	}
 	nb_pt = 0;
-	RTE_ETH_FOREACH_DEV(i) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
 		if (! ((uint64_t)(1ULL << i) & portmask))
 			continue;
 		portlist[nb_pt++] = i;
@@ -2512,10 +2516,9 @@ struct igb_ring_desc_16_bytes {
 void
 setup_gro(const char *onoff, portid_t port_id)
 {
-	if (!rte_eth_dev_is_valid_port(port_id)) {
-		printf("invalid port id %u\n", port_id);
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
 		return;
-	}
+
 	if (test_done == 0) {
 		printf("Before enable/disable GRO,"
 				" please stop forwarding first\n");
@@ -2574,10 +2577,9 @@ struct igb_ring_desc_16_bytes {
 
 	param = &gro_ports[port_id].param;
 
-	if (!rte_eth_dev_is_valid_port(port_id)) {
-		printf("Invalid port id %u.\n", port_id);
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
 		return;
-	}
+
 	if (gro_ports[port_id].enable) {
 		printf("GRO type: TCP/IPv4\n");
 		if (gro_flush_cycles == GRO_DEFAULT_FLUSH_CYCLES) {
@@ -2595,10 +2597,9 @@ struct igb_ring_desc_16_bytes {
 void
 setup_gso(const char *mode, portid_t port_id)
 {
-	if (!rte_eth_dev_is_valid_port(port_id)) {
-		printf("invalid port id %u\n", port_id);
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
 		return;
-	}
+
 	if (strcmp(mode, "on") == 0) {
 		if (test_done == 0) {
 			printf("before enabling GSO,"
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 878c112..0e57b46 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -398,7 +398,7 @@
 		if (port_id_is_invalid(port_id, ENABLED_WARN) ||
 			port_id == (portid_t)RTE_PORT_ALL) {
 			printf("Valid port range is [0");
-			RTE_ETH_FOREACH_DEV(pid)
+			RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 				printf(", %d", pid);
 			printf("]\n");
 			return -1;
@@ -459,7 +459,7 @@
 		if (port_id_is_invalid(port_id, ENABLED_WARN) ||
 			port_id == (portid_t)RTE_PORT_ALL) {
 			printf("Valid port range is [0");
-			RTE_ETH_FOREACH_DEV(pid)
+			RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 				printf(", %d", pid);
 			printf("]\n");
 			return -1;
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index c066cf9..83f5e84 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -108,6 +108,11 @@
 lcoreid_t nb_lcores;           /**< Number of probed logical cores. */
 
 /*
+ * My port owner structure used to own Ethernet ports.
+ */
+struct rte_eth_dev_owner my_owner; /**< Unique owner. */
+
+/*
  * Test Forwarding Configuration.
  *    nb_fwd_lcores <= nb_cfg_lcores <= nb_lcores
  *    nb_fwd_ports  <= nb_cfg_ports  <= nb_ports
@@ -449,7 +454,7 @@ static int eth_event_callback(portid_t port_id,
 	portid_t pt_id;
 	int i = 0;
 
-	RTE_ETH_FOREACH_DEV(pt_id)
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pt_id, my_owner.id)
 		fwd_ports_ids[i++] = pt_id;
 
 	nb_cfg_ports = nb_ports;
@@ -573,7 +578,7 @@ static int eth_event_callback(portid_t port_id,
 		fwd_lcores[lc_id]->cpuid_idx = lc_id;
 	}
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		port = &ports[pid];
 		/* Apply default Tx configuration for all ports */
 		port->dev_conf.txmode = tx_mode;
@@ -706,7 +711,7 @@ static int eth_event_callback(portid_t port_id,
 	queueid_t q;
 
 	/* set socket id according to numa or not */
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		port = &ports[pid];
 		if (nb_rxq > port->dev_info.max_rx_queues) {
 			printf("Fail: nb_rxq(%d) is greater than "
@@ -1000,9 +1005,8 @@ static int eth_event_callback(portid_t port_id,
 	uint64_t tics_per_1sec;
 	uint64_t tics_datum;
 	uint64_t tics_current;
-	uint8_t idx_port, cnt_ports;
+	uint16_t idx_port;
 
-	cnt_ports = rte_eth_dev_count();
 	tics_datum = rte_rdtsc();
 	tics_per_1sec = rte_get_timer_hz();
 #endif
@@ -1017,11 +1021,10 @@ static int eth_event_callback(portid_t port_id,
 			tics_current = rte_rdtsc();
 			if (tics_current - tics_datum >= tics_per_1sec) {
 				/* Periodic bitrate calculation */
-				for (idx_port = 0;
-						idx_port < cnt_ports;
-						idx_port++)
+				RTE_ETH_FOREACH_DEV_OWNED_BY(idx_port,
+							     my_owner.id)
 					rte_stats_bitrate_calc(bitrate_data,
-						idx_port);
+							       idx_port);
 				tics_datum = tics_current;
 			}
 		}
@@ -1359,7 +1362,7 @@ static int eth_event_callback(portid_t port_id,
 	portid_t pi;
 	struct rte_port *port;
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		port = &ports[pi];
 		/* Check if there is a port which is not started */
 		if ((port->port_status != RTE_PORT_STARTED) &&
@@ -1387,7 +1390,7 @@ static int eth_event_callback(portid_t port_id,
 {
 	portid_t pi;
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (!port_is_stopped(pi))
 			return 0;
 	}
@@ -1434,7 +1437,7 @@ static int eth_event_callback(portid_t port_id,
 
 	if(dcb_config)
 		dcb_test = 1;
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
 			continue;
 
@@ -1620,7 +1623,7 @@ static int eth_event_callback(portid_t port_id,
 
 	printf("Stopping ports...\n");
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
 			continue;
 
@@ -1663,7 +1666,7 @@ static int eth_event_callback(portid_t port_id,
 
 	printf("Closing ports...\n");
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
 			continue;
 
@@ -1714,7 +1717,7 @@ static int eth_event_callback(portid_t port_id,
 
 	printf("Resetting ports...\n");
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
 			continue;
 
@@ -1759,6 +1762,12 @@ static int eth_event_callback(portid_t port_id,
 	if (rte_eth_dev_attach(identifier, &pi))
 		return;
 
+	if (rte_eth_dev_owner_set(pi, &my_owner) != 0) {
+		printf("Error: cannot own new attached port %d\n", pi);
+		return;
+	}
+	nb_ports++;
+
 	socket_id = (unsigned)rte_eth_dev_socket_id(pi);
 	/* if socket_id is invalid, set to 0 */
 	if (check_socket_id(socket_id) < 0)
@@ -1766,8 +1775,6 @@ static int eth_event_callback(portid_t port_id,
 	reconfig(pi, socket_id);
 	rte_eth_promiscuous_enable(pi);
 
-	nb_ports = rte_eth_dev_count();
-
 	ports[pi].port_status = RTE_PORT_STOPPED;
 
 	printf("Port %d is attached. Now total ports is %d\n", pi, nb_ports);
@@ -1781,6 +1788,9 @@ static int eth_event_callback(portid_t port_id,
 
 	printf("Detaching a port...\n");
 
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
+		return;
+
 	if (!port_is_closed(port_id)) {
 		printf("Please close port first\n");
 		return;
@@ -1794,7 +1804,7 @@ static int eth_event_callback(portid_t port_id,
 		return;
 	}
 
-	nb_ports = rte_eth_dev_count();
+	nb_ports--;
 
 	printf("Port '%s' is detached. Now total ports is %d\n",
 			name, nb_ports);
@@ -1812,7 +1822,7 @@ static int eth_event_callback(portid_t port_id,
 
 	if (ports != NULL) {
 		no_link_check = 1;
-		RTE_ETH_FOREACH_DEV(pt_id) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pt_id, my_owner.id) {
 			printf("\nShutting down port %d...\n", pt_id);
 			fflush(stdout);
 			stop_port(pt_id);
@@ -1844,7 +1854,7 @@ struct pmd_test_command {
 	fflush(stdout);
 	for (count = 0; count <= MAX_CHECK_TIME; count++) {
 		all_ports_up = 1;
-		RTE_ETH_FOREACH_DEV(portid) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(portid, my_owner.id) {
 			if ((port_mask & (1 << portid)) == 0)
 				continue;
 			memset(&link, 0, sizeof(link));
@@ -1936,6 +1946,8 @@ struct pmd_test_command {
 
 	switch (type) {
 	case RTE_ETH_EVENT_INTR_RMV:
+		if (port_id_is_invalid(port_id, ENABLED_WARN))
+			break;
 		if (rte_eal_alarm_set(100000,
 				rmv_event_callback, (void *)(intptr_t)port_id))
 			fprintf(stderr, "Could not set up deferred device removal\n");
@@ -2068,7 +2080,7 @@ struct pmd_test_command {
 	portid_t pid;
 	struct rte_port *port;
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		port = &ports[pid];
 		port->dev_conf.fdir_conf = fdir_conf;
 		if (nb_rxq > 1) {
@@ -2383,7 +2395,12 @@ uint8_t port_is_bonding_slave(portid_t slave_pid)
 	rte_pdump_init(NULL);
 #endif
 
-	nb_ports = (portid_t) rte_eth_dev_count();
+	if (rte_eth_dev_owner_new(&my_owner.id))
+		rte_panic("Failed to get unique owner identifier\n");
+	snprintf(my_owner.name, sizeof(my_owner.name), TESTPMD_OWNER_NAME);
+	RTE_ETH_FOREACH_DEV(port_id)
+		if (rte_eth_dev_owner_set(port_id, &my_owner) == 0)
+			nb_ports++;
 	if (nb_ports == 0)
 		TESTPMD_LOG(WARNING, "No probed ethernet devices\n");
 
@@ -2431,7 +2448,7 @@ uint8_t port_is_bonding_slave(portid_t slave_pid)
 		rte_exit(EXIT_FAILURE, "Start ports failed\n");
 
 	/* set all ports to promiscuous mode by default */
-	RTE_ETH_FOREACH_DEV(port_id)
+	RTE_ETH_FOREACH_DEV_OWNED_BY(port_id, my_owner.id)
 		rte_eth_promiscuous_enable(port_id);
 
 	/* Init metrics library */
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 9c739e5..2d253b9 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -50,6 +50,8 @@
 #define NUMA_NO_CONFIG 0xFF
 #define UMA_NO_CONFIG  0xFF
 
+#define TESTPMD_OWNER_NAME "TestPMD"
+
 typedef uint8_t  lcoreid_t;
 typedef uint16_t portid_t;
 typedef uint16_t queueid_t;
@@ -361,6 +363,7 @@ struct queue_stats_mappings {
  * nb_fwd_ports <= nb_cfg_ports <= nb_ports
  */
 extern portid_t nb_ports; /**< Number of ethernet ports probed at init time. */
+extern struct rte_eth_dev_owner my_owner; /**< Unique owner. */
 extern portid_t nb_cfg_ports; /**< Number of configured ports. */
 extern portid_t nb_fwd_ports; /**< Number of forwarding ports. */
 extern portid_t fwd_ports_ids[RTE_MAX_ETHPORTS];
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v4 3/7] ethdev: add port ownership
  2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 3/7] ethdev: add port ownership Matan Azrad
@ 2018-01-21 20:43         ` Ferruh Yigit
  2018-01-21 20:46         ` Ferruh Yigit
  1 sibling, 0 replies; 214+ messages in thread
From: Ferruh Yigit @ 2018-01-21 20:43 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

On 1/20/2018 9:24 PM, Matan Azrad wrote:
> The ownership of a port is implicit in DPDK.
> Making it explicit is better from the next reasons:
> 1. It will define well who is in charge of the port usage synchronization.
> 2. A library could work on top of a port.
> 3. A port can work on top of another port.
> 
> Also in the fail-safe case, an issue has been met in testpmd.
> We need to check that the application is not trying to use a port which
> is already managed by fail-safe.
> 
> A port owner is built from owner id(number) and owner name(string) while
> the owner id must be unique to distinguish between two identical entity
> instances and the owner name can be any name.
> The name helps to logically recognize the owner by different DPDK
> entities and allows easy debug.
> Each DPDK entity can allocate an owner unique identifier and can use it
> and its preferred name to owns valid ethdev ports.
> Each DPDK entity can get any port owner status to decide if it can
> manage the port or not.
> 
> The mechanism is synchronized for both the primary process threads and
> the secondary processes threads to allow secondary process entity to be
> a port owner.
> 
> Add a synchronized ownership mechanism to DPDK Ethernet devices to
> avoid multiple management of a device by different DPDK entities.
> 
> The current ethdev internal port management is not affected by this
> feature.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> Acked-by: Thomas Monjalon <thomas@monjalon.net>
> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

<...>

> @@ -273,6 +289,144 @@ struct rte_eth_dev *
>  		return 1;
>  }
>  
> +static int
> +rte_eth_is_valid_owner_id(uint64_t owner_id)
> +{
> +	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
> +	    rte_eth_dev_shared_data->next_owner_id <= owner_id) {
> +		RTE_LOG(ERR, EAL, "Invalid owner_id=%016lX.\n", owner_id);

This break build [1], also why using EAL log type here? There are a few sample
of this, and there are a few using PMD log type, please fix log types.

[1]
...dpdk/lib/librte_ether/rte_ethdev.c:372:59: error: format ‘%lX’ expects
argument of type ‘long unsigned int’, but argument 4 has type ‘uint64_t {aka
long long unsigned int}’ [-Werror=format=]

   RTE_LOG(ERR, EAL, "Invalid owner_id=%016lX.\n", owner_id);
                                                           ^
...dpdk/i686-native-linuxapp-gcc/include/rte_log.h:288:25: note: in definition
of macro ‘RTE_LOG’
    RTE_LOGTYPE_ ## t, # t ": " __VA_ARGS__)
                         ^
...dpdk/lib/librte_ether/rte_ethdev.c: In function ‘_rte_eth_dev_owner_set’:

...dpdk/lib/librte_ether/rte_ethdev.c:421:18: error: format ‘%lX’ expects
argument of type ‘long unsigned int’, but argument 6 has type ‘uint64_t {aka
long long unsigned int}’ [-Werror=format=]

    port_owner->id);
    ~~~~~~~~~~~~~ ^
...dpdk/i686-native-linuxapp-gcc/include/rte_log.h:288:25: note: in definition
of macro ‘RTE_LOG’
    RTE_LOGTYPE_ ## t, # t ": " __VA_ARGS__)
                         ^
cc1: all warnings being treated as errors

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v4 3/7] ethdev: add port ownership
  2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 3/7] ethdev: add port ownership Matan Azrad
  2018-01-21 20:43         ` Ferruh Yigit
@ 2018-01-21 20:46         ` Ferruh Yigit
  1 sibling, 0 replies; 214+ messages in thread
From: Ferruh Yigit @ 2018-01-21 20:46 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

On 1/20/2018 9:24 PM, Matan Azrad wrote:
> The ownership of a port is implicit in DPDK.
> Making it explicit is better from the next reasons:
> 1. It will define well who is in charge of the port usage synchronization.
> 2. A library could work on top of a port.
> 3. A port can work on top of another port.
> 
> Also in the fail-safe case, an issue has been met in testpmd.
> We need to check that the application is not trying to use a port which
> is already managed by fail-safe.
> 
> A port owner is built from owner id(number) and owner name(string) while
> the owner id must be unique to distinguish between two identical entity
> instances and the owner name can be any name.
> The name helps to logically recognize the owner by different DPDK
> entities and allows easy debug.
> Each DPDK entity can allocate an owner unique identifier and can use it
> and its preferred name to owns valid ethdev ports.
> Each DPDK entity can get any port owner status to decide if it can
> manage the port or not.
> 
> The mechanism is synchronized for both the primary process threads and
> the secondary processes threads to allow secondary process entity to be
> a port owner.
> 
> Add a synchronized ownership mechanism to DPDK Ethernet devices to
> avoid multiple management of a device by different DPDK entities.
> 
> The current ethdev internal port management is not affected by this
> feature.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> Acked-by: Thomas Monjalon <thomas@monjalon.net>
> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

<...>

> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Unset Ethernet device owner to make the device ownerless.
> + *
> + * @param	port_id
> + *  The identifier of port to make ownerless.
> + * @param	owner
> + *  The owner identifier.

Causing doc build warning: s/owner/owner_id

> + * @return
> + *  0 on success, negative errno value on error.
> + */
> +int rte_eth_dev_owner_unset(const uint16_t port_id, const uint64_t owner_id);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Remove owner from all Ethernet devices owned by a specific owner.
> + *
> + * @param	owner

Causing doc build warning: s/owner/owner_id

> + *  The owner identifier.
> + */
> +void rte_eth_dev_owner_delete(const uint64_t owner_id);
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Get the owner of an Ethernet device.
> + *
> + * @param	port_id
> + *  The port identifier.
> + * @param	owner
> + *  The owner structure pointer to fill.
> + * @return
> + *  0 on success, negative errno value on error..
> + */
> +int rte_eth_dev_owner_get(const uint16_t port_id,
> +			  struct rte_eth_dev_owner *owner);

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
  2018-01-19 18:10                                                   ` Thomas Monjalon
@ 2018-01-21 22:12                                                     ` Ferruh Yigit
  0 siblings, 0 replies; 214+ messages in thread
From: Ferruh Yigit @ 2018-01-21 22:12 UTC (permalink / raw)
  To: Thomas Monjalon, Neil Horman
  Cc: Matan Azrad, Ananyev, Konstantin, Gaetan Rivet, Wu, Jingjing,
	dev, Richardson, Bruce

On 1/19/2018 6:10 PM, Thomas Monjalon wrote:
> 19/01/2018 18:37, Neil Horman:
>> On Fri, Jan 19, 2018 at 06:09:47PM +0100, Thomas Monjalon wrote:
>>> 19/01/2018 15:32, Neil Horman:
>>>> On Fri, Jan 19, 2018 at 03:07:28PM +0100, Thomas Monjalon wrote:
>>>>> 19/01/2018 14:57, Neil Horman:
>>>>>>>> I specifically pointed that out above.  There is no reason an owernship record
>>>>>>>> couldn't be added to the rte_eth_dev structure.
>>>>>>>
>>>>>>> Sorry, don't understand why.
>>>>>>>
>>>>>> Because, thats the resource your trying to protect, and the object you want to
>>>>>> identify ownership of, no?
>>>>>
>>>>> No
>>>>> The rte_eth_dev structure is the port representation in the process.
>>>>> The rte_eth_dev_data structure is the port represenation across multi-process.
>>>>> The ownership must be in rte_eth_dev_data to cover multi-process protection.
>>>>>
>>>> Ok.   You get the idea though right?  That the port representation,
>>>> for some definition thereof, should embody the ownership state.
>>>> Neil
>>>
>>> Not sure to understand your question.
>>>
>> There is no real question here, only confirming that we are saying the same
>> thing.  I misspoke when I indicated ownership information should be embodied in
>> rte_eth_dev rather than its shared data.  But regardless, the concept is the
>> same
> 
> Yes we agree.
> And I think it is what Matan did.
> The owner is in struct rte_eth_dev_data:

Hi Thomas, Neil,

Sorry I did not able to this thred, is discussion concluded?

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-20 18:14                 ` Matan Azrad
@ 2018-01-22 10:17                   ` Gaëtan Rivet
  2018-01-22 11:22                     ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Gaëtan Rivet @ 2018-01-22 10:17 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Ananyev, Konstantin, Thomas Monjalon, Wu, Jingjing, dev,
	Neil Horman, Richardson, Bruce

Hi Matan,

On Sat, Jan 20, 2018 at 06:14:13PM +0000, Matan Azrad wrote:

<snip>

> > > > > > > @@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
> > > > > > >  			&link_speed) < 0)
> > > > > > >  		return;
> > > > > > >
> > > > > > > -	RTE_ETH_FOREACH_DEV(pid) {
> > > > > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
> > > > > >
> > > > > > Why do we need all these changes?
> > > > > > As I understand you changed definition of RTE_ETH_FOREACH_DEV(),
> > > > > > so no testpmd should work ok default (no_owner case).
> > > > > > Am I missing something here?
> > > > >
> > > > > Now, After Gaetan suggestion RTE_ETH_FOREACH_DEV(pid) will iterate
> > > > over all valid and ownerless ports.

To be clear: you did not implement what I suggested, but your own
interpretation of it. Please do not write as if I validated this
interpretation.

Essentially, the NO_OWNER semantic is completely different from a
default owner. A default owner would protect ports from race conditions
and force port ownership requests to go through proper channels
protected by critical sections.

NO_OWNER means that anyone is free to take any ownerless port at any
time. And as a result, your are thus forced here to fix this by
modifying an existing application for any entity using your ownership
API to function with it.

This is very different from what I suggested. What I said was that I
wanted the most common case to be taken care of, and for existing
applications to continue working. It entails having a more complicated
API, but I think this is a price we should pay.

You are implementing the most common case in testpmd (the app entity
creating an owner and putting its valid ports within). Your API should
ease that up as much as possible before considering forcing everyone to
work with it.

                           ~*~

You implemented a way for the failsafe to capture existing ports.
How does it work without the channels for requesting ports suggested above?

Regards,
-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-22 10:17                   ` Gaëtan Rivet
@ 2018-01-22 11:22                     ` Matan Azrad
  0 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-22 11:22 UTC (permalink / raw)
  To: Gaëtan Rivet
  Cc: Ananyev, Konstantin, Thomas Monjalon, Wu, Jingjing, dev,
	Neil Horman, Richardson, Bruce

Hi Gaetan

From: Gaëtan Rivet, Monday, January 22, 2018 12:17 PM
> Hi Matan,
> 
> On Sat, Jan 20, 2018 at 06:14:13PM +0000, Matan Azrad wrote:
> 
> <snip>
> 
> > > > > > > > @@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
> > > > > > > >  			&link_speed) < 0)
> > > > > > > >  		return;
> > > > > > > >
> > > > > > > > -	RTE_ETH_FOREACH_DEV(pid) {
> > > > > > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
> > > > > > >
> > > > > > > Why do we need all these changes?
> > > > > > > As I understand you changed definition of
> > > > > > > RTE_ETH_FOREACH_DEV(), so no testpmd should work ok default
> (no_owner case).
> > > > > > > Am I missing something here?
> > > > > >
> > > > > > Now, After Gaetan suggestion RTE_ETH_FOREACH_DEV(pid) will
> > > > > > iterate
> > > > > over all valid and ownerless ports.
> 
> To be clear: you did not implement what I suggested, but your own
> interpretation of it. Please do not write as if I validated this interpretation.
> 
> Essentially, the NO_OWNER semantic is completely different from a default
> owner. A default owner would protect ports from race conditions and force
> port ownership requests to go through proper channels protected by critical
> sections.
> 

Please explain it more.
Do you want any created port will be owned by default owner(app)?
So, how can other DPDK entity to take control on a port? 

> NO_OWNER means that anyone is free to take any ownerless port at any
> time. And as a result, your are thus forced here to fix this by modifying an
> existing application for any entity using your ownership API to function with
> it.
>

Yes, I think is should be explicit!
Because hotplug is in the game and  a port can be created\released any time,
Any dpdk entity should know about its ports and own them.

> This is very different from what I suggested. What I said was that I wanted
> the most common case to be taken care of, and for existing applications to
> continue working. It entails having a more complicated API, but I think this is
> a price we should pay.
> 

So, please define what is the common case you are talking about.
And if you have an idea how to adjust port ownership to take care of it, I will be happy to hear.

> You are implementing the most common case in testpmd (the app entity
> creating an owner and putting its valid ports within). Your API should ease
> that up as much as possible before considering forcing everyone to work
> with it.

I don't think it is complicated.

>                            ~*~
> 
> You implemented a way for the failsafe to capture existing ports.
> How does it work without the channels for requesting ports suggested
> above?

If the port is without an owner, it will just take ownership of it and will manage it, else will try to take ownership in the next hotplug alarm. 

> 
> Regards,
> --
> Gaëtan Rivet
> 6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-19 15:00               ` Gaëtan Rivet
  2018-01-20 18:14                 ` Matan Azrad
@ 2018-01-22 12:28                 ` Ananyev, Konstantin
  2018-01-22 13:22                   ` Matan Azrad
  1 sibling, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-22 12:28 UTC (permalink / raw)
  To: Gaëtan Rivet, Matan Azrad
  Cc: Thomas Monjalon, Wu, Jingjing, dev, Neil Horman, Richardson, Bruce

Hi lads,

> 
> Hi Matan,
> 
> On Fri, Jan 19, 2018 at 01:35:10PM +0000, Matan Azrad wrote:
> > Hi Konstantin
> >
> > From: Ananyev, Konstantin, Friday, January 19, 2018 3:09 PM
> > > > -----Original Message-----
> > > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > > Sent: Friday, January 19, 2018 12:52 PM
> > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Thomas
> > > > Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > <gaetan.rivet@6wind.com>;
> > > > Wu, Jingjing <jingjing.wu@intel.com>
> > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > > > Bruce <bruce.richardson@intel.com>
> > > > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> > > >
> > > > Hi Konstantin
> > > >
> > > > From: Ananyev, Konstantin, Friday, January 19, 2018 2:38 PM
> > > > > To: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > > <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>;
> > > Wu,
> > > > > Jingjing <jingjing.wu@intel.com>
> > > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > > > > Bruce <bruce.richardson@intel.com>
> > > > > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port
> > > > > ownership
> > > > >
> > > > > Hi Matan,
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > > > > Sent: Thursday, January 18, 2018 4:35 PM
> > > > > > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > > > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > > > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> > > Richardson,
> > > > > > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > > > > <konstantin.ananyev@intel.com>
> > > > > > Subject: [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> > > > > >
> > > > > > Testpmd should not use ethdev ports which are managed by other
> > > > > > DPDK entities.
> > > > > >
> > > > > > Set Testpmd ownership to each port which is not used by other
> > > > > > entity and prevent any usage of ethdev ports which are not owned by
> > > Testpmd.
> > > > > >
> > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > > ---
> > > > > >  app/test-pmd/cmdline.c      | 89 +++++++++++++++++++-----------------
> > > ----
> > > > > -----
> > > > > >  app/test-pmd/cmdline_flow.c |  2 +-
> > > > > >  app/test-pmd/config.c       | 37 ++++++++++---------
> > > > > >  app/test-pmd/parameters.c   |  4 +-
> > > > > >  app/test-pmd/testpmd.c      | 63 ++++++++++++++++++++------------
> > > > > >  app/test-pmd/testpmd.h      |  3 ++
> > > > > >  6 files changed, 103 insertions(+), 95 deletions(-)
> > > > > >
> > > > > > diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
> > > > > > 31919ba..6199c64 100644
> > > > > > --- a/app/test-pmd/cmdline.c
> > > > > > +++ b/app/test-pmd/cmdline.c
> > > > > > @@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
> > > > > >  			&link_speed) < 0)
> > > > > >  		return;
> > > > > >
> > > > > > -	RTE_ETH_FOREACH_DEV(pid) {
> > > > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
> > > > >
> > > > > Why do we need all these changes?
> > > > > As I understand you changed definition of RTE_ETH_FOREACH_DEV(), so
> > > > > no testpmd should work ok default (no_owner case).
> > > > > Am I missing something here?
> > > >
> > > > Now, After Gaetan suggestion RTE_ETH_FOREACH_DEV(pid) will iterate
> > > over all valid and ownerless ports.
> > >
> > > Yes.
> > >
> > > > Here Testpmd wants to iterate over its owned ports.
> > >
> > > Why? Why it can't just iterate over all valid and ownerless ports?
> > > As I understand it would be enough to fix current problems and would allow
> > > us to avoid any changes in testmpd (which I think is a good thing).
> >
> > Yes, I understand that this big change is very daunted, But I think the current a lot of bugs in testpmd(regarding port ownership) even more
> daunted.
> >
> > Look,
> > Testpmd initiates some of its internal databases depends on specific port iteration,
> > In some time someone may take ownership of Testpmd ports and testpmd will continue to touch them.

But if someone will take the ownership (assign new owner_id) that port will not appear
in RTE_ETH_FOREACH_DEV() any more.

> >
> 
> If I look back on the fail-safe, its sole purpose is to have seamless
> hotplug with existing applications.
> 
> Port ownership is a genericization of some functions introduced by the
> fail-safe, that could structure DPDK further. It should allow
> applications to have a seamless integration with subsystems using port
> ownership. Without this, port ownership cannot be used.
> 
> Testpmd should be fixed, but follow the most common design patterns of
> DPDK applications. Going with port ownership seems like a paradigm
> shift.
> 
> > In addition
> > Using the old iterator in some places in testpmd will cause a race for run-time new ports(can be created by failsafe or any hotplug code):
> > - testpmd finds an ownerless port(just now created) by the old iterator and start traffic there,
> > - failsafe takes ownership of this new port and start traffic there.
> > Problem!

Could you shed a bit more light here - it would be race condition between whom and whom?
As I remember in testpmd all control ops are done within one thread (main lcore).
The only way to attach/detach port with it - invoke testpmd CLI "attach/detach" port.

Konstantin

> 
> Testpmd does not handle detection of new port. If it did, testing
> fail-safe with it would be wrong.
> 
> At startup, RTE_ETH_FOREACH_DEV already fixed the issue of registering
> DEFERRED ports. There are still remaining issues regarding this, but I
> think they should be fixed. The architecture does not need to be
> completely moved to port ownership.
> 
> If anything, this should serve as a test for your API with common
> applications. I think you'd prefer to know and debug with testpmd
> instead of firing up VPP or something like that to determine what went
> wrong with using the fail-safe.
> 
> >
> > In addition
> > As a good example for well-done application (free from ownership bugs) I tried here to adjust Tespmd to the new rules and BTW to fix a
> lot of bugs.
> 
> Testpmd has too much cruft, it won't ever be a good example of a
> well-done application.
> 
> If you want to demonstrate ownership, I think you should start an
> example application demonstrating race conditions and their mitigation.
> 
> I think that would be interesting for many DPDK users.
> 
> >
> >
> > So actually applications which are not aware to the port ownership still are exposed to races, but if there use the old iterator(with the new
> change) the amount of races decreases.
> >
> > Thanks, Matan.
> > > Konstantin
> > >
> > > >
> > > > I added to Testpmd ability to take an ownership of ports as the new
> > > > ownership and synchronization rules suggested, Since Tespmd is a DPDK
> > > > entity which wants that no one will touch its owned ports, It must allocate
> > > an unique ID, set owner for its ports (see in main function) and recognizes
> > > them by its owner ID.
> > > >
> > > > > Konstantin
> > > > >
> 
> Regards,
> --
> Gaëtan Rivet
> 6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-22 12:28                 ` Ananyev, Konstantin
@ 2018-01-22 13:22                   ` Matan Azrad
  2018-01-22 20:48                     ` Ananyev, Konstantin
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-22 13:22 UTC (permalink / raw)
  To: Ananyev, Konstantin, Gaëtan Rivet
  Cc: Thomas Monjalon, Wu, Jingjing, dev, Neil Horman, Richardson, Bruce


Hi 
From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> Hi lads,
> 
> >
> > Hi Matan,
> >
> > On Fri, Jan 19, 2018 at 01:35:10PM +0000, Matan Azrad wrote:
> > > Hi Konstantin
> > >
> > > From: Ananyev, Konstantin, Friday, January 19, 2018 3:09 PM
> > > > > -----Original Message-----
> > > > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > > > Sent: Friday, January 19, 2018 12:52 PM
> > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Thomas
> > > > > Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > > <gaetan.rivet@6wind.com>;
> > > > > Wu, Jingjing <jingjing.wu@intel.com>
> > > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> > > > > Richardson, Bruce <bruce.richardson@intel.com>
> > > > > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port
> > > > > ownership
> > > > >
> > > > > Hi Konstantin
> > > > >
> > > > > From: Ananyev, Konstantin, Friday, January 19, 2018 2:38 PM
> > > > > > To: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > > > <thomas@monjalon.net>; Gaetan Rivet
> <gaetan.rivet@6wind.com>;
> > > > Wu,
> > > > > > Jingjing <jingjing.wu@intel.com>
> > > > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> > > > > > Richardson, Bruce <bruce.richardson@intel.com>
> > > > > > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port
> > > > > > ownership
> > > > > >
> > > > > > Hi Matan,
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > > > > > Sent: Thursday, January 18, 2018 4:35 PM
> > > > > > > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > > > > > <gaetan.rivet@6wind.com>; Wu, Jingjing
> > > > > > > <jingjing.wu@intel.com>
> > > > > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> > > > Richardson,
> > > > > > > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > > > > > <konstantin.ananyev@intel.com>
> > > > > > > Subject: [PATCH v3 7/7] app/testpmd: adjust ethdev port
> > > > > > > ownership
> > > > > > >
> > > > > > > Testpmd should not use ethdev ports which are managed by
> > > > > > > other DPDK entities.
> > > > > > >
> > > > > > > Set Testpmd ownership to each port which is not used by
> > > > > > > other entity and prevent any usage of ethdev ports which are
> > > > > > > not owned by
> > > > Testpmd.
> > > > > > >
> > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > > > ---
> > > > > > >  app/test-pmd/cmdline.c      | 89 +++++++++++++++++++----------
> -------
> > > > ----
> > > > > > -----
> > > > > > >  app/test-pmd/cmdline_flow.c |  2 +-
> > > > > > >  app/test-pmd/config.c       | 37 ++++++++++---------
> > > > > > >  app/test-pmd/parameters.c   |  4 +-
> > > > > > >  app/test-pmd/testpmd.c      | 63 ++++++++++++++++++++--------
> ----
> > > > > > >  app/test-pmd/testpmd.h      |  3 ++
> > > > > > >  6 files changed, 103 insertions(+), 95 deletions(-)
> > > > > > >
> > > > > > > diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> > > > > > > index
> > > > > > > 31919ba..6199c64 100644
> > > > > > > --- a/app/test-pmd/cmdline.c
> > > > > > > +++ b/app/test-pmd/cmdline.c
> > > > > > > @@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
> > > > > > >  			&link_speed) < 0)
> > > > > > >  		return;
> > > > > > >
> > > > > > > -	RTE_ETH_FOREACH_DEV(pid) {
> > > > > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
> > > > > >
> > > > > > Why do we need all these changes?
> > > > > > As I understand you changed definition of
> > > > > > RTE_ETH_FOREACH_DEV(), so no testpmd should work ok default
> (no_owner case).
> > > > > > Am I missing something here?
> > > > >
> > > > > Now, After Gaetan suggestion RTE_ETH_FOREACH_DEV(pid) will
> > > > > iterate
> > > > over all valid and ownerless ports.
> > > >
> > > > Yes.
> > > >
> > > > > Here Testpmd wants to iterate over its owned ports.
> > > >
> > > > Why? Why it can't just iterate over all valid and ownerless ports?
> > > > As I understand it would be enough to fix current problems and
> > > > would allow us to avoid any changes in testmpd (which I think is a good
> thing).
> > >
> > > Yes, I understand that this big change is very daunted, But I think
> > > the current a lot of bugs in testpmd(regarding port ownership) even
> > > more
> > daunted.
> > >
> > > Look,
> > > Testpmd initiates some of its internal databases depends on specific
> > > port iteration, In some time someone may take ownership of Testpmd
> ports and testpmd will continue to touch them.
> 
> But if someone will take the ownership (assign new owner_id) that port will
> not appear in RTE_ETH_FOREACH_DEV() any more.
> 

Yes, but testpmd sometimes depends on previous iteration using internal database.
So it uses internal database that was updated by old iteration.  

> > >
> >
> > If I look back on the fail-safe, its sole purpose is to have seamless
> > hotplug with existing applications.
> >
> > Port ownership is a genericization of some functions introduced by the
> > fail-safe, that could structure DPDK further. It should allow
> > applications to have a seamless integration with subsystems using port
> > ownership. Without this, port ownership cannot be used.
> >
> > Testpmd should be fixed, but follow the most common design patterns of
> > DPDK applications. Going with port ownership seems like a paradigm
> > shift.
> >
> > > In addition
> > > Using the old iterator in some places in testpmd will cause a race for run-
> time new ports(can be created by failsafe or any hotplug code):
> > > - testpmd finds an ownerless port(just now created) by the old
> > > iterator and start traffic there,
> > > - failsafe takes ownership of this new port and start traffic there.
> > > Problem!
> 
> Could you shed a bit more light here - it would be race condition between
> whom and whom?

Sure.

> As I remember in testpmd all control ops are done within one thread (main
> lcore).

But other dpdk entity can use another thread, for example:
Failsafe uses the host thread(using alarm callback) to create a new port and to take ownership of a port.

The race:
Testpmd iterates over all ports by the master thread.
Failsafe takes ownership of a port by the host thread and start using it. 
=> The two dpdk entities may use the device at same time!

Obeying the new ownership rules can prevent all these races.

> The only way to attach/detach port with it - invoke testpmd CLI
> "attach/detach" port.
> 
> Konstantin
> 
> >
> > Testpmd does not handle detection of new port. If it did, testing
> > fail-safe with it would be wrong.
> >
> > At startup, RTE_ETH_FOREACH_DEV already fixed the issue of registering
> > DEFERRED ports. There are still remaining issues regarding this, but I
> > think they should be fixed. The architecture does not need to be
> > completely moved to port ownership.
> >
> > If anything, this should serve as a test for your API with common
> > applications. I think you'd prefer to know and debug with testpmd
> > instead of firing up VPP or something like that to determine what went
> > wrong with using the fail-safe.
> >
> > >
> > > In addition
> > > As a good example for well-done application (free from ownership
> > > bugs) I tried here to adjust Tespmd to the new rules and BTW to fix
> > > a
> > lot of bugs.
> >
> > Testpmd has too much cruft, it won't ever be a good example of a
> > well-done application.
> >
> > If you want to demonstrate ownership, I think you should start an
> > example application demonstrating race conditions and their mitigation.
> >
> > I think that would be interesting for many DPDK users.
> >
> > >
> > >
> > > So actually applications which are not aware to the port ownership
> > > still are exposed to races, but if there use the old iterator(with
> > > the new
> > change) the amount of races decreases.
> > >
> > > Thanks, Matan.
> > > > Konstantin
> > > >
> > > > >
> > > > > I added to Testpmd ability to take an ownership of ports as the
> > > > > new ownership and synchronization rules suggested, Since Tespmd
> > > > > is a DPDK entity which wants that no one will touch its owned
> > > > > ports, It must allocate
> > > > an unique ID, set owner for its ports (see in main function) and
> > > > recognizes them by its owner ID.
> > > > >
> > > > > > Konstantin
> > > > > >
> >
> > Regards,
> > --
> > Gaëtan Rivet
> > 6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v5 0/7] Port ownership and synchronization
  2018-01-20 21:24     ` [dpdk-dev] [PATCH v4 0/7] Port ownership and syncronization Matan Azrad
                         ` (6 preceding siblings ...)
  2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 7/7] app/testpmd: adjust ethdev port ownership Matan Azrad
@ 2018-01-22 16:38       ` Matan Azrad
  2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 1/7] ethdev: fix port data reset timing Matan Azrad
                           ` (7 more replies)
  7 siblings, 8 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-22 16:38 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Add ownership mechanism to DPDK Ethernet devices to avoid multiple management of a device by different DPDK entities.
The port ownership mechanism is a good point to redefine the synchronization rules in ethdev:

	1. The port allocation and port release synchronization will be managed by ethdev.
	2. The port usage synchronization will be managed by the port owner.
	3. The port ownership synchronization will be managed by ethdev.
	4. DPDK entity which want to use a port safely must take ownership before.


V2:  
Synchronize ethdev port creation.
Synchronize port ownership mechanism.
Rename owner remove API to rte_eth_dev_owner_unset.
Remove "ethdev: free a port by a dedicated API" patch - passed to another series.
Add "ethdev: fix port data reset timing" patch.
Cahnge owner get API to return int value and to pass copy of the owner structure.
Adjust testpmd to the improved owner get API.
Adjust documentations.

V3:
Change RTE_ETH_FOREACH_DEV iterator to skip owned ports(Gaetan suggestion).
Prevent goto in set\unset APIs by adding internal API - this also adds reuse of code(Konstantin suggestion).
Group all the shared processes variables in one struct to allow easy allocation of it(Konstantin suggestion).
Take owner name truncation as warning and not as error(Konstantin suggestion).
Mark the new APIs as EXPERIMENTAL.
Rebase on top of master_net_mlx.
Rebase on top of "[PATCH v6 0/6] Fail-safe\ethdev: fix removal handling lack" series.
Rebase on top of "[PATCH v5 0/8] Introduce virtual driver for Hyper-V/Azure platforms" .
Add "ethdev: fix used portid allocation" patch suggested y Konstantin.

v4:
Share => shared in ethdev patches(Thomas suggestion).
Rephase some code comments(Thomas suggestion).
Fix compilation issue caused by wrong rebase with "fix used portid allocation" patch.
Add assert check for the correct port state to above fix patch.

V5:
Use defferent print message type as Ferruh suggested.
Fix the name parameter description in set\unset APIs(Ferruh suggestion).
Rebase on top of 18.02-rc1.
Fix issue: ownership API must check that the shared data was allocated before using the shared ownership lock(relevant when no port was created).

Matan Azrad (7):
  ethdev: fix port data reset timing
  ethdev: fix used portid allocation
  ethdev: add port ownership
  ethdev: synchronize port allocation
  net/failsafe: free an eth port by a dedicated API
  net/failsafe: use ownership mechanism to own ports
  app/testpmd: adjust ethdev port ownership

 app/test-pmd/cmdline.c                  |  89 +++++------
 app/test-pmd/cmdline_flow.c             |   2 +-
 app/test-pmd/config.c                   |  37 ++---
 app/test-pmd/parameters.c               |   4 +-
 app/test-pmd/testpmd.c                  |  63 +++++---
 app/test-pmd/testpmd.h                  |   3 +
 doc/guides/prog_guide/poll_mode_drv.rst |  14 +-
 drivers/net/failsafe/failsafe.c         |   7 +
 drivers/net/failsafe/failsafe_eal.c     |  16 ++
 drivers/net/failsafe/failsafe_ether.c   |   2 +-
 drivers/net/failsafe/failsafe_private.h |   2 +
 lib/librte_ether/rte_ethdev.c           | 267 +++++++++++++++++++++++++++-----
 lib/librte_ether/rte_ethdev.h           | 115 +++++++++++++-
 lib/librte_ether/rte_ethdev_core.h      |   2 +
 lib/librte_ether/rte_ethdev_version.map |   6 +
 15 files changed, 486 insertions(+), 143 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v5 1/7] ethdev: fix port data reset timing
  2018-01-22 16:38       ` [dpdk-dev] [PATCH v5 0/7] Port ownership and synchronization Matan Azrad
@ 2018-01-22 16:38         ` Matan Azrad
  2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 2/7] ethdev: fix used portid allocation Matan Azrad
                           ` (6 subsequent siblings)
  7 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-22 16:38 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev, stable

rte_eth_dev_data structure is allocated per ethdev port and can be
used to get a data of the port internally.

rte_eth_dev_attach_secondary tries to find the port identifier using
rte_eth_dev_data name field comparison and may get an identifier of
invalid port in case of this port was released by the primary process
because the port release API doesn't reset the port data.

So, it will be better to reset the port data in release time instead of
allocation time.

Move the port data reset to the port release API.

Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple process model")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 lib/librte_ether/rte_ethdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index f285ba2..c469bd4 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -261,7 +261,6 @@ struct rte_eth_dev *
 		return NULL;
 	}
 
-	memset(&rte_eth_dev_data[port_id], 0, sizeof(struct rte_eth_dev_data));
 	eth_dev = eth_dev_get(port_id);
 	snprintf(eth_dev->data->name, sizeof(eth_dev->data->name), "%s", name);
 	eth_dev->data->port_id = port_id;
@@ -309,6 +308,7 @@ struct rte_eth_dev *
 	if (eth_dev == NULL)
 		return -EINVAL;
 
+	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
 	eth_dev->state = RTE_ETH_DEV_UNUSED;
 
 	_rte_eth_dev_callback_process(eth_dev, RTE_ETH_EVENT_DESTROY, NULL);
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v5 2/7] ethdev: fix used portid allocation
  2018-01-22 16:38       ` [dpdk-dev] [PATCH v5 0/7] Port ownership and synchronization Matan Azrad
  2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 1/7] ethdev: fix port data reset timing Matan Azrad
@ 2018-01-22 16:38         ` Matan Azrad
  2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 3/7] ethdev: add port ownership Matan Azrad
                           ` (5 subsequent siblings)
  7 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-22 16:38 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev, stable

rte_eth_dev_find_free_port() found a free port by state checking.
The state field are in local process memory, so other DPDK processes
may get the same port ID because their local states may be different.

Replace the state checking by the ethdev port name checking,
so, if the name is an empty string the port ID will be detected as
unused.

Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple process model")
Cc: stable@dpdk.org

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 lib/librte_ether/rte_ethdev.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index c469bd4..e348b63 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -221,8 +221,12 @@ struct rte_eth_dev *
 	unsigned i;
 
 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
-		if (rte_eth_devices[i].state == RTE_ETH_DEV_UNUSED)
+		/* Using shared name field to find a free port. */
+		if (rte_eth_dev_data[i].name[0] == '\0') {
+			RTE_ASSERT(rte_eth_devices[i].state ==
+				   RTE_ETH_DEV_UNUSED);
 			return i;
+		}
 	}
 	return RTE_MAX_ETHPORTS;
 }
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v5 3/7] ethdev: add port ownership
  2018-01-22 16:38       ` [dpdk-dev] [PATCH v5 0/7] Port ownership and synchronization Matan Azrad
  2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 1/7] ethdev: fix port data reset timing Matan Azrad
  2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 2/7] ethdev: fix used portid allocation Matan Azrad
@ 2018-01-22 16:38         ` Matan Azrad
  2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 4/7] ethdev: synchronize port allocation Matan Azrad
                           ` (4 subsequent siblings)
  7 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-22 16:38 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

The ownership of a port is implicit in DPDK.
Making it explicit is better from the next reasons:
1. It will define well who is in charge of the port usage synchronization.
2. A library could work on top of a port.
3. A port can work on top of another port.

Also in the fail-safe case, an issue has been met in testpmd.
We need to check that the application is not trying to use a port which
is already managed by fail-safe.

A port owner is built from owner id(number) and owner name(string) while
the owner id must be unique to distinguish between two identical entity
instances and the owner name can be any name.
The name helps to logically recognize the owner by different DPDK
entities and allows easy debug.
Each DPDK entity can allocate an owner unique identifier and can use it
and its preferred name to owns valid ethdev ports.
Each DPDK entity can get any port owner status to decide if it can
manage the port or not.

The mechanism is synchronized for both the primary process threads and
the secondary processes threads to allow secondary process entity to be
a port owner.

Add a synchronized ownership mechanism to DPDK Ethernet devices to
avoid multiple management of a device by different DPDK entities.

The current ethdev internal port management is not affected by this
feature.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 doc/guides/prog_guide/poll_mode_drv.rst |  14 +-
 lib/librte_ether/rte_ethdev.c           | 233 ++++++++++++++++++++++++++++----
 lib/librte_ether/rte_ethdev.h           | 115 +++++++++++++++-
 lib/librte_ether/rte_ethdev_core.h      |   2 +
 lib/librte_ether/rte_ethdev_version.map |   6 +
 5 files changed, 333 insertions(+), 37 deletions(-)

diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst
index d1d4b1c..d513ee3 100644
--- a/doc/guides/prog_guide/poll_mode_drv.rst
+++ b/doc/guides/prog_guide/poll_mode_drv.rst
@@ -156,8 +156,8 @@ concurrently on the same tx queue without SW lock. This PMD feature found in som
 
 See `Hardware Offload`_ for ``DEV_TX_OFFLOAD_MT_LOCKFREE`` capability probing details.
 
-Device Identification and Configuration
----------------------------------------
+Device Identification, Ownership and Configuration
+--------------------------------------------------
 
 Device Identification
 ~~~~~~~~~~~~~~~~~~~~~
@@ -171,6 +171,16 @@ Based on their PCI identifier, NIC ports are assigned two other identifiers:
 *   A port name used to designate the port in console messages, for administration or debugging purposes.
     For ease of use, the port name includes the port index.
 
+Port Ownership
+~~~~~~~~~~~~~~
+The Ethernet devices ports can be owned by a single DPDK entity (application, library, PMD, process, etc).
+The ownership mechanism is controlled by ethdev APIs and allows to set/remove/get a port owner by DPDK entities.
+Allowing this should prevent any multiple management of Ethernet port by different entities.
+
+.. note::
+
+    It is the DPDK entity responsibility to set the port owner before using it and to manage the port usage synchronization between different threads or processes.
+
 Device Configuration
 ~~~~~~~~~~~~~~~~~~~~
 
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index e348b63..3d2b90c 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -43,7 +43,6 @@
 
 static const char *MZ_RTE_ETH_DEV_DATA = "rte_eth_dev_data";
 struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS];
-static struct rte_eth_dev_data *rte_eth_dev_data;
 static uint8_t eth_dev_last_created_port;
 
 /* spinlock for eth device callbacks */
@@ -55,12 +54,22 @@
 /* spinlock for add/remove tx callbacks */
 static rte_spinlock_t rte_eth_tx_cb_lock = RTE_SPINLOCK_INITIALIZER;
 
+/* spinlock for shared data allocation */
+static rte_spinlock_t rte_eth_shared_data_lock = RTE_SPINLOCK_INITIALIZER;
+
 /* store statistics names and its offset in stats structure  */
 struct rte_eth_xstats_name_off {
 	char name[RTE_ETH_XSTATS_NAME_SIZE];
 	unsigned offset;
 };
 
+/* Shared memory between primary and secondary processes. */
+static struct {
+	uint64_t next_owner_id;
+	rte_spinlock_t ownership_lock;
+	struct rte_eth_dev_data data[RTE_MAX_ETHPORTS];
+} *rte_eth_dev_shared_data;
+
 static const struct rte_eth_xstats_name_off rte_stats_strings[] = {
 	{"rx_good_packets", offsetof(struct rte_eth_stats, ipackets)},
 	{"tx_good_packets", offsetof(struct rte_eth_stats, opackets)},
@@ -182,24 +191,35 @@ enum {
 }
 
 static void
-rte_eth_dev_data_alloc(void)
+rte_eth_dev_shared_data_prepare(void)
 {
 	const unsigned flags = 0;
 	const struct rte_memzone *mz;
 
-	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
-		mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA,
-				RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data),
-				rte_socket_id(), flags);
-	} else
-		mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
-	if (mz == NULL)
-		rte_panic("Cannot allocate memzone for ethernet port data\n");
+	rte_spinlock_lock(&rte_eth_shared_data_lock);
+
+	if (rte_eth_dev_shared_data == NULL) {
+		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+			/* Allocate port data and ownership shared memory. */
+			mz = rte_memzone_reserve(MZ_RTE_ETH_DEV_DATA,
+					sizeof(*rte_eth_dev_shared_data),
+					rte_socket_id(), flags);
+		} else
+			mz = rte_memzone_lookup(MZ_RTE_ETH_DEV_DATA);
+		if (mz == NULL)
+			rte_panic("Cannot allocate ethdev shared data\n");
+
+		rte_eth_dev_shared_data = mz->addr;
+		if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+			rte_eth_dev_shared_data->next_owner_id =
+					RTE_ETH_DEV_NO_OWNER + 1;
+			rte_spinlock_init(&rte_eth_dev_shared_data->ownership_lock);
+			memset(rte_eth_dev_shared_data->data, 0,
+			       sizeof(rte_eth_dev_shared_data->data));
+		}
+	}
 
-	rte_eth_dev_data = mz->addr;
-	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
-		memset(rte_eth_dev_data, 0,
-				RTE_MAX_ETHPORTS * sizeof(*rte_eth_dev_data));
+	rte_spinlock_unlock(&rte_eth_shared_data_lock);
 }
 
 struct rte_eth_dev *
@@ -222,7 +242,7 @@ struct rte_eth_dev *
 
 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
 		/* Using shared name field to find a free port. */
-		if (rte_eth_dev_data[i].name[0] == '\0') {
+		if (rte_eth_dev_shared_data->data[i].name[0] == '\0') {
 			RTE_ASSERT(rte_eth_devices[i].state ==
 				   RTE_ETH_DEV_UNUSED);
 			return i;
@@ -236,7 +256,7 @@ struct rte_eth_dev *
 {
 	struct rte_eth_dev *eth_dev = &rte_eth_devices[port_id];
 
-	eth_dev->data = &rte_eth_dev_data[port_id];
+	eth_dev->data = &rte_eth_dev_shared_data->data[port_id];
 	eth_dev->state = RTE_ETH_DEV_ATTACHED;
 
 	eth_dev_last_created_port = port_id;
@@ -256,8 +276,7 @@ struct rte_eth_dev *
 		return NULL;
 	}
 
-	if (rte_eth_dev_data == NULL)
-		rte_eth_dev_data_alloc();
+	rte_eth_dev_shared_data_prepare();
 
 	if (rte_eth_dev_allocated(name) != NULL) {
 		RTE_PMD_DEBUG_TRACE("Ethernet Device with name %s already allocated!\n",
@@ -286,11 +305,10 @@ struct rte_eth_dev *
 	uint16_t i;
 	struct rte_eth_dev *eth_dev;
 
-	if (rte_eth_dev_data == NULL)
-		rte_eth_dev_data_alloc();
+	rte_eth_dev_shared_data_prepare();
 
 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
-		if (strcmp(rte_eth_dev_data[i].name, name) == 0)
+		if (strcmp(rte_eth_dev_shared_data->data[i].name, name) == 0)
 			break;
 	}
 	if (i == RTE_MAX_ETHPORTS) {
@@ -312,9 +330,16 @@ struct rte_eth_dev *
 	if (eth_dev == NULL)
 		return -EINVAL;
 
-	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
+	rte_eth_dev_shared_data_prepare();
+
+	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
+
 	eth_dev->state = RTE_ETH_DEV_UNUSED;
 
+	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
+
+	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
+
 	_rte_eth_dev_callback_process(eth_dev, RTE_ETH_EVENT_DESTROY, NULL);
 
 	return 0;
@@ -330,6 +355,154 @@ struct rte_eth_dev *
 		return 1;
 }
 
+static int
+rte_eth_is_valid_owner_id(uint64_t owner_id)
+{
+	if (owner_id == RTE_ETH_DEV_NO_OWNER ||
+	    rte_eth_dev_shared_data->next_owner_id <= owner_id) {
+		RTE_PMD_DEBUG_TRACE("Invalid owner_id=%016lX.\n", owner_id);
+		return 0;
+	}
+	return 1;
+}
+
+uint64_t
+rte_eth_find_next_owned_by(uint16_t port_id, const uint64_t owner_id)
+{
+	while (port_id < RTE_MAX_ETHPORTS &&
+	       ((rte_eth_devices[port_id].state != RTE_ETH_DEV_ATTACHED &&
+	       rte_eth_devices[port_id].state != RTE_ETH_DEV_REMOVED) ||
+	       rte_eth_devices[port_id].data->owner.id != owner_id))
+		port_id++;
+
+	if (port_id >= RTE_MAX_ETHPORTS)
+		return RTE_MAX_ETHPORTS;
+
+	return port_id;
+}
+
+int
+rte_eth_dev_owner_new(uint64_t *owner_id)
+{
+	rte_eth_dev_shared_data_prepare();
+
+	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
+
+	*owner_id = rte_eth_dev_shared_data->next_owner_id++;
+
+	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
+	return 0;
+}
+
+static int
+_rte_eth_dev_owner_set(const uint16_t port_id, const uint64_t old_owner_id,
+		       const struct rte_eth_dev_owner *new_owner)
+{
+	struct rte_eth_dev_owner *port_owner;
+	int sret;
+
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+
+	if (!rte_eth_is_valid_owner_id(new_owner->id) &&
+	    !rte_eth_is_valid_owner_id(old_owner_id))
+		return -EINVAL;
+
+	port_owner = &rte_eth_devices[port_id].data->owner;
+	if (port_owner->id != old_owner_id) {
+		RTE_PMD_DEBUG_TRACE("Cannot set owner to port %d already owned"
+				    " by %s_%016lX.\n", port_id,
+				    port_owner->name, port_owner->id);
+		return -EPERM;
+	}
+
+	sret = snprintf(port_owner->name, RTE_ETH_MAX_OWNER_NAME_LEN, "%s",
+			new_owner->name);
+	if (sret < 0 || sret >= RTE_ETH_MAX_OWNER_NAME_LEN)
+		RTE_PMD_DEBUG_TRACE("Port %d owner name was truncated.\n",
+				    port_id);
+
+	port_owner->id = new_owner->id;
+
+	RTE_PMD_DEBUG_TRACE("Port %d owner is %s_%016lX.\n", port_id,
+			    new_owner->name, new_owner->id);
+
+	return 0;
+}
+
+int
+rte_eth_dev_owner_set(const uint16_t port_id,
+		      const struct rte_eth_dev_owner *owner)
+{
+	int ret;
+
+	rte_eth_dev_shared_data_prepare();
+
+	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
+
+	ret = _rte_eth_dev_owner_set(port_id, RTE_ETH_DEV_NO_OWNER, owner);
+
+	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
+	return ret;
+}
+
+int
+rte_eth_dev_owner_unset(const uint16_t port_id, const uint64_t owner_id)
+{
+	const struct rte_eth_dev_owner new_owner = (struct rte_eth_dev_owner)
+			{.id = RTE_ETH_DEV_NO_OWNER, .name = ""};
+	int ret;
+
+	rte_eth_dev_shared_data_prepare();
+
+	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
+
+	ret = _rte_eth_dev_owner_set(port_id, owner_id, &new_owner);
+
+	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
+	return ret;
+}
+
+void
+rte_eth_dev_owner_delete(const uint64_t owner_id)
+{
+	uint16_t port_id;
+
+	rte_eth_dev_shared_data_prepare();
+
+	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
+
+	if (rte_eth_is_valid_owner_id(owner_id)) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(port_id, owner_id)
+			memset(&rte_eth_devices[port_id].data->owner, 0,
+			       sizeof(struct rte_eth_dev_owner));
+		RTE_PMD_DEBUG_TRACE("All port owners owned by %016X identifier"
+				    " have removed.\n", owner_id);
+	}
+
+	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
+}
+
+int
+rte_eth_dev_owner_get(const uint16_t port_id, struct rte_eth_dev_owner *owner)
+{
+	int ret = 0;
+
+	rte_eth_dev_shared_data_prepare();
+
+	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
+
+	if (!rte_eth_dev_is_valid_port(port_id)) {
+		RTE_PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+		ret = -ENODEV;
+	} else {
+		rte_memcpy(owner, &rte_eth_devices[port_id].data->owner,
+			   sizeof(*owner));
+	}
+
+	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
+	return ret;
+}
+
 int
 rte_eth_dev_socket_id(uint16_t port_id)
 {
@@ -372,7 +545,7 @@ struct rte_eth_dev *
 
 	/* shouldn't check 'rte_eth_devices[i].data',
 	 * because it might be overwritten by VDEV PMD */
-	tmp = rte_eth_dev_data[port_id].name;
+	tmp = rte_eth_dev_shared_data->data[port_id].name;
 	strcpy(name, tmp);
 	return 0;
 }
@@ -380,22 +553,22 @@ struct rte_eth_dev *
 int
 rte_eth_dev_get_port_by_name(const char *name, uint16_t *port_id)
 {
-	int i;
+	uint32_t pid;
 
 	if (name == NULL) {
 		RTE_PMD_DEBUG_TRACE("Null pointer is specified\n");
 		return -EINVAL;
 	}
 
-	RTE_ETH_FOREACH_DEV(i) {
-		if (!strncmp(name,
-			rte_eth_dev_data[i].name, strlen(name))) {
-
-			*port_id = i;
-
+	for (pid = 0; pid < RTE_MAX_ETHPORTS; pid++) {
+		if (rte_eth_devices[pid].state != RTE_ETH_DEV_UNUSED &&
+		    !strncmp(name, rte_eth_dev_shared_data->data[pid].name,
+			     strlen(name))) {
+			*port_id = pid;
 			return 0;
 		}
 	}
+
 	return -ENODEV;
 }
 
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index ccf4a15..db3199f 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1221,6 +1221,15 @@ struct rte_eth_dev_sriov {
 
 #define RTE_ETH_NAME_MAX_LEN RTE_DEV_NAME_MAX_LEN
 
+#define RTE_ETH_DEV_NO_OWNER 0
+
+#define RTE_ETH_MAX_OWNER_NAME_LEN 64
+
+struct rte_eth_dev_owner {
+	uint64_t id; /**< The owner unique identifier. */
+	char name[RTE_ETH_MAX_OWNER_NAME_LEN]; /**< The owner name. */
+};
+
 /** Device supports link state interrupt */
 #define RTE_ETH_DEV_INTR_LSC     0x0002
 /** Device is a bonded slave */
@@ -1229,6 +1238,30 @@ struct rte_eth_dev_sriov {
 #define RTE_ETH_DEV_INTR_RMV     0x0008
 
 /**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Iterates over valid ethdev ports owned by a specific owner.
+ *
+ * @param port_id
+ *   The id of the next possible valid owned port.
+ * @param	owner_id
+ *  The owner identifier.
+ *  RTE_ETH_DEV_NO_OWNER means iterate over all valid ownerless ports.
+ * @return
+ *   Next valid port id owned by owner_id, RTE_MAX_ETHPORTS if there is none.
+ */
+uint64_t rte_eth_find_next_owned_by(uint16_t port_id, const uint64_t owner_id);
+
+/**
+ * Macro to iterate over all enabled ethdev ports owned by a specific owner.
+ */
+#define RTE_ETH_FOREACH_DEV_OWNED_BY(p, o) \
+	for (p = rte_eth_find_next_owned_by(0, o); \
+	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS; \
+	     p = rte_eth_find_next_owned_by(p + 1, o))
+
+/**
  * Iterates over valid ethdev ports.
  *
  * @param port_id
@@ -1239,12 +1272,84 @@ struct rte_eth_dev_sriov {
 uint16_t rte_eth_find_next(uint16_t port_id);
 
 /**
- * Macro to iterate over all enabled ethdev ports.
+ * Macro to iterate over all enabled and ownerless ethdev ports.
+ */
+#define RTE_ETH_FOREACH_DEV(p) \
+	RTE_ETH_FOREACH_DEV_OWNED_BY(p, RTE_ETH_DEV_NO_OWNER)
+
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get a new unique owner identifier.
+ * An owner identifier is used to owns Ethernet devices by only one DPDK entity
+ * to avoid multiple management of device by different entities.
+ *
+ * @param	owner_id
+ *   Owner identifier pointer.
+ * @return
+ *   Negative errno value on error, 0 on success.
+ */
+int rte_eth_dev_owner_new(uint64_t *owner_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set an Ethernet device owner.
+ *
+ * @param	port_id
+ *  The identifier of the port to own.
+ * @param	owner
+ *  The owner pointer.
+ * @return
+ *  Negative errno value on error, 0 on success.
+ */
+int rte_eth_dev_owner_set(const uint16_t port_id,
+			  const struct rte_eth_dev_owner *owner);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Unset Ethernet device owner to make the device ownerless.
+ *
+ * @param	port_id
+ *  The identifier of port to make ownerless.
+ * @param	owner_id
+ *  The owner identifier.
+ * @return
+ *  0 on success, negative errno value on error.
+ */
+int rte_eth_dev_owner_unset(const uint16_t port_id, const uint64_t owner_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Remove owner from all Ethernet devices owned by a specific owner.
+ *
+ * @param	owner_id
+ *  The owner identifier.
+ */
+void rte_eth_dev_owner_delete(const uint64_t owner_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the owner of an Ethernet device.
+ *
+ * @param	port_id
+ *  The port identifier.
+ * @param	owner
+ *  The owner structure pointer to fill.
+ * @return
+ *  0 on success, negative errno value on error..
  */
-#define RTE_ETH_FOREACH_DEV(p)					\
-	for (p = rte_eth_find_next(0);				\
-	     (unsigned int)p < (unsigned int)RTE_MAX_ETHPORTS;	\
-	     p = rte_eth_find_next(p + 1))
+int rte_eth_dev_owner_get(const uint16_t port_id,
+			  struct rte_eth_dev_owner *owner);
 
 /**
  * Get the total number of Ethernet devices that have been successfully
diff --git a/lib/librte_ether/rte_ethdev_core.h b/lib/librte_ether/rte_ethdev_core.h
index f44b40e..f9fccc4 100644
--- a/lib/librte_ether/rte_ethdev_core.h
+++ b/lib/librte_ether/rte_ethdev_core.h
@@ -544,6 +544,7 @@ struct rte_eth_dev {
 } __rte_cache_aligned;
 
 struct rte_eth_dev_sriov;
+struct rte_eth_dev_owner;
 
 /**
  * @internal
@@ -595,6 +596,7 @@ struct rte_eth_dev_data {
 	int numa_node;  /**< NUMA node connection */
 	struct rte_vlan_filter_conf vlan_filter_conf;
 	/**< VLAN filter configuration. */
+	struct rte_eth_dev_owner owner; /**< The port owner. */
 };
 
 /**
diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
index 5c61563..a5cb372 100644
--- a/lib/librte_ether/rte_ethdev_version.map
+++ b/lib/librte_ether/rte_ethdev_version.map
@@ -209,8 +209,14 @@ EXPERIMENTAL {
 	global:
 
 	rte_eth_dev_is_removed;
+	rte_eth_dev_owner_delete;
+	rte_eth_dev_owner_get;
+	rte_eth_dev_owner_new;
+	rte_eth_dev_owner_set;
+	rte_eth_dev_owner_unset;
 	rte_eth_dev_rx_offload_name;
 	rte_eth_dev_tx_offload_name;
+	rte_eth_find_next_owned_by;
 	rte_mtr_capabilities_get;
 	rte_mtr_create;
 	rte_mtr_destroy;
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v5 4/7] ethdev: synchronize port allocation
  2018-01-22 16:38       ` [dpdk-dev] [PATCH v5 0/7] Port ownership and synchronization Matan Azrad
                           ` (2 preceding siblings ...)
  2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 3/7] ethdev: add port ownership Matan Azrad
@ 2018-01-22 16:38         ` Matan Azrad
  2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 5/7] net/failsafe: free an eth port by a dedicated API Matan Azrad
                           ` (3 subsequent siblings)
  7 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-22 16:38 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Ethernet port allocation was not thread safe, means 2 threads which tried
to allocate a new port at the same time might get an identical port
identifier and caused to memory overwrite.
Actually, all the port configurations were not thread safe from ethdev
point of view.

The port ownership mechanism added to the ethdev is a good point to
redefine the synchronization rules in ethdev:

1. The port allocation and port release synchronization will be
   managed by ethdev.
2. The port usage synchronization will be managed by the port owner.
3. The port ownership synchronization will be managed by ethdev.

Add port allocation synchronization to complete the new rules.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---
 lib/librte_ether/rte_ethdev.c | 32 +++++++++++++++++++++-----------
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 3d2b90c..19650ae 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -268,20 +268,23 @@ struct rte_eth_dev *
 rte_eth_dev_allocate(const char *name)
 {
 	uint16_t port_id;
-	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev *eth_dev = NULL;
+
+	rte_eth_dev_shared_data_prepare();
+
+	/* Synchronize port creation between primary and secondary threads. */
+	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
 
 	port_id = rte_eth_dev_find_free_port();
 	if (port_id == RTE_MAX_ETHPORTS) {
 		RTE_PMD_DEBUG_TRACE("Reached maximum number of Ethernet ports\n");
-		return NULL;
+		goto unlock;
 	}
 
-	rte_eth_dev_shared_data_prepare();
-
 	if (rte_eth_dev_allocated(name) != NULL) {
 		RTE_PMD_DEBUG_TRACE("Ethernet Device with name %s already allocated!\n",
 				name);
-		return NULL;
+		goto unlock;
 	}
 
 	eth_dev = eth_dev_get(port_id);
@@ -289,7 +292,11 @@ struct rte_eth_dev *
 	eth_dev->data->port_id = port_id;
 	eth_dev->data->mtu = ETHER_MTU;
 
-	_rte_eth_dev_callback_process(eth_dev, RTE_ETH_EVENT_NEW, NULL);
+unlock:
+	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
+
+	if (eth_dev != NULL)
+		_rte_eth_dev_callback_process(eth_dev, RTE_ETH_EVENT_NEW, NULL);
 
 	return eth_dev;
 }
@@ -303,10 +310,13 @@ struct rte_eth_dev *
 rte_eth_dev_attach_secondary(const char *name)
 {
 	uint16_t i;
-	struct rte_eth_dev *eth_dev;
+	struct rte_eth_dev *eth_dev = NULL;
 
 	rte_eth_dev_shared_data_prepare();
 
+	/* Synchronize port attachment to primary port creation and release. */
+	rte_spinlock_lock(&rte_eth_dev_shared_data->ownership_lock);
+
 	for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
 		if (strcmp(rte_eth_dev_shared_data->data[i].name, name) == 0)
 			break;
@@ -315,12 +325,12 @@ struct rte_eth_dev *
 		RTE_PMD_DEBUG_TRACE(
 			"device %s is not driven by the primary process\n",
 			name);
-		return NULL;
+	} else {
+		eth_dev = eth_dev_get(i);
+		RTE_ASSERT(eth_dev->data->port_id == i);
 	}
 
-	eth_dev = eth_dev_get(i);
-	RTE_ASSERT(eth_dev->data->port_id == i);
-
+	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
 	return eth_dev;
 }
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v5 5/7] net/failsafe: free an eth port by a dedicated API
  2018-01-22 16:38       ` [dpdk-dev] [PATCH v5 0/7] Port ownership and synchronization Matan Azrad
                           ` (3 preceding siblings ...)
  2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 4/7] ethdev: synchronize port allocation Matan Azrad
@ 2018-01-22 16:38         ` Matan Azrad
  2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 6/7] net/failsafe: use ownership mechanism to own ports Matan Azrad
                           ` (2 subsequent siblings)
  7 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-22 16:38 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Call dedicated ethdev API to free port in remove time as was done in
other fail-safe places.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
---
 drivers/net/failsafe/failsafe_ether.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index 8a4cacf..e9b0cfe 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -297,7 +297,7 @@
 			ERROR("Bus detach failed for sub_device %u",
 			      SUB_ID(sdev));
 		} else {
-			ETH(sdev)->state = RTE_ETH_DEV_UNUSED;
+			rte_eth_dev_release_port(ETH(sdev));
 		}
 		sdev->state = DEV_PARSED;
 		/* fallthrough */
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v5 6/7] net/failsafe: use ownership mechanism to own ports
  2018-01-22 16:38       ` [dpdk-dev] [PATCH v5 0/7] Port ownership and synchronization Matan Azrad
                           ` (4 preceding siblings ...)
  2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 5/7] net/failsafe: free an eth port by a dedicated API Matan Azrad
@ 2018-01-22 16:38         ` Matan Azrad
  2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 7/7] app/testpmd: adjust ethdev port ownership Matan Azrad
  2018-01-29 11:21         ` [dpdk-dev] [PATCH v5 0/7] Port ownership and synchronization Matan Azrad
  7 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-22 16:38 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Fail-safe PMD sub devices management is based on ethdev port mechanism.
So, the sub-devices management structures are exposed to other DPDK
entities which may use them in parallel to fail-safe PMD.

Use the new port ownership mechanism to avoid multiple managments of
fail-safe PMD sub-devices.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
---
 drivers/net/failsafe/failsafe.c         |  7 +++++++
 drivers/net/failsafe/failsafe_eal.c     | 16 ++++++++++++++++
 drivers/net/failsafe/failsafe_private.h |  2 ++
 3 files changed, 25 insertions(+)

diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index cb274eb..e05afbf 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -196,6 +196,13 @@
 	ret = failsafe_args_parse(dev, params);
 	if (ret)
 		goto free_subs;
+	ret = rte_eth_dev_owner_new(&priv->my_owner.id);
+	if (ret) {
+		ERROR("Failed to get unique owner identifier");
+		goto free_args;
+	}
+	snprintf(priv->my_owner.name, sizeof(priv->my_owner.name),
+		 FAILSAFE_OWNER_NAME);
 	ret = failsafe_eal_init(dev);
 	if (ret)
 		goto free_args;
diff --git a/drivers/net/failsafe/failsafe_eal.c b/drivers/net/failsafe/failsafe_eal.c
index 33a5adf..3994661 100644
--- a/drivers/net/failsafe/failsafe_eal.c
+++ b/drivers/net/failsafe/failsafe_eal.c
@@ -106,6 +106,22 @@
 			INFO("Taking control of a probed sub device"
 			      " %d named %s", i, da->name);
 		}
+		ret = rte_eth_dev_owner_set(pid, &PRIV(dev)->my_owner);
+		if (ret) {
+			INFO("sub_device %d owner set failed (%s),"
+			     " will try again later", i, strerror(ret));
+			continue;
+		} else if (strncmp(rte_eth_devices[pid].device->name, da->name,
+			   strlen(da->name)) != 0) {
+			/*
+			 * The device probably was removed and its port id was
+			 * reallocated before ownership set.
+			 */
+			rte_eth_dev_owner_unset(pid, PRIV(dev)->my_owner.id);
+			INFO("sub_device %d was probably removed before taking"
+			     " ownership, will try again later", i);
+			continue;
+		}
 		ETH(sdev) = &rte_eth_devices[pid];
 		SUB_ID(sdev) = i;
 		sdev->fs_dev = dev;
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 7754248..ef0c9df 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -42,6 +42,7 @@
 #include <rte_devargs.h>
 
 #define FAILSAFE_DRIVER_NAME "Fail-safe PMD"
+#define FAILSAFE_OWNER_NAME "Fail-safe"
 
 #define PMD_FAILSAFE_MAC_KVARG "mac"
 #define PMD_FAILSAFE_HOTPLUG_POLL_KVARG "hotplug_poll"
@@ -145,6 +146,7 @@ struct fs_priv {
 	uint32_t mac_addr_pool[FAILSAFE_MAX_ETHADDR];
 	/* current capabilities */
 	struct rte_eth_dev_info infos;
+	struct rte_eth_dev_owner my_owner; /* Unique owner. */
 	/*
 	 * Fail-safe state machine.
 	 * This level will be tracking state of the EAL and eth
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* [dpdk-dev] [PATCH v5 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-22 16:38       ` [dpdk-dev] [PATCH v5 0/7] Port ownership and synchronization Matan Azrad
                           ` (5 preceding siblings ...)
  2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 6/7] net/failsafe: use ownership mechanism to own ports Matan Azrad
@ 2018-01-22 16:38         ` Matan Azrad
  2018-01-25  1:47           ` Lu, Wenzhuo
  2018-01-29 11:21         ` [dpdk-dev] [PATCH v5 0/7] Port ownership and synchronization Matan Azrad
  7 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-22 16:38 UTC (permalink / raw)
  To: Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

Testpmd should not use ethdev ports which are managed by other DPDK
entities.

Set Testpmd ownership to each port which is not used by other entity and
prevent any usage of ethdev ports which are not owned by Testpmd.

Signed-off-by: Matan Azrad <matan@mellanox.com>
---
 app/test-pmd/cmdline.c      | 89 +++++++++++++++++++--------------------------
 app/test-pmd/cmdline_flow.c |  2 +-
 app/test-pmd/config.c       | 37 ++++++++++---------
 app/test-pmd/parameters.c   |  4 +-
 app/test-pmd/testpmd.c      | 63 ++++++++++++++++++++------------
 app/test-pmd/testpmd.h      |  3 ++
 6 files changed, 103 insertions(+), 95 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 9f12c0f..36dba6c 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
 			&link_speed) < 0)
 		return;
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		ports[pid].dev_conf.link_speeds = link_speed;
 	}
 
@@ -1906,7 +1906,7 @@ struct cmd_config_rss {
 	struct cmd_config_rss *res = parsed_result;
 	struct rte_eth_rss_conf rss_conf = { .rss_key_len = 0, };
 	int diag;
-	uint8_t i;
+	uint16_t pid;
 
 	if (!strcmp(res->value, "all"))
 		rss_conf.rss_hf = ETH_RSS_IP | ETH_RSS_TCP |
@@ -1940,12 +1940,12 @@ struct cmd_config_rss {
 		return;
 	}
 	rss_conf.rss_key = NULL;
-	for (i = 0; i < rte_eth_dev_count(); i++) {
-		diag = rte_eth_dev_rss_hash_update(i, &rss_conf);
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
+		diag = rte_eth_dev_rss_hash_update(pid, &rss_conf);
 		if (diag < 0)
 			printf("Configuration of RSS hash at ethernet port %d "
 				"failed with error (%d): %s.\n",
-				i, -diag, strerror(-diag));
+				pid, -diag, strerror(-diag));
 	}
 }
 
@@ -3690,10 +3690,9 @@ struct cmd_csum_result {
 	uint64_t csum_offloads = 0;
 	struct rte_eth_dev_info dev_info;
 
-	if (port_id_is_invalid(res->port_id, ENABLED_WARN)) {
-		printf("invalid port %d\n", res->port_id);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
+
 	if (!port_is_stopped(res->port_id)) {
 		printf("Please stop port %d first\n", res->port_id);
 		return;
@@ -4368,8 +4367,8 @@ struct cmd_gso_show_result {
 {
 	struct cmd_gso_show_result *res = parsed_result;
 
-	if (!rte_eth_dev_is_valid_port(res->cmd_pid)) {
-		printf("invalid port id %u\n", res->cmd_pid);
+	if (port_id_is_invalid(res->cmd_pid, ENABLED_WARN)) {
+		printf("invalid/not owned port id %u\n", res->cmd_pid);
 		return;
 	}
 	if (!strcmp(res->cmd_keyword, "gso")) {
@@ -5379,7 +5378,12 @@ static void cmd_create_bonded_device_parsed(void *parsed_result,
 				port_id);
 
 		/* Update number of ports */
-		nb_ports = rte_eth_dev_count();
+		if (rte_eth_dev_owner_set(port_id, &my_owner) != 0) {
+			printf("Error: cannot own new attached port %d\n",
+			       port_id);
+			return;
+		}
+		nb_ports++;
 		reconfig(port_id, res->socket);
 		rte_eth_promiscuous_enable(port_id);
 	}
@@ -5488,10 +5492,8 @@ static void cmd_set_bond_mon_period_parsed(void *parsed_result,
 	struct cmd_set_bond_mon_period_result *res = parsed_result;
 	int ret;
 
-	if (res->port_num >= nb_ports) {
-		printf("Port id %d must be less than %d\n", res->port_num, nb_ports);
+	if (port_id_is_invalid(res->port_num, ENABLED_WARN))
 		return;
-	}
 
 	ret = rte_eth_bond_link_monitoring_set(res->port_num, res->period_ms);
 
@@ -5549,11 +5551,8 @@ struct cmd_set_bonding_agg_mode_policy_result {
 	struct cmd_set_bonding_agg_mode_policy_result *res = parsed_result;
 	uint8_t policy = AGG_BANDWIDTH;
 
-	if (res->port_num >= nb_ports) {
-		printf("Port id %d must be less than %d\n",
-				res->port_num, nb_ports);
+	if (port_id_is_invalid(res->port_num, ENABLED_WARN))
 		return;
-	}
 
 	if (!strcmp(res->policy, "bandwidth"))
 		policy = AGG_BANDWIDTH;
@@ -5812,7 +5811,7 @@ static void cmd_set_promisc_mode_parsed(void *parsed_result,
 
 	/* all ports */
 	if (allports) {
-		RTE_ETH_FOREACH_DEV(i) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
 			if (enable)
 				rte_eth_promiscuous_enable(i);
 			else
@@ -5892,7 +5891,7 @@ static void cmd_set_allmulti_mode_parsed(void *parsed_result,
 
 	/* all ports */
 	if (allports) {
-		RTE_ETH_FOREACH_DEV(i) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
 			if (enable)
 				rte_eth_allmulticast_enable(i);
 			else
@@ -6626,31 +6625,31 @@ static void cmd_showportall_parsed(void *parsed_result,
 	struct cmd_showportall_result *res = parsed_result;
 	if (!strcmp(res->show, "clear")) {
 		if (!strcmp(res->what, "stats"))
-			RTE_ETH_FOREACH_DEV(i)
+			RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 				nic_stats_clear(i);
 		else if (!strcmp(res->what, "xstats"))
-			RTE_ETH_FOREACH_DEV(i)
+			RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 				nic_xstats_clear(i);
 	} else if (!strcmp(res->what, "info"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			port_infos_display(i);
 	else if (!strcmp(res->what, "stats"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			nic_stats_display(i);
 	else if (!strcmp(res->what, "xstats"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			nic_xstats_display(i);
 	else if (!strcmp(res->what, "fdir"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			fdir_get_infos(i);
 	else if (!strcmp(res->what, "stat_qmap"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			nic_stats_mapping_display(i);
 	else if (!strcmp(res->what, "dcb_tc"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			port_dcb_info_display(i);
 	else if (!strcmp(res->what, "cap"))
-		RTE_ETH_FOREACH_DEV(i)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id)
 			port_offload_cap_display(i);
 }
 
@@ -10702,10 +10701,8 @@ struct cmd_flow_director_mask_result {
 	struct rte_eth_fdir_masks *mask;
 	struct rte_port *port;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	port = &ports[res->port_id];
 	/** Check if the port is not started **/
@@ -10903,10 +10900,8 @@ struct cmd_flow_director_flex_mask_result {
 	uint16_t i;
 	int ret;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	port = &ports[res->port_id];
 	/** Check if the port is not started **/
@@ -11057,12 +11052,10 @@ struct cmd_flow_director_flexpayload_result {
 	struct cmd_flow_director_flexpayload_result *res = parsed_result;
 	struct rte_eth_flex_payload_cfg flex_cfg;
 	struct rte_port *port;
-	int ret = 0;
+	int ret;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	port = &ports[res->port_id];
 	/** Check if the port is not started **/
@@ -11778,7 +11771,7 @@ struct cmd_config_l2_tunnel_eth_type_result {
 	entry.l2_tunnel_type = str2fdir_l2_tunnel_type(res->l2_tunnel_type);
 	entry.ether_type = res->eth_type_val;
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		rte_eth_dev_l2_tunnel_eth_type_conf(pid, &entry);
 	}
 }
@@ -11894,7 +11887,7 @@ struct cmd_config_l2_tunnel_en_dis_result {
 	else
 		en = 0;
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		rte_eth_dev_l2_tunnel_offload_set(pid,
 						  &entry,
 						  ETH_L2_TUNNEL_ENABLE_MASK,
@@ -14444,10 +14437,8 @@ struct cmd_ddp_add_result {
 	int file_num;
 	int ret = -ENOTSUP;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	if (!all_ports_stopped()) {
 		printf("Please stop all ports first\n");
@@ -14526,10 +14517,8 @@ struct cmd_ddp_del_result {
 	uint32_t size;
 	int ret = -ENOTSUP;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 	if (!all_ports_stopped()) {
 		printf("Please stop all ports first\n");
@@ -14841,10 +14830,8 @@ struct cmd_ddp_get_list_result {
 #endif
 	int ret = -ENOTSUP;
 
-	if (res->port_id > nb_ports) {
-		printf("Invalid port, range is [0, %d]\n", nb_ports - 1);
+	if (port_id_is_invalid(res->port_id, ENABLED_WARN))
 		return;
-	}
 
 #ifdef RTE_LIBRTE_I40E_PMD
 	size = PROFILE_INFO_SIZE * MAX_PROFILE_NUM + 4;
@@ -16300,7 +16287,7 @@ struct cmd_cmdfile_result {
 	if (id == (portid_t)RTE_PORT_ALL) {
 		portid_t pid;
 
-		RTE_ETH_FOREACH_DEV(pid) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 			/* check if need_reconfig has been set to 1 */
 			if (ports[pid].need_reconfig == 0)
 				ports[pid].need_reconfig = dev;
diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 561e057..e55490f 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -2652,7 +2652,7 @@ static int comp_vc_action_rss_queue(struct context *, const struct token *,
 
 	(void)ctx;
 	(void)token;
-	RTE_ETH_FOREACH_DEV(p) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(p, my_owner.id) {
 		if (buf && i == ent)
 			return snprintf(buf, size, "%u", p);
 		++i;
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 957b820..43b9a7d 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -156,7 +156,7 @@ struct rss_type_info {
 
 	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
 		printf("Valid port range is [0");
-		RTE_ETH_FOREACH_DEV(pid)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 			printf(", %d", pid);
 		printf("]\n");
 		return;
@@ -236,7 +236,7 @@ struct rss_type_info {
 
 	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
 		printf("Valid port range is [0");
-		RTE_ETH_FOREACH_DEV(pid)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 			printf(", %d", pid);
 		printf("]\n");
 		return;
@@ -253,10 +253,9 @@ struct rss_type_info {
 	struct rte_eth_xstat_name *xstats_names;
 
 	printf("###### NIC extended statistics for port %-2d\n", port_id);
-	if (!rte_eth_dev_is_valid_port(port_id)) {
-		printf("Error: Invalid port number %i\n", port_id);
+
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
 		return;
-	}
 
 	/* Get count */
 	cnt_xstats = rte_eth_xstats_get_names(port_id, NULL, 0);
@@ -321,7 +320,7 @@ struct rss_type_info {
 
 	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
 		printf("Valid port range is [0");
-		RTE_ETH_FOREACH_DEV(pid)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 			printf(", %d", pid);
 		printf("]\n");
 		return;
@@ -439,7 +438,7 @@ struct rss_type_info {
 
 	if (port_id_is_invalid(port_id, ENABLED_WARN)) {
 		printf("Valid port range is [0");
-		RTE_ETH_FOREACH_DEV(pid)
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 			printf(", %d", pid);
 		printf("]\n");
 		return;
@@ -756,10 +755,15 @@ struct rss_type_info {
 int
 port_id_is_invalid(portid_t port_id, enum print_warning warning)
 {
+	struct rte_eth_dev_owner owner;
+	int ret;
+
 	if (port_id == (portid_t)RTE_PORT_ALL)
 		return 0;
 
-	if (rte_eth_dev_is_valid_port(port_id))
+	ret = rte_eth_dev_owner_get(port_id, &owner);
+
+	if (ret == 0 && owner.id == my_owner.id)
 		return 0;
 
 	if (warning == ENABLED_WARN)
@@ -2373,7 +2377,7 @@ struct igb_ring_desc_16_bytes {
 		return;
 	}
 	nb_pt = 0;
-	RTE_ETH_FOREACH_DEV(i) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(i, my_owner.id) {
 		if (! ((uint64_t)(1ULL << i) & portmask))
 			continue;
 		portlist[nb_pt++] = i;
@@ -2512,10 +2516,9 @@ struct igb_ring_desc_16_bytes {
 void
 setup_gro(const char *onoff, portid_t port_id)
 {
-	if (!rte_eth_dev_is_valid_port(port_id)) {
-		printf("invalid port id %u\n", port_id);
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
 		return;
-	}
+
 	if (test_done == 0) {
 		printf("Before enable/disable GRO,"
 				" please stop forwarding first\n");
@@ -2574,10 +2577,9 @@ struct igb_ring_desc_16_bytes {
 
 	param = &gro_ports[port_id].param;
 
-	if (!rte_eth_dev_is_valid_port(port_id)) {
-		printf("Invalid port id %u.\n", port_id);
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
 		return;
-	}
+
 	if (gro_ports[port_id].enable) {
 		printf("GRO type: TCP/IPv4\n");
 		if (gro_flush_cycles == GRO_DEFAULT_FLUSH_CYCLES) {
@@ -2595,10 +2597,9 @@ struct igb_ring_desc_16_bytes {
 void
 setup_gso(const char *mode, portid_t port_id)
 {
-	if (!rte_eth_dev_is_valid_port(port_id)) {
-		printf("invalid port id %u\n", port_id);
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
 		return;
-	}
+
 	if (strcmp(mode, "on") == 0) {
 		if (test_done == 0) {
 			printf("before enabling GSO,"
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index fd59071..37490ce 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -398,7 +398,7 @@
 		if (port_id_is_invalid(port_id, ENABLED_WARN) ||
 			port_id == (portid_t)RTE_PORT_ALL) {
 			printf("Valid port range is [0");
-			RTE_ETH_FOREACH_DEV(pid)
+			RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 				printf(", %d", pid);
 			printf("]\n");
 			return -1;
@@ -459,7 +459,7 @@
 		if (port_id_is_invalid(port_id, ENABLED_WARN) ||
 			port_id == (portid_t)RTE_PORT_ALL) {
 			printf("Valid port range is [0");
-			RTE_ETH_FOREACH_DEV(pid)
+			RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id)
 				printf(", %d", pid);
 			printf("]\n");
 			return -1;
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 5dc8cca..276c7eb 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -108,6 +108,11 @@
 lcoreid_t nb_lcores;           /**< Number of probed logical cores. */
 
 /*
+ * My port owner structure used to own Ethernet ports.
+ */
+struct rte_eth_dev_owner my_owner; /**< Unique owner. */
+
+/*
  * Test Forwarding Configuration.
  *    nb_fwd_lcores <= nb_cfg_lcores <= nb_lcores
  *    nb_fwd_ports  <= nb_cfg_ports  <= nb_ports
@@ -449,7 +454,7 @@ static int eth_event_callback(portid_t port_id,
 	portid_t pt_id;
 	int i = 0;
 
-	RTE_ETH_FOREACH_DEV(pt_id)
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pt_id, my_owner.id)
 		fwd_ports_ids[i++] = pt_id;
 
 	nb_cfg_ports = nb_ports;
@@ -665,7 +670,7 @@ static int eth_event_callback(portid_t port_id,
 		fwd_lcores[lc_id]->cpuid_idx = lc_id;
 	}
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		port = &ports[pid];
 		/* Apply default Tx configuration for all ports */
 		port->dev_conf.txmode = tx_mode;
@@ -798,7 +803,7 @@ static int eth_event_callback(portid_t port_id,
 	queueid_t q;
 
 	/* set socket id according to numa or not */
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		port = &ports[pid];
 		if (nb_rxq > port->dev_info.max_rx_queues) {
 			printf("Fail: nb_rxq(%d) is greater than "
@@ -1092,9 +1097,8 @@ static int eth_event_callback(portid_t port_id,
 	uint64_t tics_per_1sec;
 	uint64_t tics_datum;
 	uint64_t tics_current;
-	uint8_t idx_port, cnt_ports;
+	uint16_t idx_port;
 
-	cnt_ports = rte_eth_dev_count();
 	tics_datum = rte_rdtsc();
 	tics_per_1sec = rte_get_timer_hz();
 #endif
@@ -1109,11 +1113,10 @@ static int eth_event_callback(portid_t port_id,
 			tics_current = rte_rdtsc();
 			if (tics_current - tics_datum >= tics_per_1sec) {
 				/* Periodic bitrate calculation */
-				for (idx_port = 0;
-						idx_port < cnt_ports;
-						idx_port++)
+				RTE_ETH_FOREACH_DEV_OWNED_BY(idx_port,
+							     my_owner.id)
 					rte_stats_bitrate_calc(bitrate_data,
-						idx_port);
+							       idx_port);
 				tics_datum = tics_current;
 			}
 		}
@@ -1451,7 +1454,7 @@ static int eth_event_callback(portid_t port_id,
 	portid_t pi;
 	struct rte_port *port;
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		port = &ports[pi];
 		/* Check if there is a port which is not started */
 		if ((port->port_status != RTE_PORT_STARTED) &&
@@ -1479,7 +1482,7 @@ static int eth_event_callback(portid_t port_id,
 {
 	portid_t pi;
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (!port_is_stopped(pi))
 			return 0;
 	}
@@ -1526,7 +1529,7 @@ static int eth_event_callback(portid_t port_id,
 
 	if(dcb_config)
 		dcb_test = 1;
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
 			continue;
 
@@ -1712,7 +1715,7 @@ static int eth_event_callback(portid_t port_id,
 
 	printf("Stopping ports...\n");
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
 			continue;
 
@@ -1755,7 +1758,7 @@ static int eth_event_callback(portid_t port_id,
 
 	printf("Closing ports...\n");
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
 			continue;
 
@@ -1806,7 +1809,7 @@ static int eth_event_callback(portid_t port_id,
 
 	printf("Resetting ports...\n");
 
-	RTE_ETH_FOREACH_DEV(pi) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pi, my_owner.id) {
 		if (pid != pi && pid != (portid_t)RTE_PORT_ALL)
 			continue;
 
@@ -1851,6 +1854,12 @@ static int eth_event_callback(portid_t port_id,
 	if (rte_eth_dev_attach(identifier, &pi))
 		return;
 
+	if (rte_eth_dev_owner_set(pi, &my_owner) != 0) {
+		printf("Error: cannot own new attached port %d\n", pi);
+		return;
+	}
+	nb_ports++;
+
 	socket_id = (unsigned)rte_eth_dev_socket_id(pi);
 	/* if socket_id is invalid, set to 0 */
 	if (check_socket_id(socket_id) < 0)
@@ -1858,8 +1867,6 @@ static int eth_event_callback(portid_t port_id,
 	reconfig(pi, socket_id);
 	rte_eth_promiscuous_enable(pi);
 
-	nb_ports = rte_eth_dev_count();
-
 	ports[pi].port_status = RTE_PORT_STOPPED;
 
 	printf("Port %d is attached. Now total ports is %d\n", pi, nb_ports);
@@ -1873,6 +1880,9 @@ static int eth_event_callback(portid_t port_id,
 
 	printf("Detaching a port...\n");
 
+	if (port_id_is_invalid(port_id, ENABLED_WARN))
+		return;
+
 	if (!port_is_closed(port_id)) {
 		printf("Please close port first\n");
 		return;
@@ -1886,7 +1896,7 @@ static int eth_event_callback(portid_t port_id,
 		return;
 	}
 
-	nb_ports = rte_eth_dev_count();
+	nb_ports--;
 
 	printf("Port '%s' is detached. Now total ports is %d\n",
 			name, nb_ports);
@@ -1904,7 +1914,7 @@ static int eth_event_callback(portid_t port_id,
 
 	if (ports != NULL) {
 		no_link_check = 1;
-		RTE_ETH_FOREACH_DEV(pt_id) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(pt_id, my_owner.id) {
 			printf("\nShutting down port %d...\n", pt_id);
 			fflush(stdout);
 			stop_port(pt_id);
@@ -1936,7 +1946,7 @@ struct pmd_test_command {
 	fflush(stdout);
 	for (count = 0; count <= MAX_CHECK_TIME; count++) {
 		all_ports_up = 1;
-		RTE_ETH_FOREACH_DEV(portid) {
+		RTE_ETH_FOREACH_DEV_OWNED_BY(portid, my_owner.id) {
 			if ((port_mask & (1 << portid)) == 0)
 				continue;
 			memset(&link, 0, sizeof(link));
@@ -2028,6 +2038,8 @@ struct pmd_test_command {
 
 	switch (type) {
 	case RTE_ETH_EVENT_INTR_RMV:
+		if (port_id_is_invalid(port_id, ENABLED_WARN))
+			break;
 		if (rte_eal_alarm_set(100000,
 				rmv_event_callback, (void *)(intptr_t)port_id))
 			fprintf(stderr, "Could not set up deferred device removal\n");
@@ -2160,7 +2172,7 @@ struct pmd_test_command {
 	portid_t pid;
 	struct rte_port *port;
 
-	RTE_ETH_FOREACH_DEV(pid) {
+	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
 		port = &ports[pid];
 		port->dev_conf.fdir_conf = fdir_conf;
 		if (nb_rxq > 1) {
@@ -2475,7 +2487,12 @@ uint8_t port_is_bonding_slave(portid_t slave_pid)
 	rte_pdump_init(NULL);
 #endif
 
-	nb_ports = (portid_t) rte_eth_dev_count();
+	if (rte_eth_dev_owner_new(&my_owner.id))
+		rte_panic("Failed to get unique owner identifier\n");
+	snprintf(my_owner.name, sizeof(my_owner.name), TESTPMD_OWNER_NAME);
+	RTE_ETH_FOREACH_DEV(port_id)
+		if (rte_eth_dev_owner_set(port_id, &my_owner) == 0)
+			nb_ports++;
 	if (nb_ports == 0)
 		TESTPMD_LOG(WARNING, "No probed ethernet devices\n");
 
@@ -2523,7 +2540,7 @@ uint8_t port_is_bonding_slave(portid_t slave_pid)
 		rte_exit(EXIT_FAILURE, "Start ports failed\n");
 
 	/* set all ports to promiscuous mode by default */
-	RTE_ETH_FOREACH_DEV(port_id)
+	RTE_ETH_FOREACH_DEV_OWNED_BY(port_id, my_owner.id)
 		rte_eth_promiscuous_enable(port_id);
 
 	/* Init metrics library */
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 47f8fa8..ef946ae 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -50,6 +50,8 @@
 #define NUMA_NO_CONFIG 0xFF
 #define UMA_NO_CONFIG  0xFF
 
+#define TESTPMD_OWNER_NAME "TestPMD"
+
 typedef uint8_t  lcoreid_t;
 typedef uint16_t portid_t;
 typedef uint16_t queueid_t;
@@ -361,6 +363,7 @@ struct queue_stats_mappings {
  * nb_fwd_ports <= nb_cfg_ports <= nb_ports
  */
 extern portid_t nb_ports; /**< Number of ethernet ports probed at init time. */
+extern struct rte_eth_dev_owner my_owner; /**< Unique owner. */
 extern portid_t nb_cfg_ports; /**< Number of configured ports. */
 extern portid_t nb_fwd_ports; /**< Number of forwarding ports. */
 extern portid_t fwd_ports_ids[RTE_MAX_ETHPORTS];
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-22 13:22                   ` Matan Azrad
@ 2018-01-22 20:48                     ` Ananyev, Konstantin
  2018-01-23  8:54                       ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-22 20:48 UTC (permalink / raw)
  To: Matan Azrad, Gaëtan Rivet
  Cc: Thomas Monjalon, Wu, Jingjing, dev, Neil Horman, Richardson, Bruce

Hi Matan,

> -----Original Message-----
> From: Matan Azrad [mailto:matan@mellanox.com]
> Sent: Monday, January 22, 2018 1:23 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing <jingjing.wu@intel.com>; dev@dpdk.org; Neil Horman
> <nhorman@tuxdriver.com>; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> 
> 
> Hi
> From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > Hi lads,
> >
> > >
> > > Hi Matan,
> > >
> > > On Fri, Jan 19, 2018 at 01:35:10PM +0000, Matan Azrad wrote:
> > > > Hi Konstantin
> > > >
> > > > From: Ananyev, Konstantin, Friday, January 19, 2018 3:09 PM
> > > > > > -----Original Message-----
> > > > > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > > > > Sent: Friday, January 19, 2018 12:52 PM
> > > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Thomas
> > > > > > Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > > > <gaetan.rivet@6wind.com>;
> > > > > > Wu, Jingjing <jingjing.wu@intel.com>
> > > > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> > > > > > Richardson, Bruce <bruce.richardson@intel.com>
> > > > > > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port
> > > > > > ownership
> > > > > >
> > > > > > Hi Konstantin
> > > > > >
> > > > > > From: Ananyev, Konstantin, Friday, January 19, 2018 2:38 PM
> > > > > > > To: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > > > > <thomas@monjalon.net>; Gaetan Rivet
> > <gaetan.rivet@6wind.com>;
> > > > > Wu,
> > > > > > > Jingjing <jingjing.wu@intel.com>
> > > > > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> > > > > > > Richardson, Bruce <bruce.richardson@intel.com>
> > > > > > > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port
> > > > > > > ownership
> > > > > > >
> > > > > > > Hi Matan,
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > > > > > > Sent: Thursday, January 18, 2018 4:35 PM
> > > > > > > > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > > > > > > <gaetan.rivet@6wind.com>; Wu, Jingjing
> > > > > > > > <jingjing.wu@intel.com>
> > > > > > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> > > > > Richardson,
> > > > > > > > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > > > > > > <konstantin.ananyev@intel.com>
> > > > > > > > Subject: [PATCH v3 7/7] app/testpmd: adjust ethdev port
> > > > > > > > ownership
> > > > > > > >
> > > > > > > > Testpmd should not use ethdev ports which are managed by
> > > > > > > > other DPDK entities.
> > > > > > > >
> > > > > > > > Set Testpmd ownership to each port which is not used by
> > > > > > > > other entity and prevent any usage of ethdev ports which are
> > > > > > > > not owned by
> > > > > Testpmd.
> > > > > > > >
> > > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > > > > ---
> > > > > > > >  app/test-pmd/cmdline.c      | 89 +++++++++++++++++++----------
> > -------
> > > > > ----
> > > > > > > -----
> > > > > > > >  app/test-pmd/cmdline_flow.c |  2 +-
> > > > > > > >  app/test-pmd/config.c       | 37 ++++++++++---------
> > > > > > > >  app/test-pmd/parameters.c   |  4 +-
> > > > > > > >  app/test-pmd/testpmd.c      | 63 ++++++++++++++++++++--------
> > ----
> > > > > > > >  app/test-pmd/testpmd.h      |  3 ++
> > > > > > > >  6 files changed, 103 insertions(+), 95 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> > > > > > > > index
> > > > > > > > 31919ba..6199c64 100644
> > > > > > > > --- a/app/test-pmd/cmdline.c
> > > > > > > > +++ b/app/test-pmd/cmdline.c
> > > > > > > > @@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
> > > > > > > >  			&link_speed) < 0)
> > > > > > > >  		return;
> > > > > > > >
> > > > > > > > -	RTE_ETH_FOREACH_DEV(pid) {
> > > > > > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
> > > > > > >
> > > > > > > Why do we need all these changes?
> > > > > > > As I understand you changed definition of
> > > > > > > RTE_ETH_FOREACH_DEV(), so no testpmd should work ok default
> > (no_owner case).
> > > > > > > Am I missing something here?
> > > > > >
> > > > > > Now, After Gaetan suggestion RTE_ETH_FOREACH_DEV(pid) will
> > > > > > iterate
> > > > > over all valid and ownerless ports.
> > > > >
> > > > > Yes.
> > > > >
> > > > > > Here Testpmd wants to iterate over its owned ports.
> > > > >
> > > > > Why? Why it can't just iterate over all valid and ownerless ports?
> > > > > As I understand it would be enough to fix current problems and
> > > > > would allow us to avoid any changes in testmpd (which I think is a good
> > thing).
> > > >
> > > > Yes, I understand that this big change is very daunted, But I think
> > > > the current a lot of bugs in testpmd(regarding port ownership) even
> > > > more
> > > daunted.
> > > >
> > > > Look,
> > > > Testpmd initiates some of its internal databases depends on specific
> > > > port iteration, In some time someone may take ownership of Testpmd
> > ports and testpmd will continue to touch them.
> >
> > But if someone will take the ownership (assign new owner_id) that port will
> > not appear in RTE_ETH_FOREACH_DEV() any more.
> >
> 
> Yes, but testpmd sometimes depends on previous iteration using internal database.
> So it uses internal database that was updated by old iteration.

That sounds like just a bug in testpmd that need to be fixed, no?
Any particular places where outdated device info is used? 

> 
> > > >
> > >
> > > If I look back on the fail-safe, its sole purpose is to have seamless
> > > hotplug with existing applications.
> > >
> > > Port ownership is a genericization of some functions introduced by the
> > > fail-safe, that could structure DPDK further. It should allow
> > > applications to have a seamless integration with subsystems using port
> > > ownership. Without this, port ownership cannot be used.
> > >
> > > Testpmd should be fixed, but follow the most common design patterns of
> > > DPDK applications. Going with port ownership seems like a paradigm
> > > shift.
> > >
> > > > In addition
> > > > Using the old iterator in some places in testpmd will cause a race for run-
> > time new ports(can be created by failsafe or any hotplug code):
> > > > - testpmd finds an ownerless port(just now created) by the old
> > > > iterator and start traffic there,
> > > > - failsafe takes ownership of this new port and start traffic there.
> > > > Problem!
> >
> > Could you shed a bit more light here - it would be race condition between
> > whom and whom?
> 
> Sure.
> 
> > As I remember in testpmd all control ops are done within one thread (main
> > lcore).
> 
> But other dpdk entity can use another thread, for example:
> Failsafe uses the host thread(using alarm callback) to create a new port and to take ownership of a port.

Hm, and you create new ports inside failsafe PMD, right and then set new owner_id for it?
And all this in alarm in interrupt thread?
If so I wonder how you can guarantee that no-one else will set different owner_id between
rte_eth_dev_allocate() and rte_eth_dev_owner_set()?
Could you point me to that place (I am not really familiar with familiar with failsafe code)?

> 
> The race:
> Testpmd iterates over all ports by the master thread.
> Failsafe takes ownership of a port by the host thread and start using it.
> => The two dpdk entities may use the device at same time!

Ok, if failsafe really assigns its owner_id(s) to ports that are already in use by the app,
then how such scheme supposed to work at all?
I.E. application has a port - it assigns some owner_id != 0 to it, then PMD tries to
set its owner_id tot the same port.
Obviously failsafe's set_owner() will always fail in such case.

>From what I hear we need to introduce a concept of 'default owner id'.
I.E. when failsafe PMD is created - user assigns some owner_id to it (default).
Then failsafe PMD generates it's own owner_id and assigns it only to the ports
whose current owner_id is equal either 0 or 'default' owner_id. 

> 
> Obeying the new ownership rules can prevent all these races.
> 

When we discussed RFC of owner_id patch, I thought we all agreed that
owner_id  API shouldn't be mandatory - i.e. existing apps not required to change
to work normally with that.
Though right now it seems that application changes seems necessary,
at least to work ok with failsafe PMD.
Which makes we wonder was it some sort of misunderstanding or
we did we do something wrong here?
Konstantin

> > The only way to attach/detach port with it - invoke testpmd CLI
> > "attach/detach" port.
> >
> > Konstantin
> >
> > >
> > > Testpmd does not handle detection of new port. If it did, testing
> > > fail-safe with it would be wrong.
> > >
> > > At startup, RTE_ETH_FOREACH_DEV already fixed the issue of registering
> > > DEFERRED ports. There are still remaining issues regarding this, but I
> > > think they should be fixed. The architecture does not need to be
> > > completely moved to port ownership.
> > >
> > > If anything, this should serve as a test for your API with common
> > > applications. I think you'd prefer to know and debug with testpmd
> > > instead of firing up VPP or something like that to determine what went
> > > wrong with using the fail-safe.
> > >
> > > >
> > > > In addition
> > > > As a good example for well-done application (free from ownership
> > > > bugs) I tried here to adjust Tespmd to the new rules and BTW to fix
> > > > a
> > > lot of bugs.
> > >
> > > Testpmd has too much cruft, it won't ever be a good example of a
> > > well-done application.
> > >
> > > If you want to demonstrate ownership, I think you should start an
> > > example application demonstrating race conditions and their mitigation.
> > >
> > > I think that would be interesting for many DPDK users.
> > >
> > > >
> > > >
> > > > So actually applications which are not aware to the port ownership
> > > > still are exposed to races, but if there use the old iterator(with
> > > > the new
> > > change) the amount of races decreases.
> > > >
> > > > Thanks, Matan.
> > > > > Konstantin
> > > > >
> > > > > >
> > > > > > I added to Testpmd ability to take an ownership of ports as the
> > > > > > new ownership and synchronization rules suggested, Since Tespmd
> > > > > > is a DPDK entity which wants that no one will touch its owned
> > > > > > ports, It must allocate
> > > > > an unique ID, set owner for its ports (see in main function) and
> > > > > recognizes them by its owner ID.
> > > > > >
> > > > > > > Konstantin
> > > > > > >
> > >
> > > Regards,
> > > --
> > > Gaëtan Rivet
> > > 6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-22 20:48                     ` Ananyev, Konstantin
@ 2018-01-23  8:54                       ` Matan Azrad
  2018-01-23 12:56                         ` Gaëtan Rivet
  2018-01-23 13:34                         ` Ananyev, Konstantin
  0 siblings, 2 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-23  8:54 UTC (permalink / raw)
  To: Ananyev, Konstantin, Gaëtan Rivet
  Cc: Thomas Monjalon, Wu, Jingjing, dev, Neil Horman, Richardson, Bruce


Hi Konstantin
From: Ananyev, Konstantin, Monday, January 22, 2018 10:49 PM
> Hi Matan,
> 
> > -----Original Message-----
> > From: Matan Azrad [mailto:matan@mellanox.com]
> > Sent: Monday, January 22, 2018 1:23 PM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Gaëtan Rivet
> > <gaetan.rivet@6wind.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > <jingjing.wu@intel.com>; dev@dpdk.org; Neil Horman
> > <nhorman@tuxdriver.com>; Richardson, Bruce
> > <bruce.richardson@intel.com>
> > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> >
> >
> > Hi
> > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > > Hi lads,
> > >
> > > >
> > > > Hi Matan,
> > > >
> > > > On Fri, Jan 19, 2018 at 01:35:10PM +0000, Matan Azrad wrote:
> > > > > Hi Konstantin
> > > > >
> > > > > From: Ananyev, Konstantin, Friday, January 19, 2018 3:09 PM
> > > > > > > -----Original Message-----
> > > > > > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > > > > > Sent: Friday, January 19, 2018 12:52 PM
> > > > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> > > > > > > Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > > > > <gaetan.rivet@6wind.com>;
> > > > > > > Wu, Jingjing <jingjing.wu@intel.com>
> > > > > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> > > > > > > Richardson, Bruce <bruce.richardson@intel.com>
> > > > > > > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port
> > > > > > > ownership
> > > > > > >
> > > > > > > Hi Konstantin
> > > > > > >
> > > > > > > From: Ananyev, Konstantin, Friday, January 19, 2018 2:38 PM
> > > > > > > > To: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > > > > > <thomas@monjalon.net>; Gaetan Rivet
> > > <gaetan.rivet@6wind.com>;
> > > > > > Wu,
> > > > > > > > Jingjing <jingjing.wu@intel.com>
> > > > > > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> > > > > > > > Richardson, Bruce <bruce.richardson@intel.com>
> > > > > > > > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev
> > > > > > > > port ownership
> > > > > > > >
> > > > > > > > Hi Matan,
> > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > > > > > > > Sent: Thursday, January 18, 2018 4:35 PM
> > > > > > > > > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > > > > > > > <gaetan.rivet@6wind.com>; Wu, Jingjing
> > > > > > > > > <jingjing.wu@intel.com>
> > > > > > > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> > > > > > Richardson,
> > > > > > > > > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > > > > > > > <konstantin.ananyev@intel.com>
> > > > > > > > > Subject: [PATCH v3 7/7] app/testpmd: adjust ethdev port
> > > > > > > > > ownership
> > > > > > > > >
> > > > > > > > > Testpmd should not use ethdev ports which are managed by
> > > > > > > > > other DPDK entities.
> > > > > > > > >
> > > > > > > > > Set Testpmd ownership to each port which is not used by
> > > > > > > > > other entity and prevent any usage of ethdev ports which
> > > > > > > > > are not owned by
> > > > > > Testpmd.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > > > > > ---
> > > > > > > > >  app/test-pmd/cmdline.c      | 89 +++++++++++++++++++------
> ----
> > > -------
> > > > > > ----
> > > > > > > > -----
> > > > > > > > >  app/test-pmd/cmdline_flow.c |  2 +-
> > > > > > > > >  app/test-pmd/config.c       | 37 ++++++++++---------
> > > > > > > > >  app/test-pmd/parameters.c   |  4 +-
> > > > > > > > >  app/test-pmd/testpmd.c      | 63 ++++++++++++++++++++----
> ----
> > > ----
> > > > > > > > >  app/test-pmd/testpmd.h      |  3 ++
> > > > > > > > >  6 files changed, 103 insertions(+), 95 deletions(-)
> > > > > > > > >
> > > > > > > > > diff --git a/app/test-pmd/cmdline.c
> > > > > > > > > b/app/test-pmd/cmdline.c index
> > > > > > > > > 31919ba..6199c64 100644
> > > > > > > > > --- a/app/test-pmd/cmdline.c
> > > > > > > > > +++ b/app/test-pmd/cmdline.c
> > > > > > > > > @@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
> > > > > > > > >  			&link_speed) < 0)
> > > > > > > > >  		return;
> > > > > > > > >
> > > > > > > > > -	RTE_ETH_FOREACH_DEV(pid) {
> > > > > > > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid,
> my_owner.id) {
> > > > > > > >
> > > > > > > > Why do we need all these changes?
> > > > > > > > As I understand you changed definition of
> > > > > > > > RTE_ETH_FOREACH_DEV(), so no testpmd should work ok
> > > > > > > > default
> > > (no_owner case).
> > > > > > > > Am I missing something here?
> > > > > > >
> > > > > > > Now, After Gaetan suggestion RTE_ETH_FOREACH_DEV(pid) will
> > > > > > > iterate
> > > > > > over all valid and ownerless ports.
> > > > > >
> > > > > > Yes.
> > > > > >
> > > > > > > Here Testpmd wants to iterate over its owned ports.
> > > > > >
> > > > > > Why? Why it can't just iterate over all valid and ownerless ports?
> > > > > > As I understand it would be enough to fix current problems and
> > > > > > would allow us to avoid any changes in testmpd (which I think
> > > > > > is a good
> > > thing).
> > > > >
> > > > > Yes, I understand that this big change is very daunted, But I
> > > > > think the current a lot of bugs in testpmd(regarding port
> > > > > ownership) even more
> > > > daunted.
> > > > >
> > > > > Look,
> > > > > Testpmd initiates some of its internal databases depends on
> > > > > specific port iteration, In some time someone may take ownership
> > > > > of Testpmd
> > > ports and testpmd will continue to touch them.
> > >
> > > But if someone will take the ownership (assign new owner_id) that
> > > port will not appear in RTE_ETH_FOREACH_DEV() any more.
> > >
> >
> > Yes, but testpmd sometimes depends on previous iteration using internal
> database.
> > So it uses internal database that was updated by old iteration.
> 
> That sounds like just a bug in testpmd that need to be fixed, no?

If Testpmd already took ownership for these ports(like I did), it is ok.

> Any particular places where outdated device info is used?

For example, look for the stream management in testpmd(I think I saw it there).

> > > > If I look back on the fail-safe, its sole purpose is to have
> > > > seamless hotplug with existing applications.
> > > >
> > > > Port ownership is a genericization of some functions introduced by
> > > > the fail-safe, that could structure DPDK further. It should allow
> > > > applications to have a seamless integration with subsystems using
> > > > port ownership. Without this, port ownership cannot be used.
> > > >
> > > > Testpmd should be fixed, but follow the most common design
> > > > patterns of DPDK applications. Going with port ownership seems
> > > > like a paradigm shift.
> > > >
> > > > > In addition
> > > > > Using the old iterator in some places in testpmd will cause a
> > > > > race for run-
> > > time new ports(can be created by failsafe or any hotplug code):
> > > > > - testpmd finds an ownerless port(just now created) by the old
> > > > > iterator and start traffic there,
> > > > > - failsafe takes ownership of this new port and start traffic there.
> > > > > Problem!
> > >
> > > Could you shed a bit more light here - it would be race condition
> > > between whom and whom?
> >
> > Sure.
> >
> > > As I remember in testpmd all control ops are done within one thread
> > > (main lcore).
> >
> > But other dpdk entity can use another thread, for example:
> > Failsafe uses the host thread(using alarm callback) to create a new port and
> to take ownership of a port.
> 
> Hm, and you create new ports inside failsafe PMD, right and then set new
> owner_id for it?

Yes.

> And all this in alarm in interrupt thread?

Yes.

> If so I wonder how you can guarantee that no-one else will set different
> owner_id between
> rte_eth_dev_allocate() and rte_eth_dev_owner_set()?

I check it (see failsafe patch to this series - V5).
Function: fs_bus_init.

> Could you point me to that place (I am not really familiar with familiar with
> failsafe code)?
> 
> >
> > The race:
> > Testpmd iterates over all ports by the master thread.
> > Failsafe takes ownership of a port by the host thread and start using it.
> > => The two dpdk entities may use the device at same time!
> 
> Ok, if failsafe really assigns its owner_id(s) to ports that are already in use by
> the app, then how such scheme supposed to work at all?

If the app works well (with the new rules) it already took ownership and failsafe will see it and will wait until the application release it.
Every dpdk entity should know which port it wants to manage,
If 2 entities want to manage the same device -  it can be ok and port ownership can synchronize the usage.

Probably, application which will run fail-safe wants to manage only the fail-safe port and therefor to take ownership only for it.

> I.E. application has a port - it assigns some owner_id != 0 to it, then PMD tries
> to set its owner_id tot the same port.
> Obviously failsafe's set_owner() will always fail in such case.
>
Yes, and will try again after some time.
 
> From what I hear we need to introduce a concept of 'default owner id'.
> I.E. when failsafe PMD is created - user assigns some owner_id to it (default).
> Then failsafe PMD generates it's own owner_id and assigns it only to the
> ports whose current owner_id is equal either 0 or 'default' owner_id.
> 

It is a suggestion and we need to think about it more (I'm talking about it with Gaetan in another thread).
Actually I think, if we want a generic solution to the generic problem the current solution is ok. 

> >
> > Obeying the new ownership rules can prevent all these races.
> >
> 
> When we discussed RFC of owner_id patch, I thought we all agreed that
> owner_id  API shouldn't be mandatory - i.e. existing apps not required to
> change to work normally with that.

Yes, it is not mandatory if app doesn't use hotplug.

I think with hotplug it is mandatory in the most cases.

And it can ease the secondary process model too.

Again, in the generic ownership problem as discussed in RFC:
Every entity, include app, should know which ports it wants to manage and to take ownership only for them.

> Though right now it seems that application changes seems necessary, at least
> to work ok with failsafe PMD.

And for solving the generic problem of ownership.(will defend from future issues by sure).

> Which makes we wonder was it some sort of misunderstanding or we did we
> do something wrong here?

Mistakes can be done all the time, but I think we are all understand the big issue of ownership and how the current solution solves it.
fail-safe it is just a current example for the problems which are possible because of the generic ownership issue.

Thanks,
Matan
> Konstantin
> 
> > > The only way to attach/detach port with it - invoke testpmd CLI
> > > "attach/detach" port.
> > >
> > > Konstantin
> > >
> > > >
> > > > Testpmd does not handle detection of new port. If it did, testing
> > > > fail-safe with it would be wrong.
> > > >
> > > > At startup, RTE_ETH_FOREACH_DEV already fixed the issue of
> > > > registering DEFERRED ports. There are still remaining issues
> > > > regarding this, but I think they should be fixed. The architecture
> > > > does not need to be completely moved to port ownership.
> > > >
> > > > If anything, this should serve as a test for your API with common
> > > > applications. I think you'd prefer to know and debug with testpmd
> > > > instead of firing up VPP or something like that to determine what
> > > > went wrong with using the fail-safe.
> > > >
> > > > >
> > > > > In addition
> > > > > As a good example for well-done application (free from ownership
> > > > > bugs) I tried here to adjust Tespmd to the new rules and BTW to
> > > > > fix a
> > > > lot of bugs.
> > > >
> > > > Testpmd has too much cruft, it won't ever be a good example of a
> > > > well-done application.
> > > >
> > > > If you want to demonstrate ownership, I think you should start an
> > > > example application demonstrating race conditions and their mitigation.
> > > >
> > > > I think that would be interesting for many DPDK users.
> > > >
> > > > >
> > > > >
> > > > > So actually applications which are not aware to the port
> > > > > ownership still are exposed to races, but if there use the old
> > > > > iterator(with the new
> > > > change) the amount of races decreases.
> > > > >
> > > > > Thanks, Matan.
> > > > > > Konstantin
> > > > > >
> > > > > > >
> > > > > > > I added to Testpmd ability to take an ownership of ports as
> > > > > > > the new ownership and synchronization rules suggested, Since
> > > > > > > Tespmd is a DPDK entity which wants that no one will touch
> > > > > > > its owned ports, It must allocate
> > > > > > an unique ID, set owner for its ports (see in main function)
> > > > > > and recognizes them by its owner ID.
> > > > > > >
> > > > > > > > Konstantin
> > > > > > > >
> > > >
> > > > Regards,
> > > > --
> > > > Gaëtan Rivet
> > > > 6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-23  8:54                       ` Matan Azrad
@ 2018-01-23 12:56                         ` Gaëtan Rivet
  2018-01-23 14:30                           ` Matan Azrad
  2018-01-23 13:34                         ` Ananyev, Konstantin
  1 sibling, 1 reply; 214+ messages in thread
From: Gaëtan Rivet @ 2018-01-23 12:56 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Ananyev, Konstantin, Thomas Monjalon, Wu, Jingjing, dev,
	Neil Horman, Richardson, Bruce

Hi all,

On Tue, Jan 23, 2018 at 08:54:27AM +0000, Matan Azrad wrote:

<snip>

> > > > > > > > > > Subject: [PATCH v3 7/7] app/testpmd: adjust ethdev port
> > > > > > > > > > ownership
> > > > > > > > > >
> > > > > > > > > > Testpmd should not use ethdev ports which are managed by
> > > > > > > > > > other DPDK entities.
> > > > > > > > > >
> > > > > > > > > > Set Testpmd ownership to each port which is not used by
> > > > > > > > > > other entity and prevent any usage of ethdev ports which
> > > > > > > > > > are not owned by
> > > > > > > Testpmd.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > > > > > > ---
> > > > > > > > > >  app/test-pmd/cmdline.c      | 89 +++++++++++++++++++------
> > ----
> > > > -------
> > > > > > > ----
> > > > > > > > > -----
> > > > > > > > > >  app/test-pmd/cmdline_flow.c |  2 +-
> > > > > > > > > >  app/test-pmd/config.c       | 37 ++++++++++---------
> > > > > > > > > >  app/test-pmd/parameters.c   |  4 +-
> > > > > > > > > >  app/test-pmd/testpmd.c      | 63 ++++++++++++++++++++----
> > ----
> > > > ----
> > > > > > > > > >  app/test-pmd/testpmd.h      |  3 ++
> > > > > > > > > >  6 files changed, 103 insertions(+), 95 deletions(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/app/test-pmd/cmdline.c
> > > > > > > > > > b/app/test-pmd/cmdline.c index
> > > > > > > > > > 31919ba..6199c64 100644
> > > > > > > > > > --- a/app/test-pmd/cmdline.c
> > > > > > > > > > +++ b/app/test-pmd/cmdline.c
> > > > > > > > > > @@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
> > > > > > > > > >  			&link_speed) < 0)
> > > > > > > > > >  		return;
> > > > > > > > > >
> > > > > > > > > > -	RTE_ETH_FOREACH_DEV(pid) {
> > > > > > > > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid,
> > my_owner.id) {
> > > > > > > > >
> > > > > > > > > Why do we need all these changes?
> > > > > > > > > As I understand you changed definition of
> > > > > > > > > RTE_ETH_FOREACH_DEV(), so no testpmd should work ok
> > > > > > > > > default
> > > > (no_owner case).
> > > > > > > > > Am I missing something here?
> > > > > > > >
> > > > > > > > Now, After Gaetan suggestion RTE_ETH_FOREACH_DEV(pid) will
> > > > > > > > iterate
> > > > > > > over all valid and ownerless ports.
> > > > > > >
> > > > > > > Yes.
> > > > > > >
> > > > > > > > Here Testpmd wants to iterate over its owned ports.
> > > > > > >
> > > > > > > Why? Why it can't just iterate over all valid and ownerless ports?
> > > > > > > As I understand it would be enough to fix current problems and
> > > > > > > would allow us to avoid any changes in testmpd (which I think
> > > > > > > is a good
> > > > thing).
> > > > > >
> > > > > > Yes, I understand that this big change is very daunted, But I
> > > > > > think the current a lot of bugs in testpmd(regarding port
> > > > > > ownership) even more
> > > > > daunted.
> > > > > >
> > > > > > Look,
> > > > > > Testpmd initiates some of its internal databases depends on
> > > > > > specific port iteration, In some time someone may take ownership
> > > > > > of Testpmd ports and testpmd will continue to touch them.
> > > >
> > > > But if someone will take the ownership (assign new owner_id) that
> > > > port will not appear in RTE_ETH_FOREACH_DEV() any more.
> > > >
> > >
> > > Yes, but testpmd sometimes depends on previous iteration using internal database.
> > > So it uses internal database that was updated by old iteration.
> > 
> > That sounds like just a bug in testpmd that need to be fixed, no?
> 
> If Testpmd already took ownership for these ports(like I did), it is ok.
> 

Have you tested using the default iterator (NO_OWNER)?
It worked until now with the bare minimal device tagging using
DEV_DEFERRED. Testpmd did not seem to mind having to skip this port.

I'm sure there were places where this was overlooked, but overall, I'd
think everything should be fixable using only the NO_OWNER iteration.

Can you point to a specific scenario (command line, chain of event) that
would lead to a problem?

> > Any particular places where outdated device info is used?
> 
> For example, look for the stream management in testpmd(I think I saw it there).
> 

The stream management is certainly shaky, but it happens after the EAL
initial port creation, and is not able to update itself for new
hotplugged ports (unless something changed).

> > > > > If I look back on the fail-safe, its sole purpose is to have
> > > > > seamless hotplug with existing applications.
> > > > >
> > > > > Port ownership is a genericization of some functions introduced by
> > > > > the fail-safe, that could structure DPDK further. It should allow
> > > > > applications to have a seamless integration with subsystems using
> > > > > port ownership. Without this, port ownership cannot be used.
> > > > >
> > > > > Testpmd should be fixed, but follow the most common design
> > > > > patterns of DPDK applications. Going with port ownership seems
> > > > > like a paradigm shift.
> > > > >
> > > > > > In addition
> > > > > > Using the old iterator in some places in testpmd will cause a
> > > > > > race for run-
> > > > time new ports(can be created by failsafe or any hotplug code):
> > > > > > - testpmd finds an ownerless port(just now created) by the old
> > > > > > iterator and start traffic there,

How does testpmd start traffic there? Testpmd has only a callback for
displaying that it received an event for a new port. It has no concept
of hotplugging beyond that.

Testpmd will not start using any new port probed using the hotplug API
on its own, again, unless something has drastically changed.

> > > > > > - failsafe takes ownership of this new port and start traffic there.
> > > > > > Problem!
> > > >
> > > > Could you shed a bit more light here - it would be race condition
> > > > between whom and whom?
> > >
> > > Sure.
> > >
> > > > As I remember in testpmd all control ops are done within one thread
> > > > (main lcore).
> > >
> > > But other dpdk entity can use another thread, for example:
> > > Failsafe uses the host thread(using alarm callback) to create a new port and
> > to take ownership of a port.
> > 
> > Hm, and you create new ports inside failsafe PMD, right and then set new
> > owner_id for it?
> 
> Yes.
> 
> > And all this in alarm in interrupt thread?
> 
> Yes.
> 
> > If so I wonder how you can guarantee that no-one else will set different
> > owner_id between
> > rte_eth_dev_allocate() and rte_eth_dev_owner_set()?
> 
> I check it (see failsafe patch to this series - V5).
> Function: fs_bus_init.
> 
> > Could you point me to that place (I am not really familiar with familiar with
> > failsafe code)?
> > 
> > >
> > > The race:
> > > Testpmd iterates over all ports by the master thread.
> > > Failsafe takes ownership of a port by the host thread and start using it.
> > > => The two dpdk entities may use the device at same time!
> > 

When can this happen? Fail-safe creates its initial pool of ports during
EAL init, before testpmd scans eth_dev ports and configure its streams.
At that point, it has taken ownership, from the master lcore context.

After this point, new ports could be detected and hotplugged by
fail-safe. However, even if testpmd had a callback to capture those new
ports and reconfigure its streams, it would be executed from within the
intr-thread, same as failsafe. If the thread was interrupted, by a
dataplane-lcore for example, streams would not have been reconfigured.
The fail-safe would execute its callback and set the owner-id before the
callback chains goes to the application.

And that would only be if testpmd had any callback for hotplugging ports
and reconfiguring its streams, which it hasn't, as far as I know.

> > Ok, if failsafe really assigns its owner_id(s) to ports that are already in use by
> > the app, then how such scheme supposed to work at all?
> 
> If the app works well (with the new rules) it already took ownership and failsafe will see it and will wait until the application release it.
> Every dpdk entity should know which port it wants to manage,
> If 2 entities want to manage the same device -  it can be ok and port ownership can synchronize the usage.
> 
> Probably, application which will run fail-safe wants to manage only the fail-safe port and therefor to take ownership only for it.
> 
> > I.E. application has a port - it assigns some owner_id != 0 to it, then PMD tries
> > to set its owner_id tot the same port.
> > Obviously failsafe's set_owner() will always fail in such case.
> >
> Yes, and will try again after some time.
>  
> > From what I hear we need to introduce a concept of 'default owner id'.
> > I.E. when failsafe PMD is created - user assigns some owner_id to it (default).
> > Then failsafe PMD generates it's own owner_id and assigns it only to the
> > ports whose current owner_id is equal either 0 or 'default' owner_id.
> > 
> 
> It is a suggestion and we need to think about it more (I'm talking about it with Gaetan in another thread).
> Actually I think, if we want a generic solution to the generic problem the current solution is ok. 
> 

We could as well conclude this other thread there.

The only solution would be to have a default relationship between
owners, something that goes beyond the scope assigned by Thomas to your
evolution, but would be necessary for this API to be properly used by
existing applications.

I think it's the only way to have a sane default behavior with your
API, but I also think this goes beyong the scope of the DPDK altogether.

But even with those considerations that could be ironed out later (API
is still experimental anyway), in the meantime, I think we should strive
not to break "userland" as much as possible. Meaning that unless you
have a specific situation creating a bug, you shouldn't have to modify
testpmd, and if an issues arises, you need to try to improve your API
before resorting to changing the resource management model of all
existing applications.

> > >
> > > Obeying the new ownership rules can prevent all these races.
> > >
> > 
> > When we discussed RFC of owner_id patch, I thought we all agreed that
> > owner_id  API shouldn't be mandatory - i.e. existing apps not required to
> > change to work normally with that.
> 
> Yes, it is not mandatory if app doesn't use hotplug.
> 
> I think with hotplug it is mandatory in the most cases.
> 
> And it can ease the secondary process model too.
> 
> Again, in the generic ownership problem as discussed in RFC:
> Every entity, include app, should know which ports it wants to manage and to take ownership only for them.
> 
> > Though right now it seems that application changes seems necessary, at least
> > to work ok with failsafe PMD.
> 
> And for solving the generic problem of ownership.(will defend from future issues by sure).
> 
> > Which makes we wonder was it some sort of misunderstanding or we did we
> > do something wrong here?
> 
> Mistakes can be done all the time, but I think we are all understand the big issue of ownership and how the current solution solves it.
> fail-safe it is just a current example for the problems which are possible because of the generic ownership issue.
> 

Regards,
-- 
Gaëtan Rivet
6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-23  8:54                       ` Matan Azrad
  2018-01-23 12:56                         ` Gaëtan Rivet
@ 2018-01-23 13:34                         ` Ananyev, Konstantin
  2018-01-23 14:18                           ` Thomas Monjalon
  2018-01-23 14:43                           ` Matan Azrad
  1 sibling, 2 replies; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-23 13:34 UTC (permalink / raw)
  To: Matan Azrad, Gaëtan Rivet
  Cc: Thomas Monjalon, Wu, Jingjing, dev, Neil Horman, Richardson, Bruce

Hi Matan,

> 
> 
> Hi Konstantin
> From: Ananyev, Konstantin, Monday, January 22, 2018 10:49 PM
> > Hi Matan,
> >
> > > -----Original Message-----
> > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > Sent: Monday, January 22, 2018 1:23 PM
> > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Gaëtan Rivet
> > > <gaetan.rivet@6wind.com>
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > <jingjing.wu@intel.com>; dev@dpdk.org; Neil Horman
> > > <nhorman@tuxdriver.com>; Richardson, Bruce
> > > <bruce.richardson@intel.com>
> > > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> > >
> > >
> > > Hi
> > > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > > > Hi lads,
> > > >
> > > > >
> > > > > Hi Matan,
> > > > >
> > > > > On Fri, Jan 19, 2018 at 01:35:10PM +0000, Matan Azrad wrote:
> > > > > > Hi Konstantin
> > > > > >
> > > > > > From: Ananyev, Konstantin, Friday, January 19, 2018 3:09 PM
> > > > > > > > -----Original Message-----
> > > > > > > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > > > > > > Sent: Friday, January 19, 2018 12:52 PM
> > > > > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> > > > > > > > Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > > > > > <gaetan.rivet@6wind.com>;
> > > > > > > > Wu, Jingjing <jingjing.wu@intel.com>
> > > > > > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> > > > > > > > Richardson, Bruce <bruce.richardson@intel.com>
> > > > > > > > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port
> > > > > > > > ownership
> > > > > > > >
> > > > > > > > Hi Konstantin
> > > > > > > >
> > > > > > > > From: Ananyev, Konstantin, Friday, January 19, 2018 2:38 PM
> > > > > > > > > To: Matan Azrad <matan@mellanox.com>; Thomas Monjalon
> > > > > > > > > <thomas@monjalon.net>; Gaetan Rivet
> > > > <gaetan.rivet@6wind.com>;
> > > > > > > Wu,
> > > > > > > > > Jingjing <jingjing.wu@intel.com>
> > > > > > > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> > > > > > > > > Richardson, Bruce <bruce.richardson@intel.com>
> > > > > > > > > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev
> > > > > > > > > port ownership
> > > > > > > > >
> > > > > > > > > Hi Matan,
> > > > > > > > >
> > > > > > > > > > -----Original Message-----
> > > > > > > > > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > > > > > > > > Sent: Thursday, January 18, 2018 4:35 PM
> > > > > > > > > > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > > > > > > > > <gaetan.rivet@6wind.com>; Wu, Jingjing
> > > > > > > > > > <jingjing.wu@intel.com>
> > > > > > > > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> > > > > > > Richardson,
> > > > > > > > > > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > > > > > > > > <konstantin.ananyev@intel.com>
> > > > > > > > > > Subject: [PATCH v3 7/7] app/testpmd: adjust ethdev port
> > > > > > > > > > ownership
> > > > > > > > > >
> > > > > > > > > > Testpmd should not use ethdev ports which are managed by
> > > > > > > > > > other DPDK entities.
> > > > > > > > > >
> > > > > > > > > > Set Testpmd ownership to each port which is not used by
> > > > > > > > > > other entity and prevent any usage of ethdev ports which
> > > > > > > > > > are not owned by
> > > > > > > Testpmd.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > > > > > > ---
> > > > > > > > > >  app/test-pmd/cmdline.c      | 89 +++++++++++++++++++------
> > ----
> > > > -------
> > > > > > > ----
> > > > > > > > > -----
> > > > > > > > > >  app/test-pmd/cmdline_flow.c |  2 +-
> > > > > > > > > >  app/test-pmd/config.c       | 37 ++++++++++---------
> > > > > > > > > >  app/test-pmd/parameters.c   |  4 +-
> > > > > > > > > >  app/test-pmd/testpmd.c      | 63 ++++++++++++++++++++----
> > ----
> > > > ----
> > > > > > > > > >  app/test-pmd/testpmd.h      |  3 ++
> > > > > > > > > >  6 files changed, 103 insertions(+), 95 deletions(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/app/test-pmd/cmdline.c
> > > > > > > > > > b/app/test-pmd/cmdline.c index
> > > > > > > > > > 31919ba..6199c64 100644
> > > > > > > > > > --- a/app/test-pmd/cmdline.c
> > > > > > > > > > +++ b/app/test-pmd/cmdline.c
> > > > > > > > > > @@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
> > > > > > > > > >  			&link_speed) < 0)
> > > > > > > > > >  		return;
> > > > > > > > > >
> > > > > > > > > > -	RTE_ETH_FOREACH_DEV(pid) {
> > > > > > > > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid,
> > my_owner.id) {
> > > > > > > > >
> > > > > > > > > Why do we need all these changes?
> > > > > > > > > As I understand you changed definition of
> > > > > > > > > RTE_ETH_FOREACH_DEV(), so no testpmd should work ok
> > > > > > > > > default
> > > > (no_owner case).
> > > > > > > > > Am I missing something here?
> > > > > > > >
> > > > > > > > Now, After Gaetan suggestion RTE_ETH_FOREACH_DEV(pid) will
> > > > > > > > iterate
> > > > > > > over all valid and ownerless ports.
> > > > > > >
> > > > > > > Yes.
> > > > > > >
> > > > > > > > Here Testpmd wants to iterate over its owned ports.
> > > > > > >
> > > > > > > Why? Why it can't just iterate over all valid and ownerless ports?
> > > > > > > As I understand it would be enough to fix current problems and
> > > > > > > would allow us to avoid any changes in testmpd (which I think
> > > > > > > is a good
> > > > thing).
> > > > > >
> > > > > > Yes, I understand that this big change is very daunted, But I
> > > > > > think the current a lot of bugs in testpmd(regarding port
> > > > > > ownership) even more
> > > > > daunted.
> > > > > >
> > > > > > Look,
> > > > > > Testpmd initiates some of its internal databases depends on
> > > > > > specific port iteration, In some time someone may take ownership
> > > > > > of Testpmd
> > > > ports and testpmd will continue to touch them.
> > > >
> > > > But if someone will take the ownership (assign new owner_id) that
> > > > port will not appear in RTE_ETH_FOREACH_DEV() any more.
> > > >
> > >
> > > Yes, but testpmd sometimes depends on previous iteration using internal
> > database.
> > > So it uses internal database that was updated by old iteration.
> >
> > That sounds like just a bug in testpmd that need to be fixed, no?
> 
> If Testpmd already took ownership for these ports(like I did), it is ok.
> 

Hmm, why not just to fix testpmd, if there is a bug?
As I said all control ops here are done by one thread, so it should be pretty easy.
Or are you talking about race conditions?

> > Any particular places where outdated device info is used?
> 
> For example, look for the stream management in testpmd(I think I saw it there).

Anything particular?

> 
> > > > > If I look back on the fail-safe, its sole purpose is to have
> > > > > seamless hotplug with existing applications.
> > > > >
> > > > > Port ownership is a genericization of some functions introduced by
> > > > > the fail-safe, that could structure DPDK further. It should allow
> > > > > applications to have a seamless integration with subsystems using
> > > > > port ownership. Without this, port ownership cannot be used.
> > > > >
> > > > > Testpmd should be fixed, but follow the most common design
> > > > > patterns of DPDK applications. Going with port ownership seems
> > > > > like a paradigm shift.
> > > > >
> > > > > > In addition
> > > > > > Using the old iterator in some places in testpmd will cause a
> > > > > > race for run-
> > > > time new ports(can be created by failsafe or any hotplug code):
> > > > > > - testpmd finds an ownerless port(just now created) by the old
> > > > > > iterator and start traffic there,
> > > > > > - failsafe takes ownership of this new port and start traffic there.
> > > > > > Problem!
> > > >
> > > > Could you shed a bit more light here - it would be race condition
> > > > between whom and whom?
> > >
> > > Sure.
> > >
> > > > As I remember in testpmd all control ops are done within one thread
> > > > (main lcore).
> > >
> > > But other dpdk entity can use another thread, for example:
> > > Failsafe uses the host thread(using alarm callback) to create a new port and
> > to take ownership of a port.
> >
> > Hm, and you create new ports inside failsafe PMD, right and then set new
> > owner_id for it?
> 
> Yes.
> 
> > And all this in alarm in interrupt thread?
> 
> Yes.
> 
> > If so I wonder how you can guarantee that no-one else will set different
> > owner_id between
> > rte_eth_dev_allocate() and rte_eth_dev_owner_set()?
> 
> I check it (see failsafe patch to this series - V5).
> Function: fs_bus_init.

You are talking about that peace of code:
+		ret = rte_eth_dev_owner_set(pid, &PRIV(dev)->my_owner);
+		if (ret) {
+			INFO("sub_device %d owner set failed (%s),"
+			     " will try again later", i, strerror(ret));
+			continue;

right?
So you just wouldn't include that device into your failsafe device.
But that probably not what user wanted, especially if he bothered to create
a special new low-level device for you.
If that' s the use case, then I think you need to set device ownership at creation time -
inside dev_allocate().
Again that would avoid such racing conditions inside testpmd.

> 
> > Could you point me to that place (I am not really familiar with familiar with
> > failsafe code)?
> >
> > >
> > > The race:
> > > Testpmd iterates over all ports by the master thread.
> > > Failsafe takes ownership of a port by the host thread and start using it.
> > > => The two dpdk entities may use the device at same time!
> >
> > Ok, if failsafe really assigns its owner_id(s) to ports that are already in use by
> > the app, then how such scheme supposed to work at all?
> 
> If the app works well (with the new rules) it already took ownership and failsafe will see it and will wait until the application release it.

Ok, and why application would need to release it?
How it would know that failsafe device wants to use it now?
Again where is a guarantee that after app released it some other entity wouldn't grab it for itself?

> Every dpdk entity should know which port it wants to manage,
> If 2 entities want to manage the same device -  it can be ok and port ownership can synchronize the usage.
> 
> Probably, application which will run fail-safe wants to manage only the fail-safe port and therefor to take ownership only for it.
> 
> > I.E. application has a port - it assigns some owner_id != 0 to it, then PMD tries
> > to set its owner_id tot the same port.
> > Obviously failsafe's set_owner() will always fail in such case.
> >
> Yes, and will try again after some time.

Same question again - how app will know that it has to release the port ownership?

> 
> > From what I hear we need to introduce a concept of 'default owner id'.
> > I.E. when failsafe PMD is created - user assigns some owner_id to it (default).
> > Then failsafe PMD generates it's own owner_id and assigns it only to the
> > ports whose current owner_id is equal either 0 or 'default' owner_id.
> >
> 
> It is a suggestion and we need to think about it more (I'm talking about it with Gaetan in another thread).
> Actually I think, if we want a generic solution to the generic problem the current solution is ok.

>From what I heard - every app that wants to use failsafe PMD would require quite a lot of changes.
It doesn't look ok to me.

> 
> > >
> > > Obeying the new ownership rules can prevent all these races.
> > >
> >
> > When we discussed RFC of owner_id patch, I thought we all agreed that
> > owner_id  API shouldn't be mandatory - i.e. existing apps not required to
> > change to work normally with that.
> 
> Yes, it is not mandatory if app doesn't use hotplug.
> 
> I think with hotplug it is mandatory in the most cases.

Yes in failsafe you always install this alarm handler, so even
if the app would have its own  way to handle hotplug  devices -
it would suddenly need to use this new owner API - even if it doesn't need to.
Why it has to be?

> 
> And it can ease the secondary process model too.
> 
> Again, in the generic ownership problem as discussed in RFC:
> Every entity, include app, should know which ports it wants to manage and to take ownership only for them.
> 
> > Though right now it seems that application changes seems necessary, at least
> > to work ok with failsafe PMD.
> 
> And for solving the generic problem of ownership.(will defend from future issues by sure).
> 
> > Which makes we wonder was it some sort of misunderstanding or we did we
> > do something wrong here?
> 
> Mistakes can be done all the time, but I think we are all understand the big issue of ownership and how the current solution solves it.
> fail-safe it is just a current example for the problems which are possible because of the generic ownership issue.

Honestly that seems too much changes for the app just to make failsafe PMD work correctly.
IMO - It should be some way to support it without causing changes in each DPDK application  -
otherwise something is wrong with the PMD itself.
If let say that ownership model is required to make failsafe PMD to operate -
it should be done in a transparent way to the user.
Probably something like Gaetan suggested in another mail or so.
Konstantin

> 
> Thanks,
> Matan
> > Konstantin
> >
> > > > The only way to attach/detach port with it - invoke testpmd CLI
> > > > "attach/detach" port.
> > > >
> > > > Konstantin
> > > >
> > > > >
> > > > > Testpmd does not handle detection of new port. If it did, testing
> > > > > fail-safe with it would be wrong.
> > > > >
> > > > > At startup, RTE_ETH_FOREACH_DEV already fixed the issue of
> > > > > registering DEFERRED ports. There are still remaining issues
> > > > > regarding this, but I think they should be fixed. The architecture
> > > > > does not need to be completely moved to port ownership.
> > > > >
> > > > > If anything, this should serve as a test for your API with common
> > > > > applications. I think you'd prefer to know and debug with testpmd
> > > > > instead of firing up VPP or something like that to determine what
> > > > > went wrong with using the fail-safe.
> > > > >
> > > > > >
> > > > > > In addition
> > > > > > As a good example for well-done application (free from ownership
> > > > > > bugs) I tried here to adjust Tespmd to the new rules and BTW to
> > > > > > fix a
> > > > > lot of bugs.
> > > > >
> > > > > Testpmd has too much cruft, it won't ever be a good example of a
> > > > > well-done application.
> > > > >
> > > > > If you want to demonstrate ownership, I think you should start an
> > > > > example application demonstrating race conditions and their mitigation.
> > > > >
> > > > > I think that would be interesting for many DPDK users.
> > > > >
> > > > > >
> > > > > >
> > > > > > So actually applications which are not aware to the port
> > > > > > ownership still are exposed to races, but if there use the old
> > > > > > iterator(with the new
> > > > > change) the amount of races decreases.
> > > > > >
> > > > > > Thanks, Matan.
> > > > > > > Konstantin
> > > > > > >
> > > > > > > >
> > > > > > > > I added to Testpmd ability to take an ownership of ports as
> > > > > > > > the new ownership and synchronization rules suggested, Since
> > > > > > > > Tespmd is a DPDK entity which wants that no one will touch
> > > > > > > > its owned ports, It must allocate
> > > > > > > an unique ID, set owner for its ports (see in main function)
> > > > > > > and recognizes them by its owner ID.
> > > > > > > >
> > > > > > > > > Konstantin
> > > > > > > > >
> > > > >
> > > > > Regards,
> > > > > --
> > > > > Gaëtan Rivet
> > > > > 6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-23 13:34                         ` Ananyev, Konstantin
@ 2018-01-23 14:18                           ` Thomas Monjalon
  2018-01-23 15:12                             ` Ananyev, Konstantin
  2018-01-23 14:43                           ` Matan Azrad
  1 sibling, 1 reply; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-23 14:18 UTC (permalink / raw)
  To: Ananyev, Konstantin
  Cc: Matan Azrad, Gaëtan Rivet, Wu, Jingjing, dev, Neil Horman,
	Richardson, Bruce

23/01/2018 14:34, Ananyev, Konstantin:
> If that' s the use case, then I think you need to set device ownership at creation time -
> inside dev_allocate().
> Again that would avoid such racing conditions inside testpmd.

The devices must be allocated at a low level layer.
When a new device appears (hotplug), an ethdev port should be allocated
automatically if it passes the whitelist/blacklist policy test.
Then we must decide who will manage this device.
I suggest notifying the DPDK libs first.
So a DPDK lib or PMD like failsafe can have the priority to take the
ownership in its notification callback.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-23 12:56                         ` Gaëtan Rivet
@ 2018-01-23 14:30                           ` Matan Azrad
  2018-01-25  9:36                             ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-23 14:30 UTC (permalink / raw)
  To: Gaëtan Rivet
  Cc: Ananyev, Konstantin, Thomas Monjalon, Wu, Jingjing, dev,
	Neil Horman, Richardson, Bruce

Hi

From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com]
<snip>
> > > > > > Look,
> > > > > > > Testpmd initiates some of its internal databases depends on
> > > > > > > specific port iteration, In some time someone may take
> > > > > > > ownership of Testpmd ports and testpmd will continue to touch
> them.
> > > > >
> > > > > But if someone will take the ownership (assign new owner_id)
> > > > > that port will not appear in RTE_ETH_FOREACH_DEV() any more.
> > > > >
> > > >
> > > > Yes, but testpmd sometimes depends on previous iteration using
> internal database.
> > > > So it uses internal database that was updated by old iteration.
> > >
> > > That sounds like just a bug in testpmd that need to be fixed, no?
> >
> > If Testpmd already took ownership for these ports(like I did), it is ok.
> >
> 
> Have you tested using the default iterator (NO_OWNER)?
> It worked until now with the bare minimal device tagging using
> DEV_DEFERRED. Testpmd did not seem to mind having to skip this port.
> 
> I'm sure there were places where this was overlooked, but overall, I'd think
> everything should be fixable using only the NO_OWNER iteration.

I don't think so.

> Can you point to a specific scenario (command line, chain of event) that
> would lead to a problem?
>

I didn't construct a race test to catch testpmd issue, but I think without this patch, there is a lot of issues.
Go to the testpmd code (before ownership) and find usage of the old iterator(after the first iteration in main),
Ask yourself what should happen if exactly in this time, a new port is created by fail-safe(plug in event).
 
> > > Any particular places where outdated device info is used?
> >
> > For example, look for the stream management in testpmd(I think I saw it
> there).
> >
> 
> The stream management is certainly shaky, but it happens after the EAL
> initial port creation, and is not able to update itself for new hotplugged ports
> (unless something changed).
> 

Yes, but conceptually someone in the future may take the port(because it ownerless).

> > > > > > If I look back on the fail-safe, its sole purpose is to have
> > > > > > seamless hotplug with existing applications.
> > > > > >
> > > > > > Port ownership is a genericization of some functions
> > > > > > introduced by the fail-safe, that could structure DPDK
> > > > > > further. It should allow applications to have a seamless
> > > > > > integration with subsystems using port ownership. Without this,
> port ownership cannot be used.
> > > > > >
> > > > > > Testpmd should be fixed, but follow the most common design
> > > > > > patterns of DPDK applications. Going with port ownership seems
> > > > > > like a paradigm shift.
> > > > > >
> > > > > > > In addition
> > > > > > > Using the old iterator in some places in testpmd will cause
> > > > > > > a race for run-
> > > > > time new ports(can be created by failsafe or any hotplug code):
> > > > > > > - testpmd finds an ownerless port(just now created) by the
> > > > > > > old iterator and start traffic there,
> 
> How does testpmd start traffic there? Testpmd has only a callback for
> displaying that it received an event for a new port. It has no concept of
> hotplugging beyond that.
> 

Yes, so no traffic just some control command.

> Testpmd will not start using any new port probed using the hotplug API on its
> own, again, unless something has drastically changed.
> 

Every iterator using in testpmd is exposed to race.

> > > > > > > - failsafe takes ownership of this new port and start traffic there.
> > > > > > > Problem!
> > > > >
> > > > > Could you shed a bit more light here - it would be race
> > > > > condition between whom and whom?
> > > >
> > > > Sure.
> > > >
> > > > > As I remember in testpmd all control ops are done within one
> > > > > thread (main lcore).
> > > >
> > > > But other dpdk entity can use another thread, for example:
> > > > Failsafe uses the host thread(using alarm callback) to create a
> > > > new port and
> > > to take ownership of a port.
> > >
> > > Hm, and you create new ports inside failsafe PMD, right and then set
> > > new owner_id for it?
> >
> > Yes.
> >
> > > And all this in alarm in interrupt thread?
> >
> > Yes.
> >
> > > If so I wonder how you can guarantee that no-one else will set
> > > different owner_id between
> > > rte_eth_dev_allocate() and rte_eth_dev_owner_set()?
> >
> > I check it (see failsafe patch to this series - V5).
> > Function: fs_bus_init.
> >
> > > Could you point me to that place (I am not really familiar with
> > > familiar with failsafe code)?
> > >
> > > >
> > > > The race:
> > > > Testpmd iterates over all ports by the master thread.
> > > > Failsafe takes ownership of a port by the host thread and start using it.
> > > > => The two dpdk entities may use the device at same time!
> > >
> 
> When can this happen? Fail-safe creates its initial pool of ports during EAL
> init, before testpmd scans eth_dev ports and configure its streams.
> At that point, it has taken ownership, from the master lcore context.
> 
> After this point, new ports could be detected and hotplugged by fail-safe.
> However, even if testpmd had a callback to capture those new ports and
> reconfigure its streams, it would be executed from within the intr-thread,
> same as failsafe. If the thread was interrupted, by a dataplane-lcore for
> example, streams would not have been reconfigured.
> The fail-safe would execute its callback and set the owner-id before the
> callback chains goes to the application.
>

Some iterator may be invoked in plug out process by other thread in testpmd and causes to control command 
 
> And that would only be if testpmd had any callback for hotplugging ports and
> reconfiguring its streams, which it hasn't, as far as I know.
>

We don't need to implement it in testpmd.
 
> > > Ok, if failsafe really assigns its owner_id(s) to ports that are
> > > already in use by the app, then how such scheme supposed to work at
> all?
> >
> > If the app works well (with the new rules) it already took ownership and
> failsafe will see it and will wait until the application release it.
> > Every dpdk entity should know which port it wants to manage, If 2
> > entities want to manage the same device -  it can be ok and port ownership
> can synchronize the usage.
> >
> > Probably, application which will run fail-safe wants to manage only the fail-
> safe port and therefor to take ownership only for it.
> >
> > > I.E. application has a port - it assigns some owner_id != 0 to it,
> > > then PMD tries to set its owner_id tot the same port.
> > > Obviously failsafe's set_owner() will always fail in such case.
> > >
> > Yes, and will try again after some time.
> >
> > > From what I hear we need to introduce a concept of 'default owner id'.
> > > I.E. when failsafe PMD is created - user assigns some owner_id to it
> (default).
> > > Then failsafe PMD generates it's own owner_id and assigns it only to
> > > the ports whose current owner_id is equal either 0 or 'default' owner_id.
> > >
> >
> > It is a suggestion and we need to think about it more (I'm talking about it
> with Gaetan in another thread).
> > Actually I think, if we want a generic solution to the generic problem the
> current solution is ok.
> >
> 
> We could as well conclude this other thread there.
> 
> The only solution would be to have a default relationship between owners,
> something that goes beyond the scope assigned by Thomas to your
> evolution, but would be necessary for this API to be properly used by
> existing applications.
> 
> I think it's the only way to have a sane default behavior with your API, but I
> also think this goes beyong the scope of the DPDK altogether.
> 
> But even with those considerations that could be ironed out later (API is still
> experimental anyway), in the meantime, I think we should strive not to
> break "userland" as much as possible. Meaning that unless you have a
> specific situation creating a bug, you shouldn't have to modify testpmd, and if
> an issues arises, you need to try to improve your API before resorting to
> changing the resource management model of all existing applications.
> 

I understand it.
Suggestion:

2 system owners.
APP_OWNER - 0.
NO_OWNER - 1.

And allowing for more owners as now.

1. Every port creation will set the owner for NO_OWNER (as now).
2. There is option for all dpdk entities to take owner of  NO_OWNER ports all the time(as now).
3. In some point in the end of EAL init: set all the NO_OWNER to APP_OWNER(for V6).
4. Change the old iterator to iterate over APP_OWNER ports(for V6).

What do you think?

<snip>

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-23 13:34                         ` Ananyev, Konstantin
  2018-01-23 14:18                           ` Thomas Monjalon
@ 2018-01-23 14:43                           ` Matan Azrad
  1 sibling, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-01-23 14:43 UTC (permalink / raw)
  To: Ananyev, Konstantin, Gaëtan Rivet
  Cc: Thomas Monjalon, Wu, Jingjing, dev, Neil Horman, Richardson, Bruce

Hi Konstantin

Please move the second thread, I'm feeling you and Gaetan have the same questions.

From: Ananyev, Konstantin, Tuesday, January 23, 2018 3:35 PM
> Hi Matan,
> 
> >
> >
> > Hi Konstantin
> > From: Ananyev, Konstantin, Monday, January 22, 2018 10:49 PM
> > > Hi Matan,
> > >
> > > > -----Original Message-----
> > > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > > Sent: Monday, January 22, 2018 1:23 PM
> > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Gaëtan
> > > > Rivet <gaetan.rivet@6wind.com>
> > > > Cc: Thomas Monjalon <thomas@monjalon.net>; Wu, Jingjing
> > > > <jingjing.wu@intel.com>; dev@dpdk.org; Neil Horman
> > > > <nhorman@tuxdriver.com>; Richardson, Bruce
> > > > <bruce.richardson@intel.com>
> > > > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev port
> > > > ownership
> > > >
> > > >
> > > > Hi
> > > > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> > > > > Hi lads,
> > > > >
> > > > > >
> > > > > > Hi Matan,
> > > > > >
> > > > > > On Fri, Jan 19, 2018 at 01:35:10PM +0000, Matan Azrad wrote:
> > > > > > > Hi Konstantin
> > > > > > >
> > > > > > > From: Ananyev, Konstantin, Friday, January 19, 2018 3:09 PM
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > > > > > > > Sent: Friday, January 19, 2018 12:52 PM
> > > > > > > > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> > > > > > > > > Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > > > > > > <gaetan.rivet@6wind.com>;
> > > > > > > > > Wu, Jingjing <jingjing.wu@intel.com>
> > > > > > > > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>;
> > > > > > > > > Richardson, Bruce <bruce.richardson@intel.com>
> > > > > > > > > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev
> > > > > > > > > port ownership
> > > > > > > > >
> > > > > > > > > Hi Konstantin
> > > > > > > > >
> > > > > > > > > From: Ananyev, Konstantin, Friday, January 19, 2018 2:38
> > > > > > > > > PM
> > > > > > > > > > To: Matan Azrad <matan@mellanox.com>; Thomas
> Monjalon
> > > > > > > > > > <thomas@monjalon.net>; Gaetan Rivet
> > > > > <gaetan.rivet@6wind.com>;
> > > > > > > > Wu,
> > > > > > > > > > Jingjing <jingjing.wu@intel.com>
> > > > > > > > > > Cc: dev@dpdk.org; Neil Horman
> <nhorman@tuxdriver.com>;
> > > > > > > > > > Richardson, Bruce <bruce.richardson@intel.com>
> > > > > > > > > > Subject: RE: [PATCH v3 7/7] app/testpmd: adjust ethdev
> > > > > > > > > > port ownership
> > > > > > > > > >
> > > > > > > > > > Hi Matan,
> > > > > > > > > >
> > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > From: Matan Azrad [mailto:matan@mellanox.com]
> > > > > > > > > > > Sent: Thursday, January 18, 2018 4:35 PM
> > > > > > > > > > > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan
> > > > > > > > > > > Rivet <gaetan.rivet@6wind.com>; Wu, Jingjing
> > > > > > > > > > > <jingjing.wu@intel.com>
> > > > > > > > > > > Cc: dev@dpdk.org; Neil Horman
> > > > > > > > > > > <nhorman@tuxdriver.com>;
> > > > > > > > Richardson,
> > > > > > > > > > > Bruce <bruce.richardson@intel.com>; Ananyev,
> > > > > > > > > > > Konstantin <konstantin.ananyev@intel.com>
> > > > > > > > > > > Subject: [PATCH v3 7/7] app/testpmd: adjust ethdev
> > > > > > > > > > > port ownership
> > > > > > > > > > >
> > > > > > > > > > > Testpmd should not use ethdev ports which are
> > > > > > > > > > > managed by other DPDK entities.
> > > > > > > > > > >
> > > > > > > > > > > Set Testpmd ownership to each port which is not used
> > > > > > > > > > > by other entity and prevent any usage of ethdev
> > > > > > > > > > > ports which are not owned by
> > > > > > > > Testpmd.
> > > > > > > > > > >
> > > > > > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > > > > > > > ---
> > > > > > > > > > >  app/test-pmd/cmdline.c      | 89 +++++++++++++++++++--
> ----
> > > ----
> > > > > -------
> > > > > > > > ----
> > > > > > > > > > -----
> > > > > > > > > > >  app/test-pmd/cmdline_flow.c |  2 +-
> > > > > > > > > > >  app/test-pmd/config.c       | 37 ++++++++++---------
> > > > > > > > > > >  app/test-pmd/parameters.c   |  4 +-
> > > > > > > > > > >  app/test-pmd/testpmd.c      | 63
> ++++++++++++++++++++----
> > > ----
> > > > > ----
> > > > > > > > > > >  app/test-pmd/testpmd.h      |  3 ++
> > > > > > > > > > >  6 files changed, 103 insertions(+), 95 deletions(-)
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/app/test-pmd/cmdline.c
> > > > > > > > > > > b/app/test-pmd/cmdline.c index
> > > > > > > > > > > 31919ba..6199c64 100644
> > > > > > > > > > > --- a/app/test-pmd/cmdline.c
> > > > > > > > > > > +++ b/app/test-pmd/cmdline.c
> > > > > > > > > > > @@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
> > > > > > > > > > >  			&link_speed) < 0)
> > > > > > > > > > >  		return;
> > > > > > > > > > >
> > > > > > > > > > > -	RTE_ETH_FOREACH_DEV(pid) {
> > > > > > > > > > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid,
> > > my_owner.id) {
> > > > > > > > > >
> > > > > > > > > > Why do we need all these changes?
> > > > > > > > > > As I understand you changed definition of
> > > > > > > > > > RTE_ETH_FOREACH_DEV(), so no testpmd should work ok
> > > > > > > > > > default
> > > > > (no_owner case).
> > > > > > > > > > Am I missing something here?
> > > > > > > > >
> > > > > > > > > Now, After Gaetan suggestion RTE_ETH_FOREACH_DEV(pid)
> > > > > > > > > will iterate
> > > > > > > > over all valid and ownerless ports.
> > > > > > > >
> > > > > > > > Yes.
> > > > > > > >
> > > > > > > > > Here Testpmd wants to iterate over its owned ports.
> > > > > > > >
> > > > > > > > Why? Why it can't just iterate over all valid and ownerless ports?
> > > > > > > > As I understand it would be enough to fix current problems
> > > > > > > > and would allow us to avoid any changes in testmpd (which
> > > > > > > > I think is a good
> > > > > thing).
> > > > > > >
> > > > > > > Yes, I understand that this big change is very daunted, But
> > > > > > > I think the current a lot of bugs in testpmd(regarding port
> > > > > > > ownership) even more
> > > > > > daunted.
> > > > > > >
> > > > > > > Look,
> > > > > > > Testpmd initiates some of its internal databases depends on
> > > > > > > specific port iteration, In some time someone may take
> > > > > > > ownership of Testpmd
> > > > > ports and testpmd will continue to touch them.
> > > > >
> > > > > But if someone will take the ownership (assign new owner_id)
> > > > > that port will not appear in RTE_ETH_FOREACH_DEV() any more.
> > > > >
> > > >
> > > > Yes, but testpmd sometimes depends on previous iteration using
> > > > internal
> > > database.
> > > > So it uses internal database that was updated by old iteration.
> > >
> > > That sounds like just a bug in testpmd that need to be fixed, no?
> >
> > If Testpmd already took ownership for these ports(like I did), it is ok.
> >
> 
> Hmm, why not just to fix testpmd, if there is a bug?
> As I said all control ops here are done by one thread, so it should be pretty
> easy.
> Or are you talking about race conditions?
> 
> > > Any particular places where outdated device info is used?
> >
> > For example, look for the stream management in testpmd(I think I saw it
> there).
> 
> Anything particular?
> 
> >
> > > > > > If I look back on the fail-safe, its sole purpose is to have
> > > > > > seamless hotplug with existing applications.
> > > > > >
> > > > > > Port ownership is a genericization of some functions
> > > > > > introduced by the fail-safe, that could structure DPDK
> > > > > > further. It should allow applications to have a seamless
> > > > > > integration with subsystems using port ownership. Without this,
> port ownership cannot be used.
> > > > > >
> > > > > > Testpmd should be fixed, but follow the most common design
> > > > > > patterns of DPDK applications. Going with port ownership seems
> > > > > > like a paradigm shift.
> > > > > >
> > > > > > > In addition
> > > > > > > Using the old iterator in some places in testpmd will cause
> > > > > > > a race for run-
> > > > > time new ports(can be created by failsafe or any hotplug code):
> > > > > > > - testpmd finds an ownerless port(just now created) by the
> > > > > > > old iterator and start traffic there,
> > > > > > > - failsafe takes ownership of this new port and start traffic there.
> > > > > > > Problem!
> > > > >
> > > > > Could you shed a bit more light here - it would be race
> > > > > condition between whom and whom?
> > > >
> > > > Sure.
> > > >
> > > > > As I remember in testpmd all control ops are done within one
> > > > > thread (main lcore).
> > > >
> > > > But other dpdk entity can use another thread, for example:
> > > > Failsafe uses the host thread(using alarm callback) to create a
> > > > new port and
> > > to take ownership of a port.
> > >
> > > Hm, and you create new ports inside failsafe PMD, right and then set
> > > new owner_id for it?
> >
> > Yes.
> >
> > > And all this in alarm in interrupt thread?
> >
> > Yes.
> >
> > > If so I wonder how you can guarantee that no-one else will set
> > > different owner_id between
> > > rte_eth_dev_allocate() and rte_eth_dev_owner_set()?
> >
> > I check it (see failsafe patch to this series - V5).
> > Function: fs_bus_init.
> 
> You are talking about that peace of code:
> +		ret = rte_eth_dev_owner_set(pid, &PRIV(dev)-
> >my_owner);
> +		if (ret) {
> +			INFO("sub_device %d owner set failed (%s),"
> +			     " will try again later", i, strerror(ret));
> +			continue;
> 
> right?
> So you just wouldn't include that device into your failsafe device.
> But that probably not what user wanted, especially if he bothered to create a
> special new low-level device for you.
> If that' s the use case, then I think you need to set device ownership at
> creation time - inside dev_allocate().
> Again that would avoid such racing conditions inside testpmd.
> 
> >
> > > Could you point me to that place (I am not really familiar with
> > > familiar with failsafe code)?
> > >
> > > >
> > > > The race:
> > > > Testpmd iterates over all ports by the master thread.
> > > > Failsafe takes ownership of a port by the host thread and start using it.
> > > > => The two dpdk entities may use the device at same time!
> > >
> > > Ok, if failsafe really assigns its owner_id(s) to ports that are
> > > already in use by the app, then how such scheme supposed to work at
> all?
> >
> > If the app works well (with the new rules) it already took ownership and
> failsafe will see it and will wait until the application release it.
> 
> Ok, and why application would need to release it?
> How it would know that failsafe device wants to use it now?
> Again where is a guarantee that after app released it some other entity
> wouldn't grab it for itself?
> 
> > Every dpdk entity should know which port it wants to manage, If 2
> > entities want to manage the same device -  it can be ok and port ownership
> can synchronize the usage.
> >
> > Probably, application which will run fail-safe wants to manage only the fail-
> safe port and therefor to take ownership only for it.
> >
> > > I.E. application has a port - it assigns some owner_id != 0 to it,
> > > then PMD tries to set its owner_id tot the same port.
> > > Obviously failsafe's set_owner() will always fail in such case.
> > >
> > Yes, and will try again after some time.
> 
> Same question again - how app will know that it has to release the port
> ownership?
> 
> >
> > > From what I hear we need to introduce a concept of 'default owner id'.
> > > I.E. when failsafe PMD is created - user assigns some owner_id to it
> (default).
> > > Then failsafe PMD generates it's own owner_id and assigns it only to
> > > the ports whose current owner_id is equal either 0 or 'default' owner_id.
> > >
> >
> > It is a suggestion and we need to think about it more (I'm talking about it
> with Gaetan in another thread).
> > Actually I think, if we want a generic solution to the generic problem the
> current solution is ok.
> 
> From what I heard - every app that wants to use failsafe PMD would require
> quite a lot of changes.
> It doesn't look ok to me.
> 
> >
> > > >
> > > > Obeying the new ownership rules can prevent all these races.
> > > >
> > >
> > > When we discussed RFC of owner_id patch, I thought we all agreed
> > > that owner_id  API shouldn't be mandatory - i.e. existing apps not
> > > required to change to work normally with that.
> >
> > Yes, it is not mandatory if app doesn't use hotplug.
> >
> > I think with hotplug it is mandatory in the most cases.
> 
> Yes in failsafe you always install this alarm handler, so even if the app would
> have its own  way to handle hotplug  devices - it would suddenly need to use
> this new owner API - even if it doesn't need to.
> Why it has to be?
> 
> >
> > And it can ease the secondary process model too.
> >
> > Again, in the generic ownership problem as discussed in RFC:
> > Every entity, include app, should know which ports it wants to manage and
> to take ownership only for them.
> >
> > > Though right now it seems that application changes seems necessary,
> > > at least to work ok with failsafe PMD.
> >
> > And for solving the generic problem of ownership.(will defend from future
> issues by sure).
> >
> > > Which makes we wonder was it some sort of misunderstanding or we did
> > > we do something wrong here?
> >
> > Mistakes can be done all the time, but I think we are all understand the big
> issue of ownership and how the current solution solves it.
> > fail-safe it is just a current example for the problems which are possible
> because of the generic ownership issue.
> 
> Honestly that seems too much changes for the app just to make failsafe PMD
> work correctly.
> IMO - It should be some way to support it without causing changes in each
> DPDK application  - otherwise something is wrong with the PMD itself.
> If let say that ownership model is required to make failsafe PMD to operate -
> it should be done in a transparent way to the user.
> Probably something like Gaetan suggested in another mail or so.
> Konstantin
> 
> >
> > Thanks,
> > Matan
> > > Konstantin
> > >
> > > > > The only way to attach/detach port with it - invoke testpmd CLI
> > > > > "attach/detach" port.
> > > > >
> > > > > Konstantin
> > > > >
> > > > > >
> > > > > > Testpmd does not handle detection of new port. If it did,
> > > > > > testing fail-safe with it would be wrong.
> > > > > >
> > > > > > At startup, RTE_ETH_FOREACH_DEV already fixed the issue of
> > > > > > registering DEFERRED ports. There are still remaining issues
> > > > > > regarding this, but I think they should be fixed. The
> > > > > > architecture does not need to be completely moved to port
> ownership.
> > > > > >
> > > > > > If anything, this should serve as a test for your API with
> > > > > > common applications. I think you'd prefer to know and debug
> > > > > > with testpmd instead of firing up VPP or something like that
> > > > > > to determine what went wrong with using the fail-safe.
> > > > > >
> > > > > > >
> > > > > > > In addition
> > > > > > > As a good example for well-done application (free from
> > > > > > > ownership
> > > > > > > bugs) I tried here to adjust Tespmd to the new rules and BTW
> > > > > > > to fix a
> > > > > > lot of bugs.
> > > > > >
> > > > > > Testpmd has too much cruft, it won't ever be a good example of
> > > > > > a well-done application.
> > > > > >
> > > > > > If you want to demonstrate ownership, I think you should start
> > > > > > an example application demonstrating race conditions and their
> mitigation.
> > > > > >
> > > > > > I think that would be interesting for many DPDK users.
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > So actually applications which are not aware to the port
> > > > > > > ownership still are exposed to races, but if there use the
> > > > > > > old iterator(with the new
> > > > > > change) the amount of races decreases.
> > > > > > >
> > > > > > > Thanks, Matan.
> > > > > > > > Konstantin
> > > > > > > >
> > > > > > > > >
> > > > > > > > > I added to Testpmd ability to take an ownership of ports
> > > > > > > > > as the new ownership and synchronization rules
> > > > > > > > > suggested, Since Tespmd is a DPDK entity which wants
> > > > > > > > > that no one will touch its owned ports, It must allocate
> > > > > > > > an unique ID, set owner for its ports (see in main
> > > > > > > > function) and recognizes them by its owner ID.
> > > > > > > > >
> > > > > > > > > > Konstantin
> > > > > > > > > >
> > > > > >
> > > > > > Regards,
> > > > > > --
> > > > > > Gaëtan Rivet
> > > > > > 6WIND

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-23 14:18                           ` Thomas Monjalon
@ 2018-01-23 15:12                             ` Ananyev, Konstantin
  2018-01-23 15:18                               ` Ananyev, Konstantin
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-23 15:12 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Matan Azrad, Gaëtan Rivet, Wu, Jingjing, dev, Neil Horman,
	Richardson, Bruce



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Tuesday, January 23, 2018 2:19 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: Matan Azrad <matan@mellanox.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>;
> dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> 
> 23/01/2018 14:34, Ananyev, Konstantin:
> > If that' s the use case, then I think you need to set device ownership at creation time -
> > inside dev_allocate().
> > Again that would avoid such racing conditions inside testpmd.
> 
> The devices must be allocated at a low level layer.

No one arguing about that.
But we can provide owner id information to the low level.

> When a new device appears (hotplug), an ethdev port should be allocated
> automatically if it passes the whitelist/blacklist policy test.
> Then we must decide who will manage this device.
> I suggest notifying the DPDK libs first.
> So a DPDK lib or PMD like failsafe can have the priority to take the
> ownership in its notification callback.

Possible, but seems a bit overcomplicated.
Why not just:

Have a global variable process_default_owner_id, that would be init once at startup.
Have an LTS variable default_owner_id.
It will be used by rte_eth_dev_allocate() caller can set dev->owner_id at creation time,
so port allocation and setting ownership - will be an atomic operation.
At the exit rte_eth_dev_allocate() will always reset default_owner_id=0:

rte_eth_dev_allocate(...)
{
   lock(owner_lock);
   <allocate_port>
   owner = RTE_PER_LCORE(default_owner_id);
   if (owner == 0)
       owner = process_default_owner_id;
  set_owner(port, ..., owner);
 unlock(owner_lock);
 RTE_PER_LCORE(default_owner_id) = 0;
}

So callers who don't need any special ownership - don't need to do anything.
Special callers (like failsafe) can set default_owenr_id just before calling hotplug
handling routine.  

Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-23 15:12                             ` Ananyev, Konstantin
@ 2018-01-23 15:18                               ` Ananyev, Konstantin
  2018-01-23 17:33                                 ` Thomas Monjalon
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-23 15:18 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon
  Cc: Matan Azrad, Gaëtan Rivet, Wu, Jingjing, dev, Neil Horman,
	Richardson, Bruce



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin
> Sent: Tuesday, January 23, 2018 3:12 PM
> To: Thomas Monjalon <thomas@monjalon.net>
> Cc: Matan Azrad <matan@mellanox.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>;
> dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> 
> 
> 
> > -----Original Message-----
> > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > Sent: Tuesday, January 23, 2018 2:19 PM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > Cc: Matan Azrad <matan@mellanox.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>;
> > dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson, Bruce <bruce.richardson@intel.com>
> > Subject: Re: [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> >
> > 23/01/2018 14:34, Ananyev, Konstantin:
> > > If that' s the use case, then I think you need to set device ownership at creation time -
> > > inside dev_allocate().
> > > Again that would avoid such racing conditions inside testpmd.
> >
> > The devices must be allocated at a low level layer.
> 
> No one arguing about that.
> But we can provide owner id information to the low level.
> 
> > When a new device appears (hotplug), an ethdev port should be allocated
> > automatically if it passes the whitelist/blacklist policy test.
> > Then we must decide who will manage this device.
> > I suggest notifying the DPDK libs first.
> > So a DPDK lib or PMD like failsafe can have the priority to take the
> > ownership in its notification callback.
> 
> Possible, but seems a bit overcomplicated.
> Why not just:
> 
> Have a global variable process_default_owner_id, that would be init once at startup.
> Have an LTS variable default_owner_id.
> It will be used by rte_eth_dev_allocate() caller can set dev->owner_id at creation time,
> so port allocation and setting ownership - will be an atomic operation.
> At the exit rte_eth_dev_allocate() will always reset default_owner_id=0:
> 
> rte_eth_dev_allocate(...)
> {
>    lock(owner_lock);
>    <allocate_port>
>    owner = RTE_PER_LCORE(default_owner_id);
>    if (owner == 0)
>        owner = process_default_owner_id;
>   set_owner(port, ..., owner);
>  unlock(owner_lock);
>  RTE_PER_LCORE(default_owner_id) = 0;

Or probably better to leave default_owner_id reset to the caller.
Another thing - we can use same LTS variable in all control ops to
allow/disallow changing of port configuration based on ownership.
Konstantin

> }
> 
> So callers who don't need any special ownership - don't need to do anything.
> Special callers (like failsafe) can set default_owenr_id just before calling hotplug
> handling routine.
> 
> Konstantin
> 
> 
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-23 15:18                               ` Ananyev, Konstantin
@ 2018-01-23 17:33                                 ` Thomas Monjalon
  2018-01-23 21:18                                   ` Ananyev, Konstantin
  0 siblings, 1 reply; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-23 17:33 UTC (permalink / raw)
  To: Ananyev, Konstantin
  Cc: Matan Azrad, Gaëtan Rivet, Wu, Jingjing, dev, Neil Horman,
	Richardson, Bruce

23/01/2018 16:18, Ananyev, Konstantin:
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin
> > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > 23/01/2018 14:34, Ananyev, Konstantin:
> > > > If that' s the use case, then I think you need to set device ownership at creation time -
> > > > inside dev_allocate().
> > > > Again that would avoid such racing conditions inside testpmd.
> > >
> > > The devices must be allocated at a low level layer.
> > 
> > No one arguing about that.
> > But we can provide owner id information to the low level.

Sorry, you did not get it.
We cannot provide owner id at the low level
because it is not yet decided who will be the owner
before the port is allocated.

> > > When a new device appears (hotplug), an ethdev port should be allocated
> > > automatically if it passes the whitelist/blacklist policy test.
> > > Then we must decide who will manage this device.
> > > I suggest notifying the DPDK libs first.
> > > So a DPDK lib or PMD like failsafe can have the priority to take the
> > > ownership in its notification callback.
> > 
> > Possible, but seems a bit overcomplicated.
> > Why not just:
> > 
> > Have a global variable process_default_owner_id, that would be init once at startup.
> > Have an LTS variable default_owner_id.
> > It will be used by rte_eth_dev_allocate() caller can set dev->owner_id at creation time,
> > so port allocation and setting ownership - will be an atomic operation.
> > At the exit rte_eth_dev_allocate() will always reset default_owner_id=0:
> > 
> > rte_eth_dev_allocate(...)
> > {
> >    lock(owner_lock);
> >    <allocate_port>
> >    owner = RTE_PER_LCORE(default_owner_id);
> >    if (owner == 0)
> >        owner = process_default_owner_id;
> >   set_owner(port, ..., owner);
> >  unlock(owner_lock);
> >  RTE_PER_LCORE(default_owner_id) = 0;
> 
> Or probably better to leave default_owner_id reset to the caller.
> Another thing - we can use same LTS variable in all control ops to
> allow/disallow changing of port configuration based on ownership.
> Konstantin
> 
> > }
> > 
> > So callers who don't need any special ownership - don't need to do anything.
> > Special callers (like failsafe) can set default_owenr_id just before calling hotplug
> > handling routine.

No, hotplug will not be a routine.
I am talking about real hotplug, like a device which appears in the VM.
This new device must be handled by EAL and probed automatically if
comply with whitelist/blacklist policy given by the application or user.
Real hotplug is asynchronous.
We will just receive notifications that it appeared.

Note: there is some temporary code in failsafe to manage some hotplug.
This code must be removed when it will be properly handled in EAL.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-23 17:33                                 ` Thomas Monjalon
@ 2018-01-23 21:18                                   ` Ananyev, Konstantin
  2018-01-24  8:10                                     ` Thomas Monjalon
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-23 21:18 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Matan Azrad, Gaëtan Rivet, Wu, Jingjing, dev, Neil Horman,
	Richardson, Bruce

> 
> 23/01/2018 16:18, Ananyev, Konstantin:
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin
> > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > 23/01/2018 14:34, Ananyev, Konstantin:
> > > > > If that' s the use case, then I think you need to set device ownership at creation time -
> > > > > inside dev_allocate().
> > > > > Again that would avoid such racing conditions inside testpmd.
> > > >
> > > > The devices must be allocated at a low level layer.
> > >
> > > No one arguing about that.
> > > But we can provide owner id information to the low level.
> 
> Sorry, you did not get it.

Might be.

> We cannot provide owner id at the low level
> because it is not yet decided who will be the owner
> before the port is allocated.

Why is that?
What prevents us decide who will manage that device *before* allocating port of it?
IMO we do have all needed information at that stage.

> 
> > > > When a new device appears (hotplug), an ethdev port should be allocated
> > > > automatically if it passes the whitelist/blacklist policy test.
> > > > Then we must decide who will manage this device.
> > > > I suggest notifying the DPDK libs first.
> > > > So a DPDK lib or PMD like failsafe can have the priority to take the
> > > > ownership in its notification callback.
> > >
> > > Possible, but seems a bit overcomplicated.
> > > Why not just:
> > >
> > > Have a global variable process_default_owner_id, that would be init once at startup.
> > > Have an LTS variable default_owner_id.
> > > It will be used by rte_eth_dev_allocate() caller can set dev->owner_id at creation time,
> > > so port allocation and setting ownership - will be an atomic operation.
> > > At the exit rte_eth_dev_allocate() will always reset default_owner_id=0:
> > >
> > > rte_eth_dev_allocate(...)
> > > {
> > >    lock(owner_lock);
> > >    <allocate_port>
> > >    owner = RTE_PER_LCORE(default_owner_id);
> > >    if (owner == 0)
> > >        owner = process_default_owner_id;
> > >   set_owner(port, ..., owner);
> > >  unlock(owner_lock);
> > >  RTE_PER_LCORE(default_owner_id) = 0;
> >
> > Or probably better to leave default_owner_id reset to the caller.
> > Another thing - we can use same LTS variable in all control ops to
> > allow/disallow changing of port configuration based on ownership.
> > Konstantin
> >
> > > }
> > >
> > > So callers who don't need any special ownership - don't need to do anything.
> > > Special callers (like failsafe) can set default_owenr_id just before calling hotplug
> > > handling routine.
> 
> No, hotplug will not be a routine.
> I am talking about real hotplug, like a device which appears in the VM.
> This new device must be handled by EAL and probed automatically if
> comply with whitelist/blacklist policy given by the application or user.
> Real hotplug is asynchronous.

By 'asynchronous' here you mean it would be handled in the EAL interrupt thread
or something different?
Anyway, I suppose  you do need a function inside DPDK that will do the actual work in response
on hotplug event, right?
That's what I refer to as 'hotplug routine' above. 

> We will just receive notifications that it appeared.
> 
> Note: there is some temporary code in failsafe to manage some hotplug.
> This code must be removed when it will be properly handled in EAL.

Ok, if it is just a temporary code, that would be removed soon -
then it definitely seems wrong to modify tespmd (or any other user app)
to comply with that temporary solution.

Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-23 21:18                                   ` Ananyev, Konstantin
@ 2018-01-24  8:10                                     ` Thomas Monjalon
  2018-01-24 18:30                                       ` Ananyev, Konstantin
  0 siblings, 1 reply; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-24  8:10 UTC (permalink / raw)
  To: Ananyev, Konstantin
  Cc: Matan Azrad, Gaëtan Rivet, Wu, Jingjing, dev, Neil Horman,
	Richardson, Bruce

23/01/2018 22:18, Ananyev, Konstantin:
> > 
> > 23/01/2018 16:18, Ananyev, Konstantin:
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin
> > > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > > 23/01/2018 14:34, Ananyev, Konstantin:
> > > > > > If that' s the use case, then I think you need to set device ownership at creation time -
> > > > > > inside dev_allocate().
> > > > > > Again that would avoid such racing conditions inside testpmd.
> > > > >
> > > > > The devices must be allocated at a low level layer.
> > > >
> > > > No one arguing about that.
> > > > But we can provide owner id information to the low level.
> > 
> > Sorry, you did not get it.
> 
> Might be.
> 
> > We cannot provide owner id at the low level
> > because it is not yet decided who will be the owner
> > before the port is allocated.
> 
> Why is that?
> What prevents us decide who will manage that device *before* allocating port of it?
> IMO we do have all needed information at that stage.

We don't have the information.
It is a new device, it is matched by a driver which allocates a port.
I don't see where the higher level can interact here.
And even if you manage a trick, the higher level needs to read the port
informations to decide the ownership.

> > > > > When a new device appears (hotplug), an ethdev port should be allocated
> > > > > automatically if it passes the whitelist/blacklist policy test.
> > > > > Then we must decide who will manage this device.
> > > > > I suggest notifying the DPDK libs first.
> > > > > So a DPDK lib or PMD like failsafe can have the priority to take the
> > > > > ownership in its notification callback.
> > > >
> > > > Possible, but seems a bit overcomplicated.
> > > > Why not just:
> > > >
> > > > Have a global variable process_default_owner_id, that would be init once at startup.
> > > > Have an LTS variable default_owner_id.
> > > > It will be used by rte_eth_dev_allocate() caller can set dev->owner_id at creation time,
> > > > so port allocation and setting ownership - will be an atomic operation.
> > > > At the exit rte_eth_dev_allocate() will always reset default_owner_id=0:
> > > >
> > > > rte_eth_dev_allocate(...)
> > > > {
> > > >    lock(owner_lock);
> > > >    <allocate_port>
> > > >    owner = RTE_PER_LCORE(default_owner_id);
> > > >    if (owner == 0)
> > > >        owner = process_default_owner_id;
> > > >   set_owner(port, ..., owner);
> > > >  unlock(owner_lock);
> > > >  RTE_PER_LCORE(default_owner_id) = 0;
> > >
> > > Or probably better to leave default_owner_id reset to the caller.
> > > Another thing - we can use same LTS variable in all control ops to
> > > allow/disallow changing of port configuration based on ownership.
> > > Konstantin
> > >
> > > > }
> > > >
> > > > So callers who don't need any special ownership - don't need to do anything.
> > > > Special callers (like failsafe) can set default_owenr_id just before calling hotplug
> > > > handling routine.
> > 
> > No, hotplug will not be a routine.
> > I am talking about real hotplug, like a device which appears in the VM.
> > This new device must be handled by EAL and probed automatically if
> > comply with whitelist/blacklist policy given by the application or user.
> > Real hotplug is asynchronous.
> 
> By 'asynchronous' here you mean it would be handled in the EAL interrupt thread
> or something different?

Yes, we receive an hotplug event which is processed in the event thread.

> Anyway, I suppose  you do need a function inside DPDK that will do the actual work in response
> on hotplug event, right?

Yes

> That's what I refer to as 'hotplug routine' above.
> 
> > We will just receive notifications that it appeared.
> > 
> > Note: there is some temporary code in failsafe to manage some hotplug.
> > This code must be removed when it will be properly handled in EAL.
> 
> Ok, if it is just a temporary code, that would be removed soon -
> then it definitely seems wrong to modify tespmd (or any other user app)
> to comply with that temporary solution.

It will be modified when EAL hotplug will be implemented.

However, the ownership issue will be the same:
we don't know the owner when allocating a port.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-24  8:10                                     ` Thomas Monjalon
@ 2018-01-24 18:30                                       ` Ananyev, Konstantin
  2018-01-25 10:55                                         ` Thomas Monjalon
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-24 18:30 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Matan Azrad, Gaëtan Rivet, Wu, Jingjing, dev, Neil Horman,
	Richardson, Bruce



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Wednesday, January 24, 2018 8:10 AM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: Matan Azrad <matan@mellanox.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>;
> dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> 
> 23/01/2018 22:18, Ananyev, Konstantin:
> > >
> > > 23/01/2018 16:18, Ananyev, Konstantin:
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin
> > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > > > 23/01/2018 14:34, Ananyev, Konstantin:
> > > > > > > If that' s the use case, then I think you need to set device ownership at creation time -
> > > > > > > inside dev_allocate().
> > > > > > > Again that would avoid such racing conditions inside testpmd.
> > > > > >
> > > > > > The devices must be allocated at a low level layer.
> > > > >
> > > > > No one arguing about that.
> > > > > But we can provide owner id information to the low level.
> > >
> > > Sorry, you did not get it.
> >
> > Might be.
> >
> > > We cannot provide owner id at the low level
> > > because it is not yet decided who will be the owner
> > > before the port is allocated.
> >
> > Why is that?
> > What prevents us decide who will manage that device *before* allocating port of it?
> > IMO we do have all needed information at that stage.
> 
> We don't have the information.

At that point we do have dev name and all parameters, right?
Plus we do have blacklist/whitelist, etc.
So what else are we missing to make the decision at that point? 

> It is a new device, it is matched by a driver which allocates a port.
> I don't see where the higher level can interact here.
> And even if you manage a trick, the higher level needs to read the port
> informations to decide the ownership.

Could you specify what particular port information it needs?

> 
> > > > > > When a new device appears (hotplug), an ethdev port should be allocated
> > > > > > automatically if it passes the whitelist/blacklist policy test.
> > > > > > Then we must decide who will manage this device.
> > > > > > I suggest notifying the DPDK libs first.
> > > > > > So a DPDK lib or PMD like failsafe can have the priority to take the
> > > > > > ownership in its notification callback.
> > > > >
> > > > > Possible, but seems a bit overcomplicated.
> > > > > Why not just:
> > > > >
> > > > > Have a global variable process_default_owner_id, that would be init once at startup.
> > > > > Have an LTS variable default_owner_id.
> > > > > It will be used by rte_eth_dev_allocate() caller can set dev->owner_id at creation time,
> > > > > so port allocation and setting ownership - will be an atomic operation.
> > > > > At the exit rte_eth_dev_allocate() will always reset default_owner_id=0:
> > > > >
> > > > > rte_eth_dev_allocate(...)
> > > > > {
> > > > >    lock(owner_lock);
> > > > >    <allocate_port>
> > > > >    owner = RTE_PER_LCORE(default_owner_id);
> > > > >    if (owner == 0)
> > > > >        owner = process_default_owner_id;
> > > > >   set_owner(port, ..., owner);
> > > > >  unlock(owner_lock);
> > > > >  RTE_PER_LCORE(default_owner_id) = 0;
> > > >
> > > > Or probably better to leave default_owner_id reset to the caller.
> > > > Another thing - we can use same LTS variable in all control ops to
> > > > allow/disallow changing of port configuration based on ownership.
> > > > Konstantin
> > > >
> > > > > }
> > > > >
> > > > > So callers who don't need any special ownership - don't need to do anything.
> > > > > Special callers (like failsafe) can set default_owenr_id just before calling hotplug
> > > > > handling routine.
> > >
> > > No, hotplug will not be a routine.
> > > I am talking about real hotplug, like a device which appears in the VM.
> > > This new device must be handled by EAL and probed automatically if
> > > comply with whitelist/blacklist policy given by the application or user.
> > > Real hotplug is asynchronous.
> >
> > By 'asynchronous' here you mean it would be handled in the EAL interrupt thread
> > or something different?
> 
> Yes, we receive an hotplug event which is processed in the event thread.
> 
> > Anyway, I suppose  you do need a function inside DPDK that will do the actual work in response
> > on hotplug event, right?
> 
> Yes

Ok, btw why that function has to be always called from interrupt thread?
Why not to allow user to decide?
Some apps have their own thread dedicated for control ops (like testpmd)
For them it might be plausible to call that function from their own control thread context.
Konstantin

> 
> > That's what I refer to as 'hotplug routine' above.
> >
> > > We will just receive notifications that it appeared.
> > >
> > > Note: there is some temporary code in failsafe to manage some hotplug.
> > > This code must be removed when it will be properly handled in EAL.
> >
> > Ok, if it is just a temporary code, that would be removed soon -
> > then it definitely seems wrong to modify tespmd (or any other user app)
> > to comply with that temporary solution.
> 
> It will be modified when EAL hotplug will be implemented.
> 
> However, the ownership issue will be the same:
> we don't know the owner when allocating a port.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v5 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 7/7] app/testpmd: adjust ethdev port ownership Matan Azrad
@ 2018-01-25  1:47           ` Lu, Wenzhuo
  2018-01-25  8:30             ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Lu, Wenzhuo @ 2018-01-25  1:47 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce, Ananyev, Konstantin

Hi Matan,


> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> Sent: Tuesday, January 23, 2018 12:38 AM
> To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>
> Subject: [dpdk-dev] [PATCH v5 7/7] app/testpmd: adjust ethdev port
> ownership
> 
> Testpmd should not use ethdev ports which are managed by other DPDK
> entities.
> 
> Set Testpmd ownership to each port which is not used by other entity and
> prevent any usage of ethdev ports which are not owned by Testpmd.
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> ---
>  app/test-pmd/cmdline.c      | 89 +++++++++++++++++++--------------------------
>  app/test-pmd/cmdline_flow.c |  2 +-
>  app/test-pmd/config.c       | 37 ++++++++++---------
>  app/test-pmd/parameters.c   |  4 +-
>  app/test-pmd/testpmd.c      | 63 ++++++++++++++++++++------------
>  app/test-pmd/testpmd.h      |  3 ++
>  6 files changed, 103 insertions(+), 95 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
> 9f12c0f..36dba6c 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
>  			&link_speed) < 0)
>  		return;
> 
> -	RTE_ETH_FOREACH_DEV(pid) {
> +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
I see my_owner is a global variable, so, don't know why we need the parameter 'my_owner.id' here. I think we can still use RTE_ETH_FOREACH_DEV and check 'my_owner' in it. If there's some reason and you don't want change RTE_ETH_FOREACH_DEV, I think ' RTE_ETH_FOREACH_DEV_OWNED(pid) {' is better.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v5 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-25  1:47           ` Lu, Wenzhuo
@ 2018-01-25  8:30             ` Matan Azrad
  2018-01-26  0:50               ` Lu, Wenzhuo
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-25  8:30 UTC (permalink / raw)
  To: Lu, Wenzhuo, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce, Ananyev, Konstantin

Hi Lu

From: Lu, Wenzhuo [mailto:wenzhuo.lu@intel.com]
> Hi Matan,
> 
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > Sent: Tuesday, January 23, 2018 12:38 AM
> > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > <konstantin.ananyev@intel.com>
> > Subject: [dpdk-dev] [PATCH v5 7/7] app/testpmd: adjust ethdev port
> > ownership
> >
> > Testpmd should not use ethdev ports which are managed by other DPDK
> > entities.
> >
> > Set Testpmd ownership to each port which is not used by other entity
> > and prevent any usage of ethdev ports which are not owned by Testpmd.
> >
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > ---
> >  app/test-pmd/cmdline.c      | 89 +++++++++++++++++++---------------------
> -----
> >  app/test-pmd/cmdline_flow.c |  2 +-
> >  app/test-pmd/config.c       | 37 ++++++++++---------
> >  app/test-pmd/parameters.c   |  4 +-
> >  app/test-pmd/testpmd.c      | 63 ++++++++++++++++++++------------
> >  app/test-pmd/testpmd.h      |  3 ++
> >  6 files changed, 103 insertions(+), 95 deletions(-)
> >
> > diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
> > 9f12c0f..36dba6c 100644
> > --- a/app/test-pmd/cmdline.c
> > +++ b/app/test-pmd/cmdline.c
> > @@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
> >  			&link_speed) < 0)
> >  		return;
> >
> > -	RTE_ETH_FOREACH_DEV(pid) {
> > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
> I see my_owner is a global variable, so, don't know why we need the
> parameter 'my_owner.id' here.

Yes it is a testpmd global variable (which was initiated in testpmd main function - you can see it in this patch) as a lot of variables in testpmd.
RTE_ETH_FOREACH_DEV_OWNED_BY iterator is an ethdev iterator -> not only for testpmd\applications.
So, each dpdk entity(application, PMDs, any other libs) should use this iterator with its specific owner id to get its owned ports.

> I think we can still use
> RTE_ETH_FOREACH_DEV and check 'my_owner' in it. If there's some reason
> and you don't want change RTE_ETH_FOREACH_DEV, I think '
> RTE_ETH_FOREACH_DEV_OWNED(pid) {' is better.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-23 14:30                           ` Matan Azrad
@ 2018-01-25  9:36                             ` Matan Azrad
  2018-01-25 10:05                               ` Thomas Monjalon
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-25  9:36 UTC (permalink / raw)
  To: Matan Azrad, Gaëtan Rivet, Ananyev, Konstantin, Thomas Monjalon
  Cc: Wu, Jingjing, dev, Neil Horman, Richardson, Bruce

Gaetan, Konstantin, Thomas

Any response to my suggestion below?

From: Matan Azrad
> Hi
> 
> From: Gaëtan Rivet [mailto:gaetan.rivet@6wind.com] <snip>
> > > > > > > Look,
> > > > > > > > Testpmd initiates some of its internal databases depends
> > > > > > > > on specific port iteration, In some time someone may take
> > > > > > > > ownership of Testpmd ports and testpmd will continue to
> > > > > > > > touch
> > them.
> > > > > >
> > > > > > But if someone will take the ownership (assign new owner_id)
> > > > > > that port will not appear in RTE_ETH_FOREACH_DEV() any more.
> > > > > >
> > > > >
> > > > > Yes, but testpmd sometimes depends on previous iteration using
> > internal database.
> > > > > So it uses internal database that was updated by old iteration.
> > > >
> > > > That sounds like just a bug in testpmd that need to be fixed, no?
> > >
> > > If Testpmd already took ownership for these ports(like I did), it is ok.
> > >
> >
> > Have you tested using the default iterator (NO_OWNER)?
> > It worked until now with the bare minimal device tagging using
> > DEV_DEFERRED. Testpmd did not seem to mind having to skip this port.
> >
> > I'm sure there were places where this was overlooked, but overall, I'd
> > think everything should be fixable using only the NO_OWNER iteration.
> 
> I don't think so.
> 
> > Can you point to a specific scenario (command line, chain of event)
> > that would lead to a problem?
> >
> 
> I didn't construct a race test to catch testpmd issue, but I think without this
> patch, there is a lot of issues.
> Go to the testpmd code (before ownership) and find usage of the old
> iterator(after the first iteration in main), Ask yourself what should happen if
> exactly in this time, a new port is created by fail-safe(plug in event).
> 
> > > > Any particular places where outdated device info is used?
> > >
> > > For example, look for the stream management in testpmd(I think I saw
> > > it
> > there).
> > >
> >
> > The stream management is certainly shaky, but it happens after the EAL
> > initial port creation, and is not able to update itself for new
> > hotplugged ports (unless something changed).
> >
> 
> Yes, but conceptually someone in the future may take the port(because it
> ownerless).
> 
> > > > > > > If I look back on the fail-safe, its sole purpose is to have
> > > > > > > seamless hotplug with existing applications.
> > > > > > >
> > > > > > > Port ownership is a genericization of some functions
> > > > > > > introduced by the fail-safe, that could structure DPDK
> > > > > > > further. It should allow applications to have a seamless
> > > > > > > integration with subsystems using port ownership. Without
> > > > > > > this,
> > port ownership cannot be used.
> > > > > > >
> > > > > > > Testpmd should be fixed, but follow the most common design
> > > > > > > patterns of DPDK applications. Going with port ownership
> > > > > > > seems like a paradigm shift.
> > > > > > >
> > > > > > > > In addition
> > > > > > > > Using the old iterator in some places in testpmd will
> > > > > > > > cause a race for run-
> > > > > > time new ports(can be created by failsafe or any hotplug code):
> > > > > > > > - testpmd finds an ownerless port(just now created) by the
> > > > > > > > old iterator and start traffic there,
> >
> > How does testpmd start traffic there? Testpmd has only a callback for
> > displaying that it received an event for a new port. It has no concept
> > of hotplugging beyond that.
> >
> 
> Yes, so no traffic just some control command.
> 
> > Testpmd will not start using any new port probed using the hotplug API
> > on its own, again, unless something has drastically changed.
> >
> 
> Every iterator using in testpmd is exposed to race.
> 
> > > > > > > > - failsafe takes ownership of this new port and start traffic there.
> > > > > > > > Problem!
> > > > > >
> > > > > > Could you shed a bit more light here - it would be race
> > > > > > condition between whom and whom?
> > > > >
> > > > > Sure.
> > > > >
> > > > > > As I remember in testpmd all control ops are done within one
> > > > > > thread (main lcore).
> > > > >
> > > > > But other dpdk entity can use another thread, for example:
> > > > > Failsafe uses the host thread(using alarm callback) to create a
> > > > > new port and
> > > > to take ownership of a port.
> > > >
> > > > Hm, and you create new ports inside failsafe PMD, right and then
> > > > set new owner_id for it?
> > >
> > > Yes.
> > >
> > > > And all this in alarm in interrupt thread?
> > >
> > > Yes.
> > >
> > > > If so I wonder how you can guarantee that no-one else will set
> > > > different owner_id between
> > > > rte_eth_dev_allocate() and rte_eth_dev_owner_set()?
> > >
> > > I check it (see failsafe patch to this series - V5).
> > > Function: fs_bus_init.
> > >
> > > > Could you point me to that place (I am not really familiar with
> > > > familiar with failsafe code)?
> > > >
> > > > >
> > > > > The race:
> > > > > Testpmd iterates over all ports by the master thread.
> > > > > Failsafe takes ownership of a port by the host thread and start using
> it.
> > > > > => The two dpdk entities may use the device at same time!
> > > >
> >
> > When can this happen? Fail-safe creates its initial pool of ports
> > during EAL init, before testpmd scans eth_dev ports and configure its
> streams.
> > At that point, it has taken ownership, from the master lcore context.
> >
> > After this point, new ports could be detected and hotplugged by fail-safe.
> > However, even if testpmd had a callback to capture those new ports and
> > reconfigure its streams, it would be executed from within the
> > intr-thread, same as failsafe. If the thread was interrupted, by a
> > dataplane-lcore for example, streams would not have been reconfigured.
> > The fail-safe would execute its callback and set the owner-id before
> > the callback chains goes to the application.
> >
> 
> Some iterator may be invoked in plug out process by other thread in testpmd
> and causes to control command
> 
> > And that would only be if testpmd had any callback for hotplugging
> > ports and reconfiguring its streams, which it hasn't, as far as I know.
> >
> 
> We don't need to implement it in testpmd.
> 
> > > > Ok, if failsafe really assigns its owner_id(s) to ports that are
> > > > already in use by the app, then how such scheme supposed to work
> > > > at
> > all?
> > >
> > > If the app works well (with the new rules) it already took ownership
> > > and
> > failsafe will see it and will wait until the application release it.
> > > Every dpdk entity should know which port it wants to manage, If 2
> > > entities want to manage the same device -  it can be ok and port
> > > ownership
> > can synchronize the usage.
> > >
> > > Probably, application which will run fail-safe wants to manage only
> > > the fail-
> > safe port and therefor to take ownership only for it.
> > >
> > > > I.E. application has a port - it assigns some owner_id != 0 to it,
> > > > then PMD tries to set its owner_id tot the same port.
> > > > Obviously failsafe's set_owner() will always fail in such case.
> > > >
> > > Yes, and will try again after some time.
> > >
> > > > From what I hear we need to introduce a concept of 'default owner id'.
> > > > I.E. when failsafe PMD is created - user assigns some owner_id to
> > > > it
> > (default).
> > > > Then failsafe PMD generates it's own owner_id and assigns it only
> > > > to the ports whose current owner_id is equal either 0 or 'default'
> owner_id.
> > > >
> > >
> > > It is a suggestion and we need to think about it more (I'm talking
> > > about it
> > with Gaetan in another thread).
> > > Actually I think, if we want a generic solution to the generic
> > > problem the
> > current solution is ok.
> > >
> >
> > We could as well conclude this other thread there.
> >
> > The only solution would be to have a default relationship between
> > owners, something that goes beyond the scope assigned by Thomas to
> > your evolution, but would be necessary for this API to be properly
> > used by existing applications.
> >
> > I think it's the only way to have a sane default behavior with your
> > API, but I also think this goes beyong the scope of the DPDK altogether.
> >
> > But even with those considerations that could be ironed out later (API
> > is still experimental anyway), in the meantime, I think we should
> > strive not to break "userland" as much as possible. Meaning that
> > unless you have a specific situation creating a bug, you shouldn't
> > have to modify testpmd, and if an issues arises, you need to try to
> > improve your API before resorting to changing the resource management
> model of all existing applications.
> >
> 
> I understand it.
> Suggestion:
> 
> 2 system owners.
> APP_OWNER - 1.
> NO_OWNER - 0.
> 
> And allowing for more owners as now.
> 
> 1. Every port creation will set the owner for NO_OWNER (as now).
> 2. There is option for all dpdk entities to take owner of  NO_OWNER ports all
> the time(as now).
> 3. In some point in the end of EAL init: set all the NO_OWNER to
> APP_OWNER(for V6).
> 4. Change the old iterator to iterate over APP_OWNER ports(for V6).
> 
> What do you think?
> 
> <snip>

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-25  9:36                             ` Matan Azrad
@ 2018-01-25 10:05                               ` Thomas Monjalon
  2018-01-25 11:15                                 ` Ananyev, Konstantin
  0 siblings, 1 reply; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-25 10:05 UTC (permalink / raw)
  To: Matan Azrad
  Cc: Gaëtan Rivet, Ananyev, Konstantin, Wu, Jingjing, dev,
	Neil Horman, Richardson, Bruce

25/01/2018 10:36, Matan Azrad:
> Gaetan, Konstantin, Thomas
> 
> Any response to my suggestion below?
> 
> From: Matan Azrad
> > Suggestion:
> > 
> > 2 system owners.
> > APP_OWNER - 1.
> > NO_OWNER - 0.
> > 
> > And allowing for more owners as now.
> > 
> > 1. Every port creation will set the owner for NO_OWNER (as now).
> > 2. There is option for all dpdk entities to take owner of  NO_OWNER ports all
> > the time(as now).
> > 3. In some point in the end of EAL init: set all the NO_OWNER to
> > APP_OWNER(for V6).
> > 4. Change the old iterator to iterate over APP_OWNER ports(for V6).
> > 
> > What do you think?

Reminder for everybody: there is no issue if no hotplug.
There is a race condition with hotplug.
Hotplug is not managed by EAL yet, but there is a temporary hotplug
management in failsafe.
So until now, the issue is seen only with hotplug in failsafe.

Your suggestion makes no change for applications,
and fix the ownership issue for failsafe.
And later, if an application wants to support generic hotplug properly
(when it will be generally available in DPDK),
the application should use the ownership API.
Right?

I think it is a good compromise.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-24 18:30                                       ` Ananyev, Konstantin
@ 2018-01-25 10:55                                         ` Thomas Monjalon
  2018-01-25 11:09                                           ` Ananyev, Konstantin
  0 siblings, 1 reply; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-25 10:55 UTC (permalink / raw)
  To: Ananyev, Konstantin
  Cc: Matan Azrad, Gaëtan Rivet, Wu, Jingjing, dev, Neil Horman,
	Richardson, Bruce

24/01/2018 19:30, Ananyev, Konstantin:
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > 23/01/2018 22:18, Ananyev, Konstantin:
> > > >
> > > > 23/01/2018 16:18, Ananyev, Konstantin:
> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin
> > > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > > > > 23/01/2018 14:34, Ananyev, Konstantin:
> > > > > > > > If that' s the use case, then I think you need to set device ownership at creation time -
> > > > > > > > inside dev_allocate().
> > > > > > > > Again that would avoid such racing conditions inside testpmd.
> > > > > > >
> > > > > > > The devices must be allocated at a low level layer.
> > > > > >
> > > > > > No one arguing about that.
> > > > > > But we can provide owner id information to the low level.
> > > >
> > > > Sorry, you did not get it.
> > >
> > > Might be.
> > >
> > > > We cannot provide owner id at the low level
> > > > because it is not yet decided who will be the owner
> > > > before the port is allocated.
> > >
> > > Why is that?
> > > What prevents us decide who will manage that device *before* allocating port of it?
> > > IMO we do have all needed information at that stage.
> > 
> > We don't have the information.
> 
> At that point we do have dev name and all parameters, right?

We just have the PCI id.

> Plus we do have blacklist/whitelist, etc.
> So what else are we missing to make the decision at that point? 

It depends on the ownership policy.
Example: we can decide to take ownership based on a MAC address.
Another example: it can be decided to take ownership of a given driver.
We don't have these informations with the PCI id.

> > It is a new device, it is matched by a driver which allocates a port.
> > I don't see where the higher level can interact here.
> > And even if you manage a trick, the higher level needs to read the port
> > informations to decide the ownership.
> 
> Could you specify what particular port information it needs?

Replied to the same question above :)


> > > > > > > When a new device appears (hotplug), an ethdev port should be allocated
> > > > > > > automatically if it passes the whitelist/blacklist policy test.
> > > > > > > Then we must decide who will manage this device.
> > > > > > > I suggest notifying the DPDK libs first.
> > > > > > > So a DPDK lib or PMD like failsafe can have the priority to take the
> > > > > > > ownership in its notification callback.
> > > > > >
> > > > > > Possible, but seems a bit overcomplicated.
> > > > > > Why not just:
> > > > > >
> > > > > > Have a global variable process_default_owner_id, that would be init once at startup.
> > > > > > Have an LTS variable default_owner_id.
> > > > > > It will be used by rte_eth_dev_allocate() caller can set dev->owner_id at creation time,
> > > > > > so port allocation and setting ownership - will be an atomic operation.
> > > > > > At the exit rte_eth_dev_allocate() will always reset default_owner_id=0:
> > > > > >
> > > > > > rte_eth_dev_allocate(...)
> > > > > > {
> > > > > >    lock(owner_lock);
> > > > > >    <allocate_port>
> > > > > >    owner = RTE_PER_LCORE(default_owner_id);
> > > > > >    if (owner == 0)
> > > > > >        owner = process_default_owner_id;
> > > > > >   set_owner(port, ..., owner);
> > > > > >  unlock(owner_lock);
> > > > > >  RTE_PER_LCORE(default_owner_id) = 0;
> > > > >
> > > > > Or probably better to leave default_owner_id reset to the caller.
> > > > > Another thing - we can use same LTS variable in all control ops to
> > > > > allow/disallow changing of port configuration based on ownership.
> > > > > Konstantin
> > > > >
> > > > > > }
> > > > > >
> > > > > > So callers who don't need any special ownership - don't need to do anything.
> > > > > > Special callers (like failsafe) can set default_owenr_id just before calling hotplug
> > > > > > handling routine.
> > > >
> > > > No, hotplug will not be a routine.
> > > > I am talking about real hotplug, like a device which appears in the VM.
> > > > This new device must be handled by EAL and probed automatically if
> > > > comply with whitelist/blacklist policy given by the application or user.
> > > > Real hotplug is asynchronous.
> > >
> > > By 'asynchronous' here you mean it would be handled in the EAL interrupt thread
> > > or something different?
> > 
> > Yes, we receive an hotplug event which is processed in the event thread.
> > 
> > > Anyway, I suppose  you do need a function inside DPDK that will do the actual work in response
> > > on hotplug event, right?
> > 
> > Yes
> 
> Ok, btw why that function has to be always called from interrupt thread?
> Why not to allow user to decide?

Absolutely, the user must decide.
In the example of failsafe, the user instructs a policy to decide
which devices will be owned, so failsafe takes the decision based
on user inputs.

> Some apps have their own thread dedicated for control ops (like testpmd)
> For them it might be plausible to call that function from their own control thread context.
> Konstantin
> 
> > 
> > > That's what I refer to as 'hotplug routine' above.
> > >
> > > > We will just receive notifications that it appeared.
> > > >
> > > > Note: there is some temporary code in failsafe to manage some hotplug.
> > > > This code must be removed when it will be properly handled in EAL.
> > >
> > > Ok, if it is just a temporary code, that would be removed soon -
> > > then it definitely seems wrong to modify tespmd (or any other user app)
> > > to comply with that temporary solution.
> > 
> > It will be modified when EAL hotplug will be implemented.
> > 
> > However, the ownership issue will be the same:
> > we don't know the owner when allocating a port.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-25 10:55                                         ` Thomas Monjalon
@ 2018-01-25 11:09                                           ` Ananyev, Konstantin
  2018-01-25 11:27                                             ` Thomas Monjalon
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-25 11:09 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Matan Azrad, Gaëtan Rivet, Wu, Jingjing, dev, Neil Horman,
	Richardson, Bruce



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Thursday, January 25, 2018 10:55 AM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: Matan Azrad <matan@mellanox.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>;
> dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> 
> 24/01/2018 19:30, Ananyev, Konstantin:
> > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > 23/01/2018 22:18, Ananyev, Konstantin:
> > > > >
> > > > > 23/01/2018 16:18, Ananyev, Konstantin:
> > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin
> > > > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > > > > > 23/01/2018 14:34, Ananyev, Konstantin:
> > > > > > > > > If that' s the use case, then I think you need to set device ownership at creation time -
> > > > > > > > > inside dev_allocate().
> > > > > > > > > Again that would avoid such racing conditions inside testpmd.
> > > > > > > >
> > > > > > > > The devices must be allocated at a low level layer.
> > > > > > >
> > > > > > > No one arguing about that.
> > > > > > > But we can provide owner id information to the low level.
> > > > >
> > > > > Sorry, you did not get it.
> > > >
> > > > Might be.
> > > >
> > > > > We cannot provide owner id at the low level
> > > > > because it is not yet decided who will be the owner
> > > > > before the port is allocated.
> > > >
> > > > Why is that?
> > > > What prevents us decide who will manage that device *before* allocating port of it?
> > > > IMO we do have all needed information at that stage.
> > >
> > > We don't have the information.
> >
> > At that point we do have dev name and all parameters, right?
> 
> We just have the PCI id.
> 
> > Plus we do have blacklist/whitelist, etc.
> > So what else are we missing to make the decision at that point?
> 
> It depends on the ownership policy.
> Example: we can decide to take ownership based on a MAC address.

That's sounds a bit articificial (mac address can be changed on the fly), but ok -
for such devices user can decide to use default id first and change
it later after port is allocated and dev_init() is passed.
Though as I understand there situations (like in failsafe PMD) when we do 
know for sure owner_id before calling dev_allocate().

> Another example: it can be decided to take ownership of a given driver.
> We don't have these informations with the PCI id.
> 
> > > It is a new device, it is matched by a driver which allocates a port.
> > > I don't see where the higher level can interact here.
> > > And even if you manage a trick, the higher level needs to read the port
> > > informations to decide the ownership.
> >
> > Could you specify what particular port information it needs?
> 
> Replied to the same question above :)
> 
> 
> > > > > > > > When a new device appears (hotplug), an ethdev port should be allocated
> > > > > > > > automatically if it passes the whitelist/blacklist policy test.
> > > > > > > > Then we must decide who will manage this device.
> > > > > > > > I suggest notifying the DPDK libs first.
> > > > > > > > So a DPDK lib or PMD like failsafe can have the priority to take the
> > > > > > > > ownership in its notification callback.
> > > > > > >
> > > > > > > Possible, but seems a bit overcomplicated.
> > > > > > > Why not just:
> > > > > > >
> > > > > > > Have a global variable process_default_owner_id, that would be init once at startup.
> > > > > > > Have an LTS variable default_owner_id.
> > > > > > > It will be used by rte_eth_dev_allocate() caller can set dev->owner_id at creation time,
> > > > > > > so port allocation and setting ownership - will be an atomic operation.
> > > > > > > At the exit rte_eth_dev_allocate() will always reset default_owner_id=0:
> > > > > > >
> > > > > > > rte_eth_dev_allocate(...)
> > > > > > > {
> > > > > > >    lock(owner_lock);
> > > > > > >    <allocate_port>
> > > > > > >    owner = RTE_PER_LCORE(default_owner_id);
> > > > > > >    if (owner == 0)
> > > > > > >        owner = process_default_owner_id;
> > > > > > >   set_owner(port, ..., owner);
> > > > > > >  unlock(owner_lock);
> > > > > > >  RTE_PER_LCORE(default_owner_id) = 0;
> > > > > >
> > > > > > Or probably better to leave default_owner_id reset to the caller.
> > > > > > Another thing - we can use same LTS variable in all control ops to
> > > > > > allow/disallow changing of port configuration based on ownership.
> > > > > > Konstantin
> > > > > >
> > > > > > > }
> > > > > > >
> > > > > > > So callers who don't need any special ownership - don't need to do anything.
> > > > > > > Special callers (like failsafe) can set default_owenr_id just before calling hotplug
> > > > > > > handling routine.
> > > > >
> > > > > No, hotplug will not be a routine.
> > > > > I am talking about real hotplug, like a device which appears in the VM.
> > > > > This new device must be handled by EAL and probed automatically if
> > > > > comply with whitelist/blacklist policy given by the application or user.
> > > > > Real hotplug is asynchronous.
> > > >
> > > > By 'asynchronous' here you mean it would be handled in the EAL interrupt thread
> > > > or something different?
> > >
> > > Yes, we receive an hotplug event which is processed in the event thread.
> > >
> > > > Anyway, I suppose  you do need a function inside DPDK that will do the actual work in response
> > > > on hotplug event, right?
> > >
> > > Yes
> >
> > Ok, btw why that function has to be always called from interrupt thread?
> > Why not to allow user to decide?
> 
> Absolutely, the user must decide.
> In the example of failsafe, the user instructs a policy to decide
> which devices will be owned, so failsafe takes the decision based
> on user inputs.
> 
> > Some apps have their own thread dedicated for control ops (like testpmd)
> > For them it might be plausible to call that function from their own control thread context.
> > Konstantin
> >
> > >
> > > > That's what I refer to as 'hotplug routine' above.
> > > >
> > > > > We will just receive notifications that it appeared.
> > > > >
> > > > > Note: there is some temporary code in failsafe to manage some hotplug.
> > > > > This code must be removed when it will be properly handled in EAL.
> > > >
> > > > Ok, if it is just a temporary code, that would be removed soon -
> > > > then it definitely seems wrong to modify tespmd (or any other user app)
> > > > to comply with that temporary solution.
> > >
> > > It will be modified when EAL hotplug will be implemented.
> > >
> > > However, the ownership issue will be the same:
> > > we don't know the owner when allocating a port.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-25 10:05                               ` Thomas Monjalon
@ 2018-01-25 11:15                                 ` Ananyev, Konstantin
  2018-01-25 11:33                                   ` Thomas Monjalon
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-25 11:15 UTC (permalink / raw)
  To: Thomas Monjalon, Matan Azrad
  Cc: Gaëtan Rivet, Wu, Jingjing, dev, Neil Horman, Richardson, Bruce

Hi everyone,

> 
> 25/01/2018 10:36, Matan Azrad:
> > Gaetan, Konstantin, Thomas
> >
> > Any response to my suggestion below?
> >
> > From: Matan Azrad
> > > Suggestion:
> > >
> > > 2 system owners.
> > > APP_OWNER - 1.
> > > NO_OWNER - 0.
> > >
> > > And allowing for more owners as now.
> > >
> > > 1. Every port creation will set the owner for NO_OWNER (as now).
> > > 2. There is option for all dpdk entities to take owner of  NO_OWNER ports all
> > > the time(as now).
> > > 3. In some point in the end of EAL init: set all the NO_OWNER to
> > > APP_OWNER(for V6).

What will happen if we have 2 (or more process) sharing the same device?
How we will distinguish what APP_OWNER we are talking about?
Shouldn't default_owner be unique per process?

> > > 4. Change the old iterator to iterate over APP_OWNER ports(for V6).

If I get it right it means no changes in tetpmd, correct?

> > >
> > > What do you think?
> 
> Reminder for everybody: there is no issue if no hotplug.
> There is a race condition with hotplug.
> Hotplug is not managed by EAL yet, but there is a temporary hotplug
> management in failsafe.
> So until now, the issue is seen only with hotplug in failsafe.
> 
> Your suggestion makes no change for applications,
> and fix the ownership issue for failsafe.
> And later, if an application wants to support generic hotplug properly
> (when it will be generally available in DPDK),
> the application should use the ownership API.
> Right?
> 
> I think it is a good compromise.

I still think it would be good if future hotplug support will be transparent
to existing apps (no/minimal changes).
But I suppose we can discuss it later, when will have hotplug patches.
Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-25 11:09                                           ` Ananyev, Konstantin
@ 2018-01-25 11:27                                             ` Thomas Monjalon
  0 siblings, 0 replies; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-25 11:27 UTC (permalink / raw)
  To: Ananyev, Konstantin
  Cc: Matan Azrad, Gaëtan Rivet, Wu, Jingjing, dev, Neil Horman,
	Richardson, Bruce

25/01/2018 12:09, Ananyev, Konstantin:
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > 24/01/2018 19:30, Ananyev, Konstantin:
> > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > 23/01/2018 22:18, Ananyev, Konstantin:
> > > > > >
> > > > > > 23/01/2018 16:18, Ananyev, Konstantin:
> > > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin
> > > > > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > > > > > > > > 23/01/2018 14:34, Ananyev, Konstantin:
> > > > > > > > > > If that' s the use case, then I think you need to set device ownership at creation time -
> > > > > > > > > > inside dev_allocate().
> > > > > > > > > > Again that would avoid such racing conditions inside testpmd.
> > > > > > > > >
> > > > > > > > > The devices must be allocated at a low level layer.
> > > > > > > >
> > > > > > > > No one arguing about that.
> > > > > > > > But we can provide owner id information to the low level.
> > > > > >
> > > > > > Sorry, you did not get it.
> > > > >
> > > > > Might be.
> > > > >
> > > > > > We cannot provide owner id at the low level
> > > > > > because it is not yet decided who will be the owner
> > > > > > before the port is allocated.
> > > > >
> > > > > Why is that?
> > > > > What prevents us decide who will manage that device *before* allocating port of it?
> > > > > IMO we do have all needed information at that stage.
> > > >
> > > > We don't have the information.
> > >
> > > At that point we do have dev name and all parameters, right?
> > 
> > We just have the PCI id.
> > 
> > > Plus we do have blacklist/whitelist, etc.
> > > So what else are we missing to make the decision at that point?
> > 
> > It depends on the ownership policy.
> > Example: we can decide to take ownership based on a MAC address.
> 
> That's sounds a bit articificial (mac address can be changed on the fly), but ok -
> for such devices user can decide to use default id first and change
> it later after port is allocated and dev_init() is passed.
> Though as I understand there situations (like in failsafe PMD) when we do 
> know for sure owner_id before calling dev_allocate().

In the general case, when hotplug will be managed by EAL in an
asynchronous way, the port allocation will be done without any knowledge
about the port owner.

> > Another example: it can be decided to take ownership of a given driver.
> > We don't have these informations with the PCI id.
> > 
> > > > It is a new device, it is matched by a driver which allocates a port.
> > > > I don't see where the higher level can interact here.
> > > > And even if you manage a trick, the higher level needs to read the port
> > > > informations to decide the ownership.
> > >
> > > Could you specify what particular port information it needs?
> > 
> > Replied to the same question above :)
> > 
> > 
> > > > > > > > > When a new device appears (hotplug), an ethdev port should be allocated
> > > > > > > > > automatically if it passes the whitelist/blacklist policy test.
> > > > > > > > > Then we must decide who will manage this device.
> > > > > > > > > I suggest notifying the DPDK libs first.
> > > > > > > > > So a DPDK lib or PMD like failsafe can have the priority to take the
> > > > > > > > > ownership in its notification callback.
> > > > > > > >
> > > > > > > > Possible, but seems a bit overcomplicated.
> > > > > > > > Why not just:
> > > > > > > >
> > > > > > > > Have a global variable process_default_owner_id, that would be init once at startup.
> > > > > > > > Have an LTS variable default_owner_id.
> > > > > > > > It will be used by rte_eth_dev_allocate() caller can set dev->owner_id at creation time,
> > > > > > > > so port allocation and setting ownership - will be an atomic operation.
> > > > > > > > At the exit rte_eth_dev_allocate() will always reset default_owner_id=0:
> > > > > > > >
> > > > > > > > rte_eth_dev_allocate(...)
> > > > > > > > {
> > > > > > > >    lock(owner_lock);
> > > > > > > >    <allocate_port>
> > > > > > > >    owner = RTE_PER_LCORE(default_owner_id);
> > > > > > > >    if (owner == 0)
> > > > > > > >        owner = process_default_owner_id;
> > > > > > > >   set_owner(port, ..., owner);
> > > > > > > >  unlock(owner_lock);
> > > > > > > >  RTE_PER_LCORE(default_owner_id) = 0;
> > > > > > >
> > > > > > > Or probably better to leave default_owner_id reset to the caller.
> > > > > > > Another thing - we can use same LTS variable in all control ops to
> > > > > > > allow/disallow changing of port configuration based on ownership.
> > > > > > > Konstantin
> > > > > > >
> > > > > > > > }
> > > > > > > >
> > > > > > > > So callers who don't need any special ownership - don't need to do anything.
> > > > > > > > Special callers (like failsafe) can set default_owenr_id just before calling hotplug
> > > > > > > > handling routine.
> > > > > >
> > > > > > No, hotplug will not be a routine.
> > > > > > I am talking about real hotplug, like a device which appears in the VM.
> > > > > > This new device must be handled by EAL and probed automatically if
> > > > > > comply with whitelist/blacklist policy given by the application or user.
> > > > > > Real hotplug is asynchronous.
> > > > >
> > > > > By 'asynchronous' here you mean it would be handled in the EAL interrupt thread
> > > > > or something different?
> > > >
> > > > Yes, we receive an hotplug event which is processed in the event thread.
> > > >
> > > > > Anyway, I suppose  you do need a function inside DPDK that will do the actual work in response
> > > > > on hotplug event, right?
> > > >
> > > > Yes
> > >
> > > Ok, btw why that function has to be always called from interrupt thread?
> > > Why not to allow user to decide?
> > 
> > Absolutely, the user must decide.
> > In the example of failsafe, the user instructs a policy to decide
> > which devices will be owned, so failsafe takes the decision based
> > on user inputs.
> > 
> > > Some apps have their own thread dedicated for control ops (like testpmd)
> > > For them it might be plausible to call that function from their own control thread context.
> > > Konstantin
> > >
> > > >
> > > > > That's what I refer to as 'hotplug routine' above.
> > > > >
> > > > > > We will just receive notifications that it appeared.
> > > > > >
> > > > > > Note: there is some temporary code in failsafe to manage some hotplug.
> > > > > > This code must be removed when it will be properly handled in EAL.
> > > > >
> > > > > Ok, if it is just a temporary code, that would be removed soon -
> > > > > then it definitely seems wrong to modify tespmd (or any other user app)
> > > > > to comply with that temporary solution.
> > > >
> > > > It will be modified when EAL hotplug will be implemented.
> > > >
> > > > However, the ownership issue will be the same:
> > > > we don't know the owner when allocating a port.
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-25 11:15                                 ` Ananyev, Konstantin
@ 2018-01-25 11:33                                   ` Thomas Monjalon
  2018-01-25 11:55                                     ` Ananyev, Konstantin
  0 siblings, 1 reply; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-25 11:33 UTC (permalink / raw)
  To: Ananyev, Konstantin
  Cc: Matan Azrad, Gaëtan Rivet, Wu, Jingjing, dev, Neil Horman,
	Richardson, Bruce

25/01/2018 12:15, Ananyev, Konstantin:
> Hi everyone,
> 
> > 
> > 25/01/2018 10:36, Matan Azrad:
> > > Gaetan, Konstantin, Thomas
> > >
> > > Any response to my suggestion below?
> > >
> > > From: Matan Azrad
> > > > Suggestion:
> > > >
> > > > 2 system owners.
> > > > APP_OWNER - 1.
> > > > NO_OWNER - 0.
> > > >
> > > > And allowing for more owners as now.
> > > >
> > > > 1. Every port creation will set the owner for NO_OWNER (as now).
> > > > 2. There is option for all dpdk entities to take owner of  NO_OWNER ports all
> > > > the time(as now).
> > > > 3. In some point in the end of EAL init: set all the NO_OWNER to
> > > > APP_OWNER(for V6).
> 
> What will happen if we have 2 (or more process) sharing the same device?
> How we will distinguish what APP_OWNER we are talking about?
> Shouldn't default_owner be unique per process?

Yes, good point!
Each application process should be considered a different owner
by default.

> > > > 4. Change the old iterator to iterate over APP_OWNER ports(for V6).
> 
> If I get it right it means no changes in tetpmd, correct?

It is my understanding yes.

> > > > What do you think?
> > 
> > Reminder for everybody: there is no issue if no hotplug.
> > There is a race condition with hotplug.
> > Hotplug is not managed by EAL yet, but there is a temporary hotplug
> > management in failsafe.
> > So until now, the issue is seen only with hotplug in failsafe.
> > 
> > Your suggestion makes no change for applications,
> > and fix the ownership issue for failsafe.
> > And later, if an application wants to support generic hotplug properly
> > (when it will be generally available in DPDK),
> > the application should use the ownership API.
> > Right?
> > 
> > I think it is a good compromise.
> 
> I still think it would be good if future hotplug support will be transparent
> to existing apps (no/minimal changes).
> But I suppose we can discuss it later, when will have hotplug patches.

It cannot be really transparent.
If an application is based on port detection at init, it has to be changed
to decide how to handle new ports.
That's why I introduced the new events for ethdev notification callback.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-25 11:33                                   ` Thomas Monjalon
@ 2018-01-25 11:55                                     ` Ananyev, Konstantin
  0 siblings, 0 replies; 214+ messages in thread
From: Ananyev, Konstantin @ 2018-01-25 11:55 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Matan Azrad, Gaëtan Rivet, Wu, Jingjing, dev, Neil Horman,
	Richardson, Bruce



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Thursday, January 25, 2018 11:33 AM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: Matan Azrad <matan@mellanox.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>;
> dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership
> 
> 25/01/2018 12:15, Ananyev, Konstantin:
> > Hi everyone,
> >
> > >
> > > 25/01/2018 10:36, Matan Azrad:
> > > > Gaetan, Konstantin, Thomas
> > > >
> > > > Any response to my suggestion below?
> > > >
> > > > From: Matan Azrad
> > > > > Suggestion:
> > > > >
> > > > > 2 system owners.
> > > > > APP_OWNER - 1.
> > > > > NO_OWNER - 0.
> > > > >
> > > > > And allowing for more owners as now.
> > > > >
> > > > > 1. Every port creation will set the owner for NO_OWNER (as now).
> > > > > 2. There is option for all dpdk entities to take owner of  NO_OWNER ports all
> > > > > the time(as now).
> > > > > 3. In some point in the end of EAL init: set all the NO_OWNER to
> > > > > APP_OWNER(for V6).
> >
> > What will happen if we have 2 (or more process) sharing the same device?
> > How we will distinguish what APP_OWNER we are talking about?
> > Shouldn't default_owner be unique per process?
> 
> Yes, good point!
> Each application process should be considered a different owner
> by default.
> 
> > > > > 4. Change the old iterator to iterate over APP_OWNER ports(for V6).
> >
> > If I get it right it means no changes in tetpmd, correct?
> 
> It is my understanding yes.

Sounds reasonable to me then.

> 
> > > > > What do you think?
> > >
> > > Reminder for everybody: there is no issue if no hotplug.
> > > There is a race condition with hotplug.
> > > Hotplug is not managed by EAL yet, but there is a temporary hotplug
> > > management in failsafe.
> > > So until now, the issue is seen only with hotplug in failsafe.
> > >
> > > Your suggestion makes no change for applications,
> > > and fix the ownership issue for failsafe.
> > > And later, if an application wants to support generic hotplug properly
> > > (when it will be generally available in DPDK),
> > > the application should use the ownership API.
> > > Right?
> > >
> > > I think it is a good compromise.
> >
> > I still think it would be good if future hotplug support will be transparent
> > to existing apps (no/minimal changes).
> > But I suppose we can discuss it later, when will have hotplug patches.
> 
> It cannot be really transparent.
> If an application is based on port detection at init, it has to be changed
> to decide how to handle new ports.
> That's why I introduced the new events for ethdev notification callback.

Ok, but I think, most processes would just assign default_owner for newly plugged device. 
For that case I don't see why it can't be transparent.
Anyway, that's probably a topic for new mail thread.
Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/7] Port ownership and syncronization
  2018-01-18 16:35   ` [dpdk-dev] [PATCH v3 0/7] Port ownership and syncronization Matan Azrad
                       ` (7 preceding siblings ...)
  2018-01-20 21:24     ` [dpdk-dev] [PATCH v4 0/7] Port ownership and syncronization Matan Azrad
@ 2018-01-25 14:35     ` Ferruh Yigit
  8 siblings, 0 replies; 214+ messages in thread
From: Ferruh Yigit @ 2018-01-25 14:35 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev

On 1/18/2018 4:35 PM, Matan Azrad wrote:
> Add ownership mechanism to DPDK Ethernet devices to avoid multiple management of a device by different DPDK entities.
> The port ownership mechanism is a good point to redefine the synchronization rules in ethdev:
> 
> 	1. The port allocation and port release synchronization will be managed by ethdev.
> 	2. The port usage synchronization will be managed by the port owner.
> 	3. The port ownership synchronization will be managed by ethdev.
> 	4. DPDK entity which want to use a port safely must take ownership before.
> 
> 
> V2:  
> Synchronize ethdev port creation.
> Synchronize port ownership mechanism.
> Rename owner remove API to rte_eth_dev_owner_unset.
> Remove "ethdev: free a port by a dedicated API" patch - passed to another series.
> Add "ethdev: fix port data reset timing" patch.
> Cahnge owner get API to return int value and to pass copy of the owner structure.
> Adjust testpmd to the improved owner get API.
> Adjust documentations.
> 
> V3:
> Change RTE_ETH_FOREACH_DEV iterator to skip owned ports(Gaetan suggestion).
> Prevent goto in set\unset APIs by adding internal API - this also adds reuse of code(Konstantin suggestion).
> Group all the shared processes variables in one struct to allow easy allocation of it(Konstantin suggestion).
> Take owner name truncation as warning and not as error(Konstantin suggestion).
> Mark the new APIs as EXPERIMENTAL.
> Rebase on top of master_net_mlx.
> Rebase on top of "[PATCH v6 0/6] Fail-safe\ethdev: fix removal handling lack" series.
> Rebase on top of "[PATCH v5 0/8] Introduce virtual driver for Hyper-V/Azure platforms" series.
> Add "ethdev: fix used portid allocation" patch suggested by Konstantin.
> 
> 
> 
> Matan Azrad (7):
>   ethdev: fix port data reset timing
>   ethdev: fix used portid allocation
>   ethdev: add port ownership
>   ethdev: synchronize port allocation
>   net/failsafe: free an eth port by a dedicated API
>   net/failsafe: use ownership mechanism to own ports
>   app/testpmd: adjust ethdev port ownership

Hi Thomas,

Technically this should be in scope of next-net, but this become a long
discussion and I am not able to follow it indeed, would you mind dealing this in
master tree?

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v5 7/7] app/testpmd: adjust ethdev port ownership
  2018-01-25  8:30             ` Matan Azrad
@ 2018-01-26  0:50               ` Lu, Wenzhuo
  0 siblings, 0 replies; 214+ messages in thread
From: Lu, Wenzhuo @ 2018-01-26  0:50 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Wu, Jingjing
  Cc: dev, Neil Horman, Richardson, Bruce, Ananyev, Konstantin

> -----Original Message-----
> From: Matan Azrad [mailto:matan@mellanox.com]
> Sent: Thursday, January 25, 2018 4:30 PM
> To: Lu, Wenzhuo <wenzhuo.lu@intel.com>; Thomas Monjalon
> <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>; Wu,
> Jingjing <jingjing.wu@intel.com>
> Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v5 7/7] app/testpmd: adjust ethdev port
> ownership
> 
> Hi Lu
> 
> From: Lu, Wenzhuo [mailto:wenzhuo.lu@intel.com]
> > Hi Matan,
> >
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> > > Sent: Tuesday, January 23, 2018 12:38 AM
> > > To: Thomas Monjalon <thomas@monjalon.net>; Gaetan Rivet
> > > <gaetan.rivet@6wind.com>; Wu, Jingjing <jingjing.wu@intel.com>
> > > Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Richardson,
> > > Bruce <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > <konstantin.ananyev@intel.com>
> > > Subject: [dpdk-dev] [PATCH v5 7/7] app/testpmd: adjust ethdev port
> > > ownership
> > >
> > > Testpmd should not use ethdev ports which are managed by other DPDK
> > > entities.
> > >
> > > Set Testpmd ownership to each port which is not used by other entity
> > > and prevent any usage of ethdev ports which are not owned by Testpmd.
> > >
> > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > ---
> > >  app/test-pmd/cmdline.c      | 89 +++++++++++++++++++---------------------
> > -----
> > >  app/test-pmd/cmdline_flow.c |  2 +-
> > >  app/test-pmd/config.c       | 37 ++++++++++---------
> > >  app/test-pmd/parameters.c   |  4 +-
> > >  app/test-pmd/testpmd.c      | 63 ++++++++++++++++++++------------
> > >  app/test-pmd/testpmd.h      |  3 ++
> > >  6 files changed, 103 insertions(+), 95 deletions(-)
> > >
> > > diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
> > > 9f12c0f..36dba6c 100644
> > > --- a/app/test-pmd/cmdline.c
> > > +++ b/app/test-pmd/cmdline.c
> > > @@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
> > >  			&link_speed) < 0)
> > >  		return;
> > >
> > > -	RTE_ETH_FOREACH_DEV(pid) {
> > > +	RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
> > I see my_owner is a global variable, so, don't know why we need the
> > parameter 'my_owner.id' here.
> 
> Yes it is a testpmd global variable (which was initiated in testpmd main
> function - you can see it in this patch) as a lot of variables in testpmd.
> RTE_ETH_FOREACH_DEV_OWNED_BY iterator is an ethdev iterator -> not
> only for testpmd\applications.
> So, each dpdk entity(application, PMDs, any other libs) should use this
> iterator with its specific owner id to get its owned ports.
Sorry, I may express myself too simple to understand.
I think about that RTE use a callback to get the owner ID and even other necessary info from APP. RTE can use the callback in RTE_ETH_FOREACH_DEV. Then APP need hardly change anything. Only need register a callback. I think it's more friendly to the users.
Or even APP defines its own macro ' FOREACH_DEV_OWNED(pid)' to wrap ' RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) ' looks better.

> 
> > I think we can still use
> > RTE_ETH_FOREACH_DEV and check 'my_owner' in it. If there's some reason
> > and you don't want change RTE_ETH_FOREACH_DEV, I think '
> > RTE_ETH_FOREACH_DEV_OWNED(pid) {' is better.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v5 0/7] Port ownership and synchronization
  2018-01-22 16:38       ` [dpdk-dev] [PATCH v5 0/7] Port ownership and synchronization Matan Azrad
                           ` (6 preceding siblings ...)
  2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 7/7] app/testpmd: adjust ethdev port ownership Matan Azrad
@ 2018-01-29 11:21         ` Matan Azrad
  2018-01-31 19:53           ` Thomas Monjalon
  7 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-01-29 11:21 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Konstantin Ananyev
  Cc: dev, Neil Horman, Bruce Richardson, Jingjing Wu


Hi all

Since there is not agreement for testpmd to be ownership aware by using the new ownership mechanism,
I think we can drop testpmd patch for now(app/testpmd: adjust ethdev port ownership).

Maybe we can add example application to use this API in future.

Thanks!

From: Matan Azrad
> Add ownership mechanism to DPDK Ethernet devices to avoid multiple
> management of a device by different DPDK entities.
> The port ownership mechanism is a good point to redefine the
> synchronization rules in ethdev:
> 
> 	1. The port allocation and port release synchronization will be
> managed by ethdev.
> 	2. The port usage synchronization will be managed by the port
> owner.
> 	3. The port ownership synchronization will be managed by ethdev.
> 	4. DPDK entity which want to use a port safely must take ownership
> before.
> 
> 
> V2:
> Synchronize ethdev port creation.
> Synchronize port ownership mechanism.
> Rename owner remove API to rte_eth_dev_owner_unset.
> Remove "ethdev: free a port by a dedicated API" patch - passed to another
> series.
> Add "ethdev: fix port data reset timing" patch.
> Cahnge owner get API to return int value and to pass copy of the owner
> structure.
> Adjust testpmd to the improved owner get API.
> Adjust documentations.
> 
> V3:
> Change RTE_ETH_FOREACH_DEV iterator to skip owned ports(Gaetan
> suggestion).
> Prevent goto in set\unset APIs by adding internal API - this also adds reuse of
> code(Konstantin suggestion).
> Group all the shared processes variables in one struct to allow easy allocation
> of it(Konstantin suggestion).
> Take owner name truncation as warning and not as error(Konstantin
> suggestion).
> Mark the new APIs as EXPERIMENTAL.
> Rebase on top of master_net_mlx.
> Rebase on top of "[PATCH v6 0/6] Fail-safe\ethdev: fix removal handling lack"
> series.
> Rebase on top of "[PATCH v5 0/8] Introduce virtual driver for Hyper-V/Azure
> platforms" .
> Add "ethdev: fix used portid allocation" patch suggested y Konstantin.
> 
> v4:
> Share => shared in ethdev patches(Thomas suggestion).
> Rephase some code comments(Thomas suggestion).
> Fix compilation issue caused by wrong rebase with "fix used portid allocation"
> patch.
> Add assert check for the correct port state to above fix patch.
> 
> V5:
> Use defferent print message type as Ferruh suggested.
> Fix the name parameter description in set\unset APIs(Ferruh suggestion).
> Rebase on top of 18.02-rc1.
> Fix issue: ownership API must check that the shared data was allocated
> before using the shared ownership lock(relevant when no port was created).
> 
> Matan Azrad (7):
>   ethdev: fix port data reset timing
>   ethdev: fix used portid allocation
>   ethdev: add port ownership
>   ethdev: synchronize port allocation
>   net/failsafe: free an eth port by a dedicated API
>   net/failsafe: use ownership mechanism to own ports
>   app/testpmd: adjust ethdev port ownership
> 
>  app/test-pmd/cmdline.c                  |  89 +++++------
>  app/test-pmd/cmdline_flow.c             |   2 +-
>  app/test-pmd/config.c                   |  37 ++---
>  app/test-pmd/parameters.c               |   4 +-
>  app/test-pmd/testpmd.c                  |  63 +++++---
>  app/test-pmd/testpmd.h                  |   3 +
>  doc/guides/prog_guide/poll_mode_drv.rst |  14 +-
>  drivers/net/failsafe/failsafe.c         |   7 +
>  drivers/net/failsafe/failsafe_eal.c     |  16 ++
>  drivers/net/failsafe/failsafe_ether.c   |   2 +-
>  drivers/net/failsafe/failsafe_private.h |   2 +
>  lib/librte_ether/rte_ethdev.c           | 267
> +++++++++++++++++++++++++++-----
>  lib/librte_ether/rte_ethdev.h           | 115 +++++++++++++-
>  lib/librte_ether/rte_ethdev_core.h      |   2 +
>  lib/librte_ether/rte_ethdev_version.map |   6 +
>  15 files changed, 486 insertions(+), 143 deletions(-)
> 
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH v5 0/7] Port ownership and synchronization
  2018-01-29 11:21         ` [dpdk-dev] [PATCH v5 0/7] Port ownership and synchronization Matan Azrad
@ 2018-01-31 19:53           ` Thomas Monjalon
  0 siblings, 0 replies; 214+ messages in thread
From: Thomas Monjalon @ 2018-01-31 19:53 UTC (permalink / raw)
  To: Matan Azrad
  Cc: dev, Gaetan Rivet, Konstantin Ananyev, Neil Horman,
	Bruce Richardson, Jingjing Wu, techboard

29/01/2018 12:21, Matan Azrad:
> 
> Hi all
> 
> Since there is not agreement for testpmd to be ownership aware by using the new ownership mechanism,
> I think we can drop testpmd patch for now(app/testpmd: adjust ethdev port ownership).

It has been agreed today by the Technical Board.

First 6 patches applied, thanks.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/7] ethdev: fix port data reset timing
  2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 1/7] ethdev: fix port data reset timing Matan Azrad
  2018-01-18 17:00       ` Thomas Monjalon
  2018-01-19 12:38       ` Ananyev, Konstantin
@ 2018-03-05 11:24       ` Ferruh Yigit
  2018-03-05 14:52         ` Matan Azrad
  2 siblings, 1 reply; 214+ messages in thread
From: Ferruh Yigit @ 2018-03-05 11:24 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev, stable

On 1/18/2018 4:35 PM, Matan Azrad wrote:
> rte_eth_dev_data structure is allocated per ethdev port and can be
> used to get a data of the port internally.
> 
> rte_eth_dev_attach_secondary tries to find the port identifier using
> rte_eth_dev_data name field comparison and may get an identifier of
> invalid port in case of this port was released by the primary process
> because the port release API doesn't reset the port data.
> 
> So, it will be better to reset the port data in release time instead of
> allocation time.
> 
> Move the port data reset to the port release API.
> 
> Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple process model")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> ---
>  lib/librte_ether/rte_ethdev.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 7044159..156231c 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -204,7 +204,6 @@ struct rte_eth_dev *
>  		return NULL;
>  	}
>  
> -	memset(&rte_eth_dev_data[port_id], 0, sizeof(struct rte_eth_dev_data));
>  	eth_dev = eth_dev_get(port_id);
>  	snprintf(eth_dev->data->name, sizeof(eth_dev->data->name), "%s", name);
>  	eth_dev->data->port_id = port_id;
> @@ -252,6 +251,7 @@ struct rte_eth_dev *
>  	if (eth_dev == NULL)
>  		return -EINVAL;
>  
> +	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));

Hi Matan,

What most of the vdev release path does is:

eth_dev = rte_eth_dev_allocated(...)
rte_free(eth_dev->data->dev_private);
rte_free(eth_dev->data);
rte_eth_dev_release_port(eth_dev);

Since eth_dev->data freed, memset() it in rte_eth_dev_release_port() will be
problem.

We don't run remove path that is why we didn't hit the issue but this seems
problem for all virtual PMDs.
Also rte_eth_dev_pci_release() looks problematic now.

Can you please check the issue?

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/7] ethdev: fix port data reset timing
  2018-03-05 11:24       ` [dpdk-dev] [dpdk-stable] " Ferruh Yigit
@ 2018-03-05 14:52         ` Matan Azrad
  2018-03-05 15:06           ` Ferruh Yigit
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-03-05 14:52 UTC (permalink / raw)
  To: Ferruh Yigit, Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev, stable

HI

From: Ferruh Yigit, Sent: Monday, March 5, 2018 1:24 PM
> On 1/18/2018 4:35 PM, Matan Azrad wrote:
> > rte_eth_dev_data structure is allocated per ethdev port and can be
> > used to get a data of the port internally.
> >
> > rte_eth_dev_attach_secondary tries to find the port identifier using
> > rte_eth_dev_data name field comparison and may get an identifier of
> > invalid port in case of this port was released by the primary process
> > because the port release API doesn't reset the port data.
> >
> > So, it will be better to reset the port data in release time instead
> > of allocation time.
> >
> > Move the port data reset to the port release API.
> >
> > Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple
> > process model")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > ---
> >  lib/librte_ether/rte_ethdev.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_ether/rte_ethdev.c
> > b/lib/librte_ether/rte_ethdev.c index 7044159..156231c 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -204,7 +204,6 @@ struct rte_eth_dev *
> >  		return NULL;
> >  	}
> >
> > -	memset(&rte_eth_dev_data[port_id], 0, sizeof(struct
> rte_eth_dev_data));
> >  	eth_dev = eth_dev_get(port_id);
> >  	snprintf(eth_dev->data->name, sizeof(eth_dev->data->name),
> "%s", name);
> >  	eth_dev->data->port_id = port_id;
> > @@ -252,6 +251,7 @@ struct rte_eth_dev *
> >  	if (eth_dev == NULL)
> >  		return -EINVAL;
> >
> > +	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
> 
> Hi Matan,
> 
> What most of the vdev release path does is:
> 
> eth_dev = rte_eth_dev_allocated(...)
> rte_free(eth_dev->data->dev_private);
> rte_free(eth_dev->data);
> rte_eth_dev_release_port(eth_dev);
> 
> Since eth_dev->data freed, memset() it in rte_eth_dev_release_port() will
> be problem.
> 
> We don't run remove path that is why we didn't hit the issue but this seems
> problem for all virtual PMDs.

Yes, it is a problem and should be fixed:
For vdevs which use private rte_eth_dev_data the remove order can be:
	private_data = eth_dev->data;
	rte_free(eth_dev->data->dev_private);
	rte_eth_dev_release_port(eth_dev); /* The last operation working on ethdev structure. */
	rte_free(private_data);


> Also rte_eth_dev_pci_release() looks problematic now.

Yes, again, the last operation working on ethdev structure should be rte_eth_dev_release_port().

So need to fix all vdevs and the rte_eth_dev_pci_release() function.

Any comments?

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/7] ethdev: fix port data reset timing
  2018-03-05 14:52         ` Matan Azrad
@ 2018-03-05 15:06           ` Ferruh Yigit
  2018-03-05 15:12             ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Ferruh Yigit @ 2018-03-05 15:06 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev, stable

On 3/5/2018 2:52 PM, Matan Azrad wrote:
> HI
> 
> From: Ferruh Yigit, Sent: Monday, March 5, 2018 1:24 PM
>> On 1/18/2018 4:35 PM, Matan Azrad wrote:
>>> rte_eth_dev_data structure is allocated per ethdev port and can be
>>> used to get a data of the port internally.
>>>
>>> rte_eth_dev_attach_secondary tries to find the port identifier using
>>> rte_eth_dev_data name field comparison and may get an identifier of
>>> invalid port in case of this port was released by the primary process
>>> because the port release API doesn't reset the port data.
>>>
>>> So, it will be better to reset the port data in release time instead
>>> of allocation time.
>>>
>>> Move the port data reset to the port release API.
>>>
>>> Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple
>>> process model")
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Matan Azrad <matan@mellanox.com>
>>> ---
>>>  lib/librte_ether/rte_ethdev.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/lib/librte_ether/rte_ethdev.c
>>> b/lib/librte_ether/rte_ethdev.c index 7044159..156231c 100644
>>> --- a/lib/librte_ether/rte_ethdev.c
>>> +++ b/lib/librte_ether/rte_ethdev.c
>>> @@ -204,7 +204,6 @@ struct rte_eth_dev *
>>>  		return NULL;
>>>  	}
>>>
>>> -	memset(&rte_eth_dev_data[port_id], 0, sizeof(struct
>> rte_eth_dev_data));
>>>  	eth_dev = eth_dev_get(port_id);
>>>  	snprintf(eth_dev->data->name, sizeof(eth_dev->data->name),
>> "%s", name);
>>>  	eth_dev->data->port_id = port_id;
>>> @@ -252,6 +251,7 @@ struct rte_eth_dev *
>>>  	if (eth_dev == NULL)
>>>  		return -EINVAL;
>>>
>>> +	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
>>
>> Hi Matan,
>>
>> What most of the vdev release path does is:
>>
>> eth_dev = rte_eth_dev_allocated(...)
>> rte_free(eth_dev->data->dev_private);
>> rte_free(eth_dev->data);
>> rte_eth_dev_release_port(eth_dev);
>>
>> Since eth_dev->data freed, memset() it in rte_eth_dev_release_port() will
>> be problem.
>>
>> We don't run remove path that is why we didn't hit the issue but this seems
>> problem for all virtual PMDs.
> 
> Yes, it is a problem and should be fixed:
> For vdevs which use private rte_eth_dev_data the remove order can be:
> 	private_data = eth_dev->data;
> 	rte_free(eth_dev->data->dev_private);
> 	rte_eth_dev_release_port(eth_dev); /* The last operation working on ethdev structure. */
> 	rte_free(private_data);

Do we need to save "private_data"?

> 
> 
>> Also rte_eth_dev_pci_release() looks problematic now.
> 
> Yes, again, the last operation working on ethdev structure should be rte_eth_dev_release_port().
> 
> So need to fix all vdevs and the rte_eth_dev_pci_release() function.
> 
> Any comments?
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/7] ethdev: fix port data reset timing
  2018-03-05 15:06           ` Ferruh Yigit
@ 2018-03-05 15:12             ` Matan Azrad
  2018-03-27 22:37               ` Ferruh Yigit
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-03-05 15:12 UTC (permalink / raw)
  To: Ferruh Yigit, Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev, stable

Hi Ferruh

From: Ferruh Yigit, Sent: Monday, March 5, 2018 5:07 PM
> On 3/5/2018 2:52 PM, Matan Azrad wrote:
> > HI
> >
> > From: Ferruh Yigit, Sent: Monday, March 5, 2018 1:24 PM
> >> On 1/18/2018 4:35 PM, Matan Azrad wrote:
> >>> rte_eth_dev_data structure is allocated per ethdev port and can be
> >>> used to get a data of the port internally.
> >>>
> >>> rte_eth_dev_attach_secondary tries to find the port identifier using
> >>> rte_eth_dev_data name field comparison and may get an identifier of
> >>> invalid port in case of this port was released by the primary
> >>> process because the port release API doesn't reset the port data.
> >>>
> >>> So, it will be better to reset the port data in release time instead
> >>> of allocation time.
> >>>
> >>> Move the port data reset to the port release API.
> >>>
> >>> Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple
> >>> process model")
> >>> Cc: stable@dpdk.org
> >>>
> >>> Signed-off-by: Matan Azrad <matan@mellanox.com>
> >>> ---
> >>>  lib/librte_ether/rte_ethdev.c | 2 +-
> >>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/lib/librte_ether/rte_ethdev.c
> >>> b/lib/librte_ether/rte_ethdev.c index 7044159..156231c 100644
> >>> --- a/lib/librte_ether/rte_ethdev.c
> >>> +++ b/lib/librte_ether/rte_ethdev.c
> >>> @@ -204,7 +204,6 @@ struct rte_eth_dev *
> >>>  		return NULL;
> >>>  	}
> >>>
> >>> -	memset(&rte_eth_dev_data[port_id], 0, sizeof(struct
> >> rte_eth_dev_data));
> >>>  	eth_dev = eth_dev_get(port_id);
> >>>  	snprintf(eth_dev->data->name, sizeof(eth_dev->data->name),
> >> "%s", name);
> >>>  	eth_dev->data->port_id = port_id;
> >>> @@ -252,6 +251,7 @@ struct rte_eth_dev *
> >>>  	if (eth_dev == NULL)
> >>>  		return -EINVAL;
> >>>
> >>> +	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
> >>
> >> Hi Matan,
> >>
> >> What most of the vdev release path does is:
> >>
> >> eth_dev = rte_eth_dev_allocated(...)
> >> rte_free(eth_dev->data->dev_private);
> >> rte_free(eth_dev->data);
> >> rte_eth_dev_release_port(eth_dev);
> >>
> >> Since eth_dev->data freed, memset() it in rte_eth_dev_release_port()
> >> will be problem.
> >>
> >> We don't run remove path that is why we didn't hit the issue but this
> >> seems problem for all virtual PMDs.
> >
> > Yes, it is a problem and should be fixed:
> > For vdevs which use private rte_eth_dev_data the remove order can be:
> > 	private_data = eth_dev->data;
> > 	rte_free(eth_dev->data->dev_private);
> > 	rte_eth_dev_release_port(eth_dev); /* The last operation working
> on ethdev structure. */
> > 	rte_free(private_data);
> 
> Do we need to save "private_data"?

Just to emphasis that eth_dev structure should not more be available after rte_eth_dev_release_port().
Maybe in the future rte_eth_dev_release_port() will zero eth_dev structure too :)

> >
> >
> >> Also rte_eth_dev_pci_release() looks problematic now.
> >
> > Yes, again, the last operation working on ethdev structure should be
> rte_eth_dev_release_port().
> >
> > So need to fix all vdevs and the rte_eth_dev_pci_release() function.
> >
> > Any comments?
> >


^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/7] ethdev: fix port data reset timing
  2018-03-05 15:12             ` Matan Azrad
@ 2018-03-27 22:37               ` Ferruh Yigit
  2018-03-28 12:07                 ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Ferruh Yigit @ 2018-03-27 22:37 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev, stable

On 3/5/2018 3:12 PM, Matan Azrad wrote:
> Hi Ferruh
> 
> From: Ferruh Yigit, Sent: Monday, March 5, 2018 5:07 PM
>> On 3/5/2018 2:52 PM, Matan Azrad wrote:
>>> HI
>>>
>>> From: Ferruh Yigit, Sent: Monday, March 5, 2018 1:24 PM
>>>> On 1/18/2018 4:35 PM, Matan Azrad wrote:
>>>>> rte_eth_dev_data structure is allocated per ethdev port and can be
>>>>> used to get a data of the port internally.
>>>>>
>>>>> rte_eth_dev_attach_secondary tries to find the port identifier using
>>>>> rte_eth_dev_data name field comparison and may get an identifier of
>>>>> invalid port in case of this port was released by the primary
>>>>> process because the port release API doesn't reset the port data.
>>>>>
>>>>> So, it will be better to reset the port data in release time instead
>>>>> of allocation time.
>>>>>
>>>>> Move the port data reset to the port release API.
>>>>>
>>>>> Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple
>>>>> process model")
>>>>> Cc: stable@dpdk.org
>>>>>
>>>>> Signed-off-by: Matan Azrad <matan@mellanox.com>
>>>>> ---
>>>>>  lib/librte_ether/rte_ethdev.c | 2 +-
>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/lib/librte_ether/rte_ethdev.c
>>>>> b/lib/librte_ether/rte_ethdev.c index 7044159..156231c 100644
>>>>> --- a/lib/librte_ether/rte_ethdev.c
>>>>> +++ b/lib/librte_ether/rte_ethdev.c
>>>>> @@ -204,7 +204,6 @@ struct rte_eth_dev *
>>>>>  		return NULL;
>>>>>  	}
>>>>>
>>>>> -	memset(&rte_eth_dev_data[port_id], 0, sizeof(struct
>>>> rte_eth_dev_data));
>>>>>  	eth_dev = eth_dev_get(port_id);
>>>>>  	snprintf(eth_dev->data->name, sizeof(eth_dev->data->name),
>>>> "%s", name);
>>>>>  	eth_dev->data->port_id = port_id;
>>>>> @@ -252,6 +251,7 @@ struct rte_eth_dev *
>>>>>  	if (eth_dev == NULL)
>>>>>  		return -EINVAL;
>>>>>
>>>>> +	memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
>>>>
>>>> Hi Matan,
>>>>
>>>> What most of the vdev release path does is:
>>>>
>>>> eth_dev = rte_eth_dev_allocated(...)
>>>> rte_free(eth_dev->data->dev_private);
>>>> rte_free(eth_dev->data);
>>>> rte_eth_dev_release_port(eth_dev);
>>>>
>>>> Since eth_dev->data freed, memset() it in rte_eth_dev_release_port()
>>>> will be problem.
>>>>
>>>> We don't run remove path that is why we didn't hit the issue but this
>>>> seems problem for all virtual PMDs.
>>>
>>> Yes, it is a problem and should be fixed:
>>> For vdevs which use private rte_eth_dev_data the remove order can be:
>>> 	private_data = eth_dev->data;
>>> 	rte_free(eth_dev->data->dev_private);
>>> 	rte_eth_dev_release_port(eth_dev); /* The last operation working
>> on ethdev structure. */
>>> 	rte_free(private_data);
>>
>> Do we need to save "private_data"?
> 
> Just to emphasis that eth_dev structure should not more be available after rte_eth_dev_release_port().
> Maybe in the future rte_eth_dev_release_port() will zero eth_dev structure too :)

Hi Matan,

Reminder of this issue, it would be nice to fix in this release.

> 
>>>
>>>
>>>> Also rte_eth_dev_pci_release() looks problematic now.
>>>
>>> Yes, again, the last operation working on ethdev structure should be
>> rte_eth_dev_release_port().
>>>
>>> So need to fix all vdevs and the rte_eth_dev_pci_release() function.
>>>
>>> Any comments?
>>>
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/7] ethdev: fix port data reset timing
  2018-03-27 22:37               ` Ferruh Yigit
@ 2018-03-28 12:07                 ` Matan Azrad
  2018-03-30 10:39                   ` Ferruh Yigit
  0 siblings, 1 reply; 214+ messages in thread
From: Matan Azrad @ 2018-03-28 12:07 UTC (permalink / raw)
  To: Ferruh Yigit, Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev, stable

Hi Ferruh

> From: Ferruh Yigit, Wednesday, March 28, 2018 1:38 AM
> On 3/5/2018 3:12 PM, Matan Azrad wrote:
> > Hi Ferruh
> >
> > From: Ferruh Yigit, Sent: Monday, March 5, 2018 5:07 PM
> >> On 3/5/2018 2:52 PM, Matan Azrad wrote:
> >>> HI
> >>>
> >>> From: Ferruh Yigit, Sent: Monday, March 5, 2018 1:24 PM
> >>>> On 1/18/2018 4:35 PM, Matan Azrad wrote:
> >>>>> rte_eth_dev_data structure is allocated per ethdev port and can be
> >>>>> used to get a data of the port internally.
> >>>>>
> >>>>> rte_eth_dev_attach_secondary tries to find the port identifier
> >>>>> using rte_eth_dev_data name field comparison and may get an
> >>>>> identifier of invalid port in case of this port was released by
> >>>>> the primary process because the port release API doesn't reset the
> port data.
> >>>>>
> >>>>> So, it will be better to reset the port data in release time
> >>>>> instead of allocation time.
> >>>>>
> >>>>> Move the port data reset to the port release API.
> >>>>>
> >>>>> Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple
> >>>>> process model")
> >>>>> Cc: stable@dpdk.org
> >>>>>
> >>>>> Signed-off-by: Matan Azrad <matan@mellanox.com>
> >>>>> ---
> >>>>>  lib/librte_ether/rte_ethdev.c | 2 +-
> >>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/lib/librte_ether/rte_ethdev.c
> >>>>> b/lib/librte_ether/rte_ethdev.c index 7044159..156231c 100644
> >>>>> --- a/lib/librte_ether/rte_ethdev.c
> >>>>> +++ b/lib/librte_ether/rte_ethdev.c
> >>>>> @@ -204,7 +204,6 @@ struct rte_eth_dev *
> >>>>>  		return NULL;
> >>>>>  	}
> >>>>>
> >>>>> -	memset(&rte_eth_dev_data[port_id], 0, sizeof(struct
> >>>> rte_eth_dev_data));
> >>>>>  	eth_dev = eth_dev_get(port_id);
> >>>>>  	snprintf(eth_dev->data->name, sizeof(eth_dev->data->name),
> >>>> "%s", name);
> >>>>>  	eth_dev->data->port_id = port_id; @@ -252,6 +251,7 @@ struct
> >>>>> rte_eth_dev *
> >>>>>  	if (eth_dev == NULL)
> >>>>>  		return -EINVAL;
> >>>>>
> >>>>> +	memset(eth_dev->data, 0, sizeof(struct
> rte_eth_dev_data));
> >>>>
> >>>> Hi Matan,
> >>>>
> >>>> What most of the vdev release path does is:
> >>>>
> >>>> eth_dev = rte_eth_dev_allocated(...)
> >>>> rte_free(eth_dev->data->dev_private);
> >>>> rte_free(eth_dev->data);
> >>>> rte_eth_dev_release_port(eth_dev);
> >>>>
> >>>> Since eth_dev->data freed, memset() it in
> >>>> rte_eth_dev_release_port() will be problem.
> >>>>
> >>>> We don't run remove path that is why we didn't hit the issue but
> >>>> this seems problem for all virtual PMDs.
> >>>
> >>> Yes, it is a problem and should be fixed:
> >>> For vdevs which use private rte_eth_dev_data the remove order can
> be:
> >>> 	private_data = eth_dev->data;
> >>> 	rte_free(eth_dev->data->dev_private);
> >>> 	rte_eth_dev_release_port(eth_dev); /* The last operation working
> >> on ethdev structure. */
> >>> 	rte_free(private_data);
> >>
> >> Do we need to save "private_data"?
> >
> > Just to emphasis that eth_dev structure should not more be available after
> rte_eth_dev_release_port().
> > Maybe in the future rte_eth_dev_release_port() will zero eth_dev
> > structure too :)
> 
> Hi Matan,
> 
> Reminder of this issue, it would be nice to fix in this release.
> 

Regarding the private rte_eth_dev_data, it should be fixed in the next thread:
https://dpdk.org/dev/patchwork/patch/35632/

Regarding the rte_eth_dev_pci_release() function: I'm going to send a fix.

> >
> >>>
> >>>
> >>>> Also rte_eth_dev_pci_release() looks problematic now.
> >>>
> >>> Yes, again, the last operation working on ethdev structure should be
> >> rte_eth_dev_release_port().
> >>>
> >>> So need to fix all vdevs and the rte_eth_dev_pci_release() function.
> >>>
> >>> Any comments?
> >>>
> >


^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/7] ethdev: fix port data reset timing
  2018-03-28 12:07                 ` Matan Azrad
@ 2018-03-30 10:39                   ` Ferruh Yigit
  2018-04-19 11:07                     ` Ferruh Yigit
  0 siblings, 1 reply; 214+ messages in thread
From: Ferruh Yigit @ 2018-03-30 10:39 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev, stable

On 3/28/2018 1:07 PM, Matan Azrad wrote:
> Hi Ferruh
> 
>> From: Ferruh Yigit, Wednesday, March 28, 2018 1:38 AM
>> On 3/5/2018 3:12 PM, Matan Azrad wrote:
>>> Hi Ferruh
>>>
>>> From: Ferruh Yigit, Sent: Monday, March 5, 2018 5:07 PM
>>>> On 3/5/2018 2:52 PM, Matan Azrad wrote:
>>>>> HI
>>>>>
>>>>> From: Ferruh Yigit, Sent: Monday, March 5, 2018 1:24 PM
>>>>>> On 1/18/2018 4:35 PM, Matan Azrad wrote:
>>>>>>> rte_eth_dev_data structure is allocated per ethdev port and can be
>>>>>>> used to get a data of the port internally.
>>>>>>>
>>>>>>> rte_eth_dev_attach_secondary tries to find the port identifier
>>>>>>> using rte_eth_dev_data name field comparison and may get an
>>>>>>> identifier of invalid port in case of this port was released by
>>>>>>> the primary process because the port release API doesn't reset the
>> port data.
>>>>>>>
>>>>>>> So, it will be better to reset the port data in release time
>>>>>>> instead of allocation time.
>>>>>>>
>>>>>>> Move the port data reset to the port release API.
>>>>>>>
>>>>>>> Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple
>>>>>>> process model")
>>>>>>> Cc: stable@dpdk.org
>>>>>>>
>>>>>>> Signed-off-by: Matan Azrad <matan@mellanox.com>
>>>>>>> ---
>>>>>>>  lib/librte_ether/rte_ethdev.c | 2 +-
>>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/lib/librte_ether/rte_ethdev.c
>>>>>>> b/lib/librte_ether/rte_ethdev.c index 7044159..156231c 100644
>>>>>>> --- a/lib/librte_ether/rte_ethdev.c
>>>>>>> +++ b/lib/librte_ether/rte_ethdev.c
>>>>>>> @@ -204,7 +204,6 @@ struct rte_eth_dev *
>>>>>>>  		return NULL;
>>>>>>>  	}
>>>>>>>
>>>>>>> -	memset(&rte_eth_dev_data[port_id], 0, sizeof(struct
>>>>>> rte_eth_dev_data));
>>>>>>>  	eth_dev = eth_dev_get(port_id);
>>>>>>>  	snprintf(eth_dev->data->name, sizeof(eth_dev->data->name),
>>>>>> "%s", name);
>>>>>>>  	eth_dev->data->port_id = port_id; @@ -252,6 +251,7 @@ struct
>>>>>>> rte_eth_dev *
>>>>>>>  	if (eth_dev == NULL)
>>>>>>>  		return -EINVAL;
>>>>>>>
>>>>>>> +	memset(eth_dev->data, 0, sizeof(struct
>> rte_eth_dev_data));
>>>>>>
>>>>>> Hi Matan,
>>>>>>
>>>>>> What most of the vdev release path does is:
>>>>>>
>>>>>> eth_dev = rte_eth_dev_allocated(...)
>>>>>> rte_free(eth_dev->data->dev_private);
>>>>>> rte_free(eth_dev->data);
>>>>>> rte_eth_dev_release_port(eth_dev);
>>>>>>
>>>>>> Since eth_dev->data freed, memset() it in
>>>>>> rte_eth_dev_release_port() will be problem.
>>>>>>
>>>>>> We don't run remove path that is why we didn't hit the issue but
>>>>>> this seems problem for all virtual PMDs.
>>>>>
>>>>> Yes, it is a problem and should be fixed:
>>>>> For vdevs which use private rte_eth_dev_data the remove order can
>> be:
>>>>> 	private_data = eth_dev->data;
>>>>> 	rte_free(eth_dev->data->dev_private);
>>>>> 	rte_eth_dev_release_port(eth_dev); /* The last operation working
>>>> on ethdev structure. */
>>>>> 	rte_free(private_data);
>>>>
>>>> Do we need to save "private_data"?
>>>
>>> Just to emphasis that eth_dev structure should not more be available after
>> rte_eth_dev_release_port().
>>> Maybe in the future rte_eth_dev_release_port() will zero eth_dev
>>> structure too :)
>>
>> Hi Matan,
>>
>> Reminder of this issue, it would be nice to fix in this release.
>>
> 
> Regarding the private rte_eth_dev_data, it should be fixed in the next thread:
> https://dpdk.org/dev/patchwork/patch/35632/
> 
> Regarding the rte_eth_dev_pci_release() function: I'm going to send a fix.

Thanks Matan for the patch,

But rte_eth_dev_release_port() is still broken because of this change, please
check _rte_eth_dev_callback_process() which uses dev->data->port_id.

> 
>>>
>>>>>
>>>>>
>>>>>> Also rte_eth_dev_pci_release() looks problematic now.
>>>>>
>>>>> Yes, again, the last operation working on ethdev structure should be
>>>> rte_eth_dev_release_port().
>>>>>
>>>>> So need to fix all vdevs and the rte_eth_dev_pci_release() function.
>>>>>
>>>>> Any comments?
>>>>>
>>>
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/7] ethdev: fix port data reset timing
  2018-03-30 10:39                   ` Ferruh Yigit
@ 2018-04-19 11:07                     ` Ferruh Yigit
  2018-04-25 12:16                       ` Matan Azrad
  0 siblings, 1 reply; 214+ messages in thread
From: Ferruh Yigit @ 2018-04-19 11:07 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev, stable,
	Olga Shern

On 3/30/2018 11:39 AM, Ferruh Yigit wrote:
> On 3/28/2018 1:07 PM, Matan Azrad wrote:
>> Hi Ferruh
>>
>>> From: Ferruh Yigit, Wednesday, March 28, 2018 1:38 AM
>>> On 3/5/2018 3:12 PM, Matan Azrad wrote:
>>>> Hi Ferruh
>>>>
>>>> From: Ferruh Yigit, Sent: Monday, March 5, 2018 5:07 PM
>>>>> On 3/5/2018 2:52 PM, Matan Azrad wrote:
>>>>>> HI
>>>>>>
>>>>>> From: Ferruh Yigit, Sent: Monday, March 5, 2018 1:24 PM
>>>>>>> On 1/18/2018 4:35 PM, Matan Azrad wrote:
>>>>>>>> rte_eth_dev_data structure is allocated per ethdev port and can be
>>>>>>>> used to get a data of the port internally.
>>>>>>>>
>>>>>>>> rte_eth_dev_attach_secondary tries to find the port identifier
>>>>>>>> using rte_eth_dev_data name field comparison and may get an
>>>>>>>> identifier of invalid port in case of this port was released by
>>>>>>>> the primary process because the port release API doesn't reset the
>>> port data.
>>>>>>>>
>>>>>>>> So, it will be better to reset the port data in release time
>>>>>>>> instead of allocation time.
>>>>>>>>
>>>>>>>> Move the port data reset to the port release API.
>>>>>>>>
>>>>>>>> Fixes: d948f596fee2 ("ethdev: fix port data mismatched in multiple
>>>>>>>> process model")
>>>>>>>> Cc: stable@dpdk.org
>>>>>>>>
>>>>>>>> Signed-off-by: Matan Azrad <matan@mellanox.com>
>>>>>>>> ---
>>>>>>>>  lib/librte_ether/rte_ethdev.c | 2 +-
>>>>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/lib/librte_ether/rte_ethdev.c
>>>>>>>> b/lib/librte_ether/rte_ethdev.c index 7044159..156231c 100644
>>>>>>>> --- a/lib/librte_ether/rte_ethdev.c
>>>>>>>> +++ b/lib/librte_ether/rte_ethdev.c
>>>>>>>> @@ -204,7 +204,6 @@ struct rte_eth_dev *
>>>>>>>>  		return NULL;
>>>>>>>>  	}
>>>>>>>>
>>>>>>>> -	memset(&rte_eth_dev_data[port_id], 0, sizeof(struct
>>>>>>> rte_eth_dev_data));
>>>>>>>>  	eth_dev = eth_dev_get(port_id);
>>>>>>>>  	snprintf(eth_dev->data->name, sizeof(eth_dev->data->name),
>>>>>>> "%s", name);
>>>>>>>>  	eth_dev->data->port_id = port_id; @@ -252,6 +251,7 @@ struct
>>>>>>>> rte_eth_dev *
>>>>>>>>  	if (eth_dev == NULL)
>>>>>>>>  		return -EINVAL;
>>>>>>>>
>>>>>>>> +	memset(eth_dev->data, 0, sizeof(struct
>>> rte_eth_dev_data));
>>>>>>>
>>>>>>> Hi Matan,
>>>>>>>
>>>>>>> What most of the vdev release path does is:
>>>>>>>
>>>>>>> eth_dev = rte_eth_dev_allocated(...)
>>>>>>> rte_free(eth_dev->data->dev_private);
>>>>>>> rte_free(eth_dev->data);
>>>>>>> rte_eth_dev_release_port(eth_dev);
>>>>>>>
>>>>>>> Since eth_dev->data freed, memset() it in
>>>>>>> rte_eth_dev_release_port() will be problem.
>>>>>>>
>>>>>>> We don't run remove path that is why we didn't hit the issue but
>>>>>>> this seems problem for all virtual PMDs.
>>>>>>
>>>>>> Yes, it is a problem and should be fixed:
>>>>>> For vdevs which use private rte_eth_dev_data the remove order can
>>> be:
>>>>>> 	private_data = eth_dev->data;
>>>>>> 	rte_free(eth_dev->data->dev_private);
>>>>>> 	rte_eth_dev_release_port(eth_dev); /* The last operation working
>>>>> on ethdev structure. */
>>>>>> 	rte_free(private_data);
>>>>>
>>>>> Do we need to save "private_data"?
>>>>
>>>> Just to emphasis that eth_dev structure should not more be available after
>>> rte_eth_dev_release_port().
>>>> Maybe in the future rte_eth_dev_release_port() will zero eth_dev
>>>> structure too :)
>>>
>>> Hi Matan,
>>>
>>> Reminder of this issue, it would be nice to fix in this release.
>>>
>>
>> Regarding the private rte_eth_dev_data, it should be fixed in the next thread:
>> https://dpdk.org/dev/patchwork/patch/35632/
>>
>> Regarding the rte_eth_dev_pci_release() function: I'm going to send a fix.
> 
> Thanks Matan for the patch,
> 
> But rte_eth_dev_release_port() is still broken because of this change, please
> check _rte_eth_dev_callback_process() which uses dev->data->port_id.

Hi Matan,

Any update on this?
As mentioned above rte_eth_dev_release_port() is still broken.

Thanks,
ferruh

> 
>>
>>>>
>>>>>>
>>>>>>
>>>>>>> Also rte_eth_dev_pci_release() looks problematic now.
>>>>>>
>>>>>> Yes, again, the last operation working on ethdev structure should be
>>>>> rte_eth_dev_release_port().
>>>>>>
>>>>>> So need to fix all vdevs and the rte_eth_dev_pci_release() function.
>>>>>>
>>>>>> Any comments?
>>>>>>
>>>>
>>
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/7] ethdev: fix port data reset timing
  2018-04-19 11:07                     ` Ferruh Yigit
@ 2018-04-25 12:16                       ` Matan Azrad
  2018-04-25 12:30                         ` Ori Kam
  2018-04-25 12:54                         ` Ferruh Yigit
  0 siblings, 2 replies; 214+ messages in thread
From: Matan Azrad @ 2018-04-25 12:16 UTC (permalink / raw)
  To: Ferruh Yigit, Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev, stable,
	Olga Shern

Hi all

From: Ferruh Yigit, Thursday, April 19, 2018 2:08 PM
> > But rte_eth_dev_release_port() is still broken because of this change,
> > please check _rte_eth_dev_callback_process() which uses dev->data-
> >port_id.

The issue is that a DESTROY callback gets port_id=0 all the time, regardless the destroyed port id.

Let's discuss about the fix:

There are 2 options for the DESTROY event meaning:

1. The device is going to be destroyed in the future (a bit after the callbacks calling).
	The user may think that there is a valid data in the device structure in the callback time,
	Thus, he may use it.
	The fix here is to move the callback to the start of the function,
	In this time the data field is still valid.

2. The device was already destroyed in the past (a bit before the callbacks calling).
	The user should think that there is no any valid data in the device structure in the callback time,
	Thus, he doesn't use it.
	The issue here:
	_rte_eth_dev_callback_process() assumes there is a valid data in the data field  all the time,
	But in this case the data field is not valid because the device was already destroyed.
	Optional fixes:
	1. Always keep the data->port_id valid.
	2. keep the data->port_id valid only for the _rte_eth_dev_callback_process() call.
	2. Change _rte_eth_dev_callback_process() arg from "struct rte_eth_dev *dev" to "uint16_t port_id"
		a. Need to change all the calls for this internal API.

I vote to 2.1.


What do you think?

Matan.
	




^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/7] ethdev: fix port data reset timing
  2018-04-25 12:16                       ` Matan Azrad
@ 2018-04-25 12:30                         ` Ori Kam
  2018-04-25 12:54                         ` Ferruh Yigit
  1 sibling, 0 replies; 214+ messages in thread
From: Ori Kam @ 2018-04-25 12:30 UTC (permalink / raw)
  To: Matan Azrad, Ferruh Yigit, Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev, stable,
	Olga Shern

Hi

I vote for 2.3.

Ori

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Matan Azrad
> Sent: Wednesday, April 25, 2018 3:16 PM
> To: Ferruh Yigit <ferruh.yigit@intel.com>; Thomas Monjalon
> <thomas@monjalon.net>; Gaetan Rivet <gaetan.rivet@6wind.com>; Jingjing
> Wu <jingjing.wu@intel.com>
> Cc: dev@dpdk.org; Neil Horman <nhorman@tuxdriver.com>; Bruce
> Richardson <bruce.richardson@intel.com>; Konstantin Ananyev
> <konstantin.ananyev@intel.com>; stable@dpdk.org; Olga Shern
> <olgas@mellanox.com>
> Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/7] ethdev: fix port data
> reset timing
> 
> Hi all
> 
> From: Ferruh Yigit, Thursday, April 19, 2018 2:08 PM
> > > But rte_eth_dev_release_port() is still broken because of this
> > >change,  please check _rte_eth_dev_callback_process() which uses
> > >dev->data- port_id.
> 
> The issue is that a DESTROY callback gets port_id=0 all the time, regardless
> the destroyed port id.
> 
> Let's discuss about the fix:
> 
> There are 2 options for the DESTROY event meaning:
> 
> 1. The device is going to be destroyed in the future (a bit after the callbacks
> calling).
> 	The user may think that there is a valid data in the device structure in
> the callback time,
> 	Thus, he may use it.
> 	The fix here is to move the callback to the start of the function,
> 	In this time the data field is still valid.
> 
> 2. The device was already destroyed in the past (a bit before the callbacks
> calling).
> 	The user should think that there is no any valid data in the device
> structure in the callback time,
> 	Thus, he doesn't use it.
> 	The issue here:
> 	_rte_eth_dev_callback_process() assumes there is a valid data in the
> data field  all the time,
> 	But in this case the data field is not valid because the device was
> already destroyed.
> 	Optional fixes:
> 	1. Always keep the data->port_id valid.
> 	2. keep the data->port_id valid only for the
> _rte_eth_dev_callback_process() call.
> 	3. Change _rte_eth_dev_callback_process() arg from "struct
> rte_eth_dev *dev" to "uint16_t port_id"
> 		a. Need to change all the calls for this internal API.
> 
> I vote to 2.1.
> 
> 
> What do you think?
> 
> Matan.
> 
> 
> 


^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/7] ethdev: fix port data reset timing
  2018-04-25 12:16                       ` Matan Azrad
  2018-04-25 12:30                         ` Ori Kam
@ 2018-04-25 12:54                         ` Ferruh Yigit
  2018-04-25 14:01                           ` Matan Azrad
  1 sibling, 1 reply; 214+ messages in thread
From: Ferruh Yigit @ 2018-04-25 12:54 UTC (permalink / raw)
  To: Matan Azrad, Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev, stable,
	Olga Shern

On 4/25/2018 1:16 PM, Matan Azrad wrote:
> Hi all
> 
> From: Ferruh Yigit, Thursday, April 19, 2018 2:08 PM
>>> But rte_eth_dev_release_port() is still broken because of this change,
>>> please check _rte_eth_dev_callback_process() which uses dev->data-
>>> port_id.
> 
> The issue is that a DESTROY callback gets port_id=0 all the time, regardless the destroyed port id.
> 
> Let's discuss about the fix:
> 
> There are 2 options for the DESTROY event meaning:
> 
> 1. The device is going to be destroyed in the future (a bit after the callbacks calling).
> 	The user may think that there is a valid data in the device structure in the callback time,
> 	Thus, he may use it.
> 	The fix here is to move the callback to the start of the function,
> 	In this time the data field is still valid.
> 
> 2. The device was already destroyed in the past (a bit before the callbacks calling).
> 	The user should think that there is no any valid data in the device structure in the callback time,
> 	Thus, he doesn't use it.
> 	The issue here:
> 	_rte_eth_dev_callback_process() assumes there is a valid data in the data field  all the time,
> 	But in this case the data field is not valid because the device was already destroyed.
> 	Optional fixes:
> 	1. Always keep the data->port_id valid.
> 	2. keep the data->port_id valid only for the _rte_eth_dev_callback_process() call.
> 	2. Change _rte_eth_dev_callback_process() arg from "struct rte_eth_dev *dev" to "uint16_t port_id"
> 		a. Need to change all the calls for this internal API.
> 
> I vote to 2.1.
> 
> 
> What do you think?

What is the concern with 1? It is easy to implement.

And it may be better because if callback called after device destroyed, there is
no guarantee/locking that same port won't be re-used, in the middle of the
callback function rte_eth_dev_data can be updated, no?

> 
> Matan.
> 	
> 
> 
> 

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/7] ethdev: fix port data reset timing
  2018-04-25 12:54                         ` Ferruh Yigit
@ 2018-04-25 14:01                           ` Matan Azrad
  0 siblings, 0 replies; 214+ messages in thread
From: Matan Azrad @ 2018-04-25 14:01 UTC (permalink / raw)
  To: Ferruh Yigit, Thomas Monjalon, Gaetan Rivet, Jingjing Wu
  Cc: dev, Neil Horman, Bruce Richardson, Konstantin Ananyev, stable,
	Olga Shern


Hi Ferruh

 From: Ferruh Yigit, Wednesday, April 25, 2018 3:54 PM
> On 4/25/2018 1:16 PM, Matan Azrad wrote:
> > Hi all
> >
> > From: Ferruh Yigit, Thursday, April 19, 2018 2:08 PM
> >>> But rte_eth_dev_release_port() is still broken because of this
> >>> change, please check _rte_eth_dev_callback_process() which uses
> >>> dev->data- port_id.
> >
> > The issue is that a DESTROY callback gets port_id=0 all the time, regardless
> the destroyed port id.
> >
> > Let's discuss about the fix:
> >
> > There are 2 options for the DESTROY event meaning:
> >
> > 1. The device is going to be destroyed in the future (a bit after the callbacks
> calling).
> > 	The user may think that there is a valid data in the device structure in
> the callback time,
> > 	Thus, he may use it.
> > 	The fix here is to move the callback to the start of the function,
> > 	In this time the data field is still valid.
> >
> > 2. The device was already destroyed in the past (a bit before the callbacks
> calling).
> > 	The user should think that there is no any valid data in the device
> structure in the callback time,
> > 	Thus, he doesn't use it.
> > 	The issue here:
> > 	_rte_eth_dev_callback_process() assumes there is a valid data in the
> data field  all the time,
> > 	But in this case the data field is not valid because the device was
> already destroyed.
> > 	Optional fixes:
> > 	1. Always keep the data->port_id valid.
> > 	2. keep the data->port_id valid only for the
> _rte_eth_dev_callback_process() call.
> > 	3. Change _rte_eth_dev_callback_process() arg from "struct
> rte_eth_dev *dev" to "uint16_t port_id"
> > 		a. Need to change all the calls for this internal API.
> >
> > I vote to 2.1.
> >
> >
> > What do you think?
> 
> What is the concern with 1? It is easy to implement.
> 
Yes, also 2.1 and 2.2 are easy.

> And it may be better because if callback called after device destroyed, there
> is no guarantee/locking that same port won't be re-used, in the middle of the
> callback function rte_eth_dev_data can be updated, no?
> 

Good point!

I think we must guarantee no port allocation for the same port id in the callback time.
I also prefer to not call the callbacks in the critical section.

So maybe call it before the locking is better.


^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
  2017-12-06  0:40 [dpdk-dev] [PATCH 2/5] ethdev: add port ownership Ananyev, Konstantin
@ 2017-12-06  9:22 ` Thomas Monjalon
  0 siblings, 0 replies; 214+ messages in thread
From: Thomas Monjalon @ 2017-12-06  9:22 UTC (permalink / raw)
  To: Ananyev, Konstantin
  Cc: Matan Azrad, Neil Horman (nhorman@tuxdriver.com),
	Wu, Jingjing, Gaëtan Rivet, dev

06/12/2017 01:40, Ananyev, Konstantin:
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> > 05/12/2017 16:13, Ananyev, Konstantin:

> > > > Keep in mind that the owner can be an application thread.
> > > > If you prefer using a single function pointer (may help for
> > > > atomic implementation), we can allocate an owner structure containing
> > > > a name as a string to identify the owner in human readable format.
> > > > Then we just have to set the pointer of this struct to rte_eth_dev_data.
> > >
> > > Basically you'd like to have an ability to set something different then
> > > pointer to rte_eth_dev_data as an owner, right?
> > 
> > No, it can be a pointer, or an id, I don't care.
> 
> My question was about a pointer to a specific struct: rte_eth_dev_data.
> As I understand you want it to be a pointer to something else?
> Probably to some new struct rte_eth_dev_owner or so...

I have no strong opinion.
The requirement is to identify owners with something unique,
and to be able to print a pretty name.


> > > I think this is possible too, just not sure it will useful.
> > >
> > > > > What I meant - this api to set/get ownership should be sort of internal to ethdev layer.
> > > > > Let say it would be used for failsafe/bonding (any other compound) device that needs
> > > > > to own/manage several low-level devices.
> > > > > So in normal situation user wouldn't need to use that API directly at all.
> > >
> > > > Again, the application may use this API to declare its ownership.
> > >
> > > Could you explain that a bit: what would mean 'application declares an ownership on device'?
> > > Does it mean that no other application will be allowed to do any control op on that device
> > > till application will clear its ownership?
> > > I.E. make sure that at each moment only one particular thread can modify device configuration?
> > > Or would it be totally informal and second application will be free to ignore it?
> > 
> > It is an information.
> > It will avoid a library taking ownership on a port controlled by the app.
> 
> Ok, let's step back a bit.
> As I can see there are 2 separate issues/design choices inside rte_ethdev :)  :
> 1) control API is not MT safe - at any given moment 2 (or more) threads (/processes)
>     can try to change config of a given device.
>    Right now we leave that sync effort to the upper layer.
> 2) Even with the same thread 2 (or more) DPDK entities (high level PMD/third party lib, etc.)
>     can try to manage the same device.
>    I.E. the device can be managed by bonding device, but application mistakenly
>     can try to manage it on its own instead of relying on the bonding device to do so.
>    Again right now we leave it up to the upper layer to keep track which device is managed
>    by what DPDK entity.
> 
> So what problem are you guys trying to solve with your patch?
> Is it 1) or 2) below or might be something else?

We try to solve 2)

By solving 2), the issue 1) is slightly different:
Thread safety with control API will become the responsibility
of the DPDK entity (app or lib) which owns the port.

Hope it is clear now, thanks for the brainstorming.

^ permalink raw reply	[flat|nested] 214+ messages in thread

* Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
@ 2017-12-06  0:40 Ananyev, Konstantin
  2017-12-06  9:22 ` Thomas Monjalon
  0 siblings, 1 reply; 214+ messages in thread
From: Ananyev, Konstantin @ 2017-12-06  0:40 UTC (permalink / raw)
  To: Thomas Monjalon, Matan Azrad, Neil Horman (nhorman@tuxdriver.com),
	Wu, Jingjing, Gaëtan Rivet
  Cc: dev




> 
> 
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Tuesday, December 5, 2017 3:50 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Cc: Matan Azrad <matan@mellanox.com>; Neil Horman <nhorman@tuxdriver.com>; Gaëtan Rivet <gaetan.rivet@6wind.com>; Wu,
> Jingjing <jingjing.wu@intel.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/5] ethdev: add port ownership
> 
> 05/12/2017 16:13, Ananyev, Konstantin:
> >
> > Hi Thomas,
> >
> > > Hi,
> >
> > > I will give my view on locking and synchronization in a different email.
> > > Let's discuss about the API here.
> >
> > > 05/12/2017 12:12, Ananyev, Konstantin:
> > > >> From: Matan Azrad [mailto:matan@mellanox.com]
> > >> > From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> >
> > > > > > If the goal is just to have an ability to recognize is that device is managed by
> > > > > > another device (failsafe, bonding, etc.), then I think all we need is a pointer
> > > > > > to rte_eth_dev_data of the owner (NULL would mean no owner).
> > > > >
> > > > > I think string is better than a pointer from the next reasons:
> > > > > 1. It is more human friendly than pointers for debug and printing.
> > > >
> > > > We can have a function that would take an owner pointer and produce nice
> > > > pretty formatted text explanation: "owned by fail-safe device at port X" or so.
> >
> > > I don't think it is possible or convenient to have such function.
> >
> > Why do you think it is not possible?
> 
> Because of applications being the owner (discussion below).
> 
> > > Keep in mind that the owner can be an application thread.
> > > If you prefer using a single function pointer (may help for
> > > atomic implementation), we can allocate an owner structure containing
> > > a name as a string to identify the owner in human readable format.
> > > Then we just have to set the pointer of this struct to rte_eth_dev_data.
> >
> > Basically you'd like to have an ability to set something different then
> > pointer to rte_eth_dev_data as an owner, right?
> 
> No, it can be a pointer, or an id, I don't care.

My question was about a pointer to a specific struct: rte_eth_dev_data.
As I understand you want it to be a pointer to something else?
Probably to some new struct rte_eth_dev_owner or so...

> 
> > I think this is possible too, just not sure it will useful.
> >
> > > > What I meant - this api to set/get ownership should be sort of internal to ethdev layer.
> > > > Let say it would be used for failsafe/bonding (any other compound) device that needs
> > > > to own/manage several low-level devices.
> > > > So in normal situation user wouldn't need to use that API directly at all.
> >
> > > Again, the application may use this API to declare its ownership.
> >
> > Could you explain that a bit: what would mean 'application declares an ownership on device'?
> > Does it mean that no other application will be allowed to do any control op on that device
> > till application will clear its ownership?
> > I.E. make sure that at each moment only one particular thread can modify device configuration?
> > Or would it be totally informal and second application will be free to ignore it?
> 
> It is an information.
> It will avoid a library taking ownership on a port controlled by the app.

Ok, let's step back a bit.
As I can see there are 2 separate issues/design choices inside rte_ethdev :)  :
1) control API is not MT safe - at any given moment 2 (or more) threads (/processes)
    can try to change config of a given device.
   Right now we leave that sync effort to the upper layer.
2) Even with the same thread 2 (or more) DPDK entities (high level PMD/third party lib, etc.)
    can try to manage the same device.
   I.E. the device can be managed by bonding device, but application mistakenly
    can try to manage it on its own instead of relying on the bonding device to do so.
   Again right now we leave it up to the upper layer to keep track which device is managed
   by what DPDK entity.

So what problem are you guys trying to solve with your patch?
Is it 1) or 2) below or might be something else?

Konstantin

> 
> > If it will be the second one - I personally don't see much point in it.
> > If it the first one - then simplest and most straightforward way would be -
> > introduce a mutex (either per device or just per whole rte_eth_dev[]) and force
> > each control op to grab it at entrance release at exit.
> 
> No, a mutex does not provide any information.

> 
> > > And anwyway, it may be interesting from an application point of view
> > > to be able to list every devices and their internal owners.
> >
> > Yes sure application is free to call 'get' to retrieve information etc.
> > What I am saying for normal operation - application don't have to call that API.
> > I.E. - we don't need to change testpmd, etc. apps because that API was introduced.
> 
> Yes the ports can stay without any owner if the application does not fill it.

Konstantin

^ permalink raw reply	[flat|nested] 214+ messages in thread

end of thread, other threads:[~2018-04-25 14:01 UTC | newest]

Thread overview: 214+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-28 11:57 [dpdk-dev] [PATCH 0/5] ethdev: Port ownership Matan Azrad
2017-11-28 11:57 ` [dpdk-dev] [PATCH 1/5] ethdev: free a port by a dedicated API Matan Azrad
2017-11-28 11:57 ` [dpdk-dev] [PATCH 2/5] ethdev: add port ownership Matan Azrad
2017-11-30 12:36   ` Neil Horman
2017-11-30 13:24     ` Gaëtan Rivet
2017-11-30 14:30       ` Matan Azrad
2017-11-30 15:09         ` Gaëtan Rivet
2017-11-30 15:43           ` Matan Azrad
2017-12-01 12:09       ` Neil Horman
2017-12-03  8:04         ` Matan Azrad
2017-12-03 11:10           ` Ananyev, Konstantin
2017-12-03 13:46             ` Matan Azrad
2017-12-04 16:01               ` Neil Horman
2017-12-04 18:10                 ` Matan Azrad
2017-12-04 22:30                   ` Neil Horman
2017-12-05  6:08                     ` Matan Azrad
2017-12-05 10:05                       ` Bruce Richardson
2017-12-08 11:35                         ` Thomas Monjalon
2017-12-08 12:31                           ` Neil Horman
2017-12-21 17:06                             ` Thomas Monjalon
2017-12-21 17:43                               ` Neil Horman
2017-12-21 19:37                                 ` Matan Azrad
2017-12-21 20:14                                   ` Neil Horman
2017-12-21 21:57                                     ` Matan Azrad
2017-12-22 14:26                                       ` Neil Horman
2017-12-23 22:36                                         ` Matan Azrad
2017-12-29 16:56                                           ` Neil Horman
2017-12-05 19:26                       ` Neil Horman
2017-12-08 11:06                         ` Thomas Monjalon
2017-12-05 11:12               ` Ananyev, Konstantin
2017-12-05 11:44                 ` Ananyev, Konstantin
2017-12-05 11:53                   ` Thomas Monjalon
2017-12-05 14:56                     ` Bruce Richardson
2017-12-05 14:57                     ` Ananyev, Konstantin
2017-12-05 11:47                 ` Thomas Monjalon
2017-12-05 15:13                   ` Ananyev, Konstantin
2017-12-05 15:49                     ` Thomas Monjalon
2017-11-28 11:57 ` [dpdk-dev] [PATCH 3/5] net/failsafe: free an eth port by a dedicated API Matan Azrad
2017-11-28 11:58 ` [dpdk-dev] [PATCH 4/5] net/failsafe: use ownership mechanism to own ports Matan Azrad
2017-11-28 11:58 ` [dpdk-dev] [PATCH 5/5] app/testpmd: adjust ethdev port ownership Matan Azrad
2018-01-07  9:45 ` [dpdk-dev] [PATCH v2 0/6] ethdev: " Matan Azrad
2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 1/6] ethdev: fix port data reset timing Matan Azrad
2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership Matan Azrad
2018-01-10 13:36     ` Ananyev, Konstantin
2018-01-10 16:58       ` Matan Azrad
2018-01-11 12:40         ` Ananyev, Konstantin
2018-01-11 14:51           ` Matan Azrad
2018-01-12  0:02             ` Ananyev, Konstantin
2018-01-12  7:24               ` Matan Azrad
2018-01-15 11:45                 ` Ananyev, Konstantin
2018-01-15 13:09                   ` Matan Azrad
2018-01-15 18:43                     ` Ananyev, Konstantin
2018-01-16  8:04                       ` Matan Azrad
2018-01-16 19:11                         ` Ananyev, Konstantin
2018-01-16 20:32                           ` Matan Azrad
2018-01-17 11:24                             ` Ananyev, Konstantin
2018-01-17 12:05                               ` Matan Azrad
2018-01-17 12:54                                 ` Ananyev, Konstantin
2018-01-17 13:10                                   ` Matan Azrad
2018-01-17 16:52                                     ` Ananyev, Konstantin
2018-01-17 18:02                                       ` Matan Azrad
2018-01-17 20:34                                       ` Matan Azrad
2018-01-18 14:17                                         ` Ananyev, Konstantin
2018-01-18 14:26                                           ` Matan Azrad
2018-01-18 14:41                                             ` Ananyev, Konstantin
2018-01-18 14:45                                               ` Matan Azrad
2018-01-18 14:51                                                 ` Ananyev, Konstantin
2018-01-18 15:00                                                   ` Matan Azrad
2018-01-17 14:00                                 ` Neil Horman
2018-01-17 17:01                                   ` Ananyev, Konstantin
2018-01-18 13:10                                     ` Neil Horman
2018-01-18 14:00                                       ` Matan Azrad
2018-01-18 16:54                                         ` Neil Horman
2018-01-18 17:20                                           ` Matan Azrad
2018-01-18 18:41                                             ` Neil Horman
2018-01-18 20:21                                               ` Matan Azrad
2018-01-19  1:41                                                 ` Neil Horman
2018-01-19  7:14                                                   ` Matan Azrad
2018-01-19  9:30                                                     ` Bruce Richardson
2018-01-19 10:44                                                       ` Matan Azrad
2018-01-19 13:30                                                         ` Neil Horman
2018-01-19 13:57                                                           ` Matan Azrad
2018-01-19 14:13                                                           ` Thomas Monjalon
2018-01-19 15:27                                                             ` Neil Horman
2018-01-19 17:17                                                               ` Thomas Monjalon
2018-01-19 17:43                                                                 ` Neil Horman
2018-01-19 18:12                                                                   ` Thomas Monjalon
2018-01-19 19:47                                                                     ` Neil Horman
2018-01-19 20:19                                                                       ` Thomas Monjalon
2018-01-19 22:52                                                                         ` Neil Horman
2018-01-20  3:38                                                                         ` Tuxdriver
2018-01-20 12:54                                                                       ` Ananyev, Konstantin
2018-01-20 14:02                                                                         ` Thomas Monjalon
2018-01-19 12:55                                                       ` Neil Horman
2018-01-19 13:52                                                     ` Neil Horman
2018-01-18 16:27                                     ` Neil Horman
2018-01-17 17:58                                   ` Matan Azrad
2018-01-18 13:20                                     ` Neil Horman
2018-01-18 14:52                                       ` Matan Azrad
2018-01-19 13:57                                         ` Neil Horman
2018-01-19 14:07                                           ` Thomas Monjalon
2018-01-19 14:32                                             ` Neil Horman
2018-01-19 17:09                                               ` Thomas Monjalon
2018-01-19 17:37                                                 ` Neil Horman
2018-01-19 18:10                                                   ` Thomas Monjalon
2018-01-21 22:12                                                     ` Ferruh Yigit
2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 3/6] ethdev: synchronize port allocation Matan Azrad
2018-01-07  9:58     ` Matan Azrad
2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 4/6] net/failsafe: free an eth port by a dedicated API Matan Azrad
2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 5/6] net/failsafe: use ownership mechanism to own ports Matan Azrad
2018-01-08 10:32     ` Gaëtan Rivet
2018-01-08 11:16       ` Matan Azrad
2018-01-08 11:35         ` Gaëtan Rivet
2018-01-07  9:45   ` [dpdk-dev] [PATCH v2 6/6] app/testpmd: adjust ethdev port ownership Matan Azrad
2018-01-08 11:39     ` Gaëtan Rivet
2018-01-08 12:30       ` Matan Azrad
2018-01-08 13:30         ` Gaëtan Rivet
2018-01-08 13:55           ` Matan Azrad
2018-01-08 14:21             ` Gaëtan Rivet
2018-01-08 14:42               ` Matan Azrad
2018-01-16  5:53     ` Lu, Wenzhuo
2018-01-16  8:15       ` Matan Azrad
2018-01-17  0:46         ` Lu, Wenzhuo
2018-01-17  8:51           ` Matan Azrad
2018-01-18  0:53             ` Lu, Wenzhuo
2018-01-18 16:35   ` [dpdk-dev] [PATCH v3 0/7] Port ownership and syncronization Matan Azrad
2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 1/7] ethdev: fix port data reset timing Matan Azrad
2018-01-18 17:00       ` Thomas Monjalon
2018-01-19 12:38       ` Ananyev, Konstantin
2018-03-05 11:24       ` [dpdk-dev] [dpdk-stable] " Ferruh Yigit
2018-03-05 14:52         ` Matan Azrad
2018-03-05 15:06           ` Ferruh Yigit
2018-03-05 15:12             ` Matan Azrad
2018-03-27 22:37               ` Ferruh Yigit
2018-03-28 12:07                 ` Matan Azrad
2018-03-30 10:39                   ` Ferruh Yigit
2018-04-19 11:07                     ` Ferruh Yigit
2018-04-25 12:16                       ` Matan Azrad
2018-04-25 12:30                         ` Ori Kam
2018-04-25 12:54                         ` Ferruh Yigit
2018-04-25 14:01                           ` Matan Azrad
2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 2/7] ethdev: fix used portid allocation Matan Azrad
2018-01-18 17:00       ` Thomas Monjalon
2018-01-19 12:40       ` Ananyev, Konstantin
2018-01-20 16:48         ` Matan Azrad
2018-01-20 17:26           ` Ananyev, Konstantin
2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 3/7] ethdev: add port ownership Matan Azrad
2018-01-18 21:11       ` Thomas Monjalon
2018-01-19 12:41       ` Ananyev, Konstantin
2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 4/7] ethdev: synchronize port allocation Matan Azrad
2018-01-18 20:43       ` Thomas Monjalon
2018-01-18 20:52         ` Matan Azrad
2018-01-18 21:17           ` Thomas Monjalon
2018-01-19 12:47       ` Ananyev, Konstantin
2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 5/7] net/failsafe: free an eth port by a dedicated API Matan Azrad
2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 6/7] net/failsafe: use ownership mechanism to own ports Matan Azrad
2018-01-18 16:35     ` [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership Matan Azrad
2018-01-19 12:37       ` Ananyev, Konstantin
2018-01-19 12:51         ` Matan Azrad
2018-01-19 13:08           ` Ananyev, Konstantin
2018-01-19 13:35             ` Matan Azrad
2018-01-19 15:00               ` Gaëtan Rivet
2018-01-20 18:14                 ` Matan Azrad
2018-01-22 10:17                   ` Gaëtan Rivet
2018-01-22 11:22                     ` Matan Azrad
2018-01-22 12:28                 ` Ananyev, Konstantin
2018-01-22 13:22                   ` Matan Azrad
2018-01-22 20:48                     ` Ananyev, Konstantin
2018-01-23  8:54                       ` Matan Azrad
2018-01-23 12:56                         ` Gaëtan Rivet
2018-01-23 14:30                           ` Matan Azrad
2018-01-25  9:36                             ` Matan Azrad
2018-01-25 10:05                               ` Thomas Monjalon
2018-01-25 11:15                                 ` Ananyev, Konstantin
2018-01-25 11:33                                   ` Thomas Monjalon
2018-01-25 11:55                                     ` Ananyev, Konstantin
2018-01-23 13:34                         ` Ananyev, Konstantin
2018-01-23 14:18                           ` Thomas Monjalon
2018-01-23 15:12                             ` Ananyev, Konstantin
2018-01-23 15:18                               ` Ananyev, Konstantin
2018-01-23 17:33                                 ` Thomas Monjalon
2018-01-23 21:18                                   ` Ananyev, Konstantin
2018-01-24  8:10                                     ` Thomas Monjalon
2018-01-24 18:30                                       ` Ananyev, Konstantin
2018-01-25 10:55                                         ` Thomas Monjalon
2018-01-25 11:09                                           ` Ananyev, Konstantin
2018-01-25 11:27                                             ` Thomas Monjalon
2018-01-23 14:43                           ` Matan Azrad
2018-01-20 21:24     ` [dpdk-dev] [PATCH v4 0/7] Port ownership and syncronization Matan Azrad
2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 1/7] ethdev: fix port data reset timing Matan Azrad
2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 2/7] ethdev: fix used portid allocation Matan Azrad
2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 3/7] ethdev: add port ownership Matan Azrad
2018-01-21 20:43         ` Ferruh Yigit
2018-01-21 20:46         ` Ferruh Yigit
2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 4/7] ethdev: synchronize port allocation Matan Azrad
2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 5/7] net/failsafe: free an eth port by a dedicated API Matan Azrad
2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 6/7] net/failsafe: use ownership mechanism to own ports Matan Azrad
2018-01-20 21:24       ` [dpdk-dev] [PATCH v4 7/7] app/testpmd: adjust ethdev port ownership Matan Azrad
2018-01-22 16:38       ` [dpdk-dev] [PATCH v5 0/7] Port ownership and synchronization Matan Azrad
2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 1/7] ethdev: fix port data reset timing Matan Azrad
2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 2/7] ethdev: fix used portid allocation Matan Azrad
2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 3/7] ethdev: add port ownership Matan Azrad
2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 4/7] ethdev: synchronize port allocation Matan Azrad
2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 5/7] net/failsafe: free an eth port by a dedicated API Matan Azrad
2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 6/7] net/failsafe: use ownership mechanism to own ports Matan Azrad
2018-01-22 16:38         ` [dpdk-dev] [PATCH v5 7/7] app/testpmd: adjust ethdev port ownership Matan Azrad
2018-01-25  1:47           ` Lu, Wenzhuo
2018-01-25  8:30             ` Matan Azrad
2018-01-26  0:50               ` Lu, Wenzhuo
2018-01-29 11:21         ` [dpdk-dev] [PATCH v5 0/7] Port ownership and synchronization Matan Azrad
2018-01-31 19:53           ` Thomas Monjalon
2018-01-25 14:35     ` [dpdk-dev] [PATCH v3 0/7] Port ownership and syncronization Ferruh Yigit
2017-12-06  0:40 [dpdk-dev] [PATCH 2/5] ethdev: add port ownership Ananyev, Konstantin
2017-12-06  9:22 ` Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).