DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [RFC] ethdev: make rte flow API thread safe
@ 2020-09-03  4:53 Suanming Mou
  2020-09-03 17:37 ` Stephen Hemminger
  0 siblings, 1 reply; 9+ messages in thread
From: Suanming Mou @ 2020-09-03  4:53 UTC (permalink / raw)
  To: Ori Kam, John McNamara, Marko Kovacevic, Matan Azrad,
	Shahaf Shuler, Viacheslav Ovsiienko, Thomas Monjalon,
	Ferruh Yigit, Andrew Rybchenko
  Cc: dev

Currently, the rte flow functions are not defined as thread safety.
DPDK applications either call the functions in single thread or add
locks around the functions for the critical section.

For PMDs support the flow operations thread safe natively, the
redundant protection in application hurts the performance of the
rte flow operation functions.

And the restriction of thread safety not guaranteed for the rte
flow functions also limits the applications' expectation.

This feature is going to change the rte flow functions to be thread
safety. As different PMDs have different flow operations, some may
support thread safety already and others may not. For PMDs don't
support flow thread safe operation, a new lock is defined in ethdev
in order to protects thread unsafe PMDs from rte flow level.

A new RTE_ETH_DEV_FLOW_OPS_THREAD_SAFE device flag is added to
determine whether the PMD supports thread safe flow operation or not.
For PMDs support thread safe flow operations, set the
RTE_ETH_DEV_FLOW_OPS_THREAD_SAFE flag, rte flow level functions will
skip the thread safe helper lock for these PMDs. Again the rte flow
level thread safe lock only works when PMD operation functions are
not thread safety.

The change has no effect on the current DPDK applications. No change
is required for the current DPDK applications.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
---
 doc/guides/prog_guide/rte_flow.rst  |  7 ++--
 drivers/net/mlx5/linux/mlx5_os.c    |  5 +++
 lib/librte_ethdev/rte_ethdev.c      |  2 +
 lib/librte_ethdev/rte_ethdev.h      |  2 +
 lib/librte_ethdev/rte_ethdev_core.h |  1 +
 lib/librte_ethdev/rte_flow.c        | 84 ++++++++++++++++++++++++++++---------
 6 files changed, 78 insertions(+), 23 deletions(-)

diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 3e5cd1e..54b592c 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -3033,10 +3033,6 @@ Caveats
 - API operations are synchronous and blocking (``EAGAIN`` cannot be
   returned).
 
-- There is no provision for re-entrancy/multi-thread safety, although nothing
-  should prevent different devices from being configured at the same
-  time. PMDs may protect their control path functions accordingly.
-
 - Stopping the data path (TX/RX) should not be necessary when managing flow
   rules. If this cannot be achieved naturally or with workarounds (such as
   temporarily replacing the burst function pointers), an appropriate error
@@ -3088,6 +3084,9 @@ This interface additionally defines the following helper function:
 - ``rte_flow_ops_get()``: get generic flow operations structure from a
   port.
 
+If PMD interfaces do not support re-entrancy/multi-thread safety, rte flow
+level functions will do it by mutex.
+
 More will be added over time.
 
 Device compatibility
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 69123e1..872e868 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1152,6 +1152,11 @@
 	eth_dev->data->dev_private = priv;
 	priv->dev_data = eth_dev->data;
 	eth_dev->data->mac_addrs = priv->mac;
+	/*
+	 * This is an example of how to configure PMD flow thread safe.
+	 * Currently, mlx5 PMD still doesn't support thread safety.
+	 */
+	eth_dev->data->dev_flags = RTE_ETH_DEV_FLOW_OPS_THREAD_SAFE;
 	eth_dev->device = dpdk_dev;
 	/* Configure the first MAC address by default. */
 	if (mlx5_get_mac(eth_dev, &mac.addr_bytes)) {
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 7858ad5..8256657 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -500,6 +500,7 @@ struct rte_eth_dev *
 	strlcpy(eth_dev->data->name, name, sizeof(eth_dev->data->name));
 	eth_dev->data->port_id = port_id;
 	eth_dev->data->mtu = RTE_ETHER_MTU;
+	pthread_mutex_init(&eth_dev->data->fts_mutex, NULL);
 
 unlock:
 	rte_spinlock_unlock(&rte_eth_dev_shared_data->ownership_lock);
@@ -562,6 +563,7 @@ struct rte_eth_dev *
 		rte_free(eth_dev->data->mac_addrs);
 		rte_free(eth_dev->data->hash_mac_addrs);
 		rte_free(eth_dev->data->dev_private);
+		pthread_mutex_destroy(&eth_dev->data->fts_mutex);
 		memset(eth_dev->data, 0, sizeof(struct rte_eth_dev_data));
 	}
 
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index 70295d7..6ec2450 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1636,6 +1636,8 @@ struct rte_eth_dev_owner {
 #define RTE_ETH_DEV_REPRESENTOR  0x0010
 /** Device does not support MAC change after started */
 #define RTE_ETH_DEV_NOLIVE_MAC_ADDR  0x0020
+/** Device PMD supports thread safety flow operation */
+#define RTE_ETH_DEV_FLOW_OPS_THREAD_SAFE  0x0040
 
 /**
  * Iterates over valid ethdev ports owned by a specific owner.
diff --git a/lib/librte_ethdev/rte_ethdev_core.h b/lib/librte_ethdev/rte_ethdev_core.h
index 32407dd..4d37124 100644
--- a/lib/librte_ethdev/rte_ethdev_core.h
+++ b/lib/librte_ethdev/rte_ethdev_core.h
@@ -876,6 +876,7 @@ struct rte_eth_dev_data {
 			 *   Valid if RTE_ETH_DEV_REPRESENTOR in dev_flags.
 			 */
 
+	pthread_mutex_t fts_mutex; /**< rte flow ops thread safety mutex. */
 	uint64_t reserved_64s[4]; /**< Reserved for future fields */
 	void *reserved_ptrs[4];   /**< Reserved for future fields */
 } __rte_cache_aligned;
diff --git a/lib/librte_ethdev/rte_flow.c b/lib/librte_ethdev/rte_flow.c
index f8fdd68..ca80b12 100644
--- a/lib/librte_ethdev/rte_flow.c
+++ b/lib/librte_ethdev/rte_flow.c
@@ -207,6 +207,20 @@ struct rte_flow_desc_data {
 	return -rte_errno;
 }
 
+static void
+flow_lock(struct rte_eth_dev *dev)
+{
+	if (!(dev->data->dev_flags & RTE_ETH_DEV_FLOW_OPS_THREAD_SAFE))
+		pthread_mutex_lock(&dev->data->fts_mutex);
+}
+
+static void
+flow_unlock(struct rte_eth_dev *dev)
+{
+	if (!(dev->data->dev_flags & RTE_ETH_DEV_FLOW_OPS_THREAD_SAFE))
+		pthread_mutex_unlock(&dev->data->fts_mutex);
+}
+
 static int
 flow_err(uint16_t port_id, int ret, struct rte_flow_error *error)
 {
@@ -346,12 +360,16 @@ struct rte_flow_desc_data {
 {
 	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+	int ret;
 
 	if (unlikely(!ops))
 		return -rte_errno;
-	if (likely(!!ops->validate))
-		return flow_err(port_id, ops->validate(dev, attr, pattern,
-						       actions, error), error);
+	if (likely(!!ops->validate)) {
+		flow_lock(dev);
+		ret = ops->validate(dev, attr, pattern, actions, error);
+		flow_unlock(dev);
+		return flow_err(port_id, ret, error);
+	}
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -372,7 +390,9 @@ struct rte_flow *
 	if (unlikely(!ops))
 		return NULL;
 	if (likely(!!ops->create)) {
+		flow_lock(dev);
 		flow = ops->create(dev, attr, pattern, actions, error);
+		flow_unlock(dev);
 		if (flow == NULL)
 			flow_err(port_id, -rte_errno, error);
 		return flow;
@@ -390,12 +410,16 @@ struct rte_flow *
 {
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
 	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+	int ret;
 
 	if (unlikely(!ops))
 		return -rte_errno;
-	if (likely(!!ops->destroy))
-		return flow_err(port_id, ops->destroy(dev, flow, error),
-				error);
+	if (likely(!!ops->destroy)) {
+		flow_lock(dev);
+		ret = ops->destroy(dev, flow, error);
+		flow_unlock(dev);
+		return flow_err(port_id, ret, error);
+	}
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -408,11 +432,16 @@ struct rte_flow *
 {
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
 	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+	int ret;
 
 	if (unlikely(!ops))
 		return -rte_errno;
-	if (likely(!!ops->flush))
-		return flow_err(port_id, ops->flush(dev, error), error);
+	if (likely(!!ops->flush)) {
+		flow_lock(dev);
+		ret = ops->flush(dev, error);
+		flow_unlock(dev);
+		return flow_err(port_id, ret, error);
+	}
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -428,12 +457,16 @@ struct rte_flow *
 {
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
 	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+	int ret;
 
 	if (!ops)
 		return -rte_errno;
-	if (likely(!!ops->query))
-		return flow_err(port_id, ops->query(dev, flow, action, data,
-						    error), error);
+	if (likely(!!ops->query)) {
+		flow_lock(dev);
+		ret = ops->query(dev, flow, action, data, error);
+		flow_unlock(dev);
+		return flow_err(port_id, ret, error);
+	}
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -447,11 +480,16 @@ struct rte_flow *
 {
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
 	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+	int ret;
 
 	if (!ops)
 		return -rte_errno;
-	if (likely(!!ops->isolate))
-		return flow_err(port_id, ops->isolate(dev, set, error), error);
+	if (likely(!!ops->isolate)) {
+		flow_lock(dev);
+		ret = ops->isolate(dev, set, error);
+		flow_unlock(dev);
+		return flow_err(port_id, ret, error);
+	}
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -1224,12 +1262,16 @@ enum rte_flow_conv_item_spec_type {
 {
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
 	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+	int ret;
 
 	if (unlikely(!ops))
 		return -rte_errno;
-	if (likely(!!ops->dev_dump))
-		return flow_err(port_id, ops->dev_dump(dev, file, error),
-				error);
+	if (likely(!!ops->dev_dump)) {
+		flow_lock(dev);
+		ret = ops->dev_dump(dev, file, error);
+		flow_unlock(dev);
+		return flow_err(port_id, ret, error);
+	}
 	return rte_flow_error_set(error, ENOSYS,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOSYS));
@@ -1241,12 +1283,16 @@ enum rte_flow_conv_item_spec_type {
 {
 	struct rte_eth_dev *dev = &rte_eth_devices[port_id];
 	const struct rte_flow_ops *ops = rte_flow_ops_get(port_id, error);
+	int ret;
 
 	if (unlikely(!ops))
 		return -rte_errno;
-	if (likely(!!ops->get_aged_flows))
-		return flow_err(port_id, ops->get_aged_flows(dev, contexts,
-				nb_contexts, error), error);
+	if (likely(!!ops->get_aged_flows)) {
+		flow_lock(dev);
+		ret = ops->get_aged_flows(dev, contexts, nb_contexts, error);
+		flow_unlock(dev);
+		return flow_err(port_id, ret, error);
+	}
 	return rte_flow_error_set(error, ENOTSUP,
 				  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
 				  NULL, rte_strerror(ENOTSUP));
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dpdk-dev] [RFC] ethdev: make rte flow API thread safe
  2020-09-03  4:53 [dpdk-dev] [RFC] ethdev: make rte flow API thread safe Suanming Mou
@ 2020-09-03 17:37 ` Stephen Hemminger
  2020-09-07  2:36   ` Suanming Mou
  0 siblings, 1 reply; 9+ messages in thread
From: Stephen Hemminger @ 2020-09-03 17:37 UTC (permalink / raw)
  To: Suanming Mou
  Cc: Ori Kam, John McNamara, Marko Kovacevic, Matan Azrad,
	Shahaf Shuler, Viacheslav Ovsiienko, Thomas Monjalon,
	Ferruh Yigit, Andrew Rybchenko, dev

On Thu,  3 Sep 2020 12:53:02 +0800
Suanming Mou <suanmingm@nvidia.com> wrote:

> Currently, the rte flow functions are not defined as thread safety.
> DPDK applications either call the functions in single thread or add
> locks around the functions for the critical section.
> 
> For PMDs support the flow operations thread safe natively, the
> redundant protection in application hurts the performance of the
> rte flow operation functions.
> 
> And the restriction of thread safety not guaranteed for the rte
> flow functions also limits the applications' expectation.
> 
> This feature is going to change the rte flow functions to be thread
> safety. As different PMDs have different flow operations, some may
> support thread safety already and others may not. For PMDs don't
> support flow thread safe operation, a new lock is defined in ethdev
> in order to protects thread unsafe PMDs from rte flow level.
> 
> A new RTE_ETH_DEV_FLOW_OPS_THREAD_SAFE device flag is added to
> determine whether the PMD supports thread safe flow operation or not.
> For PMDs support thread safe flow operations, set the
> RTE_ETH_DEV_FLOW_OPS_THREAD_SAFE flag, rte flow level functions will
> skip the thread safe helper lock for these PMDs. Again the rte flow
> level thread safe lock only works when PMD operation functions are
> not thread safety.
> 
> The change has no effect on the current DPDK applications. No change
> is required for the current DPDK applications.
> 
> Signed-off-by: Suanming Mou <suanmingm@nvidia.com>

What is the performance impact of this for currently working applications
that use a single thread to program flow rules.  You are adding a couple of
system calls to what was formerly a totally usermode operation.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dpdk-dev] [RFC] ethdev: make rte flow API thread safe
  2020-09-03 17:37 ` Stephen Hemminger
@ 2020-09-07  2:36   ` Suanming Mou
  2020-09-08 14:52     ` Stephen Hemminger
  0 siblings, 1 reply; 9+ messages in thread
From: Suanming Mou @ 2020-09-07  2:36 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Ori Kam, John McNamara, Marko Kovacevic, Matan Azrad,
	Shahaf Shuler, Viacheslav Ovsiienko, NBU-Contact-Thomas Monjalon,
	Ferruh Yigit, Andrew Rybchenko, dev

Hi,

Sorry for my late reply due to the vacation.

> What is the performance impact of this for currently working applications that
> use a single thread to program flow rules.  You are adding a couple of system
> calls to what was formerly a totally usermode operation.

If I understand correctly, in the non-contended single thread case, pthread mutex lock should not go to the kernel space.
I also wrote a small application with pthread mutex, and strace shows no system call was introduced.

Another simple testing code below is to check the cycles cost difference in every round between pthread mutex and spin_lock.

//#define FT_USE_SPIN 1
#define NLOOP 1000000

#ifdef FT_USE_SPIN
static rte_spinlock_t sp_lock;
#else
static pthread_mutex_t ml_lock;
#endif

static inline void ft_init_lock(void)
{
#ifdef FT_USE_SPIN
	rte_spinlock_init(&sp_lock);
#else
	pthread_mutex_init(&ml_lock, NULL);
#endif
}

static inline void ft_unlock(void)
{
#ifdef FT_USE_SPIN
	rte_spinlock_unlock(&sp_lock);
#else
	pthread_mutex_unlock(&ml_lock);
#endif
}

static inline void ft_lock(void)
{
#ifdef FT_USE_SPIN
	rte_spinlock_lock(&sp_lock);
#else
	pthread_mutex_lock(&ml_lock);
#endif
}

static void ft_check_cycles(void)
{
	static int init = 0;
	uint64_t start, end;
	int i, n;

	if (!init) {
		init = 1;
		ft_init_lock();
	}

	/* Make code hot. */
	ft_lock();
	n = 0;
	ft_unlock();

	start = rte_rdtsc();
	for (i = 0; i < NLOOP; i++) {
		ft_lock();
		n++;
		ft_unlock();
	}
	end = rte_rdtsc();
	printf("loop:%d, cycles per loop:%f\n", n, (end - start) / (float)n);
}

They  both showed around 50 cycles similar costing per loop.

The reason pthread mutex lock chosen here is that most DPDK applications like OVS-DPDK is using that.

Thanks,
SuanmingMou


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dpdk-dev] [RFC] ethdev: make rte flow API thread safe
  2020-09-07  2:36   ` Suanming Mou
@ 2020-09-08 14:52     ` Stephen Hemminger
  2020-09-08 15:03       ` Thomas Monjalon
  2020-09-09  1:26       ` Suanming Mou
  0 siblings, 2 replies; 9+ messages in thread
From: Stephen Hemminger @ 2020-09-08 14:52 UTC (permalink / raw)
  To: Suanming Mou
  Cc: Ori Kam, John McNamara, Marko Kovacevic, Matan Azrad,
	Shahaf Shuler, Viacheslav Ovsiienko, NBU-Contact-Thomas Monjalon,
	Ferruh Yigit, Andrew Rybchenko, dev

On Mon, 7 Sep 2020 02:36:48 +0000
Suanming Mou <suanmingm@nvidia.com> wrote:

> Hi,
> 
> Sorry for my late reply due to the vacation.
> 
> > What is the performance impact of this for currently working applications that
> > use a single thread to program flow rules.  You are adding a couple of system
> > calls to what was formerly a totally usermode operation.  
> 

Read the source for glibc and see what pthread_mutex does

> If I understand correctly, in the non-contended single thread case, pthread mutex lock should not go to the kernel space.
> I also wrote a small application with pthread mutex, and strace shows no system call was introduced.
> 
> Another simple testing code below is to check the cycles cost difference in every round between pthread mutex and spin_lock.
> 

Micro benchmarks of locking is hard to see.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dpdk-dev] [RFC] ethdev: make rte flow API thread safe
  2020-09-08 14:52     ` Stephen Hemminger
@ 2020-09-08 15:03       ` Thomas Monjalon
  2020-09-08 16:02         ` Stephen Hemminger
  2020-09-09  1:26       ` Suanming Mou
  1 sibling, 1 reply; 9+ messages in thread
From: Thomas Monjalon @ 2020-09-08 15:03 UTC (permalink / raw)
  To: Suanming Mou, Stephen Hemminger
  Cc: Ori Kam, Matan Azrad, Shahaf Shuler, Viacheslav Ovsiienko,
	Ferruh Yigit, Andrew Rybchenko, dev, joyce.kong, phil.yang,
	steve.capper, honnappa.nagarahalli

08/09/2020 16:52, Stephen Hemminger:
> On Mon, 7 Sep 2020 02:36:48 +0000
> Suanming Mou <suanmingm@nvidia.com> wrote:
> > > What is the performance impact of this for currently working applications that
> > > use a single thread to program flow rules.  You are adding a couple of system
> > > calls to what was formerly a totally usermode operation.  
> 
> Read the source for glibc and see what pthread_mutex does

What would be the best lock for rte_flow?
We have spin lock, ticket lock, MCS lock (and rwlock) in DPDK.


> > If I understand correctly, in the non-contended single thread case, pthread mutex lock should not go to the kernel space.
> > I also wrote a small application with pthread mutex, and strace shows no system call was introduced.
> > 
> > Another simple testing code below is to check the cycles cost difference in every round between pthread mutex and spin_lock.
> 
> Micro benchmarks of locking is hard to see.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dpdk-dev] [RFC] ethdev: make rte flow API thread safe
  2020-09-08 15:03       ` Thomas Monjalon
@ 2020-09-08 16:02         ` Stephen Hemminger
  2020-09-09  2:26           ` Suanming Mou
  0 siblings, 1 reply; 9+ messages in thread
From: Stephen Hemminger @ 2020-09-08 16:02 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Suanming Mou, Ori Kam, Matan Azrad, Shahaf Shuler,
	Viacheslav Ovsiienko, Ferruh Yigit, Andrew Rybchenko, dev,
	joyce.kong, phil.yang, steve.capper, honnappa.nagarahalli

On Tue, 08 Sep 2020 17:03:53 +0200
Thomas Monjalon <thomas@monjalon.net> wrote:

> 08/09/2020 16:52, Stephen Hemminger:
> > On Mon, 7 Sep 2020 02:36:48 +0000
> > Suanming Mou <suanmingm@nvidia.com> wrote:  
> > > > What is the performance impact of this for currently working applications that
> > > > use a single thread to program flow rules.  You are adding a couple of system
> > > > calls to what was formerly a totally usermode operation.    
> > 
> > Read the source for glibc and see what pthread_mutex does  
> 
> What would be the best lock for rte_flow?
> We have spin lock, ticket lock, MCS lock (and rwlock) in DPDK.

The tradeoff is between speed, correctness, and simplicity.
The flow case is performance sensitive (for connection per second tests);
but not super critical (ie every packet).
Fastest would be RCU but probably not necessary here.

There would rarely be contention on this (thread safety is new), and there
is no case where reader makes sense. For hardware, programming flow rules
would basic interaction with TCAM (ie fast). For software drivers, typically
making flow rule requires system call to set classifier etc. Holding spin
lock across system calls leads to preemption and other issues.

Would it be possible to push the choice of mutual exclusion down to
the device driver? For fast HW devices they could use spinlock and for
slow SW devices it would be pthread.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dpdk-dev] [RFC] ethdev: make rte flow API thread safe
  2020-09-08 14:52     ` Stephen Hemminger
  2020-09-08 15:03       ` Thomas Monjalon
@ 2020-09-09  1:26       ` Suanming Mou
  1 sibling, 0 replies; 9+ messages in thread
From: Suanming Mou @ 2020-09-09  1:26 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Ori Kam, John McNamara, Marko Kovacevic, Matan Azrad,
	Shahaf Shuler, Viacheslav Ovsiienko, NBU-Contact-Thomas Monjalon,
	Ferruh Yigit, Andrew Rybchenko, dev



> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Tuesday, September 8, 2020 10:52 PM
> To: Suanming Mou <suanmingm@nvidia.com>
> Cc: Ori Kam <orika@mellanox.com>; John McNamara
> <john.mcnamara@intel.com>; Marko Kovacevic <marko.kovacevic@intel.com>;
> Matan Azrad <matan@mellanox.com>; Shahaf Shuler <shahafs@mellanox.com>;
> Viacheslav Ovsiienko <viacheslavo@mellanox.com>; NBU-Contact-Thomas
> Monjalon <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>;
> Andrew Rybchenko <arybchenko@solarflare.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] [RFC] ethdev: make rte flow API thread safe
> 
> On Mon, 7 Sep 2020 02:36:48 +0000
> Suanming Mou <suanmingm@nvidia.com> wrote:
> 
> > Hi,
> >
> > Sorry for my late reply due to the vacation.
> >
> > > What is the performance impact of this for currently working
> > > applications that use a single thread to program flow rules.  You
> > > are adding a couple of system calls to what was formerly a totally usermode
> operation.
> >
> 
> Read the source for glibc and see what pthread_mutex does

Yes, the pthread mutex lock will try CAS(Compare And Swap) with the atomic value first, if not success, have the futex syscall.
So it means in single thread case, no syscall will be introduced. And the testing code also shows pthread mutex have similar cycles with spin_lock.

> 
> > If I understand correctly, in the non-contended single thread case, pthread
> mutex lock should not go to the kernel space.
> > I also wrote a small application with pthread mutex, and strace shows no
> system call was introduced.
> >
> > Another simple testing code below is to check the cycles cost difference in
> every round between pthread mutex and spin_lock.
> >
> 
> Micro benchmarks of locking is hard to see.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dpdk-dev] [RFC] ethdev: make rte flow API thread safe
  2020-09-08 16:02         ` Stephen Hemminger
@ 2020-09-09  2:26           ` Suanming Mou
  2020-09-24  1:42             ` Suanming Mou
  0 siblings, 1 reply; 9+ messages in thread
From: Suanming Mou @ 2020-09-09  2:26 UTC (permalink / raw)
  To: Stephen Hemminger, NBU-Contact-Thomas Monjalon
  Cc: Ori Kam, Matan Azrad, Shahaf Shuler, Viacheslav Ovsiienko,
	Ferruh Yigit, Andrew Rybchenko, dev, joyce.kong, phil.yang,
	steve.capper, honnappa.nagarahalli



> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Wednesday, September 9, 2020 12:03 AM
> To: NBU-Contact-Thomas Monjalon <thomas@monjalon.net>
> Cc: Suanming Mou <suanmingm@nvidia.com>; Ori Kam <orika@mellanox.com>;
> Matan Azrad <matan@mellanox.com>; Shahaf Shuler <shahafs@mellanox.com>;
> Viacheslav Ovsiienko <viacheslavo@mellanox.com>; Ferruh Yigit
> <ferruh.yigit@intel.com>; Andrew Rybchenko <arybchenko@solarflare.com>;
> dev@dpdk.org; joyce.kong@arm.com; phil.yang@arm.com;
> steve.capper@arm.com; honnappa.nagarahalli@arm.com
> Subject: Re: [dpdk-dev] [RFC] ethdev: make rte flow API thread safe
> 
> On Tue, 08 Sep 2020 17:03:53 +0200
> Thomas Monjalon <thomas@monjalon.net> wrote:
> 
> > 08/09/2020 16:52, Stephen Hemminger:
> > > On Mon, 7 Sep 2020 02:36:48 +0000
> > > Suanming Mou <suanmingm@nvidia.com> wrote:
> > > > > What is the performance impact of this for currently working
> > > > > applications that use a single thread to program flow rules.  You are
> adding a couple of system
> > > > > calls to what was formerly a totally usermode operation.
> > >
> > > Read the source for glibc and see what pthread_mutex does
> >
> > What would be the best lock for rte_flow?
> > We have spin lock, ticket lock, MCS lock (and rwlock) in DPDK.
> 
> The tradeoff is between speed, correctness, and simplicity.
> The flow case is performance sensitive (for connection per second tests); but
> not super critical (ie every packet).
> Fastest would be RCU but probably not necessary here.
> 
> There would rarely be contention on this (thread safety is new), and there is no
> case where reader makes sense. For hardware, programming flow rules would
> basic interaction with TCAM (ie fast). For software drivers, typically making flow
> rule requires system call to set classifier etc. Holding spin lock across system
> calls leads to preemption and other issues.

Very good explanation, thank you Stephen. Totally agree with that.

> 
> Would it be possible to push the choice of mutual exclusion down to the device
> driver? For fast HW devices they could use spinlock and for slow SW devices it
> would be pthread.

That's a good hint. But that will also introduce the vendor PMD to update. I tried to list some points here.
1. For single thread case, pthread mutex acts similar as spinlock, spinlock or pthread mutex will not make difference here. 
2. For multiple threads lock contention, spinlock will introduce more CPU usage. DPDK applications currently use mutex later get rid of the outer mutex will suffer higher CPU usage with the spinlock(fast hardware PMD).

And one more general question, can we find some DPDK applications now use spinlock with the rte flow APIs?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [dpdk-dev] [RFC] ethdev: make rte flow API thread safe
  2020-09-09  2:26           ` Suanming Mou
@ 2020-09-24  1:42             ` Suanming Mou
  0 siblings, 0 replies; 9+ messages in thread
From: Suanming Mou @ 2020-09-24  1:42 UTC (permalink / raw)
  To: Stephen Hemminger, NBU-Contact-Thomas Monjalon
  Cc: Ori Kam, Matan Azrad, Shahaf Shuler, Viacheslav Ovsiienko,
	Ferruh Yigit, Andrew Rybchenko, dev, joyce.kong, phil.yang,
	steve.capper, honnappa.nagarahalli

Hi,

> -----Original Message-----
> From: Suanming Mou
> Sent: Wednesday, September 9, 2020 10:26 AM
> To: Stephen Hemminger <stephen@networkplumber.org>; NBU-Contact-
> Thomas Monjalon <thomas@monjalon.net>
> Cc: Ori Kam <orika@mellanox.com>; Matan Azrad <matan@mellanox.com>;
> Shahaf Shuler <shahafs@mellanox.com>; Viacheslav Ovsiienko
> <viacheslavo@mellanox.com>; Ferruh Yigit <ferruh.yigit@intel.com>; Andrew
> Rybchenko <arybchenko@solarflare.com>; dev@dpdk.org;
> joyce.kong@arm.com; phil.yang@arm.com; steve.capper@arm.com;
> honnappa.nagarahalli@arm.com
> Subject: RE: [dpdk-dev] [RFC] ethdev: make rte flow API thread safe
> 
> > Would it be possible to push the choice of mutual exclusion down to
> > the device driver? For fast HW devices they could use spinlock and for
> > slow SW devices it would be pthread.
> 
> That's a good hint. But that will also introduce the vendor PMD to update. I tried
> to list some points here.
> 1. For single thread case, pthread mutex acts similar as spinlock, spinlock or
> pthread mutex will not make difference here.
> 2. For multiple threads lock contention, spinlock will introduce more CPU usage.
> DPDK applications currently use mutex later get rid of the outer mutex will suffer
> higher CPU usage with the spinlock(fast hardware PMD).
> 
> And one more general question, can we find some DPDK applications now use
> spinlock with the rte flow APIs?

Since there's no other suggestions regarding lock chosen, maybe we can keep the default mutex lock in rte flow functions.
For the fast HW devices case as Stephen wrote, these fast HW device PMDs can set the RTE_ETH_DEV_FLOW_OPS_THREAD_SAFE  flag, and add the spinlock in the PMD to replace the mutex lock in rte flow functions.
Similarly, if some PMDs prefer other types of lock, just set the RTE_ETH_DEV_FLOW_OPS_THREAD_SAFE  flag, and add the needed type of lock to PMD internal.
The default mutex lock in rte flow functions will be replaced easily.
How do you feel like this?

Thanks,
SuanmingMou

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-09-24  1:42 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-03  4:53 [dpdk-dev] [RFC] ethdev: make rte flow API thread safe Suanming Mou
2020-09-03 17:37 ` Stephen Hemminger
2020-09-07  2:36   ` Suanming Mou
2020-09-08 14:52     ` Stephen Hemminger
2020-09-08 15:03       ` Thomas Monjalon
2020-09-08 16:02         ` Stephen Hemminger
2020-09-09  2:26           ` Suanming Mou
2020-09-24  1:42             ` Suanming Mou
2020-09-09  1:26       ` Suanming Mou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).