DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH 1/3] config: use one single config option for C11 memory model
@ 2018-09-19 13:30 Phil Yang
  2018-09-19 13:30 ` [dpdk-dev] [PATCH 2/3] kni: fix kni fifo synchronization Phil Yang
                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Phil Yang @ 2018-09-19 13:30 UTC (permalink / raw)
  To: dev
  Cc: nd, jerin.jacob, kkokkilagadda, Honnappa.Nagarahalli, Gavin.Hu,
	Phil Yang

From: Phil Yang <phil.yang@arm.com>

Keep only single config option RTE_USE_C11_MEM_MODEL for C11 memory
model, so all modules can leverage C11 atomic extension by enable this
option.

Fixes: 39368eb ("ring: introduce C11 memory model barrier option")
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
---
 config/arm/meson.build                       | 2 +-
 config/common_armv8a_linuxapp                | 2 +-
 config/common_base                           | 2 +-
 config/defconfig_arm64-thunderx-linuxapp-gcc | 2 +-
 lib/librte_ring/rte_ring.h                   | 4 ++--
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/config/arm/meson.build b/config/arm/meson.build
index 94cca49..4b23b39 100644
--- a/config/arm/meson.build
+++ b/config/arm/meson.build
@@ -53,7 +53,7 @@ flags_cavium = [
 	['RTE_MAX_NUMA_NODES', 2],
 	['RTE_MAX_LCORE', 96],
 	['RTE_MAX_VFIO_GROUPS', 128],
-	['RTE_RING_USE_C11_MEM_MODEL', false]]
+	['RTE_USE_C11_MEM_MODEL', false]]
 flags_dpaa = [
 	['RTE_MACHINE', '"dpaa"'],
 	['RTE_CACHE_LINE_SIZE', 64],
diff --git a/config/common_armv8a_linuxapp b/config/common_armv8a_linuxapp
index 111c005..54e6987 100644
--- a/config/common_armv8a_linuxapp
+++ b/config/common_armv8a_linuxapp
@@ -29,7 +29,7 @@ CONFIG_RTE_ARCH_ARM64_MEMCPY=n
 #CONFIG_RTE_ARM64_MEMCPY_ALIGN_MASK=0xF
 #CONFIG_RTE_ARM64_MEMCPY_STRICT_ALIGN=n
 
-CONFIG_RTE_RING_USE_C11_MEM_MODEL=y
+CONFIG_RTE_USE_C11_MEM_MODEL=y
 
 CONFIG_RTE_LIBRTE_FM10K_PMD=n
 CONFIG_RTE_LIBRTE_SFC_EFX_PMD=n
diff --git a/config/common_base b/config/common_base
index 155c7d4..ccd2670 100644
--- a/config/common_base
+++ b/config/common_base
@@ -661,7 +661,7 @@ CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y
 # Compile librte_ring
 #
 CONFIG_RTE_LIBRTE_RING=y
-CONFIG_RTE_RING_USE_C11_MEM_MODEL=n
+CONFIG_RTE_USE_C11_MEM_MODEL=n
 
 #
 # Compile librte_mempool
diff --git a/config/defconfig_arm64-thunderx-linuxapp-gcc b/config/defconfig_arm64-thunderx-linuxapp-gcc
index 2bed66c..f11e758 100644
--- a/config/defconfig_arm64-thunderx-linuxapp-gcc
+++ b/config/defconfig_arm64-thunderx-linuxapp-gcc
@@ -10,7 +10,7 @@ CONFIG_RTE_CACHE_LINE_SIZE=128
 CONFIG_RTE_MAX_NUMA_NODES=2
 CONFIG_RTE_MAX_LCORE=96
 CONFIG_RTE_MAX_VFIO_GROUPS=128
-CONFIG_RTE_RING_USE_C11_MEM_MODEL=n
+CONFIG_RTE_USE_C11_MEM_MODEL=n
 
 #
 # Compile PMD for octeontx sso event device
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 7a731d0..af5444a 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -303,11 +303,11 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
  * There are 2 choices for the users
  * 1.use rmb() memory barrier
  * 2.use one-direcion load_acquire/store_release barrier,defined by
- * CONFIG_RTE_RING_USE_C11_MEM_MODEL=y
+ * CONFIG_RTE_USE_C11_MEM_MODEL=y
  * It depends on performance test results.
  * By default, move common functions to rte_ring_generic.h
  */
-#ifdef RTE_RING_USE_C11_MEM_MODEL
+#ifdef RTE_USE_C11_MEM_MODEL
 #include "rte_ring_c11_mem.h"
 #else
 #include "rte_ring_generic.h"
-- 
2.7.4

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [dpdk-dev] [PATCH 2/3] kni: fix kni fifo synchronization
  2018-09-19 13:30 [dpdk-dev] [PATCH 1/3] config: use one single config option for C11 memory model Phil Yang
@ 2018-09-19 13:30 ` Phil Yang
  2018-09-19 13:30 ` [dpdk-dev] [PATCH 3/3] kni: fix kni kernel " Phil Yang
  2018-09-19 13:42 ` [dpdk-dev] [PATCH v2 1/3] config: use one single config option for C11 memory model Phil Yang
  2 siblings, 0 replies; 30+ messages in thread
From: Phil Yang @ 2018-09-19 13:30 UTC (permalink / raw)
  To: dev
  Cc: nd, jerin.jacob, kkokkilagadda, Honnappa.Nagarahalli, Gavin.Hu,
	Phil Yang

From: Phil Yang <phil.yang@arm.com>

With existing code in kni_fifo_put, rx_q values are not being updated
before updating fifo_write. While reading rx_q in kni_net_rx_normal,
This is causing the sync issue on other core. The same situation happens
in kni_fifo_get as well.

So syncing the values by adding C11 atomic memory barriers to make sure
the values being synced before updating fifo_write and fifo_read.

Fixes: 3fc5ca2 ("kni: initial import")
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
---
 .../linuxapp/eal/include/exec-env/rte_kni_common.h |  5 ++++
 lib/librte_kni/rte_kni_fifo.h                      | 30 +++++++++++++++++++++-
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
index cfa9448..1fd713b 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
@@ -54,8 +54,13 @@ struct rte_kni_request {
  * Writing should never overwrite the read position
  */
 struct rte_kni_fifo {
+#ifndef RTE_USE_C11_MEM_MODEL
 	volatile unsigned write;     /**< Next position to be written*/
 	volatile unsigned read;      /**< Next position to be read */
+#else
+	unsigned write;              /**< Next position to be written*/
+	unsigned read;               /**< Next position to be read */
+#endif
 	unsigned len;                /**< Circular buffer length */
 	unsigned elem_size;          /**< Pointer size - for 32/64 bit OS */
 	void *volatile buffer[];     /**< The buffer contains mbuf pointers */
diff --git a/lib/librte_kni/rte_kni_fifo.h b/lib/librte_kni/rte_kni_fifo.h
index ac26a8c..f4171a1 100644
--- a/lib/librte_kni/rte_kni_fifo.h
+++ b/lib/librte_kni/rte_kni_fifo.h
@@ -28,8 +28,13 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data, unsigned num)
 {
 	unsigned i = 0;
 	unsigned fifo_write = fifo->write;
-	unsigned fifo_read = fifo->read;
 	unsigned new_write = fifo_write;
+#ifdef RTE_USE_C11_MEM_MODEL
+	unsigned fifo_read = __atomic_load_n(&fifo->read,
+						 __ATOMIC_ACQUIRE);
+#else
+	unsigned fifo_read = fifo->read;
+#endif
 
 	for (i = 0; i < num; i++) {
 		new_write = (new_write + 1) & (fifo->len - 1);
@@ -39,7 +44,12 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data, unsigned num)
 		fifo->buffer[fifo_write] = data[i];
 		fifo_write = new_write;
 	}
+#ifdef RTE_USE_C11_MEM_MODEL
+	__atomic_store_n(&fifo->write, fifo_write, __ATOMIC_RELEASE);
+#else
+	rte_smp_wmb();
 	fifo->write = fifo_write;
+#endif
 	return i;
 }
 
@@ -51,7 +61,12 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, unsigned num)
 {
 	unsigned i = 0;
 	unsigned new_read = fifo->read;
+#ifdef RTE_USE_C11_MEM_MODEL
+	unsigned fifo_write = __atomic_load_n(&fifo->write, __ATOMIC_ACQUIRE);
+#else
 	unsigned fifo_write = fifo->write;
+#endif
+
 	for (i = 0; i < num; i++) {
 		if (new_read == fifo_write)
 			break;
@@ -59,7 +74,12 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, unsigned num)
 		data[i] = fifo->buffer[new_read];
 		new_read = (new_read + 1) & (fifo->len - 1);
 	}
+#ifdef RTE_USE_C11_MEM_MODEL
+	__atomic_store_n(&fifo->read, new_read, __ATOMIC_RELEASE);
+#else
+	rte_smp_wmb();
 	fifo->read = new_read;
+#endif
 	return i;
 }
 
@@ -69,5 +89,13 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, unsigned num)
 static inline uint32_t
 kni_fifo_count(struct rte_kni_fifo *fifo)
 {
+#ifdef RTE_USE_C11_MEM_MODEL
+	unsigned fifo_write = __atomic_load_n(&fifo->write,
+						  __ATOMIC_ACQUIRE);
+	unsigned fifo_read = __atomic_load_n(&fifo->read,
+						 __ATOMIC_ACQUIRE);
+	return (fifo->len + fifo_write - fifo_read) & (fifo->len - 1);
+#else
 	return (fifo->len + fifo->write - fifo->read) & (fifo->len - 1);
+#endif
 }
-- 
2.7.4

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [dpdk-dev] [PATCH 3/3] kni: fix kni kernel fifo synchronization
  2018-09-19 13:30 [dpdk-dev] [PATCH 1/3] config: use one single config option for C11 memory model Phil Yang
  2018-09-19 13:30 ` [dpdk-dev] [PATCH 2/3] kni: fix kni fifo synchronization Phil Yang
@ 2018-09-19 13:30 ` Phil Yang
  2018-09-19 13:42 ` [dpdk-dev] [PATCH v2 1/3] config: use one single config option for C11 memory model Phil Yang
  2 siblings, 0 replies; 30+ messages in thread
From: Phil Yang @ 2018-09-19 13:30 UTC (permalink / raw)
  To: dev
  Cc: nd, jerin.jacob, kkokkilagadda, Honnappa.Nagarahalli, Gavin.Hu,
	Phil Yang

From: Phil Yang <phil.yang@arm.com>

Adding memory barrier to make sure the values being synced
before updating fifo_write in kni_fifo_put and fifo_read in
kni_fifo_get.

Fixes: 3fc5ca2 ("kni: initial import")
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
---
 kernel/linux/kni/kni_fifo.h                              | 16 ++++++++++------
 .../linuxapp/eal/include/exec-env/rte_kni_common.h       |  1 +
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/kernel/linux/kni/kni_fifo.h b/kernel/linux/kni/kni_fifo.h
index 9a4762d..2cb3a4a 100644
--- a/kernel/linux/kni/kni_fifo.h
+++ b/kernel/linux/kni/kni_fifo.h
@@ -16,7 +16,7 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data, uint32_t num)
 {
 	uint32_t i = 0;
 	uint32_t fifo_write = fifo->write;
-	uint32_t fifo_read = fifo->read;
+	uint32_t fifo_read = smp_load_acquire(&fifo->read);
 	uint32_t new_write = fifo_write;
 
 	for (i = 0; i < num; i++) {
@@ -27,7 +27,7 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data, uint32_t num)
 		fifo->buffer[fifo_write] = data[i];
 		fifo_write = new_write;
 	}
-	fifo->write = fifo_write;
+	smp_store_release(&fifo->write, fifo_write);
 
 	return i;
 }
@@ -40,7 +40,7 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, uint32_t num)
 {
 	uint32_t i = 0;
 	uint32_t new_read = fifo->read;
-	uint32_t fifo_write = fifo->write;
+	uint32_t fifo_write = smp_load_acquire(&fifo->write);
 
 	for (i = 0; i < num; i++) {
 		if (new_read == fifo_write)
@@ -49,7 +49,7 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, uint32_t num)
 		data[i] = fifo->buffer[new_read];
 		new_read = (new_read + 1) & (fifo->len - 1);
 	}
-	fifo->read = new_read;
+	smp_store_release(&fifo->read, new_read);
 
 	return i;
 }
@@ -60,7 +60,9 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, uint32_t num)
 static inline uint32_t
 kni_fifo_count(struct rte_kni_fifo *fifo)
 {
-	return (fifo->len + fifo->write - fifo->read) & (fifo->len - 1);
+	uint32_t fifo_write = smp_load_acquire(&fifo->write);
+	uint32_t fifo_read = smp_load_acquire(&fifo->read);
+	return (fifo->len + fifo_write - fifo_read) & (fifo->len - 1);
 }
 
 /**
@@ -69,7 +71,9 @@ kni_fifo_count(struct rte_kni_fifo *fifo)
 static inline uint32_t
 kni_fifo_free_count(struct rte_kni_fifo *fifo)
 {
-	return (fifo->read - fifo->write - 1) & (fifo->len - 1);
+	uint32_t fifo_write = smp_load_acquire(&fifo->write);
+	uint32_t fifo_read = smp_load_acquire(&fifo->read);
+	return (fifo_read - fifo_write - 1) & (fifo->len - 1);
 }
 
 #endif /* _KNI_FIFO_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
index 1fd713b..5902ae4 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
@@ -8,6 +8,7 @@
 
 #ifdef __KERNEL__
 #include <linux/if.h>
+#include <asm/barrier.h>
 #define RTE_STD_C11
 #else
 #include <rte_common.h>
-- 
2.7.4

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [dpdk-dev] [PATCH v2 1/3] config: use one single config option for C11 memory model
  2018-09-19 13:30 [dpdk-dev] [PATCH 1/3] config: use one single config option for C11 memory model Phil Yang
  2018-09-19 13:30 ` [dpdk-dev] [PATCH 2/3] kni: fix kni fifo synchronization Phil Yang
  2018-09-19 13:30 ` [dpdk-dev] [PATCH 3/3] kni: fix kni kernel " Phil Yang
@ 2018-09-19 13:42 ` Phil Yang
  2018-09-19 13:42   ` [dpdk-dev] [PATCH v2 2/3] kni: fix kni fifo synchronization Phil Yang
                     ` (3 more replies)
  2 siblings, 4 replies; 30+ messages in thread
From: Phil Yang @ 2018-09-19 13:42 UTC (permalink / raw)
  To: dev; +Cc: nd, jerin.jacob, kkokkilagadda, Honnappa.Nagarahalli, Gavin.Hu

Keep only single config option RTE_USE_C11_MEM_MODEL for C11 memory
model, so all modules can leverage C11 atomic extension by enable this
option.

Fixes: 39368eb ("ring: introduce C11 memory model barrier option")
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
---
 config/arm/meson.build                       | 2 +-
 config/common_armv8a_linuxapp                | 2 +-
 config/common_base                           | 2 +-
 config/defconfig_arm64-thunderx-linuxapp-gcc | 2 +-
 lib/librte_ring/rte_ring.h                   | 4 ++--
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/config/arm/meson.build b/config/arm/meson.build
index 94cca49..4b23b39 100644
--- a/config/arm/meson.build
+++ b/config/arm/meson.build
@@ -53,7 +53,7 @@ flags_cavium = [
 	['RTE_MAX_NUMA_NODES', 2],
 	['RTE_MAX_LCORE', 96],
 	['RTE_MAX_VFIO_GROUPS', 128],
-	['RTE_RING_USE_C11_MEM_MODEL', false]]
+	['RTE_USE_C11_MEM_MODEL', false]]
 flags_dpaa = [
 	['RTE_MACHINE', '"dpaa"'],
 	['RTE_CACHE_LINE_SIZE', 64],
diff --git a/config/common_armv8a_linuxapp b/config/common_armv8a_linuxapp
index 111c005..54e6987 100644
--- a/config/common_armv8a_linuxapp
+++ b/config/common_armv8a_linuxapp
@@ -29,7 +29,7 @@ CONFIG_RTE_ARCH_ARM64_MEMCPY=n
 #CONFIG_RTE_ARM64_MEMCPY_ALIGN_MASK=0xF
 #CONFIG_RTE_ARM64_MEMCPY_STRICT_ALIGN=n
 
-CONFIG_RTE_RING_USE_C11_MEM_MODEL=y
+CONFIG_RTE_USE_C11_MEM_MODEL=y
 
 CONFIG_RTE_LIBRTE_FM10K_PMD=n
 CONFIG_RTE_LIBRTE_SFC_EFX_PMD=n
diff --git a/config/common_base b/config/common_base
index 155c7d4..ccd2670 100644
--- a/config/common_base
+++ b/config/common_base
@@ -661,7 +661,7 @@ CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y
 # Compile librte_ring
 #
 CONFIG_RTE_LIBRTE_RING=y
-CONFIG_RTE_RING_USE_C11_MEM_MODEL=n
+CONFIG_RTE_USE_C11_MEM_MODEL=n
 
 #
 # Compile librte_mempool
diff --git a/config/defconfig_arm64-thunderx-linuxapp-gcc b/config/defconfig_arm64-thunderx-linuxapp-gcc
index 2bed66c..f11e758 100644
--- a/config/defconfig_arm64-thunderx-linuxapp-gcc
+++ b/config/defconfig_arm64-thunderx-linuxapp-gcc
@@ -10,7 +10,7 @@ CONFIG_RTE_CACHE_LINE_SIZE=128
 CONFIG_RTE_MAX_NUMA_NODES=2
 CONFIG_RTE_MAX_LCORE=96
 CONFIG_RTE_MAX_VFIO_GROUPS=128
-CONFIG_RTE_RING_USE_C11_MEM_MODEL=n
+CONFIG_RTE_USE_C11_MEM_MODEL=n
 
 #
 # Compile PMD for octeontx sso event device
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 7a731d0..af5444a 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -303,11 +303,11 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
  * There are 2 choices for the users
  * 1.use rmb() memory barrier
  * 2.use one-direcion load_acquire/store_release barrier,defined by
- * CONFIG_RTE_RING_USE_C11_MEM_MODEL=y
+ * CONFIG_RTE_USE_C11_MEM_MODEL=y
  * It depends on performance test results.
  * By default, move common functions to rte_ring_generic.h
  */
-#ifdef RTE_RING_USE_C11_MEM_MODEL
+#ifdef RTE_USE_C11_MEM_MODEL
 #include "rte_ring_c11_mem.h"
 #else
 #include "rte_ring_generic.h"
-- 
2.7.4

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [dpdk-dev] [PATCH v2 2/3] kni: fix kni fifo synchronization
  2018-09-19 13:42 ` [dpdk-dev] [PATCH v2 1/3] config: use one single config option for C11 memory model Phil Yang
@ 2018-09-19 13:42   ` Phil Yang
  2018-09-20  8:28     ` Jerin Jacob
  2018-09-26 11:45     ` Ferruh Yigit
  2018-09-19 13:42   ` [dpdk-dev] [PATCH v2 3/3] kni: fix kni kernel " Phil Yang
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 30+ messages in thread
From: Phil Yang @ 2018-09-19 13:42 UTC (permalink / raw)
  To: dev; +Cc: nd, jerin.jacob, kkokkilagadda, Honnappa.Nagarahalli, Gavin.Hu

With existing code in kni_fifo_put, rx_q values are not being updated
before updating fifo_write. While reading rx_q in kni_net_rx_normal,
This is causing the sync issue on other core. The same situation happens
in kni_fifo_get as well.

So syncing the values by adding C11 atomic memory barriers to make sure
the values being synced before updating fifo_write and fifo_read.

Fixes: 3fc5ca2 ("kni: initial import")
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
---
 .../linuxapp/eal/include/exec-env/rte_kni_common.h |  5 ++++
 lib/librte_kni/rte_kni_fifo.h                      | 30 +++++++++++++++++++++-
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
index cfa9448..1fd713b 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
@@ -54,8 +54,13 @@ struct rte_kni_request {
  * Writing should never overwrite the read position
  */
 struct rte_kni_fifo {
+#ifndef RTE_USE_C11_MEM_MODEL
 	volatile unsigned write;     /**< Next position to be written*/
 	volatile unsigned read;      /**< Next position to be read */
+#else
+	unsigned write;              /**< Next position to be written*/
+	unsigned read;               /**< Next position to be read */
+#endif
 	unsigned len;                /**< Circular buffer length */
 	unsigned elem_size;          /**< Pointer size - for 32/64 bit OS */
 	void *volatile buffer[];     /**< The buffer contains mbuf pointers */
diff --git a/lib/librte_kni/rte_kni_fifo.h b/lib/librte_kni/rte_kni_fifo.h
index ac26a8c..f4171a1 100644
--- a/lib/librte_kni/rte_kni_fifo.h
+++ b/lib/librte_kni/rte_kni_fifo.h
@@ -28,8 +28,13 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data, unsigned num)
 {
 	unsigned i = 0;
 	unsigned fifo_write = fifo->write;
-	unsigned fifo_read = fifo->read;
 	unsigned new_write = fifo_write;
+#ifdef RTE_USE_C11_MEM_MODEL
+	unsigned fifo_read = __atomic_load_n(&fifo->read,
+						 __ATOMIC_ACQUIRE);
+#else
+	unsigned fifo_read = fifo->read;
+#endif
 
 	for (i = 0; i < num; i++) {
 		new_write = (new_write + 1) & (fifo->len - 1);
@@ -39,7 +44,12 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data, unsigned num)
 		fifo->buffer[fifo_write] = data[i];
 		fifo_write = new_write;
 	}
+#ifdef RTE_USE_C11_MEM_MODEL
+	__atomic_store_n(&fifo->write, fifo_write, __ATOMIC_RELEASE);
+#else
+	rte_smp_wmb();
 	fifo->write = fifo_write;
+#endif
 	return i;
 }
 
@@ -51,7 +61,12 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, unsigned num)
 {
 	unsigned i = 0;
 	unsigned new_read = fifo->read;
+#ifdef RTE_USE_C11_MEM_MODEL
+	unsigned fifo_write = __atomic_load_n(&fifo->write, __ATOMIC_ACQUIRE);
+#else
 	unsigned fifo_write = fifo->write;
+#endif
+
 	for (i = 0; i < num; i++) {
 		if (new_read == fifo_write)
 			break;
@@ -59,7 +74,12 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, unsigned num)
 		data[i] = fifo->buffer[new_read];
 		new_read = (new_read + 1) & (fifo->len - 1);
 	}
+#ifdef RTE_USE_C11_MEM_MODEL
+	__atomic_store_n(&fifo->read, new_read, __ATOMIC_RELEASE);
+#else
+	rte_smp_wmb();
 	fifo->read = new_read;
+#endif
 	return i;
 }
 
@@ -69,5 +89,13 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, unsigned num)
 static inline uint32_t
 kni_fifo_count(struct rte_kni_fifo *fifo)
 {
+#ifdef RTE_USE_C11_MEM_MODEL
+	unsigned fifo_write = __atomic_load_n(&fifo->write,
+						  __ATOMIC_ACQUIRE);
+	unsigned fifo_read = __atomic_load_n(&fifo->read,
+						 __ATOMIC_ACQUIRE);
+	return (fifo->len + fifo_write - fifo_read) & (fifo->len - 1);
+#else
 	return (fifo->len + fifo->write - fifo->read) & (fifo->len - 1);
+#endif
 }
-- 
2.7.4

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [dpdk-dev] [PATCH v2 3/3] kni: fix kni kernel fifo synchronization
  2018-09-19 13:42 ` [dpdk-dev] [PATCH v2 1/3] config: use one single config option for C11 memory model Phil Yang
  2018-09-19 13:42   ` [dpdk-dev] [PATCH v2 2/3] kni: fix kni fifo synchronization Phil Yang
@ 2018-09-19 13:42   ` Phil Yang
  2018-09-20  8:21   ` [dpdk-dev] [PATCH v2 1/3] config: use one single config option for C11 memory model Jerin Jacob
  2018-10-08  9:11   ` [dpdk-dev] [PATCH v3 1/4] " Phil Yang
  3 siblings, 0 replies; 30+ messages in thread
From: Phil Yang @ 2018-09-19 13:42 UTC (permalink / raw)
  To: dev; +Cc: nd, jerin.jacob, kkokkilagadda, Honnappa.Nagarahalli, Gavin.Hu

Adding memory barrier to make sure the values being synced
before updating fifo_write in kni_fifo_put and fifo_read in
kni_fifo_get.

Fixes: 3fc5ca2 ("kni: initial import")
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
---
 kernel/linux/kni/kni_fifo.h                              | 16 ++++++++++------
 .../linuxapp/eal/include/exec-env/rte_kni_common.h       |  1 +
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/kernel/linux/kni/kni_fifo.h b/kernel/linux/kni/kni_fifo.h
index 9a4762d..2cb3a4a 100644
--- a/kernel/linux/kni/kni_fifo.h
+++ b/kernel/linux/kni/kni_fifo.h
@@ -16,7 +16,7 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data, uint32_t num)
 {
 	uint32_t i = 0;
 	uint32_t fifo_write = fifo->write;
-	uint32_t fifo_read = fifo->read;
+	uint32_t fifo_read = smp_load_acquire(&fifo->read);
 	uint32_t new_write = fifo_write;
 
 	for (i = 0; i < num; i++) {
@@ -27,7 +27,7 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data, uint32_t num)
 		fifo->buffer[fifo_write] = data[i];
 		fifo_write = new_write;
 	}
-	fifo->write = fifo_write;
+	smp_store_release(&fifo->write, fifo_write);
 
 	return i;
 }
@@ -40,7 +40,7 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, uint32_t num)
 {
 	uint32_t i = 0;
 	uint32_t new_read = fifo->read;
-	uint32_t fifo_write = fifo->write;
+	uint32_t fifo_write = smp_load_acquire(&fifo->write);
 
 	for (i = 0; i < num; i++) {
 		if (new_read == fifo_write)
@@ -49,7 +49,7 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, uint32_t num)
 		data[i] = fifo->buffer[new_read];
 		new_read = (new_read + 1) & (fifo->len - 1);
 	}
-	fifo->read = new_read;
+	smp_store_release(&fifo->read, new_read);
 
 	return i;
 }
@@ -60,7 +60,9 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, uint32_t num)
 static inline uint32_t
 kni_fifo_count(struct rte_kni_fifo *fifo)
 {
-	return (fifo->len + fifo->write - fifo->read) & (fifo->len - 1);
+	uint32_t fifo_write = smp_load_acquire(&fifo->write);
+	uint32_t fifo_read = smp_load_acquire(&fifo->read);
+	return (fifo->len + fifo_write - fifo_read) & (fifo->len - 1);
 }
 
 /**
@@ -69,7 +71,9 @@ kni_fifo_count(struct rte_kni_fifo *fifo)
 static inline uint32_t
 kni_fifo_free_count(struct rte_kni_fifo *fifo)
 {
-	return (fifo->read - fifo->write - 1) & (fifo->len - 1);
+	uint32_t fifo_write = smp_load_acquire(&fifo->write);
+	uint32_t fifo_read = smp_load_acquire(&fifo->read);
+	return (fifo_read - fifo_write - 1) & (fifo->len - 1);
 }
 
 #endif /* _KNI_FIFO_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
index 1fd713b..5902ae4 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
@@ -8,6 +8,7 @@
 
 #ifdef __KERNEL__
 #include <linux/if.h>
+#include <asm/barrier.h>
 #define RTE_STD_C11
 #else
 #include <rte_common.h>
-- 
2.7.4

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/3] config: use one single config option for C11 memory model
  2018-09-19 13:42 ` [dpdk-dev] [PATCH v2 1/3] config: use one single config option for C11 memory model Phil Yang
  2018-09-19 13:42   ` [dpdk-dev] [PATCH v2 2/3] kni: fix kni fifo synchronization Phil Yang
  2018-09-19 13:42   ` [dpdk-dev] [PATCH v2 3/3] kni: fix kni kernel " Phil Yang
@ 2018-09-20  8:21   ` Jerin Jacob
  2018-10-08  9:11   ` [dpdk-dev] [PATCH v3 1/4] " Phil Yang
  3 siblings, 0 replies; 30+ messages in thread
From: Jerin Jacob @ 2018-09-20  8:21 UTC (permalink / raw)
  To: Phil Yang; +Cc: dev, nd, kkokkilagadda, Honnappa.Nagarahalli, Gavin.Hu

-----Original Message-----
> Date: Wed, 19 Sep 2018 21:42:38 +0800
> From: Phil Yang <phil.yang@arm.com>
> To: dev@dpdk.org
> CC: nd@arm.com, jerin.jacob@caviumnetworks.com,
>  kkokkilagadda@caviumnetworks.com, Honnappa.Nagarahalli@arm.com,
>  Gavin.Hu@arm.com
> Subject: [PATCH v2 1/3] config: use one single config option for C11 memory
>  model
> X-Mailer: git-send-email 2.7.4
> 
> External Email
> 
> Keep only single config option RTE_USE_C11_MEM_MODEL for C11 memory
> model, so all modules can leverage C11 atomic extension by enable this
> option.
> 
> Fixes: 39368eb ("ring: introduce C11 memory model barrier option")

IMO, Fixes is not required as you are not fixing anything in the
existing code.

Other than that, it looks good.

With above change:
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>


> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
> ---
>  config/arm/meson.build                       | 2 +-
>  config/common_armv8a_linuxapp                | 2 +-
>  config/common_base                           | 2 +-
>  config/defconfig_arm64-thunderx-linuxapp-gcc | 2 +-
>  lib/librte_ring/rte_ring.h                   | 4 ++--
>  5 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/config/arm/meson.build b/config/arm/meson.build
> index 94cca49..4b23b39 100644
> --- a/config/arm/meson.build
> +++ b/config/arm/meson.build
> @@ -53,7 +53,7 @@ flags_cavium = [
>         ['RTE_MAX_NUMA_NODES', 2],
>         ['RTE_MAX_LCORE', 96],
>         ['RTE_MAX_VFIO_GROUPS', 128],
> -       ['RTE_RING_USE_C11_MEM_MODEL', false]]
> +       ['RTE_USE_C11_MEM_MODEL', false]]
>  flags_dpaa = [
>         ['RTE_MACHINE', '"dpaa"'],
>         ['RTE_CACHE_LINE_SIZE', 64],
> diff --git a/config/common_armv8a_linuxapp b/config/common_armv8a_linuxapp
> index 111c005..54e6987 100644
> --- a/config/common_armv8a_linuxapp
> +++ b/config/common_armv8a_linuxapp
> @@ -29,7 +29,7 @@ CONFIG_RTE_ARCH_ARM64_MEMCPY=n
>  #CONFIG_RTE_ARM64_MEMCPY_ALIGN_MASK=0xF
>  #CONFIG_RTE_ARM64_MEMCPY_STRICT_ALIGN=n
> 
> -CONFIG_RTE_RING_USE_C11_MEM_MODEL=y
> +CONFIG_RTE_USE_C11_MEM_MODEL=y
> 
>  CONFIG_RTE_LIBRTE_FM10K_PMD=n
>  CONFIG_RTE_LIBRTE_SFC_EFX_PMD=n
> diff --git a/config/common_base b/config/common_base
> index 155c7d4..ccd2670 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -661,7 +661,7 @@ CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y
>  # Compile librte_ring
>  #
>  CONFIG_RTE_LIBRTE_RING=y
> -CONFIG_RTE_RING_USE_C11_MEM_MODEL=n
> +CONFIG_RTE_USE_C11_MEM_MODEL=n
> 
>  #
>  # Compile librte_mempool
> diff --git a/config/defconfig_arm64-thunderx-linuxapp-gcc b/config/defconfig_arm64-thunderx-linuxapp-gcc
> index 2bed66c..f11e758 100644
> --- a/config/defconfig_arm64-thunderx-linuxapp-gcc
> +++ b/config/defconfig_arm64-thunderx-linuxapp-gcc
> @@ -10,7 +10,7 @@ CONFIG_RTE_CACHE_LINE_SIZE=128
>  CONFIG_RTE_MAX_NUMA_NODES=2
>  CONFIG_RTE_MAX_LCORE=96
>  CONFIG_RTE_MAX_VFIO_GROUPS=128
> -CONFIG_RTE_RING_USE_C11_MEM_MODEL=n
> +CONFIG_RTE_USE_C11_MEM_MODEL=n
> 
>  #
>  # Compile PMD for octeontx sso event device
> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> index 7a731d0..af5444a 100644
> --- a/lib/librte_ring/rte_ring.h
> +++ b/lib/librte_ring/rte_ring.h
> @@ -303,11 +303,11 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
>   * There are 2 choices for the users
>   * 1.use rmb() memory barrier
>   * 2.use one-direcion load_acquire/store_release barrier,defined by
> - * CONFIG_RTE_RING_USE_C11_MEM_MODEL=y
> + * CONFIG_RTE_USE_C11_MEM_MODEL=y
>   * It depends on performance test results.
>   * By default, move common functions to rte_ring_generic.h
>   */
> -#ifdef RTE_RING_USE_C11_MEM_MODEL
> +#ifdef RTE_USE_C11_MEM_MODEL
>  #include "rte_ring_c11_mem.h"
>  #else
>  #include "rte_ring_generic.h"
> --
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] kni: fix kni fifo synchronization
  2018-09-19 13:42   ` [dpdk-dev] [PATCH v2 2/3] kni: fix kni fifo synchronization Phil Yang
@ 2018-09-20  8:28     ` Jerin Jacob
  2018-09-20 15:20       ` Honnappa Nagarahalli
  2018-09-26 11:45     ` Ferruh Yigit
  1 sibling, 1 reply; 30+ messages in thread
From: Jerin Jacob @ 2018-09-20  8:28 UTC (permalink / raw)
  To: Phil Yang
  Cc: dev, nd, kkokkilagadda, Honnappa.Nagarahalli, Gavin.Hu, ferruh.yigit

-----Original Message-----
> Date: Wed, 19 Sep 2018 21:42:39 +0800
> From: Phil Yang <phil.yang@arm.com>
> To: dev@dpdk.org
> CC: nd@arm.com, jerin.jacob@caviumnetworks.com,
>  kkokkilagadda@caviumnetworks.com, Honnappa.Nagarahalli@arm.com,
>  Gavin.Hu@arm.com
> Subject: [PATCH v2 2/3] kni: fix kni fifo synchronization
> X-Mailer: git-send-email 2.7.4
> 

+ Ferruh Yigit <ferruh.yigit@intel.com>

> 
> With existing code in kni_fifo_put, rx_q values are not being updated
> before updating fifo_write. While reading rx_q in kni_net_rx_normal,
> This is causing the sync issue on other core. The same situation happens
> in kni_fifo_get as well.
> 
> So syncing the values by adding C11 atomic memory barriers to make sure
> the values being synced before updating fifo_write and fifo_read.
> 
> Fixes: 3fc5ca2 ("kni: initial import")
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
> ---
>  .../linuxapp/eal/include/exec-env/rte_kni_common.h |  5 ++++
>  lib/librte_kni/rte_kni_fifo.h                      | 30 +++++++++++++++++++++-
>  2 files changed, 34 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> index cfa9448..1fd713b 100644
> --- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> @@ -54,8 +54,13 @@ struct rte_kni_request {
>   * Writing should never overwrite the read position
>   */
>  struct rte_kni_fifo {
> +#ifndef RTE_USE_C11_MEM_MODEL
>         volatile unsigned write;     /**< Next position to be written*/
>         volatile unsigned read;      /**< Next position to be read */
> +#else
> +       unsigned write;              /**< Next position to be written*/
> +       unsigned read;               /**< Next position to be read */
> +#endif
>         unsigned len;                /**< Circular buffer length */
>         unsigned elem_size;          /**< Pointer size - for 32/64 bit OS */
>         void *volatile buffer[];     /**< The buffer contains mbuf pointers */
> diff --git a/lib/librte_kni/rte_kni_fifo.h b/lib/librte_kni/rte_kni_fifo.h
> index ac26a8c..f4171a1 100644
> --- a/lib/librte_kni/rte_kni_fifo.h
> +++ b/lib/librte_kni/rte_kni_fifo.h
> @@ -28,8 +28,13 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data, unsigned num)
>  {
>         unsigned i = 0;
>         unsigned fifo_write = fifo->write;
> -       unsigned fifo_read = fifo->read;
>         unsigned new_write = fifo_write;
> +#ifdef RTE_USE_C11_MEM_MODEL
> +       unsigned fifo_read = __atomic_load_n(&fifo->read,
> +                                                __ATOMIC_ACQUIRE);
> +#else
> +       unsigned fifo_read = fifo->read;
> +#endif

Correct.


> 
>         for (i = 0; i < num; i++) {
>                 new_write = (new_write + 1) & (fifo->len - 1);
> @@ -39,7 +44,12 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data, unsigned num)
>                 fifo->buffer[fifo_write] = data[i];
>                 fifo_write = new_write;
>         }
> +#ifdef RTE_USE_C11_MEM_MODEL
> +       __atomic_store_n(&fifo->write, fifo_write, __ATOMIC_RELEASE);
> +#else
> +       rte_smp_wmb();
>         fifo->write = fifo_write;
> +#endif

Correct.
>         return i;
>  }
> 
> @@ -51,7 +61,12 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, unsigned num)
>  {
>         unsigned i = 0;
>         unsigned new_read = fifo->read;
> +#ifdef RTE_USE_C11_MEM_MODEL
> +       unsigned fifo_write = __atomic_load_n(&fifo->write, __ATOMIC_ACQUIRE);
> +#else
>         unsigned fifo_write = fifo->write;
> +#endif

Correct.

> +
>         for (i = 0; i < num; i++) {
>                 if (new_read == fifo_write)
>                         break;
> @@ -59,7 +74,12 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, unsigned num)
>                 data[i] = fifo->buffer[new_read];
>                 new_read = (new_read + 1) & (fifo->len - 1);
>         }
> +#ifdef RTE_USE_C11_MEM_MODEL
> +       __atomic_store_n(&fifo->read, new_read, __ATOMIC_RELEASE);
> +#else
> +       rte_smp_wmb();
>         fifo->read = new_read;
> +#endif

Correct.

>         return i;
>  }
> 
> @@ -69,5 +89,13 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, unsigned num)
>  static inline uint32_t
>  kni_fifo_count(struct rte_kni_fifo *fifo)
>  {
> +#ifdef RTE_USE_C11_MEM_MODEL
> +       unsigned fifo_write = __atomic_load_n(&fifo->write,
> +                                                 __ATOMIC_ACQUIRE);
> +       unsigned fifo_read = __atomic_load_n(&fifo->read,
> +                                                __ATOMIC_ACQUIRE);

Isn't too  heavy to have two __ATOMIC_ACQUIREs? a simple rte_smp_rmb() would
be enough here. Right?
or
Do we need __ATOMIC_ACQUIRE for fifo_write case?


Other than that, I prefer to avoid ifdef clutter by introducing two
separate file just like ring C11 implementation.

I don't have strong opinion on this this part, I let KNI MAINTAINER to
decide on how to accommodate this change.

> +       return (fifo->len + fifo_write - fifo_read) & (fifo->len - 1);
> +#else
>         return (fifo->len + fifo->write - fifo->read) & (fifo->len - 1);
> +#endif
>  }
> --
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] kni: fix kni fifo synchronization
  2018-09-20  8:28     ` Jerin Jacob
@ 2018-09-20 15:20       ` Honnappa Nagarahalli
  2018-09-20 15:37         ` Jerin Jacob
  0 siblings, 1 reply; 30+ messages in thread
From: Honnappa Nagarahalli @ 2018-09-20 15:20 UTC (permalink / raw)
  To: Jerin Jacob, Phil Yang (Arm Technology China)
  Cc: dev, nd, kkokkilagadda, Gavin Hu (Arm Technology China), ferruh.yigit

> -----Original Message-----
> > Date: Wed, 19 Sep 2018 21:42:39 +0800
> > From: Phil Yang <phil.yang@arm.com>
> > To: dev@dpdk.org
> > CC: nd@arm.com, jerin.jacob@caviumnetworks.com,
> > kkokkilagadda@caviumnetworks.com, Honnappa.Nagarahalli@arm.com,
> > Gavin.Hu@arm.com
> > Subject: [PATCH v2 2/3] kni: fix kni fifo synchronization
> > X-Mailer: git-send-email 2.7.4
> >
> 
> + Ferruh Yigit <ferruh.yigit@intel.com>
> 
> >
> > With existing code in kni_fifo_put, rx_q values are not being updated
> > before updating fifo_write. While reading rx_q in kni_net_rx_normal,
> > This is causing the sync issue on other core. The same situation
> > happens in kni_fifo_get as well.
> >
> > So syncing the values by adding C11 atomic memory barriers to make
> > sure the values being synced before updating fifo_write and fifo_read.
> >
> > Fixes: 3fc5ca2 ("kni: initial import")
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
> > ---
> >  .../linuxapp/eal/include/exec-env/rte_kni_common.h |  5 ++++
> >  lib/librte_kni/rte_kni_fifo.h                      | 30 +++++++++++++++++++++-
> >  2 files changed, 34 insertions(+), 1 deletion(-)
> >
> > diff --git
> > a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > index cfa9448..1fd713b 100644
> > --- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > @@ -54,8 +54,13 @@ struct rte_kni_request {
> >   * Writing should never overwrite the read position
> >   */
> >  struct rte_kni_fifo {
> > +#ifndef RTE_USE_C11_MEM_MODEL
> >         volatile unsigned write;     /**< Next position to be written*/
> >         volatile unsigned read;      /**< Next position to be read */
> > +#else
> > +       unsigned write;              /**< Next position to be written*/
> > +       unsigned read;               /**< Next position to be read */
> > +#endif
> >         unsigned len;                /**< Circular buffer length */
> >         unsigned elem_size;          /**< Pointer size - for 32/64 bit OS */
> >         void *volatile buffer[];     /**< The buffer contains mbuf pointers */
> > diff --git a/lib/librte_kni/rte_kni_fifo.h
> > b/lib/librte_kni/rte_kni_fifo.h index ac26a8c..f4171a1 100644
> > --- a/lib/librte_kni/rte_kni_fifo.h
> > +++ b/lib/librte_kni/rte_kni_fifo.h
> > @@ -28,8 +28,13 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void
> > **data, unsigned num)  {
> >         unsigned i = 0;
> >         unsigned fifo_write = fifo->write;
> > -       unsigned fifo_read = fifo->read;
> >         unsigned new_write = fifo_write;
> > +#ifdef RTE_USE_C11_MEM_MODEL
> > +       unsigned fifo_read = __atomic_load_n(&fifo->read,
> > +                                                __ATOMIC_ACQUIRE);
> > +#else
> > +       unsigned fifo_read = fifo->read; #endif
> 
> Correct.

My apologies, did not follow your comment here. Do you want us to correct anything here? '#endif' is not appearing on the correct line in the email, but it shows up fine on the patch work.

> 
> 
> >
> >         for (i = 0; i < num; i++) {
> >                 new_write = (new_write + 1) & (fifo->len - 1); @@
> > -39,7 +44,12 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data,
> unsigned num)
> >                 fifo->buffer[fifo_write] = data[i];
> >                 fifo_write = new_write;
> >         }
> > +#ifdef RTE_USE_C11_MEM_MODEL
> > +       __atomic_store_n(&fifo->write, fifo_write, __ATOMIC_RELEASE);
> > +#else
> > +       rte_smp_wmb();
> >         fifo->write = fifo_write;
> > +#endif
> 
> Correct.
> >         return i;
> >  }
> >
> > @@ -51,7 +61,12 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void
> > **data, unsigned num)  {
> >         unsigned i = 0;
> >         unsigned new_read = fifo->read;
> > +#ifdef RTE_USE_C11_MEM_MODEL
> > +       unsigned fifo_write = __atomic_load_n(&fifo->write,
> > +__ATOMIC_ACQUIRE); #else
> >         unsigned fifo_write = fifo->write;
> > +#endif
> 
> Correct.
> 
> > +
> >         for (i = 0; i < num; i++) {
> >                 if (new_read == fifo_write)
> >                         break;
> > @@ -59,7 +74,12 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data,
> unsigned num)
> >                 data[i] = fifo->buffer[new_read];
> >                 new_read = (new_read + 1) & (fifo->len - 1);
> >         }
> > +#ifdef RTE_USE_C11_MEM_MODEL
> > +       __atomic_store_n(&fifo->read, new_read, __ATOMIC_RELEASE);
> > +#else
> > +       rte_smp_wmb();
> >         fifo->read = new_read;
> > +#endif
> 
> Correct.
> 
> >         return i;
> >  }
> >
> > @@ -69,5 +89,13 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void
> > **data, unsigned num)  static inline uint32_t  kni_fifo_count(struct
> > rte_kni_fifo *fifo)  {
> > +#ifdef RTE_USE_C11_MEM_MODEL
> > +       unsigned fifo_write = __atomic_load_n(&fifo->write,
> > +                                                 __ATOMIC_ACQUIRE);
> > +       unsigned fifo_read = __atomic_load_n(&fifo->read,
> > +                                                __ATOMIC_ACQUIRE);
> 
> Isn't too  heavy to have two __ATOMIC_ACQUIREs? a simple rte_smp_rmb()
> would be enough here. Right?
> or
> Do we need __ATOMIC_ACQUIRE for fifo_write case?
> 
We also had some amount of debate internally on this:
1) We do not want to use rte_smp_rmb() as we want to keep the memory models separated (for ex: while using C11, use C11 everywhere). It is also not sufficient, please see 3) below.
2) This API can get called from writer or reader, so both the loads have to be __ATOMIC_ACQUIRE
3) Other option is to use __ATOMIC_RELAXED. That would allow any loads/stores around of this API to get reordered, especially since this is an inline function. This would put burden on the application to manage the ordering depending on its usage. It will also require the application to understand the implementation of this API.

> 
> Other than that, I prefer to avoid ifdef clutter by introducing two separate file
> just like ring C11 implementation.
> 
> I don't have strong opinion on this this part, I let KNI MAINTAINER to decide
> on how to accommodate this change.

I prefer to change this as well, I am open for suggestions.
Introducing two separate files would be too much for this library. A better way would be to have something similar to 'smp_store_release' provided by the kernel. i.e. create #defines for loads/stores. Hide the clutter behind the #defines.

> 
> > +       return (fifo->len + fifo_write - fifo_read) & (fifo->len - 1);
> > +#else
> >         return (fifo->len + fifo->write - fifo->read) & (fifo->len -
> > 1);
> > +#endif
> >  }
> > --
> > 2.7.4
> >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] kni: fix kni fifo synchronization
  2018-09-20 15:20       ` Honnappa Nagarahalli
@ 2018-09-20 15:37         ` Jerin Jacob
  2018-09-21  5:48           ` Honnappa Nagarahalli
  0 siblings, 1 reply; 30+ messages in thread
From: Jerin Jacob @ 2018-09-20 15:37 UTC (permalink / raw)
  To: Honnappa Nagarahalli
  Cc: Phil Yang (Arm Technology China),
	dev, nd, kkokkilagadda, Gavin Hu (Arm Technology China),
	ferruh.yigit

-----Original Message-----
> Date: Thu, 20 Sep 2018 15:20:53 +0000
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, "Phil Yang (Arm
>  Technology China)" <Phil.Yang@arm.com>
> CC: "dev@dpdk.org" <dev@dpdk.org>, nd <nd@arm.com>,
>  "kkokkilagadda@caviumnetworks.com" <kkokkilagadda@caviumnetworks.com>,
>  "Gavin Hu (Arm Technology China)" <Gavin.Hu@arm.com>,
>  "ferruh.yigit@intel.com" <ferruh.yigit@intel.com>
> Subject: RE: [PATCH v2 2/3] kni: fix kni fifo synchronization
> 
> 
> > -----Original Message-----
> > > Date: Wed, 19 Sep 2018 21:42:39 +0800
> > > From: Phil Yang <phil.yang@arm.com>
> > > To: dev@dpdk.org
> > > CC: nd@arm.com, jerin.jacob@caviumnetworks.com,
> > > kkokkilagadda@caviumnetworks.com, Honnappa.Nagarahalli@arm.com,
> > > Gavin.Hu@arm.com
> > > Subject: [PATCH v2 2/3] kni: fix kni fifo synchronization
> > > X-Mailer: git-send-email 2.7.4
> > >
> >
> > + Ferruh Yigit <ferruh.yigit@intel.com>
> >
> > >
> > > With existing code in kni_fifo_put, rx_q values are not being updated
> > > before updating fifo_write. While reading rx_q in kni_net_rx_normal,
> > > This is causing the sync issue on other core. The same situation
> > > happens in kni_fifo_get as well.
> > >
> > > So syncing the values by adding C11 atomic memory barriers to make
> > > sure the values being synced before updating fifo_write and fifo_read.
> > >
> > > Fixes: 3fc5ca2 ("kni: initial import")
> > > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > > Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > > Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
> > > ---
> > >  .../linuxapp/eal/include/exec-env/rte_kni_common.h |  5 ++++
> > >  lib/librte_kni/rte_kni_fifo.h                      | 30 +++++++++++++++++++++-
> > >  2 files changed, 34 insertions(+), 1 deletion(-)
> > >
> > > diff --git
> > > a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > > b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > > index cfa9448..1fd713b 100644
> > > --- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > > +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > > @@ -54,8 +54,13 @@ struct rte_kni_request {
> > >   * Writing should never overwrite the read position
> > >   */
> > >  struct rte_kni_fifo {
> > > +#ifndef RTE_USE_C11_MEM_MODEL
> > >         volatile unsigned write;     /**< Next position to be written*/
> > >         volatile unsigned read;      /**< Next position to be read */
> > > +#else
> > > +       unsigned write;              /**< Next position to be written*/
> > > +       unsigned read;               /**< Next position to be read */
> > > +#endif
> > >         unsigned len;                /**< Circular buffer length */
> > >         unsigned elem_size;          /**< Pointer size - for 32/64 bit OS */
> > >         void *volatile buffer[];     /**< The buffer contains mbuf pointers */
> > > diff --git a/lib/librte_kni/rte_kni_fifo.h
> > > b/lib/librte_kni/rte_kni_fifo.h index ac26a8c..f4171a1 100644
> > > --- a/lib/librte_kni/rte_kni_fifo.h
> > > +++ b/lib/librte_kni/rte_kni_fifo.h
> > > @@ -28,8 +28,13 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void
> > > **data, unsigned num)  {
> > >         unsigned i = 0;
> > >         unsigned fifo_write = fifo->write;
> > > -       unsigned fifo_read = fifo->read;
> > >         unsigned new_write = fifo_write;
> > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > +       unsigned fifo_read = __atomic_load_n(&fifo->read,
> > > +                                                __ATOMIC_ACQUIRE);
> > > +#else
> > > +       unsigned fifo_read = fifo->read; #endif
> >
> > Correct.
> 
> My apologies, did not follow your comment here. Do you want us to correct anything here? '#endif' is not appearing on the correct line in the email, but it shows up fine on the patch work.

No. What I meant is, code is correct.

> 
> >
> >
> > >
> > >         for (i = 0; i < num; i++) {
> > >                 new_write = (new_write + 1) & (fifo->len - 1); @@
> > > -39,7 +44,12 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data,
> > unsigned num)
> > >                 fifo->buffer[fifo_write] = data[i];
> > >                 fifo_write = new_write;
> > >         }
> > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > +       __atomic_store_n(&fifo->write, fifo_write, __ATOMIC_RELEASE);
> > > +#else
> > > +       rte_smp_wmb();
> > >         fifo->write = fifo_write;
> > > +#endif
> >
> > Correct.
> > >         return i;
> > >  }
> > >
> > > @@ -51,7 +61,12 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void
> > > **data, unsigned num)  {
> > >         unsigned i = 0;
> > >         unsigned new_read = fifo->read;
> > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > +       unsigned fifo_write = __atomic_load_n(&fifo->write,
> > > +__ATOMIC_ACQUIRE); #else
> > >         unsigned fifo_write = fifo->write;
> > > +#endif
> >
> > Correct.
> >
> > > +
> > >         for (i = 0; i < num; i++) {
> > >                 if (new_read == fifo_write)
> > >                         break;
> > > @@ -59,7 +74,12 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data,
> > unsigned num)
> > >                 data[i] = fifo->buffer[new_read];
> > >                 new_read = (new_read + 1) & (fifo->len - 1);
> > >         }
> > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > +       __atomic_store_n(&fifo->read, new_read, __ATOMIC_RELEASE);
> > > +#else
> > > +       rte_smp_wmb();
> > >         fifo->read = new_read;
> > > +#endif
> >
> > Correct.
> >
> > >         return i;
> > >  }
> > >
> > > @@ -69,5 +89,13 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void
> > > **data, unsigned num)  static inline uint32_t  kni_fifo_count(struct
> > > rte_kni_fifo *fifo)  {
> > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > +       unsigned fifo_write = __atomic_load_n(&fifo->write,
> > > +                                                 __ATOMIC_ACQUIRE);
> > > +       unsigned fifo_read = __atomic_load_n(&fifo->read,
> > > +                                                __ATOMIC_ACQUIRE);
> >
> > Isn't too  heavy to have two __ATOMIC_ACQUIREs? a simple rte_smp_rmb()
> > would be enough here. Right?
> > or
> > Do we need __ATOMIC_ACQUIRE for fifo_write case?
> >
> We also had some amount of debate internally on this:
> 1) We do not want to use rte_smp_rmb() as we want to keep the memory models separated (for ex: while using C11, use C11 everywhere). It is also not sufficient, please see 3) below.

But Nothing technically wrong in using rte_smp_rmb() here in terms
functionally and code generated by the compiler.

> 2) This API can get called from writer or reader, so both the loads have to be __ATOMIC_ACQUIRE
> 3) Other option is to use __ATOMIC_RELAXED. That would allow any loads/stores around of this API to get reordered, especially since this is an inline function. This would put burden on the application to manage the ordering depending on its usage. It will also require the application to understand the implementation of this API.

__ATOMIC_RELAXED may be fine too for _count() case as it may not very
important to get the exact count for the exact very moment, Application can
retry.

I am in favor of performance effective implementation.

> 
> >
> > Other than that, I prefer to avoid ifdef clutter by introducing two separate file
> > just like ring C11 implementation.
> >
> > I don't have strong opinion on this this part, I let KNI MAINTAINER to decide
> > on how to accommodate this change.
> 
> I prefer to change this as well, I am open for suggestions.
> Introducing two separate files would be too much for this library. A better way would be to have something similar to 'smp_store_release' provided by the kernel. i.e. create #defines for loads/stores. Hide the clutter behind the #defines.

No Strong opinion on this, leaving to KNI Maintainer.

This patch needs to split by two,
a) Fixes for non C11 implementation(i.e new addition to rte_smp_wmb())
b) add support for C11 implementation.

> 
> >
> > > +       return (fifo->len + fifo_write - fifo_read) & (fifo->len - 1);
> > > +#else
> > >         return (fifo->len + fifo->write - fifo->read) & (fifo->len -
> > > 1);
> > > +#endif
> > >  }
> > > --
> > > 2.7.4
> > >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] kni: fix kni fifo synchronization
  2018-09-20 15:37         ` Jerin Jacob
@ 2018-09-21  5:48           ` Honnappa Nagarahalli
  2018-09-21  5:55             ` Jerin Jacob
  0 siblings, 1 reply; 30+ messages in thread
From: Honnappa Nagarahalli @ 2018-09-21  5:48 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Phil Yang (Arm Technology China),
	dev, nd, kkokkilagadda, Gavin Hu (Arm Technology China),
	ferruh.yigit

> > > -----Original Message-----
> > > > Date: Wed, 19 Sep 2018 21:42:39 +0800
> > > > From: Phil Yang <phil.yang@arm.com>
> > > > To: dev@dpdk.org
> > > > CC: nd@arm.com, jerin.jacob@caviumnetworks.com,
> > > > kkokkilagadda@caviumnetworks.com, Honnappa.Nagarahalli@arm.com,
> > > > Gavin.Hu@arm.com
> > > > Subject: [PATCH v2 2/3] kni: fix kni fifo synchronization
> > > > X-Mailer: git-send-email 2.7.4
> > > >
> > >
> > > + Ferruh Yigit <ferruh.yigit@intel.com>
> > >
> > > >
> > > > With existing code in kni_fifo_put, rx_q values are not being
> > > > updated before updating fifo_write. While reading rx_q in
> > > > kni_net_rx_normal, This is causing the sync issue on other core.
> > > > The same situation happens in kni_fifo_get as well.
> > > >
> > > > So syncing the values by adding C11 atomic memory barriers to make
> > > > sure the values being synced before updating fifo_write and fifo_read.
> > > >
> > > > Fixes: 3fc5ca2 ("kni: initial import")
> > > > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > > > Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > > > Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
> > > > ---
> > > >  .../linuxapp/eal/include/exec-env/rte_kni_common.h |  5 ++++
> > > >  lib/librte_kni/rte_kni_fifo.h                      | 30 +++++++++++++++++++++-
> > > >  2 files changed, 34 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git
> > > > a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > > > b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > > > index cfa9448..1fd713b 100644
> > > > ---
> > > > a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > > > +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.
> > > > +++ h
> > > > @@ -54,8 +54,13 @@ struct rte_kni_request {
> > > >   * Writing should never overwrite the read position
> > > >   */
> > > >  struct rte_kni_fifo {
> > > > +#ifndef RTE_USE_C11_MEM_MODEL
> > > >         volatile unsigned write;     /**< Next position to be written*/
> > > >         volatile unsigned read;      /**< Next position to be read */
> > > > +#else
> > > > +       unsigned write;              /**< Next position to be written*/
> > > > +       unsigned read;               /**< Next position to be read */
> > > > +#endif
> > > >         unsigned len;                /**< Circular buffer length */
> > > >         unsigned elem_size;          /**< Pointer size - for 32/64 bit OS */
> > > >         void *volatile buffer[];     /**< The buffer contains mbuf pointers */
> > > > diff --git a/lib/librte_kni/rte_kni_fifo.h
> > > > b/lib/librte_kni/rte_kni_fifo.h index ac26a8c..f4171a1 100644
> > > > --- a/lib/librte_kni/rte_kni_fifo.h
> > > > +++ b/lib/librte_kni/rte_kni_fifo.h
> > > > @@ -28,8 +28,13 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void
> > > > **data, unsigned num)  {
> > > >         unsigned i = 0;
> > > >         unsigned fifo_write = fifo->write;
> > > > -       unsigned fifo_read = fifo->read;
> > > >         unsigned new_write = fifo_write;
> > > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > > +       unsigned fifo_read = __atomic_load_n(&fifo->read,
> > > > +
> > > > +__ATOMIC_ACQUIRE); #else
> > > > +       unsigned fifo_read = fifo->read; #endif
> > >
> > > Correct.
> >
> > My apologies, did not follow your comment here. Do you want us to correct
> anything here? '#endif' is not appearing on the correct line in the email, but it
> shows up fine on the patch work.
> 
> No. What I meant is, code is correct.
> 
> >
> > >
> > >
> > > >
> > > >         for (i = 0; i < num; i++) {
> > > >                 new_write = (new_write + 1) & (fifo->len - 1); @@
> > > > -39,7 +44,12 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void
> > > > **data,
> > > unsigned num)
> > > >                 fifo->buffer[fifo_write] = data[i];
> > > >                 fifo_write = new_write;
> > > >         }
> > > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > > +       __atomic_store_n(&fifo->write, fifo_write,
> > > > +__ATOMIC_RELEASE); #else
> > > > +       rte_smp_wmb();
> > > >         fifo->write = fifo_write;
> > > > +#endif
> > >
> > > Correct.
> > > >         return i;
> > > >  }
> > > >
> > > > @@ -51,7 +61,12 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void
> > > > **data, unsigned num)  {
> > > >         unsigned i = 0;
> > > >         unsigned new_read = fifo->read;
> > > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > > +       unsigned fifo_write = __atomic_load_n(&fifo->write,
> > > > +__ATOMIC_ACQUIRE); #else
> > > >         unsigned fifo_write = fifo->write;
> > > > +#endif
> > >
> > > Correct.
> > >
> > > > +
> > > >         for (i = 0; i < num; i++) {
> > > >                 if (new_read == fifo_write)
> > > >                         break;
> > > > @@ -59,7 +74,12 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void
> > > > **data,
> > > unsigned num)
> > > >                 data[i] = fifo->buffer[new_read];
> > > >                 new_read = (new_read + 1) & (fifo->len - 1);
> > > >         }
> > > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > > +       __atomic_store_n(&fifo->read, new_read, __ATOMIC_RELEASE);
> > > > +#else
> > > > +       rte_smp_wmb();
> > > >         fifo->read = new_read;
> > > > +#endif
> > >
> > > Correct.
> > >
> > > >         return i;
> > > >  }
> > > >
> > > > @@ -69,5 +89,13 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void
> > > > **data, unsigned num)  static inline uint32_t
> > > > kni_fifo_count(struct rte_kni_fifo *fifo)  {
> > > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > > +       unsigned fifo_write = __atomic_load_n(&fifo->write,
> > > > +                                                 __ATOMIC_ACQUIRE);
> > > > +       unsigned fifo_read = __atomic_load_n(&fifo->read,
> > > > +
> > > > +__ATOMIC_ACQUIRE);
> > >
> > > Isn't too  heavy to have two __ATOMIC_ACQUIREs? a simple
> > > rte_smp_rmb() would be enough here. Right?
> > > or
> > > Do we need __ATOMIC_ACQUIRE for fifo_write case?
> > >
> > We also had some amount of debate internally on this:
> > 1) We do not want to use rte_smp_rmb() as we want to keep the memory
> models separated (for ex: while using C11, use C11 everywhere). It is also not
> sufficient, please see 3) below.
> 
> But Nothing technically wrong in using rte_smp_rmb() here in terms
> functionally and code generated by the compiler.

rte_smp_rmb() generates 'DMB ISHLD'. This works fine, but it is not optimal. 'LDAR' is a better option which is generated when C11 atomics are used.

> 
> > 2) This API can get called from writer or reader, so both the loads
> > have to be __ATOMIC_ACQUIRE
> > 3) Other option is to use __ATOMIC_RELAXED. That would allow any
> loads/stores around of this API to get reordered, especially since this is an
> inline function. This would put burden on the application to manage the
> ordering depending on its usage. It will also require the application to
> understand the implementation of this API.
> 
> __ATOMIC_RELAXED may be fine too for _count() case as it may not very
> important to get the exact count for the exact very moment, Application can
> retry.
> 
> I am in favor of performance effective implementation.

The requirement on the correctness of the count depends on the usage of this function. I see the following usage:

In the file kni_net.c, function: kni_net_tx:

       if (kni_fifo_free_count(kni->tx_q) == 0 ||                              
                        kni_fifo_count(kni->alloc_q) == 0) {                    
                /**                                                             
                 * If no free entry in tx_q or no entry in alloc_q,             
                 * drops skb and goes out.                                      
                 */                                                             
                goto drop;                                                      
        }

There is no retry here, the packet is dropped.

> 
> >
> > >
> > > Other than that, I prefer to avoid ifdef clutter by introducing two
> > > separate file just like ring C11 implementation.
> > >
> > > I don't have strong opinion on this this part, I let KNI MAINTAINER
> > > to decide on how to accommodate this change.
> >
> > I prefer to change this as well, I am open for suggestions.
> > Introducing two separate files would be too much for this library. A better
> way would be to have something similar to 'smp_store_release' provided by
> the kernel. i.e. create #defines for loads/stores. Hide the clutter behind the
> #defines.
> 
> No Strong opinion on this, leaving to KNI Maintainer.
Will wait on this before re-spinning the patch

> 
> This patch needs to split by two,
> a) Fixes for non C11 implementation(i.e new addition to rte_smp_wmb())
> b) add support for C11 implementation.
Agree

> 
> >
> > >
> > > > +       return (fifo->len + fifo_write - fifo_read) & (fifo->len -
> > > > +1); #else
> > > >         return (fifo->len + fifo->write - fifo->read) & (fifo->len
> > > > - 1);
> > > > +#endif
> > > >  }
> > > > --
> > > > 2.7.4
> > > >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] kni: fix kni fifo synchronization
  2018-09-21  5:48           ` Honnappa Nagarahalli
@ 2018-09-21  5:55             ` Jerin Jacob
  2018-09-21  6:37               ` Honnappa Nagarahalli
  0 siblings, 1 reply; 30+ messages in thread
From: Jerin Jacob @ 2018-09-21  5:55 UTC (permalink / raw)
  To: Honnappa Nagarahalli
  Cc: Phil Yang (Arm Technology China),
	dev, nd, kkokkilagadda, Gavin Hu (Arm Technology China),
	ferruh.yigit

-----Original Message-----
> Date: Fri, 21 Sep 2018 05:48:44 +0000
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> CC: "Phil Yang (Arm Technology China)" <Phil.Yang@arm.com>, "dev@dpdk.org"
>  <dev@dpdk.org>, nd <nd@arm.com>, "kkokkilagadda@caviumnetworks.com"
>  <kkokkilagadda@caviumnetworks.com>, "Gavin Hu (Arm Technology China)"
>  <Gavin.Hu@arm.com>, "ferruh.yigit@intel.com" <ferruh.yigit@intel.com>
> Subject: RE: [PATCH v2 2/3] kni: fix kni fifo synchronization
> 
> 
> > > > -----Original Message-----
> > > > > Date: Wed, 19 Sep 2018 21:42:39 +0800
> > > > > From: Phil Yang <phil.yang@arm.com>
> > > > > To: dev@dpdk.org
> > > > > CC: nd@arm.com, jerin.jacob@caviumnetworks.com,
> > > > > kkokkilagadda@caviumnetworks.com, Honnappa.Nagarahalli@arm.com,
> > > > > Gavin.Hu@arm.com
> > > > > Subject: [PATCH v2 2/3] kni: fix kni fifo synchronization
> > > > > X-Mailer: git-send-email 2.7.4
> > > > >
> > > >
> > > > + Ferruh Yigit <ferruh.yigit@intel.com>
> > > >
> > > > >
> > > > > With existing code in kni_fifo_put, rx_q values are not being
> > > > > updated before updating fifo_write. While reading rx_q in
> > > > > kni_net_rx_normal, This is causing the sync issue on other core.
> > > > > The same situation happens in kni_fifo_get as well.
> > > > >
> > > > > So syncing the values by adding C11 atomic memory barriers to make
> > > > > sure the values being synced before updating fifo_write and fifo_read.
> > > > >
> > > > > Fixes: 3fc5ca2 ("kni: initial import")
> > > > > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > > > > Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > > > > Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
> > > > > ---
> > > > >  .../linuxapp/eal/include/exec-env/rte_kni_common.h |  5 ++++
> > > > >  lib/librte_kni/rte_kni_fifo.h                      | 30 +++++++++++++++++++++-
> > > > >  2 files changed, 34 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git
> > > > > a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > > > > b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > > > > index cfa9448..1fd713b 100644
> > > > > ---
> > > > > a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > > > > +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.
> > > > > +++ h
> > > > > @@ -54,8 +54,13 @@ struct rte_kni_request {
> > > > >   * Writing should never overwrite the read position
> > > > >   */
> > > > >  struct rte_kni_fifo {
> > > > > +#ifndef RTE_USE_C11_MEM_MODEL
> > > > >         volatile unsigned write;     /**< Next position to be written*/
> > > > >         volatile unsigned read;      /**< Next position to be read */
> > > > > +#else
> > > > > +       unsigned write;              /**< Next position to be written*/
> > > > > +       unsigned read;               /**< Next position to be read */
> > > > > +#endif
> > > > >         unsigned len;                /**< Circular buffer length */
> > > > >         unsigned elem_size;          /**< Pointer size - for 32/64 bit OS */
> > > > >         void *volatile buffer[];     /**< The buffer contains mbuf pointers */
> > > > > diff --git a/lib/librte_kni/rte_kni_fifo.h
> > > > > b/lib/librte_kni/rte_kni_fifo.h index ac26a8c..f4171a1 100644
> > > > > --- a/lib/librte_kni/rte_kni_fifo.h
> > > > > +++ b/lib/librte_kni/rte_kni_fifo.h
> > > > > @@ -28,8 +28,13 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void
> > > > > **data, unsigned num)  {
> > > > >         unsigned i = 0;
> > > > >         unsigned fifo_write = fifo->write;
> > > > > -       unsigned fifo_read = fifo->read;
> > > > >         unsigned new_write = fifo_write;
> > > > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > > > +       unsigned fifo_read = __atomic_load_n(&fifo->read,
> > > > > +
> > > > > +__ATOMIC_ACQUIRE); #else
> > > > > +       unsigned fifo_read = fifo->read; #endif
> > > >
> > > > Correct.
> > >
> > > My apologies, did not follow your comment here. Do you want us to correct
> > anything here? '#endif' is not appearing on the correct line in the email, but it
> > shows up fine on the patch work.
> >
> > No. What I meant is, code is correct.
> >
> > >
> > > >
> > > >
> > > > >
> > > > >         for (i = 0; i < num; i++) {
> > > > >                 new_write = (new_write + 1) & (fifo->len - 1); @@
> > > > > -39,7 +44,12 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void
> > > > > **data,
> > > > unsigned num)
> > > > >                 fifo->buffer[fifo_write] = data[i];
> > > > >                 fifo_write = new_write;
> > > > >         }
> > > > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > > > +       __atomic_store_n(&fifo->write, fifo_write,
> > > > > +__ATOMIC_RELEASE); #else
> > > > > +       rte_smp_wmb();
> > > > >         fifo->write = fifo_write;
> > > > > +#endif
> > > >
> > > > Correct.
> > > > >         return i;
> > > > >  }
> > > > >
> > > > > @@ -51,7 +61,12 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void
> > > > > **data, unsigned num)  {
> > > > >         unsigned i = 0;
> > > > >         unsigned new_read = fifo->read;
> > > > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > > > +       unsigned fifo_write = __atomic_load_n(&fifo->write,
> > > > > +__ATOMIC_ACQUIRE); #else
> > > > >         unsigned fifo_write = fifo->write;
> > > > > +#endif
> > > >
> > > > Correct.
> > > >
> > > > > +
> > > > >         for (i = 0; i < num; i++) {
> > > > >                 if (new_read == fifo_write)
> > > > >                         break;
> > > > > @@ -59,7 +74,12 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void
> > > > > **data,
> > > > unsigned num)
> > > > >                 data[i] = fifo->buffer[new_read];
> > > > >                 new_read = (new_read + 1) & (fifo->len - 1);
> > > > >         }
> > > > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > > > +       __atomic_store_n(&fifo->read, new_read, __ATOMIC_RELEASE);
> > > > > +#else
> > > > > +       rte_smp_wmb();
> > > > >         fifo->read = new_read;
> > > > > +#endif
> > > >
> > > > Correct.
> > > >
> > > > >         return i;
> > > > >  }
> > > > >
> > > > > @@ -69,5 +89,13 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void
> > > > > **data, unsigned num)  static inline uint32_t
> > > > > kni_fifo_count(struct rte_kni_fifo *fifo)  {
> > > > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > > > +       unsigned fifo_write = __atomic_load_n(&fifo->write,
> > > > > +                                                 __ATOMIC_ACQUIRE);
> > > > > +       unsigned fifo_read = __atomic_load_n(&fifo->read,
> > > > > +
> > > > > +__ATOMIC_ACQUIRE);
> > > >
> > > > Isn't too  heavy to have two __ATOMIC_ACQUIREs? a simple
> > > > rte_smp_rmb() would be enough here. Right?
> > > > or
> > > > Do we need __ATOMIC_ACQUIRE for fifo_write case?
> > > >
> > > We also had some amount of debate internally on this:
> > > 1) We do not want to use rte_smp_rmb() as we want to keep the memory
> > models separated (for ex: while using C11, use C11 everywhere). It is also not
> > sufficient, please see 3) below.
> >
> > But Nothing technically wrong in using rte_smp_rmb() here in terms
> > functionally and code generated by the compiler.
> 
> rte_smp_rmb() generates 'DMB ISHLD'. This works fine, but it is not optimal. 'LDAR' is a better option which is generated when C11 atomics are used.

Yes. But which one is optimal 1 x DMB ISHLD vs 2 x LDAR ?

> 
> >
> > > 2) This API can get called from writer or reader, so both the loads
> > > have to be __ATOMIC_ACQUIRE
> > > 3) Other option is to use __ATOMIC_RELAXED. That would allow any
> > loads/stores around of this API to get reordered, especially since this is an
> > inline function. This would put burden on the application to manage the
> > ordering depending on its usage. It will also require the application to
> > understand the implementation of this API.
> >
> > __ATOMIC_RELAXED may be fine too for _count() case as it may not very
> > important to get the exact count for the exact very moment, Application can
> > retry.
> >
> > I am in favor of performance effective implementation.
> 
> The requirement on the correctness of the count depends on the usage of this function. I see the following usage:
> 
> In the file kni_net.c, function: kni_net_tx:
> 
>        if (kni_fifo_free_count(kni->tx_q) == 0 ||
>                         kni_fifo_count(kni->alloc_q) == 0) {
>                 /**
>                  * If no free entry in tx_q or no entry in alloc_q,
>                  * drops skb and goes out.
>                  */
>                 goto drop;
>         }
> 
> There is no retry here, the packet is dropped.

OK. Then pick an implementation which is an optimal this case.
I think, then rte_smp_rmb() makes sense here as
a) no #ifdef clutter
b) it is optimal compared to 2 x LDAR


> 
> >
> > >
> > > >
> > > > Other than that, I prefer to avoid ifdef clutter by introducing two
> > > > separate file just like ring C11 implementation.
> > > >
> > > > I don't have strong opinion on this this part, I let KNI MAINTAINER
> > > > to decide on how to accommodate this change.
> > >
> > > I prefer to change this as well, I am open for suggestions.
> > > Introducing two separate files would be too much for this library. A better
> > way would be to have something similar to 'smp_store_release' provided by
> > the kernel. i.e. create #defines for loads/stores. Hide the clutter behind the
> > #defines.
> >
> > No Strong opinion on this, leaving to KNI Maintainer.
> Will wait on this before re-spinning the patch
> 
> >
> > This patch needs to split by two,
> > a) Fixes for non C11 implementation(i.e new addition to rte_smp_wmb())
> > b) add support for C11 implementation.
> Agree
> 
> >
> > >
> > > >
> > > > > +       return (fifo->len + fifo_write - fifo_read) & (fifo->len -
> > > > > +1); #else
> > > > >         return (fifo->len + fifo->write - fifo->read) & (fifo->len
> > > > > - 1);
> > > > > +#endif
> > > > >  }
> > > > > --
> > > > > 2.7.4
> > > > >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] kni: fix kni fifo synchronization
  2018-09-21  5:55             ` Jerin Jacob
@ 2018-09-21  6:37               ` Honnappa Nagarahalli
  2018-09-21  9:00                 ` Phil Yang (Arm Technology China)
                                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Honnappa Nagarahalli @ 2018-09-21  6:37 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Phil Yang (Arm Technology China),
	dev, nd, kkokkilagadda, Gavin Hu (Arm Technology China),
	ferruh.yigit

> > > > > >
> > > > > > @@ -69,5 +89,13 @@ kni_fifo_get(struct rte_kni_fifo *fifo,
> > > > > > void **data, unsigned num)  static inline uint32_t
> > > > > > kni_fifo_count(struct rte_kni_fifo *fifo)  {
> > > > > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > > > > +       unsigned fifo_write = __atomic_load_n(&fifo->write,
> > > > > > +                                                 __ATOMIC_ACQUIRE);
> > > > > > +       unsigned fifo_read = __atomic_load_n(&fifo->read,
> > > > > > +
> > > > > > +__ATOMIC_ACQUIRE);
> > > > >
> > > > > Isn't too  heavy to have two __ATOMIC_ACQUIREs? a simple
> > > > > rte_smp_rmb() would be enough here. Right?
> > > > > or
> > > > > Do we need __ATOMIC_ACQUIRE for fifo_write case?
> > > > >
> > > > We also had some amount of debate internally on this:
> > > > 1) We do not want to use rte_smp_rmb() as we want to keep the
> > > > memory
> > > models separated (for ex: while using C11, use C11 everywhere). It
> > > is also not sufficient, please see 3) below.
> > >
> > > But Nothing technically wrong in using rte_smp_rmb() here in terms
> > > functionally and code generated by the compiler.
> >
> > rte_smp_rmb() generates 'DMB ISHLD'. This works fine, but it is not optimal.
> 'LDAR' is a better option which is generated when C11 atomics are used.
> 
> Yes. But which one is optimal 1 x DMB ISHLD vs 2 x LDAR ?

Good point. I am not sure which one is optimal, it needs to be measured. 'DMB ISHLD' orders 'all' earlier loads against 'all' later loads and stores. 'LDAR' orders the 'specific' load with 'all' later loads and stores.

> 
> >
> > >
> > > > 2) This API can get called from writer or reader, so both the
> > > > loads have to be __ATOMIC_ACQUIRE
> > > > 3) Other option is to use __ATOMIC_RELAXED. That would allow any
> > > loads/stores around of this API to get reordered, especially since
> > > this is an inline function. This would put burden on the application
> > > to manage the ordering depending on its usage. It will also require
> > > the application to understand the implementation of this API.
> > >
> > > __ATOMIC_RELAXED may be fine too for _count() case as it may not
> > > very important to get the exact count for the exact very moment,
> > > Application can retry.
> > >
> > > I am in favor of performance effective implementation.
> >
> > The requirement on the correctness of the count depends on the usage of
> this function. I see the following usage:
> >
> > In the file kni_net.c, function: kni_net_tx:
> >
> >        if (kni_fifo_free_count(kni->tx_q) == 0 ||
> >                         kni_fifo_count(kni->alloc_q) == 0) {
> >                 /**
> >                  * If no free entry in tx_q or no entry in alloc_q,
> >                  * drops skb and goes out.
> >                  */
> >                 goto drop;
> >         }
> >
> > There is no retry here, the packet is dropped.
> 
> OK. Then pick an implementation which is an optimal this case.
> I think, then rte_smp_rmb() makes sense here as
> a) no #ifdef clutter
> b) it is optimal compared to 2 x LDAR
> 
As I understand, one of the principals of using C11 model is to match the store releases and load acquires. IMO, combining C11 memory model with barrier based functions makes the code unreadable.
I realized rte_smp_rmb() is required for x86 as well to prevent compiler reordering. We can add that in the non-C11 case. This way, we will have clean code for both the options (similar to rte_ring).
So, if 'RTE_USE_C11_MEM_MODEL' is set to 'n', then the 'rte_smp_rmb' would be used.

We can look at handling the #ifdef clutter based on Ferruh's feedback.

> 
> >
> > >
> > > >
> > > > >
> > > > > Other than that, I prefer to avoid ifdef clutter by introducing
> > > > > two separate file just like ring C11 implementation.
> > > > >
> > > > > I don't have strong opinion on this this part, I let KNI
> > > > > MAINTAINER to decide on how to accommodate this change.
> > > >
> > > > I prefer to change this as well, I am open for suggestions.
> > > > Introducing two separate files would be too much for this library.
> > > > A better
> > > way would be to have something similar to 'smp_store_release'
> > > provided by the kernel. i.e. create #defines for loads/stores. Hide
> > > the clutter behind the #defines.
> > >
> > > No Strong opinion on this, leaving to KNI Maintainer.
> > Will wait on this before re-spinning the patch
> >
> > >
> > > This patch needs to split by two,
> > > a) Fixes for non C11 implementation(i.e new addition to
> > > rte_smp_wmb())
> > > b) add support for C11 implementation.
> > Agree
> >
> > >
> > > >
> > > > >
> > > > > > +       return (fifo->len + fifo_write - fifo_read) &
> > > > > > +(fifo->len - 1); #else
> > > > > >         return (fifo->len + fifo->write - fifo->read) &
> > > > > > (fifo->len
> > > > > > - 1);
Requires rte_smp_rmb() for x86 to prevent compiler reordering.

> > > > > > +#endif
> > > > > >  }
> > > > > > --
> > > > > > 2.7.4
> > > > > >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] kni: fix kni fifo synchronization
  2018-09-21  6:37               ` Honnappa Nagarahalli
@ 2018-09-21  9:00                 ` Phil Yang (Arm Technology China)
  2018-09-25  4:44                 ` Honnappa Nagarahalli
  2018-09-26 11:42                 ` Ferruh Yigit
  2 siblings, 0 replies; 30+ messages in thread
From: Phil Yang (Arm Technology China) @ 2018-09-21  9:00 UTC (permalink / raw)
  To: Honnappa Nagarahalli, Jerin Jacob
  Cc: dev, nd, kkokkilagadda, Gavin Hu (Arm Technology China),
	ferruh.yigit, Ola Liljedahl

+ Ola Liljedahl <Ola.Liljedahl@arm.com>

Thanks,
Phil Yang

> -----Original Message-----
> From: Honnappa Nagarahalli
> Sent: Friday, September 21, 2018 2:37 PM
> To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Cc: Phil Yang (Arm Technology China) <Phil.Yang@arm.com>; dev@dpdk.org; nd
> <nd@arm.com>; kkokkilagadda@caviumnetworks.com; Gavin Hu (Arm
> Technology China) <Gavin.Hu@arm.com>; ferruh.yigit@intel.com
> Subject: RE: [PATCH v2 2/3] kni: fix kni fifo synchronization
> 
> > > > > > >
> > > > > > > @@ -69,5 +89,13 @@ kni_fifo_get(struct rte_kni_fifo *fifo,
> > > > > > > void **data, unsigned num)  static inline uint32_t
> > > > > > > kni_fifo_count(struct rte_kni_fifo *fifo)  {
> > > > > > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > > > > > +       unsigned fifo_write = __atomic_load_n(&fifo->write,
> > > > > > > +                                                 __ATOMIC_ACQUIRE);
> > > > > > > +       unsigned fifo_read = __atomic_load_n(&fifo->read,
> > > > > > > +
> > > > > > > +__ATOMIC_ACQUIRE);
> > > > > >
> > > > > > Isn't too  heavy to have two __ATOMIC_ACQUIREs? a simple
> > > > > > rte_smp_rmb() would be enough here. Right?
> > > > > > or
> > > > > > Do we need __ATOMIC_ACQUIRE for fifo_write case?
> > > > > >
> > > > > We also had some amount of debate internally on this:
> > > > > 1) We do not want to use rte_smp_rmb() as we want to keep the
> > > > > memory
> > > > models separated (for ex: while using C11, use C11 everywhere). It
> > > > is also not sufficient, please see 3) below.
> > > >
> > > > But Nothing technically wrong in using rte_smp_rmb() here in terms
> > > > functionally and code generated by the compiler.
> > >
> > > rte_smp_rmb() generates 'DMB ISHLD'. This works fine, but it is not optimal.
> > 'LDAR' is a better option which is generated when C11 atomics are used.
> >
> > Yes. But which one is optimal 1 x DMB ISHLD vs 2 x LDAR ?
> 
> Good point. I am not sure which one is optimal, it needs to be measured. 'DMB
> ISHLD' orders 'all' earlier loads against 'all' later loads and stores. 'LDAR' orders
> the 'specific' load with 'all' later loads and stores.
> 
> >
> > >
> > > >
> > > > > 2) This API can get called from writer or reader, so both the
> > > > > loads have to be __ATOMIC_ACQUIRE
> > > > > 3) Other option is to use __ATOMIC_RELAXED. That would allow any
> > > > loads/stores around of this API to get reordered, especially since
> > > > this is an inline function. This would put burden on the
> > > > application to manage the ordering depending on its usage. It will
> > > > also require the application to understand the implementation of this API.
> > > >
> > > > __ATOMIC_RELAXED may be fine too for _count() case as it may not
> > > > very important to get the exact count for the exact very moment,
> > > > Application can retry.
> > > >
> > > > I am in favor of performance effective implementation.
> > >
> > > The requirement on the correctness of the count depends on the usage
> > > of
> > this function. I see the following usage:
> > >
> > > In the file kni_net.c, function: kni_net_tx:
> > >
> > >        if (kni_fifo_free_count(kni->tx_q) == 0 ||
> > >                         kni_fifo_count(kni->alloc_q) == 0) {
> > >                 /**
> > >                  * If no free entry in tx_q or no entry in alloc_q,
> > >                  * drops skb and goes out.
> > >                  */
> > >                 goto drop;
> > >         }
> > >
> > > There is no retry here, the packet is dropped.
> >
> > OK. Then pick an implementation which is an optimal this case.
> > I think, then rte_smp_rmb() makes sense here as
> > a) no #ifdef clutter
> > b) it is optimal compared to 2 x LDAR
> >
> As I understand, one of the principals of using C11 model is to match the store
> releases and load acquires. IMO, combining C11 memory model with barrier
> based functions makes the code unreadable.
> I realized rte_smp_rmb() is required for x86 as well to prevent compiler
> reordering. We can add that in the non-C11 case. This way, we will have clean
> code for both the options (similar to rte_ring).
> So, if 'RTE_USE_C11_MEM_MODEL' is set to 'n', then the 'rte_smp_rmb' would
> be used.
> 
> We can look at handling the #ifdef clutter based on Ferruh's feedback.
> 
> >
> > >
> > > >
> > > > >
> > > > > >
> > > > > > Other than that, I prefer to avoid ifdef clutter by
> > > > > > introducing two separate file just like ring C11 implementation.
> > > > > >
> > > > > > I don't have strong opinion on this this part, I let KNI
> > > > > > MAINTAINER to decide on how to accommodate this change.
> > > > >
> > > > > I prefer to change this as well, I am open for suggestions.
> > > > > Introducing two separate files would be too much for this library.
> > > > > A better
> > > > way would be to have something similar to 'smp_store_release'
> > > > provided by the kernel. i.e. create #defines for loads/stores.
> > > > Hide the clutter behind the #defines.
> > > >
> > > > No Strong opinion on this, leaving to KNI Maintainer.
> > > Will wait on this before re-spinning the patch
> > >
> > > >
> > > > This patch needs to split by two,
> > > > a) Fixes for non C11 implementation(i.e new addition to
> > > > rte_smp_wmb())
> > > > b) add support for C11 implementation.
> > > Agree
> > >
> > > >
> > > > >
> > > > > >
> > > > > > > +       return (fifo->len + fifo_write - fifo_read) &
> > > > > > > +(fifo->len - 1); #else
> > > > > > >         return (fifo->len + fifo->write - fifo->read) &
> > > > > > > (fifo->len
> > > > > > > - 1);
> Requires rte_smp_rmb() for x86 to prevent compiler reordering.
> 
> > > > > > > +#endif
> > > > > > >  }
> > > > > > > --
> > > > > > > 2.7.4
> > > > > > >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] kni: fix kni fifo synchronization
  2018-09-21  6:37               ` Honnappa Nagarahalli
  2018-09-21  9:00                 ` Phil Yang (Arm Technology China)
@ 2018-09-25  4:44                 ` Honnappa Nagarahalli
  2018-09-26 11:42                 ` Ferruh Yigit
  2 siblings, 0 replies; 30+ messages in thread
From: Honnappa Nagarahalli @ 2018-09-25  4:44 UTC (permalink / raw)
  To: Honnappa Nagarahalli, Jerin Jacob
  Cc: Phil Yang (Arm Technology China),
	dev, nd, kkokkilagadda, Gavin Hu (Arm Technology China),
	ferruh.yigit

Hi Ferruh,
	Just wanted to get your attention to this conversation. Appreciate your feedback on handling the #ifdef clutter.

Thank you,
Honnappa

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Honnappa Nagarahalli
> Sent: Friday, September 21, 2018 1:37 AM
> To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> Cc: Phil Yang (Arm Technology China) <Phil.Yang@arm.com>; dev@dpdk.org;
> nd <nd@arm.com>; kkokkilagadda@caviumnetworks.com; Gavin Hu (Arm
> Technology China) <Gavin.Hu@arm.com>; ferruh.yigit@intel.com
> Subject: Re: [dpdk-dev] [PATCH v2 2/3] kni: fix kni fifo synchronization
> 
> > > > > > >
> > > > > > > @@ -69,5 +89,13 @@ kni_fifo_get(struct rte_kni_fifo *fifo,
> > > > > > > void **data, unsigned num)  static inline uint32_t
> > > > > > > kni_fifo_count(struct rte_kni_fifo *fifo)  {
> > > > > > > +#ifdef RTE_USE_C11_MEM_MODEL
> > > > > > > +       unsigned fifo_write = __atomic_load_n(&fifo->write,
> > > > > > > +                                                 __ATOMIC_ACQUIRE);
> > > > > > > +       unsigned fifo_read = __atomic_load_n(&fifo->read,
> > > > > > > +
> > > > > > > +__ATOMIC_ACQUIRE);
> > > > > >
> > > > > > Isn't too  heavy to have two __ATOMIC_ACQUIREs? a simple
> > > > > > rte_smp_rmb() would be enough here. Right?
> > > > > > or
> > > > > > Do we need __ATOMIC_ACQUIRE for fifo_write case?
> > > > > >
> > > > > We also had some amount of debate internally on this:
> > > > > 1) We do not want to use rte_smp_rmb() as we want to keep the
> > > > > memory
> > > > models separated (for ex: while using C11, use C11 everywhere). It
> > > > is also not sufficient, please see 3) below.
> > > >
> > > > But Nothing technically wrong in using rte_smp_rmb() here in terms
> > > > functionally and code generated by the compiler.
> > >
> > > rte_smp_rmb() generates 'DMB ISHLD'. This works fine, but it is not
> optimal.
> > 'LDAR' is a better option which is generated when C11 atomics are used.
> >
> > Yes. But which one is optimal 1 x DMB ISHLD vs 2 x LDAR ?
> 
> Good point. I am not sure which one is optimal, it needs to be measured.
> 'DMB ISHLD' orders 'all' earlier loads against 'all' later loads and stores.
> 'LDAR' orders the 'specific' load with 'all' later loads and stores.
> 
> >
> > >
> > > >
> > > > > 2) This API can get called from writer or reader, so both the
> > > > > loads have to be __ATOMIC_ACQUIRE
> > > > > 3) Other option is to use __ATOMIC_RELAXED. That would allow any
> > > > loads/stores around of this API to get reordered, especially since
> > > > this is an inline function. This would put burden on the
> > > > application to manage the ordering depending on its usage. It will
> > > > also require the application to understand the implementation of this
> API.
> > > >
> > > > __ATOMIC_RELAXED may be fine too for _count() case as it may not
> > > > very important to get the exact count for the exact very moment,
> > > > Application can retry.
> > > >
> > > > I am in favor of performance effective implementation.
> > >
> > > The requirement on the correctness of the count depends on the usage
> > > of
> > this function. I see the following usage:
> > >
> > > In the file kni_net.c, function: kni_net_tx:
> > >
> > >        if (kni_fifo_free_count(kni->tx_q) == 0 ||
> > >                         kni_fifo_count(kni->alloc_q) == 0) {
> > >                 /**
> > >                  * If no free entry in tx_q or no entry in alloc_q,
> > >                  * drops skb and goes out.
> > >                  */
> > >                 goto drop;
> > >         }
> > >
> > > There is no retry here, the packet is dropped.
> >
> > OK. Then pick an implementation which is an optimal this case.
> > I think, then rte_smp_rmb() makes sense here as
> > a) no #ifdef clutter
> > b) it is optimal compared to 2 x LDAR
> >
> As I understand, one of the principals of using C11 model is to match the store
> releases and load acquires. IMO, combining C11 memory model with barrier
> based functions makes the code unreadable.
> I realized rte_smp_rmb() is required for x86 as well to prevent compiler
> reordering. We can add that in the non-C11 case. This way, we will have clean
> code for both the options (similar to rte_ring).
> So, if 'RTE_USE_C11_MEM_MODEL' is set to 'n', then the 'rte_smp_rmb'
> would be used.
> 
> We can look at handling the #ifdef clutter based on Ferruh's feedback.
> 
> >
> > >
> > > >
> > > > >
> > > > > >
> > > > > > Other than that, I prefer to avoid ifdef clutter by
> > > > > > introducing two separate file just like ring C11 implementation.
> > > > > >
> > > > > > I don't have strong opinion on this this part, I let KNI
> > > > > > MAINTAINER to decide on how to accommodate this change.
> > > > >
> > > > > I prefer to change this as well, I am open for suggestions.
> > > > > Introducing two separate files would be too much for this library.
> > > > > A better
> > > > way would be to have something similar to 'smp_store_release'
> > > > provided by the kernel. i.e. create #defines for loads/stores.
> > > > Hide the clutter behind the #defines.
> > > >
> > > > No Strong opinion on this, leaving to KNI Maintainer.
> > > Will wait on this before re-spinning the patch
> > >
> > > >
> > > > This patch needs to split by two,
> > > > a) Fixes for non C11 implementation(i.e new addition to
> > > > rte_smp_wmb())
> > > > b) add support for C11 implementation.
> > > Agree
> > >
> > > >
> > > > >
> > > > > >
> > > > > > > +       return (fifo->len + fifo_write - fifo_read) &
> > > > > > > +(fifo->len - 1); #else
> > > > > > >         return (fifo->len + fifo->write - fifo->read) &
> > > > > > > (fifo->len
> > > > > > > - 1);
> Requires rte_smp_rmb() for x86 to prevent compiler reordering.
> 
> > > > > > > +#endif
> > > > > > >  }
> > > > > > > --
> > > > > > > 2.7.4
> > > > > > >

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] kni: fix kni fifo synchronization
  2018-09-21  6:37               ` Honnappa Nagarahalli
  2018-09-21  9:00                 ` Phil Yang (Arm Technology China)
  2018-09-25  4:44                 ` Honnappa Nagarahalli
@ 2018-09-26 11:42                 ` Ferruh Yigit
  2018-09-27  9:06                   ` Phil Yang (Arm Technology China)
  2 siblings, 1 reply; 30+ messages in thread
From: Ferruh Yigit @ 2018-09-26 11:42 UTC (permalink / raw)
  To: Honnappa Nagarahalli, Jerin Jacob
  Cc: Phil Yang (Arm Technology China),
	dev, nd, kkokkilagadda, Gavin Hu (Arm Technology China)

On 9/21/2018 7:37 AM, Honnappa Nagarahalli wrote:
>>>>>>>
>>>>>>> @@ -69,5 +89,13 @@ kni_fifo_get(struct rte_kni_fifo *fifo,
>>>>>>> void **data, unsigned num)  static inline uint32_t
>>>>>>> kni_fifo_count(struct rte_kni_fifo *fifo)  {
>>>>>>> +#ifdef RTE_USE_C11_MEM_MODEL
>>>>>>> +       unsigned fifo_write = __atomic_load_n(&fifo->write,
>>>>>>> +                                                 __ATOMIC_ACQUIRE);
>>>>>>> +       unsigned fifo_read = __atomic_load_n(&fifo->read,
>>>>>>> +
>>>>>>> +__ATOMIC_ACQUIRE);
>>>>>>
>>>>>> Isn't too  heavy to have two __ATOMIC_ACQUIREs? a simple
>>>>>> rte_smp_rmb() would be enough here. Right?
>>>>>> or
>>>>>> Do we need __ATOMIC_ACQUIRE for fifo_write case?
>>>>>>
>>>>> We also had some amount of debate internally on this:
>>>>> 1) We do not want to use rte_smp_rmb() as we want to keep the
>>>>> memory
>>>> models separated (for ex: while using C11, use C11 everywhere). It
>>>> is also not sufficient, please see 3) below.
>>>>
>>>> But Nothing technically wrong in using rte_smp_rmb() here in terms
>>>> functionally and code generated by the compiler.
>>>
>>> rte_smp_rmb() generates 'DMB ISHLD'. This works fine, but it is not optimal.
>> 'LDAR' is a better option which is generated when C11 atomics are used.
>>
>> Yes. But which one is optimal 1 x DMB ISHLD vs 2 x LDAR ?
> 
> Good point. I am not sure which one is optimal, it needs to be measured. 'DMB ISHLD' orders 'all' earlier loads against 'all' later loads and stores. 'LDAR' orders the 'specific' load with 'all' later loads and stores.
> 
>>
>>>
>>>>
>>>>> 2) This API can get called from writer or reader, so both the
>>>>> loads have to be __ATOMIC_ACQUIRE
>>>>> 3) Other option is to use __ATOMIC_RELAXED. That would allow any
>>>> loads/stores around of this API to get reordered, especially since
>>>> this is an inline function. This would put burden on the application
>>>> to manage the ordering depending on its usage. It will also require
>>>> the application to understand the implementation of this API.
>>>>
>>>> __ATOMIC_RELAXED may be fine too for _count() case as it may not
>>>> very important to get the exact count for the exact very moment,
>>>> Application can retry.
>>>>
>>>> I am in favor of performance effective implementation.
>>>
>>> The requirement on the correctness of the count depends on the usage of
>> this function. I see the following usage:
>>>
>>> In the file kni_net.c, function: kni_net_tx:
>>>
>>>        if (kni_fifo_free_count(kni->tx_q) == 0 ||
>>>                         kni_fifo_count(kni->alloc_q) == 0) {
>>>                 /**
>>>                  * If no free entry in tx_q or no entry in alloc_q,
>>>                  * drops skb and goes out.
>>>                  */
>>>                 goto drop;
>>>         }
>>>
>>> There is no retry here, the packet is dropped.
>>
>> OK. Then pick an implementation which is an optimal this case.
>> I think, then rte_smp_rmb() makes sense here as
>> a) no #ifdef clutter
>> b) it is optimal compared to 2 x LDAR
>>
> As I understand, one of the principals of using C11 model is to match the store releases and load acquires. IMO, combining C11 memory model with barrier based functions makes the code unreadable.
> I realized rte_smp_rmb() is required for x86 as well to prevent compiler reordering. We can add that in the non-C11 case. This way, we will have clean code for both the options (similar to rte_ring).
> So, if 'RTE_USE_C11_MEM_MODEL' is set to 'n', then the 'rte_smp_rmb' would be used.
> 
> We can look at handling the #ifdef clutter based on Ferruh's feedback.

Hi Honnappa, Jerin,

Sorry for delay, I missed that this is waiting my input.

+1 to remove #ifdef, but I don't think a separate file is required for this,
specially when it will be duplication of same implementation, nothing arch
specific implementation.
+1 Honnappa's suggestion to hide ifdef's behind APIs, plus those APIs can be
reused later...

And +1 to split into two patches, one for fix to current code and one for c11
atomic implementation support.

I have some basic questions on the patch, will send in different thread.

Thanks,
ferruh

> 
>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>> Other than that, I prefer to avoid ifdef clutter by introducing
>>>>>> two separate file just like ring C11 implementation.
>>>>>>
>>>>>> I don't have strong opinion on this this part, I let KNI
>>>>>> MAINTAINER to decide on how to accommodate this change.
>>>>>
>>>>> I prefer to change this as well, I am open for suggestions.
>>>>> Introducing two separate files would be too much for this library.
>>>>> A better
>>>> way would be to have something similar to 'smp_store_release'
>>>> provided by the kernel. i.e. create #defines for loads/stores. Hide
>>>> the clutter behind the #defines.
>>>>
>>>> No Strong opinion on this, leaving to KNI Maintainer.
>>> Will wait on this before re-spinning the patch
>>>
>>>>
>>>> This patch needs to split by two,
>>>> a) Fixes for non C11 implementation(i.e new addition to
>>>> rte_smp_wmb())
>>>> b) add support for C11 implementation.
>>> Agree
>>>
>>>>
>>>>>
>>>>>>
>>>>>>> +       return (fifo->len + fifo_write - fifo_read) &
>>>>>>> +(fifo->len - 1); #else
>>>>>>>         return (fifo->len + fifo->write - fifo->read) &
>>>>>>> (fifo->len
>>>>>>> - 1);
> Requires rte_smp_rmb() for x86 to prevent compiler reordering.
> 
>>>>>>> +#endif
>>>>>>>  }
>>>>>>> --
>>>>>>> 2.7.4
>>>>>>>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] kni: fix kni fifo synchronization
  2018-09-19 13:42   ` [dpdk-dev] [PATCH v2 2/3] kni: fix kni fifo synchronization Phil Yang
  2018-09-20  8:28     ` Jerin Jacob
@ 2018-09-26 11:45     ` Ferruh Yigit
  2018-10-01  4:52       ` Honnappa Nagarahalli
  1 sibling, 1 reply; 30+ messages in thread
From: Ferruh Yigit @ 2018-09-26 11:45 UTC (permalink / raw)
  To: dev-bounces, dev
  Cc: nd, jerin.jacob, kkokkilagadda, Honnappa.Nagarahalli, Gavin.Hu

On 9/19/2018 2:42 PM, dev-bounces@dpdk.org wrote:
> With existing code in kni_fifo_put, rx_q values are not being updated
> before updating fifo_write. While reading rx_q in kni_net_rx_normal,
> This is causing the sync issue on other core. The same situation happens
> in kni_fifo_get as well.
> 
> So syncing the values by adding C11 atomic memory barriers to make sure
> the values being synced before updating fifo_write and fifo_read.
> 
> Fixes: 3fc5ca2 ("kni: initial import")
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
> ---
>  .../linuxapp/eal/include/exec-env/rte_kni_common.h |  5 ++++
>  lib/librte_kni/rte_kni_fifo.h                      | 30 +++++++++++++++++++++-
>  2 files changed, 34 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> index cfa9448..1fd713b 100644
> --- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> @@ -54,8 +54,13 @@ struct rte_kni_request {
>   * Writing should never overwrite the read position
>   */
>  struct rte_kni_fifo {
> +#ifndef RTE_USE_C11_MEM_MODEL
>  	volatile unsigned write;     /**< Next position to be written*/
>  	volatile unsigned read;      /**< Next position to be read */
> +#else
> +	unsigned write;              /**< Next position to be written*/
> +	unsigned read;               /**< Next position to be read */
> +#endif
>  	unsigned len;                /**< Circular buffer length */
>  	unsigned elem_size;          /**< Pointer size - for 32/64 bit OS */
>  	void *volatile buffer[];     /**< The buffer contains mbuf pointers */
> diff --git a/lib/librte_kni/rte_kni_fifo.h b/lib/librte_kni/rte_kni_fifo.h
> index ac26a8c..f4171a1 100644
> --- a/lib/librte_kni/rte_kni_fifo.h
> +++ b/lib/librte_kni/rte_kni_fifo.h
> @@ -28,8 +28,13 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data, unsigned num)
>  {
>  	unsigned i = 0;
>  	unsigned fifo_write = fifo->write;
> -	unsigned fifo_read = fifo->read;
>  	unsigned new_write = fifo_write;
> +#ifdef RTE_USE_C11_MEM_MODEL
> +	unsigned fifo_read = __atomic_load_n(&fifo->read,
> +						 __ATOMIC_ACQUIRE);
> +#else
> +	unsigned fifo_read = fifo->read;
> +#endif

Why atomic load preferred against "volatile", won't both end up accessing
memory, is atomic load faster?

>  
>  	for (i = 0; i < num; i++) {
>  		new_write = (new_write + 1) & (fifo->len - 1);
> @@ -39,7 +44,12 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data, unsigned num)
>  		fifo->buffer[fifo_write] = data[i];
>  		fifo_write = new_write;
>  	}
> +#ifdef RTE_USE_C11_MEM_MODEL
> +	__atomic_store_n(&fifo->write, fifo_write, __ATOMIC_RELEASE);
> +#else
> +	rte_smp_wmb();
>  	fifo->write = fifo_write;
> +#endif

How atomic store guaranties "fifo->buffer[fifo_write] = data[i];" will wait
"fifo->write = fifo_write;"? Is atomic store also behave as write memory barrier?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] kni: fix kni fifo synchronization
  2018-09-26 11:42                 ` Ferruh Yigit
@ 2018-09-27  9:06                   ` Phil Yang (Arm Technology China)
  0 siblings, 0 replies; 30+ messages in thread
From: Phil Yang (Arm Technology China) @ 2018-09-27  9:06 UTC (permalink / raw)
  To: Ferruh Yigit, Honnappa Nagarahalli, Jerin Jacob
  Cc: dev, nd, kkokkilagadda, Gavin Hu (Arm Technology China)

Thanks for your comments.

I'll update it in the next version.

> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@intel.com>
> Sent: Wednesday, September 26, 2018 7:43 PM
> To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Jerin Jacob
> <jerin.jacob@caviumnetworks.com>
> Cc: Phil Yang (Arm Technology China) <Phil.Yang@arm.com>; dev@dpdk.org; nd
> <nd@arm.com>; kkokkilagadda@caviumnetworks.com; Gavin Hu (Arm
> Technology China) <Gavin.Hu@arm.com>
> Subject: Re: [PATCH v2 2/3] kni: fix kni fifo synchronization
> 
> On 9/21/2018 7:37 AM, Honnappa Nagarahalli wrote:
> >>>>>>>
> >>>>>>> @@ -69,5 +89,13 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void
> >>>>>>> **data, unsigned num)  static inline uint32_t
> >>>>>>> kni_fifo_count(struct rte_kni_fifo *fifo)  {
> >>>>>>> +#ifdef RTE_USE_C11_MEM_MODEL
> >>>>>>> +       unsigned fifo_write = __atomic_load_n(&fifo->write,
> >>>>>>> +                                                 __ATOMIC_ACQUIRE);
> >>>>>>> +       unsigned fifo_read = __atomic_load_n(&fifo->read,
> >>>>>>> +
> >>>>>>> +__ATOMIC_ACQUIRE);
> >>>>>>
> >>>>>> Isn't too  heavy to have two __ATOMIC_ACQUIREs? a simple
> >>>>>> rte_smp_rmb() would be enough here. Right?
> >>>>>> or
> >>>>>> Do we need __ATOMIC_ACQUIRE for fifo_write case?
> >>>>>>
> >>>>> We also had some amount of debate internally on this:
> >>>>> 1) We do not want to use rte_smp_rmb() as we want to keep the
> >>>>> memory
> >>>> models separated (for ex: while using C11, use C11 everywhere). It
> >>>> is also not sufficient, please see 3) below.
> >>>>
> >>>> But Nothing technically wrong in using rte_smp_rmb() here in terms
> >>>> functionally and code generated by the compiler.
> >>>
> >>> rte_smp_rmb() generates 'DMB ISHLD'. This works fine, but it is not optimal.
> >> 'LDAR' is a better option which is generated when C11 atomics are used.
> >>
> >> Yes. But which one is optimal 1 x DMB ISHLD vs 2 x LDAR ?
> >
> > Good point. I am not sure which one is optimal, it needs to be measured. 'DMB
> ISHLD' orders 'all' earlier loads against 'all' later loads and stores. 'LDAR' orders
> the 'specific' load with 'all' later loads and stores.
> >
> >>
> >>>
> >>>>
> >>>>> 2) This API can get called from writer or reader, so both the
> >>>>> loads have to be __ATOMIC_ACQUIRE
> >>>>> 3) Other option is to use __ATOMIC_RELAXED. That would allow any
> >>>> loads/stores around of this API to get reordered, especially since
> >>>> this is an inline function. This would put burden on the
> >>>> application to manage the ordering depending on its usage. It will
> >>>> also require the application to understand the implementation of this API.
> >>>>
> >>>> __ATOMIC_RELAXED may be fine too for _count() case as it may not
> >>>> very important to get the exact count for the exact very moment,
> >>>> Application can retry.
> >>>>
> >>>> I am in favor of performance effective implementation.
> >>>
> >>> The requirement on the correctness of the count depends on the usage
> >>> of
> >> this function. I see the following usage:
> >>>
> >>> In the file kni_net.c, function: kni_net_tx:
> >>>
> >>>        if (kni_fifo_free_count(kni->tx_q) == 0 ||
> >>>                         kni_fifo_count(kni->alloc_q) == 0) {
> >>>                 /**
> >>>                  * If no free entry in tx_q or no entry in alloc_q,
> >>>                  * drops skb and goes out.
> >>>                  */
> >>>                 goto drop;
> >>>         }
> >>>
> >>> There is no retry here, the packet is dropped.
> >>
> >> OK. Then pick an implementation which is an optimal this case.
> >> I think, then rte_smp_rmb() makes sense here as
> >> a) no #ifdef clutter
> >> b) it is optimal compared to 2 x LDAR
> >>
> > As I understand, one of the principals of using C11 model is to match the store
> releases and load acquires. IMO, combining C11 memory model with barrier
> based functions makes the code unreadable.
> > I realized rte_smp_rmb() is required for x86 as well to prevent compiler
> reordering. We can add that in the non-C11 case. This way, we will have clean
> code for both the options (similar to rte_ring).
> > So, if 'RTE_USE_C11_MEM_MODEL' is set to 'n', then the 'rte_smp_rmb' would
> be used.
> >
> > We can look at handling the #ifdef clutter based on Ferruh's feedback.
> 
> Hi Honnappa, Jerin,
> 
> Sorry for delay, I missed that this is waiting my input.
> 
> +1 to remove #ifdef, but I don't think a separate file is required for
> +this,
> specially when it will be duplication of same implementation, nothing arch
> specific implementation.
> +1 Honnappa's suggestion to hide ifdef's behind APIs, plus those APIs
> +can be
> reused later...
> 
> And +1 to split into two patches, one for fix to current code and one for c11
> atomic implementation support.
> 
> I have some basic questions on the patch, will send in different thread.
> 
> Thanks,
> ferruh
> 
> >
> >>
> >>>
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Other than that, I prefer to avoid ifdef clutter by introducing
> >>>>>> two separate file just like ring C11 implementation.
> >>>>>>
> >>>>>> I don't have strong opinion on this this part, I let KNI
> >>>>>> MAINTAINER to decide on how to accommodate this change.
> >>>>>
> >>>>> I prefer to change this as well, I am open for suggestions.
> >>>>> Introducing two separate files would be too much for this library.
> >>>>> A better
> >>>> way would be to have something similar to 'smp_store_release'
> >>>> provided by the kernel. i.e. create #defines for loads/stores. Hide
> >>>> the clutter behind the #defines.
> >>>>
> >>>> No Strong opinion on this, leaving to KNI Maintainer.
> >>> Will wait on this before re-spinning the patch
> >>>
> >>>>
> >>>> This patch needs to split by two,
> >>>> a) Fixes for non C11 implementation(i.e new addition to
> >>>> rte_smp_wmb())
> >>>> b) add support for C11 implementation.
> >>> Agree
> >>>
> >>>>
> >>>>>
> >>>>>>
> >>>>>>> +       return (fifo->len + fifo_write - fifo_read) & (fifo->len
> >>>>>>> +- 1); #else
> >>>>>>>         return (fifo->len + fifo->write - fifo->read) &
> >>>>>>> (fifo->len
> >>>>>>> - 1);
> > Requires rte_smp_rmb() for x86 to prevent compiler reordering.
> >
> >>>>>>> +#endif
> >>>>>>>  }
> >>>>>>> --
> >>>>>>> 2.7.4
> >>>>>>>

Thanks
Phil


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/3] kni: fix kni fifo synchronization
  2018-09-26 11:45     ` Ferruh Yigit
@ 2018-10-01  4:52       ` Honnappa Nagarahalli
  0 siblings, 0 replies; 30+ messages in thread
From: Honnappa Nagarahalli @ 2018-10-01  4:52 UTC (permalink / raw)
  To: Ferruh Yigit, dev-bounces, dev
  Cc: nd, jerin.jacob, kkokkilagadda, Gavin Hu (Arm Technology China),
	Honnappa Nagarahalli

> 
> On 9/19/2018 2:42 PM, dev-bounces@dpdk.org wrote:
> > With existing code in kni_fifo_put, rx_q values are not being updated
> > before updating fifo_write. While reading rx_q in kni_net_rx_normal,
> > This is causing the sync issue on other core. The same situation
> > happens in kni_fifo_get as well.
> >
> > So syncing the values by adding C11 atomic memory barriers to make
> > sure the values being synced before updating fifo_write and fifo_read.
> >
> > Fixes: 3fc5ca2 ("kni: initial import")
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
> > ---
> >  .../linuxapp/eal/include/exec-env/rte_kni_common.h |  5 ++++
> >  lib/librte_kni/rte_kni_fifo.h                      | 30 +++++++++++++++++++++-
> >  2 files changed, 34 insertions(+), 1 deletion(-)
> >
> > diff --git
> > a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > index cfa9448..1fd713b 100644
> > --- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
> > @@ -54,8 +54,13 @@ struct rte_kni_request {
> >   * Writing should never overwrite the read position
> >   */
> >  struct rte_kni_fifo {
> > +#ifndef RTE_USE_C11_MEM_MODEL
> >  	volatile unsigned write;     /**< Next position to be written*/
> >  	volatile unsigned read;      /**< Next position to be read */
> > +#else
> > +	unsigned write;              /**< Next position to be written*/
> > +	unsigned read;               /**< Next position to be read */
> > +#endif
> >  	unsigned len;                /**< Circular buffer length */
> >  	unsigned elem_size;          /**< Pointer size - for 32/64 bit OS */
> >  	void *volatile buffer[];     /**< The buffer contains mbuf pointers */
> > diff --git a/lib/librte_kni/rte_kni_fifo.h
> > b/lib/librte_kni/rte_kni_fifo.h index ac26a8c..f4171a1 100644
> > --- a/lib/librte_kni/rte_kni_fifo.h
> > +++ b/lib/librte_kni/rte_kni_fifo.h
> > @@ -28,8 +28,13 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void
> > **data, unsigned num)  {
> >  	unsigned i = 0;
> >  	unsigned fifo_write = fifo->write;
> > -	unsigned fifo_read = fifo->read;
> >  	unsigned new_write = fifo_write;
> > +#ifdef RTE_USE_C11_MEM_MODEL
> > +	unsigned fifo_read = __atomic_load_n(&fifo->read,
> > +						 __ATOMIC_ACQUIRE);
> > +#else
> > +	unsigned fifo_read = fifo->read;
> > +#endif
> 
> Why atomic load preferred against "volatile", won't both end up accessing
> memory, is atomic load faster?
> 
My understanding is that with the introduction of C11 atomics, 'volatile' was recommended to be used for memory-mapped I/O locations only. Hence, we removed the 'volatile' for the variables while using C11 (keeping it does not hurt either). However, this also means that every load and store of the variable has to be done using the C11 atomics including relaxed loads.

The '__atomic_load_n' above is providing the memory ordering which the normal load will not provide. For relaxed memory ordered architectures like Arm, the ordering needs to be done explicitly to provide correct functionality.

> >
> >  	for (i = 0; i < num; i++) {
> >  		new_write = (new_write + 1) & (fifo->len - 1); @@ -39,7
> +44,12 @@
> > kni_fifo_put(struct rte_kni_fifo *fifo, void **data, unsigned num)
> >  		fifo->buffer[fifo_write] = data[i];
> >  		fifo_write = new_write;
> >  	}
> > +#ifdef RTE_USE_C11_MEM_MODEL
> > +	__atomic_store_n(&fifo->write, fifo_write, __ATOMIC_RELEASE);
> #else
> > +	rte_smp_wmb();
> >  	fifo->write = fifo_write;
> > +#endif
> 
> How atomic store guaranties "fifo->buffer[fifo_write] = data[i];" will wait
> "fifo->write = fifo_write;"? Is atomic store also behave as write memory
> barrier?
__atomic_store_n with __ATOMIC_RELEASE will prevent memory reordering of fifo->write with any preceding loads or stores. This is called one-way barrier providing load-store and store-store fence [1].

[1] https://preshing.com/20120913/acquire-and-release-semantics/

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [dpdk-dev] [PATCH v3 1/4] config: use one single config option for C11 memory model
  2018-09-19 13:42 ` [dpdk-dev] [PATCH v2 1/3] config: use one single config option for C11 memory model Phil Yang
                     ` (2 preceding siblings ...)
  2018-09-20  8:21   ` [dpdk-dev] [PATCH v2 1/3] config: use one single config option for C11 memory model Jerin Jacob
@ 2018-10-08  9:11   ` Phil Yang
  2018-10-08  9:11     ` [dpdk-dev] [PATCH v3 2/4] kni: fix kni fifo synchronization Phil Yang
                       ` (3 more replies)
  3 siblings, 4 replies; 30+ messages in thread
From: Phil Yang @ 2018-10-08  9:11 UTC (permalink / raw)
  To: dev
  Cc: jerin.jacob, Gavin.Hu, Honnappa.Nagarahalli, Ola.Liljedahl,
	ferruh.yigit, phil.yang

Keep only single config option RTE_USE_C11_MEM_MODEL for C11 memory
model, so all modules can leverage C11 atomic extension by enable this
option.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>

---
 config/arm/meson.build                       | 2 +-
 config/common_armv8a_linuxapp                | 2 +-
 config/common_base                           | 2 +-
 config/defconfig_arm64-thunderx-linuxapp-gcc | 2 +-
 lib/librte_ring/rte_ring.h                   | 4 ++--
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/config/arm/meson.build b/config/arm/meson.build
index 94cca49..4b23b39 100644
--- a/config/arm/meson.build
+++ b/config/arm/meson.build
@@ -53,7 +53,7 @@ flags_cavium = [
 	['RTE_MAX_NUMA_NODES', 2],
 	['RTE_MAX_LCORE', 96],
 	['RTE_MAX_VFIO_GROUPS', 128],
-	['RTE_RING_USE_C11_MEM_MODEL', false]]
+	['RTE_USE_C11_MEM_MODEL', false]]
 flags_dpaa = [
 	['RTE_MACHINE', '"dpaa"'],
 	['RTE_CACHE_LINE_SIZE', 64],
diff --git a/config/common_armv8a_linuxapp b/config/common_armv8a_linuxapp
index 111c005..54e6987 100644
--- a/config/common_armv8a_linuxapp
+++ b/config/common_armv8a_linuxapp
@@ -29,7 +29,7 @@ CONFIG_RTE_ARCH_ARM64_MEMCPY=n
 #CONFIG_RTE_ARM64_MEMCPY_ALIGN_MASK=0xF
 #CONFIG_RTE_ARM64_MEMCPY_STRICT_ALIGN=n
 
-CONFIG_RTE_RING_USE_C11_MEM_MODEL=y
+CONFIG_RTE_USE_C11_MEM_MODEL=y
 
 CONFIG_RTE_LIBRTE_FM10K_PMD=n
 CONFIG_RTE_LIBRTE_SFC_EFX_PMD=n
diff --git a/config/common_base b/config/common_base
index 155c7d4..ccd2670 100644
--- a/config/common_base
+++ b/config/common_base
@@ -661,7 +661,7 @@ CONFIG_RTE_LIBRTE_PMD_IFPGA_RAWDEV=y
 # Compile librte_ring
 #
 CONFIG_RTE_LIBRTE_RING=y
-CONFIG_RTE_RING_USE_C11_MEM_MODEL=n
+CONFIG_RTE_USE_C11_MEM_MODEL=n
 
 #
 # Compile librte_mempool
diff --git a/config/defconfig_arm64-thunderx-linuxapp-gcc b/config/defconfig_arm64-thunderx-linuxapp-gcc
index 2bed66c..f11e758 100644
--- a/config/defconfig_arm64-thunderx-linuxapp-gcc
+++ b/config/defconfig_arm64-thunderx-linuxapp-gcc
@@ -10,7 +10,7 @@ CONFIG_RTE_CACHE_LINE_SIZE=128
 CONFIG_RTE_MAX_NUMA_NODES=2
 CONFIG_RTE_MAX_LCORE=96
 CONFIG_RTE_MAX_VFIO_GROUPS=128
-CONFIG_RTE_RING_USE_C11_MEM_MODEL=n
+CONFIG_RTE_USE_C11_MEM_MODEL=n
 
 #
 # Compile PMD for octeontx sso event device
diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
index 7a731d0..af5444a 100644
--- a/lib/librte_ring/rte_ring.h
+++ b/lib/librte_ring/rte_ring.h
@@ -303,11 +303,11 @@ void rte_ring_dump(FILE *f, const struct rte_ring *r);
  * There are 2 choices for the users
  * 1.use rmb() memory barrier
  * 2.use one-direcion load_acquire/store_release barrier,defined by
- * CONFIG_RTE_RING_USE_C11_MEM_MODEL=y
+ * CONFIG_RTE_USE_C11_MEM_MODEL=y
  * It depends on performance test results.
  * By default, move common functions to rte_ring_generic.h
  */
-#ifdef RTE_RING_USE_C11_MEM_MODEL
+#ifdef RTE_USE_C11_MEM_MODEL
 #include "rte_ring_c11_mem.h"
 #else
 #include "rte_ring_generic.h"
-- 
2.7.4

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [dpdk-dev] [PATCH v3 2/4] kni: fix kni fifo synchronization
  2018-10-08  9:11   ` [dpdk-dev] [PATCH v3 1/4] " Phil Yang
@ 2018-10-08  9:11     ` Phil Yang
  2018-10-08 21:53       ` Stephen Hemminger
  2018-10-08  9:11     ` [dpdk-dev] [PATCH v3 3/4] kni: fix kni kernel " Phil Yang
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 30+ messages in thread
From: Phil Yang @ 2018-10-08  9:11 UTC (permalink / raw)
  To: dev
  Cc: jerin.jacob, Gavin.Hu, Honnappa.Nagarahalli, Ola.Liljedahl,
	ferruh.yigit, phil.yang

With existing code in kni_fifo_put, rx_q values are not being updated
before updating fifo_write. While reading rx_q in kni_net_rx_normal,
This is causing the sync issue on other core. The same situation happens
in kni_fifo_get as well.

So syncing the values by adding memory barriers to make sure the values
being synced before updating fifo_write and fifo_read.

Fixes: 3fc5ca2 ("kni: initial import")

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
Reviewed-by: Ola Liljedahl <Ola.Liljedahl@arm.com>
Reviewed-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>

---
 lib/librte_kni/rte_kni_fifo.h | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/lib/librte_kni/rte_kni_fifo.h b/lib/librte_kni/rte_kni_fifo.h
index ac26a8c..70ac14e 100644
--- a/lib/librte_kni/rte_kni_fifo.h
+++ b/lib/librte_kni/rte_kni_fifo.h
@@ -28,8 +28,9 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data, unsigned num)
 {
 	unsigned i = 0;
 	unsigned fifo_write = fifo->write;
-	unsigned fifo_read = fifo->read;
 	unsigned new_write = fifo_write;
+	rte_smp_rmb();
+	unsigned fifo_read = fifo->read;
 
 	for (i = 0; i < num; i++) {
 		new_write = (new_write + 1) & (fifo->len - 1);
@@ -39,6 +40,7 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data, unsigned num)
 		fifo->buffer[fifo_write] = data[i];
 		fifo_write = new_write;
 	}
+	rte_smp_wmb();
 	fifo->write = fifo_write;
 	return i;
 }
@@ -51,7 +53,9 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, unsigned num)
 {
 	unsigned i = 0;
 	unsigned new_read = fifo->read;
+	rte_smp_rmb();
 	unsigned fifo_write = fifo->write;
+
 	for (i = 0; i < num; i++) {
 		if (new_read == fifo_write)
 			break;
@@ -59,6 +63,7 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, unsigned num)
 		data[i] = fifo->buffer[new_read];
 		new_read = (new_read + 1) & (fifo->len - 1);
 	}
+	rte_smp_rmb();
 	fifo->read = new_read;
 	return i;
 }
@@ -69,5 +74,8 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, unsigned num)
 static inline uint32_t
 kni_fifo_count(struct rte_kni_fifo *fifo)
 {
-	return (fifo->len + fifo->write - fifo->read) & (fifo->len - 1);
+	unsigned fifo_write = fifo->write;
+	rte_smp_rmb();
+	unsigned fifo_read = fifo->read;
+	return (fifo->len + fifo_write - fifo_read) & (fifo->len - 1);
 }
-- 
2.7.4

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [dpdk-dev] [PATCH v3 3/4] kni: fix kni kernel fifo synchronization
  2018-10-08  9:11   ` [dpdk-dev] [PATCH v3 1/4] " Phil Yang
  2018-10-08  9:11     ` [dpdk-dev] [PATCH v3 2/4] kni: fix kni fifo synchronization Phil Yang
@ 2018-10-08  9:11     ` Phil Yang
  2018-10-08  9:11     ` [dpdk-dev] [PATCH v3 4/4] kni: introduce c11 atomic into kni " Phil Yang
  2018-10-10 14:48     ` [dpdk-dev] [PATCH v3 1/4] config: use one single config option for C11 memory model Ferruh Yigit
  3 siblings, 0 replies; 30+ messages in thread
From: Phil Yang @ 2018-10-08  9:11 UTC (permalink / raw)
  To: dev
  Cc: jerin.jacob, Gavin.Hu, Honnappa.Nagarahalli, Ola.Liljedahl,
	ferruh.yigit, phil.yang

Adding memory barrier to make sure the values being synced
before updating fifo_write in kni_fifo_put and fifo_read in
kni_fifo_get.

Fixes: 3fc5ca2 ("kni: initial import")

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
---
 kernel/linux/kni/kni_fifo.h                              | 16 ++++++++++------
 .../linuxapp/eal/include/exec-env/rte_kni_common.h       |  1 +
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/kernel/linux/kni/kni_fifo.h b/kernel/linux/kni/kni_fifo.h
index 9a4762d..2cb3a4a 100644
--- a/kernel/linux/kni/kni_fifo.h
+++ b/kernel/linux/kni/kni_fifo.h
@@ -16,7 +16,7 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data, uint32_t num)
 {
 	uint32_t i = 0;
 	uint32_t fifo_write = fifo->write;
-	uint32_t fifo_read = fifo->read;
+	uint32_t fifo_read = smp_load_acquire(&fifo->read);
 	uint32_t new_write = fifo_write;
 
 	for (i = 0; i < num; i++) {
@@ -27,7 +27,7 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data, uint32_t num)
 		fifo->buffer[fifo_write] = data[i];
 		fifo_write = new_write;
 	}
-	fifo->write = fifo_write;
+	smp_store_release(&fifo->write, fifo_write);
 
 	return i;
 }
@@ -40,7 +40,7 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, uint32_t num)
 {
 	uint32_t i = 0;
 	uint32_t new_read = fifo->read;
-	uint32_t fifo_write = fifo->write;
+	uint32_t fifo_write = smp_load_acquire(&fifo->write);
 
 	for (i = 0; i < num; i++) {
 		if (new_read == fifo_write)
@@ -49,7 +49,7 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, uint32_t num)
 		data[i] = fifo->buffer[new_read];
 		new_read = (new_read + 1) & (fifo->len - 1);
 	}
-	fifo->read = new_read;
+	smp_store_release(&fifo->read, new_read);
 
 	return i;
 }
@@ -60,7 +60,9 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, uint32_t num)
 static inline uint32_t
 kni_fifo_count(struct rte_kni_fifo *fifo)
 {
-	return (fifo->len + fifo->write - fifo->read) & (fifo->len - 1);
+	uint32_t fifo_write = smp_load_acquire(&fifo->write);
+	uint32_t fifo_read = smp_load_acquire(&fifo->read);
+	return (fifo->len + fifo_write - fifo_read) & (fifo->len - 1);
 }
 
 /**
@@ -69,7 +71,9 @@ kni_fifo_count(struct rte_kni_fifo *fifo)
 static inline uint32_t
 kni_fifo_free_count(struct rte_kni_fifo *fifo)
 {
-	return (fifo->read - fifo->write - 1) & (fifo->len - 1);
+	uint32_t fifo_write = smp_load_acquire(&fifo->write);
+	uint32_t fifo_read = smp_load_acquire(&fifo->read);
+	return (fifo_read - fifo_write - 1) & (fifo->len - 1);
 }
 
 #endif /* _KNI_FIFO_H_ */
diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
index cfa9448..58e8533 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
@@ -8,6 +8,7 @@
 
 #ifdef __KERNEL__
 #include <linux/if.h>
+#include <asm/barrier.h>
 #define RTE_STD_C11
 #else
 #include <rte_common.h>
-- 
2.7.4

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [dpdk-dev] [PATCH v3 4/4] kni: introduce c11 atomic into kni fifo synchronization
  2018-10-08  9:11   ` [dpdk-dev] [PATCH v3 1/4] " Phil Yang
  2018-10-08  9:11     ` [dpdk-dev] [PATCH v3 2/4] kni: fix kni fifo synchronization Phil Yang
  2018-10-08  9:11     ` [dpdk-dev] [PATCH v3 3/4] kni: fix kni kernel " Phil Yang
@ 2018-10-08  9:11     ` Phil Yang
  2018-10-10 14:48     ` [dpdk-dev] [PATCH v3 1/4] config: use one single config option for C11 memory model Ferruh Yigit
  3 siblings, 0 replies; 30+ messages in thread
From: Phil Yang @ 2018-10-08  9:11 UTC (permalink / raw)
  To: dev
  Cc: jerin.jacob, Gavin.Hu, Honnappa.Nagarahalli, Ola.Liljedahl,
	ferruh.yigit, phil.yang

Syncing the values by adding c11 atomic memory barriers to make sure
the values being synced before updating fifo_write and fifo_read.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
Reviewed-by: Ola Liljedahl <Ola.Liljedahl@arm.com>
---
 .../linuxapp/eal/include/exec-env/rte_kni_common.h |  5 +++
 lib/librte_kni/rte_kni_fifo.h                      | 47 +++++++++++++++++-----
 2 files changed, 41 insertions(+), 11 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
index 58e8533..5afa087 100644
--- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
+++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h
@@ -55,8 +55,13 @@ struct rte_kni_request {
  * Writing should never overwrite the read position
  */
 struct rte_kni_fifo {
+#ifdef RTE_USE_C11_MEM_MODEL
+	unsigned write;              /**< Next position to be written*/
+	unsigned read;               /**< Next position to be read */
+#else
 	volatile unsigned write;     /**< Next position to be written*/
 	volatile unsigned read;      /**< Next position to be read */
+#endif
 	unsigned len;                /**< Circular buffer length */
 	unsigned elem_size;          /**< Pointer size - for 32/64 bit OS */
 	void *volatile buffer[];     /**< The buffer contains mbuf pointers */
diff --git a/lib/librte_kni/rte_kni_fifo.h b/lib/librte_kni/rte_kni_fifo.h
index 70ac14e..83d31b6 100644
--- a/lib/librte_kni/rte_kni_fifo.h
+++ b/lib/librte_kni/rte_kni_fifo.h
@@ -5,6 +5,36 @@
 
 
 /**
+ * @internal when c11 memory model enabled use c11 atomic memory barrier.
+ * when under non c11 memory model use rte_smp_* memory barrier.
+ *
+ * @param src
+ *   Pointer to the source data.
+ * @param dst
+ *   Pointer to the destination data.
+ * @param value
+ *   Data value.
+ */
+#ifdef RTE_USE_C11_MEM_MODEL
+#define __KNI_LOAD_ACQUIRE(src) ({                         \
+		__atomic_load_n((src), __ATOMIC_ACQUIRE);           \
+	})
+#define __KNI_STORE_RELEASE(dst, value) do {               \
+		__atomic_store_n((dst), value, __ATOMIC_RELEASE);   \
+	} while(0)
+#else
+#define __KNI_LOAD_ACQUIRE(src) ({                         \
+		typeof (*(src)) val = *(src);                       \
+		rte_smp_rmb();                                      \
+		val;                                                \
+	})
+#define __KNI_STORE_RELEASE(dst, value) do {               \
+		*(dst) = value;                                     \
+		rte_smp_wmb();                                      \
+	} while(0)
+#endif
+
+/**
  * Initializes the kni fifo structure
  */
 static void
@@ -29,8 +59,7 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data, unsigned num)
 	unsigned i = 0;
 	unsigned fifo_write = fifo->write;
 	unsigned new_write = fifo_write;
-	rte_smp_rmb();
-	unsigned fifo_read = fifo->read;
+	unsigned fifo_read = __KNI_LOAD_ACQUIRE(&fifo->read);
 
 	for (i = 0; i < num; i++) {
 		new_write = (new_write + 1) & (fifo->len - 1);
@@ -40,8 +69,7 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data, unsigned num)
 		fifo->buffer[fifo_write] = data[i];
 		fifo_write = new_write;
 	}
-	rte_smp_wmb();
-	fifo->write = fifo_write;
+	__KNI_STORE_RELEASE(&fifo->write, fifo_write);
 	return i;
 }
 
@@ -53,8 +81,7 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, unsigned num)
 {
 	unsigned i = 0;
 	unsigned new_read = fifo->read;
-	rte_smp_rmb();
-	unsigned fifo_write = fifo->write;
+	unsigned fifo_write = __KNI_LOAD_ACQUIRE(&fifo->write);
 
 	for (i = 0; i < num; i++) {
 		if (new_read == fifo_write)
@@ -63,8 +90,7 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, unsigned num)
 		data[i] = fifo->buffer[new_read];
 		new_read = (new_read + 1) & (fifo->len - 1);
 	}
-	rte_smp_rmb();
-	fifo->read = new_read;
+	__KNI_STORE_RELEASE(&fifo->read, new_read);
 	return i;
 }
 
@@ -74,8 +100,7 @@ kni_fifo_get(struct rte_kni_fifo *fifo, void **data, unsigned num)
 static inline uint32_t
 kni_fifo_count(struct rte_kni_fifo *fifo)
 {
-	unsigned fifo_write = fifo->write;
-	rte_smp_rmb();
-	unsigned fifo_read = fifo->read;
+	unsigned fifo_write = __KNI_LOAD_ACQUIRE(&fifo->write);
+	unsigned fifo_read = __KNI_LOAD_ACQUIRE(&fifo->read);
 	return (fifo->len + fifo_write - fifo_read) & (fifo->len - 1);
 }
-- 
2.7.4

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/4] kni: fix kni fifo synchronization
  2018-10-08  9:11     ` [dpdk-dev] [PATCH v3 2/4] kni: fix kni fifo synchronization Phil Yang
@ 2018-10-08 21:53       ` Stephen Hemminger
  2018-10-10  9:58         ` Phil Yang (Arm Technology China)
  0 siblings, 1 reply; 30+ messages in thread
From: Stephen Hemminger @ 2018-10-08 21:53 UTC (permalink / raw)
  To: Phil Yang
  Cc: dev, jerin.jacob, Gavin.Hu, Honnappa.Nagarahalli, Ola.Liljedahl,
	ferruh.yigit

On Mon,  8 Oct 2018 17:11:44 +0800
Phil Yang <phil.yang@arm.com> wrote:

> diff --git a/lib/librte_kni/rte_kni_fifo.h b/lib/librte_kni/rte_kni_fifo.h
> index ac26a8c..70ac14e 100644
> --- a/lib/librte_kni/rte_kni_fifo.h
> +++ b/lib/librte_kni/rte_kni_fifo.h
> @@ -28,8 +28,9 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data, unsigned num)
>  {
>  	unsigned i = 0;
>  	unsigned fifo_write = fifo->write;
> -	unsigned fifo_read = fifo->read;
>  	unsigned new_write = fifo_write;
> +	rte_smp_rmb();
> +	unsigned fifo_read = fifo->read;
>  

The patch makes sense, but this function should be changed to match kernel code style.
That means no declarations after initial block, and use 'unsigned int' rather than 'unsigned'

Also. why is i initialized? Best practice now is to not do gratitious initialization
since it defeats compiler checks for accidental  use of uninitialized variables.

What makes sense is something like:

kni_fifo_put(struct rte_kni_fifo *fifo, void **data, unsigned num)
{
	unsigned int i, fifo_read, fifo_write, new_write;

	fifo_write = fifo->write;
	new_write = fifo_write;
	rte_smb_rmb();
	fifo_read = fifo->read;

Sorry, blaming you for issues which are inherited from original KNI code.
Maybe someone should run kernel checkpatch (not DPDK checkpatch) on it and fix those.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/4] kni: fix kni fifo synchronization
  2018-10-08 21:53       ` Stephen Hemminger
@ 2018-10-10  9:58         ` Phil Yang (Arm Technology China)
  2018-10-10 10:06           ` Gavin Hu (Arm Technology China)
  0 siblings, 1 reply; 30+ messages in thread
From: Phil Yang (Arm Technology China) @ 2018-10-10  9:58 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, jerin.jacob, Gavin Hu (Arm Technology China),
	Honnappa Nagarahalli, Ola Liljedahl, ferruh.yigit

Hi Hemminger,

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Tuesday, October 9, 2018 5:53 AM
> To: Phil Yang (Arm Technology China) <Phil.Yang@arm.com>
> Cc: dev@dpdk.org; jerin.jacob@caviumnetworks.com; Gavin Hu (Arm
> Technology China) <Gavin.Hu@arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ola Liljedahl <Ola.Liljedahl@arm.com>;
> ferruh.yigit@intel.com
> Subject: Re: [dpdk-dev] [PATCH v3 2/4] kni: fix kni fifo synchronization
>
> On Mon,  8 Oct 2018 17:11:44 +0800
> Phil Yang <phil.yang@arm.com> wrote:
>
> > diff --git a/lib/librte_kni/rte_kni_fifo.h
> > b/lib/librte_kni/rte_kni_fifo.h index ac26a8c..70ac14e 100644
> > --- a/lib/librte_kni/rte_kni_fifo.h
> > +++ b/lib/librte_kni/rte_kni_fifo.h
> > @@ -28,8 +28,9 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void **data,
> > unsigned num)  {
> >  unsigned i = 0;
> >  unsigned fifo_write = fifo->write;
> > -unsigned fifo_read = fifo->read;
> >  unsigned new_write = fifo_write;
> > +rte_smp_rmb();
> > +unsigned fifo_read = fifo->read;
> >
>
> The patch makes sense, but this function should be changed to match kernel
> code style.
> That means no declarations after initial block, and use 'unsigned int' rather than
> 'unsigned'
>
> Also. why is i initialized? Best practice now is to not do gratitious initialization
> since it defeats compiler checks for accidental  use of uninitialized variables.
>
> What makes sense is something like:
>
> kni_fifo_put(struct rte_kni_fifo *fifo, void **data, unsigned num) {
> unsigned int i, fifo_read, fifo_write, new_write;
>
> fifo_write = fifo->write;
> new_write = fifo_write;
> rte_smb_rmb();
> fifo_read = fifo->read;
>
> Sorry, blaming you for issues which are inherited from original KNI code.
> Maybe someone should run kernel checkpatch (not DPDK checkpatch) on it and
> fix those.

Thanks for your comment.

I think I can submit a new separate patch to fix this historical issue.

Thanks,
Phil Yang
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/4] kni: fix kni fifo synchronization
  2018-10-10  9:58         ` Phil Yang (Arm Technology China)
@ 2018-10-10 10:06           ` Gavin Hu (Arm Technology China)
  2018-10-10 14:42             ` Ferruh Yigit
  0 siblings, 1 reply; 30+ messages in thread
From: Gavin Hu (Arm Technology China) @ 2018-10-10 10:06 UTC (permalink / raw)
  To: Phil Yang (Arm Technology China), Stephen Hemminger
  Cc: dev, jerin.jacob, Honnappa Nagarahalli, Ola Liljedahl, ferruh.yigit



> -----Original Message-----
> From: Phil Yang (Arm Technology China)
> Sent: Wednesday, October 10, 2018 5:59 PM
> To: Stephen Hemminger <stephen@networkplumber.org>
> Cc: dev@dpdk.org; jerin.jacob@caviumnetworks.com; Gavin Hu (Arm
> Technology China) <Gavin.Hu@arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ola Liljedahl <Ola.Liljedahl@arm.com>;
> ferruh.yigit@intel.com
> Subject: RE: [dpdk-dev] [PATCH v3 2/4] kni: fix kni fifo synchronization
>
> Hi Hemminger,
>
> > -----Original Message-----
> > From: Stephen Hemminger <stephen@networkplumber.org>
> > Sent: Tuesday, October 9, 2018 5:53 AM
> > To: Phil Yang (Arm Technology China) <Phil.Yang@arm.com>
> > Cc: dev@dpdk.org; jerin.jacob@caviumnetworks.com; Gavin Hu (Arm
> > Technology China) <Gavin.Hu@arm.com>; Honnappa Nagarahalli
> > <Honnappa.Nagarahalli@arm.com>; Ola Liljedahl <Ola.Liljedahl@arm.com>;
> > ferruh.yigit@intel.com
> > Subject: Re: [dpdk-dev] [PATCH v3 2/4] kni: fix kni fifo
> > synchronization
> >
> > On Mon,  8 Oct 2018 17:11:44 +0800
> > Phil Yang <phil.yang@arm.com> wrote:
> >
> > > diff --git a/lib/librte_kni/rte_kni_fifo.h
> > > b/lib/librte_kni/rte_kni_fifo.h index ac26a8c..70ac14e 100644
> > > --- a/lib/librte_kni/rte_kni_fifo.h
> > > +++ b/lib/librte_kni/rte_kni_fifo.h
> > > @@ -28,8 +28,9 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void
> > > **data, unsigned num)  {
> > >  unsigned i = 0;
> > >  unsigned fifo_write = fifo->write;
> > > -unsigned fifo_read = fifo->read;
> > >  unsigned new_write = fifo_write;
> > > +rte_smp_rmb();
> > > +unsigned fifo_read = fifo->read;
> > >
> >
> > The patch makes sense, but this function should be changed to match
> > kernel code style.
> > That means no declarations after initial block, and use 'unsigned int'
> > rather than 'unsigned'
> >
> > Also. why is i initialized? Best practice now is to not do gratitious
> > initialization since it defeats compiler checks for accidental  use of
> uninitialized variables.
> >
> > What makes sense is something like:
> >
> > kni_fifo_put(struct rte_kni_fifo *fifo, void **data, unsigned num) {
> > unsigned int i, fifo_read, fifo_write, new_write;
> >
> > fifo_write = fifo->write;
> > new_write = fifo_write;
> > rte_smb_rmb();
> > fifo_read = fifo->read;
> >
> > Sorry, blaming you for issues which are inherited from original KNI code.
> > Maybe someone should run kernel checkpatch (not DPDK checkpatch) on it
> > and fix those.
>
> Thanks for your comment.
>
> I think I can submit a new separate patch to fix this historical issue.
>
> Thanks,
> Phil Yang

I advised a separate patch to make this patch to the point and clean.

-Gavin
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dpdk-dev] [PATCH v3 2/4] kni: fix kni fifo synchronization
  2018-10-10 10:06           ` Gavin Hu (Arm Technology China)
@ 2018-10-10 14:42             ` Ferruh Yigit
  0 siblings, 0 replies; 30+ messages in thread
From: Ferruh Yigit @ 2018-10-10 14:42 UTC (permalink / raw)
  To: Gavin Hu (Arm Technology China), Phil Yang (Arm Technology China),
	Stephen Hemminger
  Cc: dev, jerin.jacob, Honnappa Nagarahalli, Ola Liljedahl

On 10/10/2018 11:06 AM, Gavin Hu (Arm Technology China) wrote:
> 
> 
>> -----Original Message-----
>> From: Phil Yang (Arm Technology China)
>> Sent: Wednesday, October 10, 2018 5:59 PM
>> To: Stephen Hemminger <stephen@networkplumber.org>
>> Cc: dev@dpdk.org; jerin.jacob@caviumnetworks.com; Gavin Hu (Arm
>> Technology China) <Gavin.Hu@arm.com>; Honnappa Nagarahalli
>> <Honnappa.Nagarahalli@arm.com>; Ola Liljedahl <Ola.Liljedahl@arm.com>;
>> ferruh.yigit@intel.com
>> Subject: RE: [dpdk-dev] [PATCH v3 2/4] kni: fix kni fifo synchronization
>>
>> Hi Hemminger,
>>
>>> -----Original Message-----
>>> From: Stephen Hemminger <stephen@networkplumber.org>
>>> Sent: Tuesday, October 9, 2018 5:53 AM
>>> To: Phil Yang (Arm Technology China) <Phil.Yang@arm.com>
>>> Cc: dev@dpdk.org; jerin.jacob@caviumnetworks.com; Gavin Hu (Arm
>>> Technology China) <Gavin.Hu@arm.com>; Honnappa Nagarahalli
>>> <Honnappa.Nagarahalli@arm.com>; Ola Liljedahl <Ola.Liljedahl@arm.com>;
>>> ferruh.yigit@intel.com
>>> Subject: Re: [dpdk-dev] [PATCH v3 2/4] kni: fix kni fifo
>>> synchronization
>>>
>>> On Mon,  8 Oct 2018 17:11:44 +0800
>>> Phil Yang <phil.yang@arm.com> wrote:
>>>
>>>> diff --git a/lib/librte_kni/rte_kni_fifo.h
>>>> b/lib/librte_kni/rte_kni_fifo.h index ac26a8c..70ac14e 100644
>>>> --- a/lib/librte_kni/rte_kni_fifo.h
>>>> +++ b/lib/librte_kni/rte_kni_fifo.h
>>>> @@ -28,8 +28,9 @@ kni_fifo_put(struct rte_kni_fifo *fifo, void
>>>> **data, unsigned num)  {
>>>>  unsigned i = 0;
>>>>  unsigned fifo_write = fifo->write;
>>>> -unsigned fifo_read = fifo->read;
>>>>  unsigned new_write = fifo_write;
>>>> +rte_smp_rmb();
>>>> +unsigned fifo_read = fifo->read;
>>>>
>>>
>>> The patch makes sense, but this function should be changed to match
>>> kernel code style.
>>> That means no declarations after initial block, and use 'unsigned int'
>>> rather than 'unsigned'
>>>
>>> Also. why is i initialized? Best practice now is to not do gratitious
>>> initialization since it defeats compiler checks for accidental  use of
>> uninitialized variables.
>>>
>>> What makes sense is something like:
>>>
>>> kni_fifo_put(struct rte_kni_fifo *fifo, void **data, unsigned num) {
>>> unsigned int i, fifo_read, fifo_write, new_write;
>>>
>>> fifo_write = fifo->write;
>>> new_write = fifo_write;
>>> rte_smb_rmb();
>>> fifo_read = fifo->read;
>>>
>>> Sorry, blaming you for issues which are inherited from original KNI code.
>>> Maybe someone should run kernel checkpatch (not DPDK checkpatch) on it
>>> and fix those.
>>
>> Thanks for your comment.
>>
>> I think I can submit a new separate patch to fix this historical issue.
>>
>> Thanks,
>> Phil Yang
> 
> I advised a separate patch to make this patch to the point and clean.

+1 to separate patch for clean up.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/4] config: use one single config option for C11 memory model
  2018-10-08  9:11   ` [dpdk-dev] [PATCH v3 1/4] " Phil Yang
                       ` (2 preceding siblings ...)
  2018-10-08  9:11     ` [dpdk-dev] [PATCH v3 4/4] kni: introduce c11 atomic into kni " Phil Yang
@ 2018-10-10 14:48     ` Ferruh Yigit
  2018-10-12  9:17       ` Phil Yang (Arm Technology China)
  2018-10-26 15:56       ` Thomas Monjalon
  3 siblings, 2 replies; 30+ messages in thread
From: Ferruh Yigit @ 2018-10-10 14:48 UTC (permalink / raw)
  To: phil.yang, dev; +Cc: jerin.jacob, Gavin.Hu, Honnappa.Nagarahalli, Ola.Liljedahl

On 10/8/2018 10:11 AM, phil.yang@arm.com wrote:
> Keep only single config option RTE_USE_C11_MEM_MODEL for C11 memory
> model, so all modules can leverage C11 atomic extension by enable this
> option.
> 
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
> Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>

For series,
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

Are you planning to send a new version to include the clean up, or get this one
first an do the cleanup on top of it? I think both are OK but please let us know.
And if there will be a new version with cleanup, please keep the review tag.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/4] config: use one single config option for C11 memory model
  2018-10-10 14:48     ` [dpdk-dev] [PATCH v3 1/4] config: use one single config option for C11 memory model Ferruh Yigit
@ 2018-10-12  9:17       ` Phil Yang (Arm Technology China)
  2018-10-26 15:56       ` Thomas Monjalon
  1 sibling, 0 replies; 30+ messages in thread
From: Phil Yang (Arm Technology China) @ 2018-10-12  9:17 UTC (permalink / raw)
  To: Ferruh Yigit, dev
  Cc: jerin.jacob, Gavin Hu (Arm Technology China),
	Honnappa Nagarahalli, Ola Liljedahl

> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@intel.com>
> Sent: Wednesday, October 10, 2018 10:48 PM
> To: Phil Yang (Arm Technology China) <Phil.Yang@arm.com>; dev@dpdk.org
> Cc: jerin.jacob@caviumnetworks.com; Gavin Hu (Arm Technology China)
> <Gavin.Hu@arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ola Liljedahl <Ola.Liljedahl@arm.com>
> Subject: Re: [PATCH v3 1/4] config: use one single config option for C11 memory
> model
>
> On 10/8/2018 10:11 AM, phil.yang@arm.com wrote:
> > Keep only single config option RTE_USE_C11_MEM_MODEL for C11 memory
> > model, so all modules can leverage C11 atomic extension by enable this
> > option.
> >
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
> > Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
>
> For series,
> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
>
> Are you planning to send a new version to include the clean up, or get this one
> first an do the cleanup on top of it? I think both are OK but please let us know.
> And if there will be a new version with cleanup, please keep the review tag.

Hi Ferruh,

Thanks for your remind.

I think it is better to merge this series first. Because the cleanup patch won't block any functionality.
I will do it on top of this series.

Thanks
Phil Yang
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [dpdk-dev] [PATCH v3 1/4] config: use one single config option for C11 memory model
  2018-10-10 14:48     ` [dpdk-dev] [PATCH v3 1/4] config: use one single config option for C11 memory model Ferruh Yigit
  2018-10-12  9:17       ` Phil Yang (Arm Technology China)
@ 2018-10-26 15:56       ` Thomas Monjalon
  1 sibling, 0 replies; 30+ messages in thread
From: Thomas Monjalon @ 2018-10-26 15:56 UTC (permalink / raw)
  To: phil.yang
  Cc: dev, Ferruh Yigit, jerin.jacob, Gavin.Hu, Honnappa.Nagarahalli,
	Ola.Liljedahl

10/10/2018 16:48, Ferruh Yigit:
> On 10/8/2018 10:11 AM, phil.yang@arm.com wrote:
> > Keep only single config option RTE_USE_C11_MEM_MODEL for C11 memory
> > model, so all modules can leverage C11 atomic extension by enable this
> > option.
> > 
> > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> > Reviewed-by: Gavin Hu <Gavin.Hu@arm.com>
> > Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
> 
> For series,
> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

Series applied, thanks

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2018-10-26 15:56 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-19 13:30 [dpdk-dev] [PATCH 1/3] config: use one single config option for C11 memory model Phil Yang
2018-09-19 13:30 ` [dpdk-dev] [PATCH 2/3] kni: fix kni fifo synchronization Phil Yang
2018-09-19 13:30 ` [dpdk-dev] [PATCH 3/3] kni: fix kni kernel " Phil Yang
2018-09-19 13:42 ` [dpdk-dev] [PATCH v2 1/3] config: use one single config option for C11 memory model Phil Yang
2018-09-19 13:42   ` [dpdk-dev] [PATCH v2 2/3] kni: fix kni fifo synchronization Phil Yang
2018-09-20  8:28     ` Jerin Jacob
2018-09-20 15:20       ` Honnappa Nagarahalli
2018-09-20 15:37         ` Jerin Jacob
2018-09-21  5:48           ` Honnappa Nagarahalli
2018-09-21  5:55             ` Jerin Jacob
2018-09-21  6:37               ` Honnappa Nagarahalli
2018-09-21  9:00                 ` Phil Yang (Arm Technology China)
2018-09-25  4:44                 ` Honnappa Nagarahalli
2018-09-26 11:42                 ` Ferruh Yigit
2018-09-27  9:06                   ` Phil Yang (Arm Technology China)
2018-09-26 11:45     ` Ferruh Yigit
2018-10-01  4:52       ` Honnappa Nagarahalli
2018-09-19 13:42   ` [dpdk-dev] [PATCH v2 3/3] kni: fix kni kernel " Phil Yang
2018-09-20  8:21   ` [dpdk-dev] [PATCH v2 1/3] config: use one single config option for C11 memory model Jerin Jacob
2018-10-08  9:11   ` [dpdk-dev] [PATCH v3 1/4] " Phil Yang
2018-10-08  9:11     ` [dpdk-dev] [PATCH v3 2/4] kni: fix kni fifo synchronization Phil Yang
2018-10-08 21:53       ` Stephen Hemminger
2018-10-10  9:58         ` Phil Yang (Arm Technology China)
2018-10-10 10:06           ` Gavin Hu (Arm Technology China)
2018-10-10 14:42             ` Ferruh Yigit
2018-10-08  9:11     ` [dpdk-dev] [PATCH v3 3/4] kni: fix kni kernel " Phil Yang
2018-10-08  9:11     ` [dpdk-dev] [PATCH v3 4/4] kni: introduce c11 atomic into kni " Phil Yang
2018-10-10 14:48     ` [dpdk-dev] [PATCH v3 1/4] config: use one single config option for C11 memory model Ferruh Yigit
2018-10-12  9:17       ` Phil Yang (Arm Technology China)
2018-10-26 15:56       ` Thomas Monjalon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).