* [PATCH 1/2] net/txgbe: add proper memory barriers in Rx
@ 2023-10-30 10:51 Jiawen Wu
2023-10-30 10:51 ` [PATCH 2/2] net/ngbe: " Jiawen Wu
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Jiawen Wu @ 2023-10-30 10:51 UTC (permalink / raw)
To: dev; +Cc: Jiawen Wu, stable
Refer to commit 85e46c532bc7 ("net/ixgbe: add proper memory barriers in
Rx"). Fix the same issue as ixgbe.
Segmentation fault has been observed while running the
txgbe_recv_pkts_lro() function to receive packets on the Loongson 3A5000
processor. It's caused by the out-of-order execution of CPU. So add a
proper memory barrier to ensure the read ordering be correct.
We also did the same thing in the txgbe_recv_pkts() function to make the
rxd data be valid even though we did not find segmentation fault in this
function.
Fixes: 0e484278c85f ("net/txgbe: support Rx")
Cc: stable@dpdk.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
---
drivers/net/txgbe/txgbe_rxtx.c | 47 +++++++++++++++-------------------
1 file changed, 21 insertions(+), 26 deletions(-)
diff --git a/drivers/net/txgbe/txgbe_rxtx.c b/drivers/net/txgbe/txgbe_rxtx.c
index 834ada886a..24fc34d3c4 100644
--- a/drivers/net/txgbe/txgbe_rxtx.c
+++ b/drivers/net/txgbe/txgbe_rxtx.c
@@ -1476,11 +1476,22 @@ txgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
* of accesses cannot be reordered by the compiler. If they were
* not volatile, they could be reordered which could lead to
* using invalid descriptor fields when read from rxd.
+ *
+ * Meanwhile, to prevent the CPU from executing out of order, we
+ * need to use a proper memory barrier to ensure the memory
+ * ordering below.
*/
rxdp = &rx_ring[rx_id];
staterr = rxdp->qw1.lo.status;
if (!(staterr & rte_cpu_to_le_32(TXGBE_RXD_STAT_DD)))
break;
+
+ /*
+ * Use acquire fence to ensure that status_error which includes
+ * DD bit is loaded before loading of other descriptor words.
+ */
+ rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
+
rxd = *rxdp;
/*
@@ -1726,32 +1737,10 @@ txgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts,
next_desc:
/*
- * The code in this whole file uses the volatile pointer to
- * ensure the read ordering of the status and the rest of the
- * descriptor fields (on the compiler level only!!!). This is so
- * UGLY - why not to just use the compiler barrier instead? DPDK
- * even has the rte_compiler_barrier() for that.
- *
- * But most importantly this is just wrong because this doesn't
- * ensure memory ordering in a general case at all. For
- * instance, DPDK is supposed to work on Power CPUs where
- * compiler barrier may just not be enough!
- *
- * I tried to write only this function properly to have a
- * starting point (as a part of an LRO/RSC series) but the
- * compiler cursed at me when I tried to cast away the
- * "volatile" from rx_ring (yes, it's volatile too!!!). So, I'm
- * keeping it the way it is for now.
- *
- * The code in this file is broken in so many other places and
- * will just not work on a big endian CPU anyway therefore the
- * lines below will have to be revisited together with the rest
- * of the txgbe PMD.
- *
- * TODO:
- * - Get rid of "volatile" and let the compiler do its job.
- * - Use the proper memory barrier (rte_rmb()) to ensure the
- * memory ordering below.
+ * "Volatile" only prevents caching of the variable marked
+ * volatile. Most important, "volatile" cannot prevent the CPU
+ * from executing out of order. So, it is necessary to use a
+ * proper memory barrier to ensure the memory ordering below.
*/
rxdp = &rx_ring[rx_id];
staterr = rte_le_to_cpu_32(rxdp->qw1.lo.status);
@@ -1759,6 +1748,12 @@ txgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts,
if (!(staterr & TXGBE_RXD_STAT_DD))
break;
+ /*
+ * Use acquire fence to ensure that status_error which includes
+ * DD bit is loaded before loading of other descriptor words.
+ */
+ rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
+
rxd = *rxdp;
PMD_RX_LOG(DEBUG, "port_id=%u queue_id=%u rx_id=%u "
--
2.27.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/2] net/ngbe: add proper memory barriers in Rx
2023-10-30 10:51 [PATCH 1/2] net/txgbe: add proper memory barriers in Rx Jiawen Wu
@ 2023-10-30 10:51 ` Jiawen Wu
2023-10-31 12:17 ` [PATCH 1/2] net/txgbe: " Ferruh Yigit
2023-11-01 3:32 ` [PATCH v2 " Jiawen Wu
2 siblings, 0 replies; 6+ messages in thread
From: Jiawen Wu @ 2023-10-30 10:51 UTC (permalink / raw)
To: dev; +Cc: Jiawen Wu, stable
Refer to commit 85e46c532bc7 ("net/ixgbe: add proper memory barriers in
Rx"). Fix the same issue as ixgbe.
Although due to the testing schedule, the current test has not found this
problem. We also do the same fix in ngbe, to ensure the read ordering be
correct.
Fixes: 79f3128d4d98 ("net/ngbe: support scattered Rx")
Cc: stable@dpdk.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
---
drivers/net/ngbe/ngbe_rxtx.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/drivers/net/ngbe/ngbe_rxtx.c b/drivers/net/ngbe/ngbe_rxtx.c
index ec353a30b1..54a6f6a887 100644
--- a/drivers/net/ngbe/ngbe_rxtx.c
+++ b/drivers/net/ngbe/ngbe_rxtx.c
@@ -1223,11 +1223,22 @@ ngbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
* of accesses cannot be reordered by the compiler. If they were
* not volatile, they could be reordered which could lead to
* using invalid descriptor fields when read from rxd.
+ *
+ * Meanwhile, to prevent the CPU from executing out of order, we
+ * need to use a proper memory barrier to ensure the memory
+ * ordering below.
*/
rxdp = &rx_ring[rx_id];
staterr = rxdp->qw1.lo.status;
if (!(staterr & rte_cpu_to_le_32(NGBE_RXD_STAT_DD)))
break;
+
+ /*
+ * Use acquire fence to ensure that status_error which includes
+ * DD bit is loaded before loading of other descriptor words.
+ */
+ rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
+
rxd = *rxdp;
/*
@@ -1454,6 +1465,12 @@ ngbe_recv_pkts_sc(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts,
if (!(staterr & NGBE_RXD_STAT_DD))
break;
+ /*
+ * Use acquire fence to ensure that status_error which includes
+ * DD bit is loaded before loading of other descriptor words.
+ */
+ rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
+
rxd = *rxdp;
PMD_RX_LOG(DEBUG, "port_id=%u queue_id=%u rx_id=%u "
--
2.27.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] net/txgbe: add proper memory barriers in Rx
2023-10-30 10:51 [PATCH 1/2] net/txgbe: add proper memory barriers in Rx Jiawen Wu
2023-10-30 10:51 ` [PATCH 2/2] net/ngbe: " Jiawen Wu
@ 2023-10-31 12:17 ` Ferruh Yigit
2023-11-01 3:32 ` [PATCH v2 " Jiawen Wu
2 siblings, 0 replies; 6+ messages in thread
From: Ferruh Yigit @ 2023-10-31 12:17 UTC (permalink / raw)
To: Jiawen Wu, dev; +Cc: stable
On 10/30/2023 10:51 AM, Jiawen Wu wrote:
> Refer to commit 85e46c532bc7 ("net/ixgbe: add proper memory barriers in
> Rx"). Fix the same issue as ixgbe.
>
> Segmentation fault has been observed while running the
> txgbe_recv_pkts_lro() function to receive packets on the Loongson 3A5000
> processor. It's caused by the out-of-order execution of CPU. So add a
> proper memory barrier to ensure the read ordering be correct.
>
> We also did the same thing in the txgbe_recv_pkts() function to make the
> rxd data be valid even though we did not find segmentation fault in this
> function.
>
> Fixes: 0e484278c85f ("net/txgbe: support Rx")
> Cc: stable@dpdk.org
>
> Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
> ---
> drivers/net/txgbe/txgbe_rxtx.c | 47 +++++++++++++++-------------------
> 1 file changed, 21 insertions(+), 26 deletions(-)
>
> diff --git a/drivers/net/txgbe/txgbe_rxtx.c b/drivers/net/txgbe/txgbe_rxtx.c
> index 834ada886a..24fc34d3c4 100644
> --- a/drivers/net/txgbe/txgbe_rxtx.c
> +++ b/drivers/net/txgbe/txgbe_rxtx.c
> @@ -1476,11 +1476,22 @@ txgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
> * of accesses cannot be reordered by the compiler. If they were
> * not volatile, they could be reordered which could lead to
> * using invalid descriptor fields when read from rxd.
> + *
> + * Meanwhile, to prevent the CPU from executing out of order, we
> + * need to use a proper memory barrier to ensure the memory
> + * ordering below.
> */
> rxdp = &rx_ring[rx_id];
> staterr = rxdp->qw1.lo.status;
> if (!(staterr & rte_cpu_to_le_32(TXGBE_RXD_STAT_DD)))
> break;
> +
> + /*
> + * Use acquire fence to ensure that status_error which includes
> + * DD bit is loaded before loading of other descriptor words.
> + */
> + rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
>
Hi Jiawen,
Can you please check the checkpatch warning:
Warning in drivers/net/txgbe/txgbe_rxtx.c:
Using __atomic_xxx/__ATOMIC_XXX built-ins, prefer
rte_atomic_xxx/rte_memory_order_xxx
For your case please use 'rte_memory_order_xxx' instead of '__ATOMIC_XXX'.
Same for both patches in the set.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2 1/2] net/txgbe: add proper memory barriers in Rx
2023-10-30 10:51 [PATCH 1/2] net/txgbe: add proper memory barriers in Rx Jiawen Wu
2023-10-30 10:51 ` [PATCH 2/2] net/ngbe: " Jiawen Wu
2023-10-31 12:17 ` [PATCH 1/2] net/txgbe: " Ferruh Yigit
@ 2023-11-01 3:32 ` Jiawen Wu
2023-11-01 3:32 ` [PATCH v2 2/2] net/ngbe: " Jiawen Wu
2023-11-01 16:55 ` [PATCH v2 1/2] net/txgbe: " Ferruh Yigit
2 siblings, 2 replies; 6+ messages in thread
From: Jiawen Wu @ 2023-11-01 3:32 UTC (permalink / raw)
To: dev; +Cc: Jiawen Wu, stable
Refer to commit 85e46c532bc7 ("net/ixgbe: add proper memory barriers in
Rx"). Fix the same issue as ixgbe.
Segmentation fault has been observed while running the
txgbe_recv_pkts_lro() function to receive packets on the Loongson 3A5000
processor. It's caused by the out-of-order execution of CPU. So add a
proper memory barrier to ensure the read ordering be correct.
We also did the same thing in the txgbe_recv_pkts() function to make the
rxd data be valid even though we did not find segmentation fault in this
function.
Fixes: 0e484278c85f ("net/txgbe: support Rx")
Cc: stable@dpdk.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
---
drivers/net/txgbe/txgbe_rxtx.c | 49 +++++++++++++++-------------------
1 file changed, 22 insertions(+), 27 deletions(-)
diff --git a/drivers/net/txgbe/txgbe_rxtx.c b/drivers/net/txgbe/txgbe_rxtx.c
index 834ada886a..1cd4b25965 100644
--- a/drivers/net/txgbe/txgbe_rxtx.c
+++ b/drivers/net/txgbe/txgbe_rxtx.c
@@ -1226,7 +1226,7 @@ txgbe_rx_scan_hw_ring(struct txgbe_rx_queue *rxq)
for (j = 0; j < LOOK_AHEAD; j++)
s[j] = rte_le_to_cpu_32(rxdp[j].qw1.lo.status);
- rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
+ rte_atomic_thread_fence(rte_memory_order_acquire);
/* Compute how many status bits were set */
for (nb_dd = 0; nb_dd < LOOK_AHEAD &&
@@ -1476,11 +1476,22 @@ txgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
* of accesses cannot be reordered by the compiler. If they were
* not volatile, they could be reordered which could lead to
* using invalid descriptor fields when read from rxd.
+ *
+ * Meanwhile, to prevent the CPU from executing out of order, we
+ * need to use a proper memory barrier to ensure the memory
+ * ordering below.
*/
rxdp = &rx_ring[rx_id];
staterr = rxdp->qw1.lo.status;
if (!(staterr & rte_cpu_to_le_32(TXGBE_RXD_STAT_DD)))
break;
+
+ /*
+ * Use acquire fence to ensure that status_error which includes
+ * DD bit is loaded before loading of other descriptor words.
+ */
+ rte_atomic_thread_fence(rte_memory_order_acquire);
+
rxd = *rxdp;
/*
@@ -1726,32 +1737,10 @@ txgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts,
next_desc:
/*
- * The code in this whole file uses the volatile pointer to
- * ensure the read ordering of the status and the rest of the
- * descriptor fields (on the compiler level only!!!). This is so
- * UGLY - why not to just use the compiler barrier instead? DPDK
- * even has the rte_compiler_barrier() for that.
- *
- * But most importantly this is just wrong because this doesn't
- * ensure memory ordering in a general case at all. For
- * instance, DPDK is supposed to work on Power CPUs where
- * compiler barrier may just not be enough!
- *
- * I tried to write only this function properly to have a
- * starting point (as a part of an LRO/RSC series) but the
- * compiler cursed at me when I tried to cast away the
- * "volatile" from rx_ring (yes, it's volatile too!!!). So, I'm
- * keeping it the way it is for now.
- *
- * The code in this file is broken in so many other places and
- * will just not work on a big endian CPU anyway therefore the
- * lines below will have to be revisited together with the rest
- * of the txgbe PMD.
- *
- * TODO:
- * - Get rid of "volatile" and let the compiler do its job.
- * - Use the proper memory barrier (rte_rmb()) to ensure the
- * memory ordering below.
+ * "Volatile" only prevents caching of the variable marked
+ * volatile. Most important, "volatile" cannot prevent the CPU
+ * from executing out of order. So, it is necessary to use a
+ * proper memory barrier to ensure the memory ordering below.
*/
rxdp = &rx_ring[rx_id];
staterr = rte_le_to_cpu_32(rxdp->qw1.lo.status);
@@ -1759,6 +1748,12 @@ txgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts,
if (!(staterr & TXGBE_RXD_STAT_DD))
break;
+ /*
+ * Use acquire fence to ensure that status_error which includes
+ * DD bit is loaded before loading of other descriptor words.
+ */
+ rte_atomic_thread_fence(rte_memory_order_acquire);
+
rxd = *rxdp;
PMD_RX_LOG(DEBUG, "port_id=%u queue_id=%u rx_id=%u "
--
2.27.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v2 2/2] net/ngbe: add proper memory barriers in Rx
2023-11-01 3:32 ` [PATCH v2 " Jiawen Wu
@ 2023-11-01 3:32 ` Jiawen Wu
2023-11-01 16:55 ` [PATCH v2 1/2] net/txgbe: " Ferruh Yigit
1 sibling, 0 replies; 6+ messages in thread
From: Jiawen Wu @ 2023-11-01 3:32 UTC (permalink / raw)
To: dev; +Cc: Jiawen Wu, stable
Refer to commit 85e46c532bc7 ("net/ixgbe: add proper memory barriers in
Rx"). Fix the same issue as ixgbe.
Although due to the testing schedule, the current test has not found this
problem. We also do the same fix in ngbe, to ensure the read ordering be
correct.
Fixes: 79f3128d4d98 ("net/ngbe: support scattered Rx")
Cc: stable@dpdk.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
---
drivers/net/ngbe/ngbe_rxtx.c | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ngbe/ngbe_rxtx.c b/drivers/net/ngbe/ngbe_rxtx.c
index ec353a30b1..8a873b858e 100644
--- a/drivers/net/ngbe/ngbe_rxtx.c
+++ b/drivers/net/ngbe/ngbe_rxtx.c
@@ -980,7 +980,7 @@ ngbe_rx_scan_hw_ring(struct ngbe_rx_queue *rxq)
for (j = 0; j < LOOK_AHEAD; j++)
s[j] = rte_le_to_cpu_32(rxdp[j].qw1.lo.status);
- rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
+ rte_atomic_thread_fence(rte_memory_order_acquire);
/* Compute how many status bits were set */
for (nb_dd = 0; nb_dd < LOOK_AHEAD &&
@@ -1223,11 +1223,22 @@ ngbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
* of accesses cannot be reordered by the compiler. If they were
* not volatile, they could be reordered which could lead to
* using invalid descriptor fields when read from rxd.
+ *
+ * Meanwhile, to prevent the CPU from executing out of order, we
+ * need to use a proper memory barrier to ensure the memory
+ * ordering below.
*/
rxdp = &rx_ring[rx_id];
staterr = rxdp->qw1.lo.status;
if (!(staterr & rte_cpu_to_le_32(NGBE_RXD_STAT_DD)))
break;
+
+ /*
+ * Use acquire fence to ensure that status_error which includes
+ * DD bit is loaded before loading of other descriptor words.
+ */
+ rte_atomic_thread_fence(rte_memory_order_acquire);
+
rxd = *rxdp;
/*
@@ -1454,6 +1465,12 @@ ngbe_recv_pkts_sc(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts,
if (!(staterr & NGBE_RXD_STAT_DD))
break;
+ /*
+ * Use acquire fence to ensure that status_error which includes
+ * DD bit is loaded before loading of other descriptor words.
+ */
+ rte_atomic_thread_fence(rte_memory_order_acquire);
+
rxd = *rxdp;
PMD_RX_LOG(DEBUG, "port_id=%u queue_id=%u rx_id=%u "
--
2.27.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/2] net/txgbe: add proper memory barriers in Rx
2023-11-01 3:32 ` [PATCH v2 " Jiawen Wu
2023-11-01 3:32 ` [PATCH v2 2/2] net/ngbe: " Jiawen Wu
@ 2023-11-01 16:55 ` Ferruh Yigit
1 sibling, 0 replies; 6+ messages in thread
From: Ferruh Yigit @ 2023-11-01 16:55 UTC (permalink / raw)
To: Jiawen Wu, dev; +Cc: stable
On 11/1/2023 3:32 AM, Jiawen Wu wrote:
> Refer to commit 85e46c532bc7 ("net/ixgbe: add proper memory barriers in
> Rx"). Fix the same issue as ixgbe.
>
> Segmentation fault has been observed while running the
> txgbe_recv_pkts_lro() function to receive packets on the Loongson 3A5000
> processor. It's caused by the out-of-order execution of CPU. So add a
> proper memory barrier to ensure the read ordering be correct.
>
> We also did the same thing in the txgbe_recv_pkts() function to make the
> rxd data be valid even though we did not find segmentation fault in this
> function.
>
> Fixes: 0e484278c85f ("net/txgbe: support Rx")
> Cc: stable@dpdk.org
>
> Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
>
Series applied to dpdk-next-net/main, thanks.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-11-01 16:55 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-30 10:51 [PATCH 1/2] net/txgbe: add proper memory barriers in Rx Jiawen Wu
2023-10-30 10:51 ` [PATCH 2/2] net/ngbe: " Jiawen Wu
2023-10-31 12:17 ` [PATCH 1/2] net/txgbe: " Ferruh Yigit
2023-11-01 3:32 ` [PATCH v2 " Jiawen Wu
2023-11-01 3:32 ` [PATCH v2 2/2] net/ngbe: " Jiawen Wu
2023-11-01 16:55 ` [PATCH v2 1/2] net/txgbe: " Ferruh Yigit
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).