patches for DPDK stable branches
 help / color / mirror / Atom feed
From: luca.boccassi@gmail.com
To: Min Zhou <zhoumin@loongson.cn>
Cc: Ruifeng Wang <ruifeng.wang@arm.com>, dpdk stable <stable@dpdk.org>
Subject: patch 'net/ixgbe: add proper memory barriers in Rx' has been queued to stable release 20.11.9
Date: Wed, 28 Jun 2023 15:10:40 +0100	[thread overview]
Message-ID: <20230628141046.2145871-16-luca.boccassi@gmail.com> (raw)
In-Reply-To: <20230628141046.2145871-1-luca.boccassi@gmail.com>

Hi,

FYI, your patch has been queued to stable release 20.11.9

Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objections before 06/30/23. So please
shout if anyone has objections.

Also note that after the patch there's a diff of the upstream commit vs the
patch applied to the branch. This will indicate if there was any rebasing
needed to apply to the stable branch. If there were code changes for rebasing
(ie: not only metadata diffs), please double check that the rebase was
correctly done.

Queued patches are on a temporary branch at:
https://github.com/bluca/dpdk-stable

This queued commit can be viewed at:
https://github.com/bluca/dpdk-stable/commit/6fb5e74cd8bfd97db2188a452974c76ee574ccd3
the

Thanks.

Luca Boccassi

---
From 6fb5e74cd8bfd97db2188a452974c76ee574ccd3 Mon Sep 17 00:00:00 2001
From: Min Zhou <zhoumin@loongson.cn>
Date: Tue, 13 Jun 2023 17:44:25 +0800
Subject: [PATCH] net/ixgbe: add proper memory barriers in Rx

[ upstream commit 85e46c532bc76ebe07f6a397aa76211250aca59c ]

Segmentation fault has been observed while running the
ixgbe_recv_pkts_lro() function to receive packets on the Loongson 3C5000
processor which has 64 cores and 4 NUMA nodes.

From the ixgbe_recv_pkts_lro() function, we found that as long as the first
packet has the EOP bit set, and the length of this packet is less than or
equal to rxq->crc_len, the segmentation fault will definitely happen even
though on the other platforms. For example, if we made the first packet
which had the EOP bit set had a zero length by force, the segmentation
fault would happen on X86.

Because when processd the first packet the first_seg->next will be NULL, if
at the same time this packet has the EOP bit set and its length is less
than or equal to rxq->crc_len, the following loop will be executed:

    for (lp = first_seg; lp->next != rxm; lp = lp->next)
        ;

We know that the first_seg->next will be NULL under this condition. So the
expression of lp->next->next will cause the segmentation fault.

Normally, the length of the first packet with EOP bit set will be greater
than rxq->crc_len. However, the out-of-order execution of CPU may make the
read ordering of the status and the rest of the descriptor fields in this
function not be correct. The related codes are as following:

        rxdp = &rx_ring[rx_id];
 #1     staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error);

        if (!(staterr & IXGBE_RXDADV_STAT_DD))
            break;

 #2     rxd = *rxdp;

The sentence #2 may be executed before sentence #1. This action is likely
to make the ready packet zero length. If the packet is the first packet and
has the EOP bit set, the above segmentation fault will happen.

So, we should add a proper memory barrier to ensure the read ordering be
correct. We also did the same thing in the ixgbe_recv_pkts() function to
make the rxd data be valid even though we did not find segmentation fault
in this function.

Fixes: 8eecb3295aed ("ixgbe: add LRO support")

Signed-off-by: Min Zhou <zhoumin@loongson.cn>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
 drivers/net/ixgbe/ixgbe_rxtx.c | 47 +++++++++++++++-------------------
 1 file changed, 21 insertions(+), 26 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index ab3e70d27e..7414384493 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1787,11 +1787,22 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 		 * of accesses cannot be reordered by the compiler. If they were
 		 * not volatile, they could be reordered which could lead to
 		 * using invalid descriptor fields when read from rxd.
+		 *
+		 * Meanwhile, to prevent the CPU from executing out of order, we
+		 * need to use a proper memory barrier to ensure the memory
+		 * ordering below.
 		 */
 		rxdp = &rx_ring[rx_id];
 		staterr = rxdp->wb.upper.status_error;
 		if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD)))
 			break;
+
+		/*
+		 * Use acquire fence to ensure that status_error which includes
+		 * DD bit is loaded before loading of other descriptor words.
+		 */
+		rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
+
 		rxd = *rxdp;
 
 		/*
@@ -2058,32 +2069,10 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts,
 
 next_desc:
 		/*
-		 * The code in this whole file uses the volatile pointer to
-		 * ensure the read ordering of the status and the rest of the
-		 * descriptor fields (on the compiler level only!!!). This is so
-		 * UGLY - why not to just use the compiler barrier instead? DPDK
-		 * even has the rte_compiler_barrier() for that.
-		 *
-		 * But most importantly this is just wrong because this doesn't
-		 * ensure memory ordering in a general case at all. For
-		 * instance, DPDK is supposed to work on Power CPUs where
-		 * compiler barrier may just not be enough!
-		 *
-		 * I tried to write only this function properly to have a
-		 * starting point (as a part of an LRO/RSC series) but the
-		 * compiler cursed at me when I tried to cast away the
-		 * "volatile" from rx_ring (yes, it's volatile too!!!). So, I'm
-		 * keeping it the way it is for now.
-		 *
-		 * The code in this file is broken in so many other places and
-		 * will just not work on a big endian CPU anyway therefore the
-		 * lines below will have to be revisited together with the rest
-		 * of the ixgbe PMD.
-		 *
-		 * TODO:
-		 *    - Get rid of "volatile" and let the compiler do its job.
-		 *    - Use the proper memory barrier (rte_rmb()) to ensure the
-		 *      memory ordering below.
+		 * "Volatile" only prevents caching of the variable marked
+		 * volatile. Most important, "volatile" cannot prevent the CPU
+		 * from executing out of order. So, it is necessary to use a
+		 * proper memory barrier to ensure the memory ordering below.
 		 */
 		rxdp = &rx_ring[rx_id];
 		staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error);
@@ -2091,6 +2080,12 @@ next_desc:
 		if (!(staterr & IXGBE_RXDADV_STAT_DD))
 			break;
 
+		/*
+		 * Use acquire fence to ensure that status_error which includes
+		 * DD bit is loaded before loading of other descriptor words.
+		 */
+		rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
+
 		rxd = *rxdp;
 
 		PMD_RX_LOG(DEBUG, "port_id=%u queue_id=%u rx_id=%u "
-- 
2.39.2

---
  Diff of the applied patch vs upstream commit (please double-check if non-empty:
---
--- -	2023-06-28 11:40:08.748912143 +0100
+++ 0016-net-ixgbe-add-proper-memory-barriers-in-Rx.patch	2023-06-28 11:40:08.076027917 +0100
@@ -1 +1 @@
-From 85e46c532bc76ebe07f6a397aa76211250aca59c Mon Sep 17 00:00:00 2001
+From 6fb5e74cd8bfd97db2188a452974c76ee574ccd3 Mon Sep 17 00:00:00 2001
@@ -5,0 +6,2 @@
+[ upstream commit 85e46c532bc76ebe07f6a397aa76211250aca59c ]
+
@@ -50 +51,0 @@
-Cc: stable@dpdk.org
@@ -59 +60 @@
-index 6cbb992823..61f17cd90b 100644
+index ab3e70d27e..7414384493 100644
@@ -62 +63 @@
-@@ -1817,11 +1817,22 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
+@@ -1787,11 +1787,22 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
@@ -85 +86 @@
-@@ -2088,32 +2099,10 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts,
+@@ -2058,32 +2069,10 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts,
@@ -122 +123 @@
-@@ -2121,6 +2110,12 @@ next_desc:
+@@ -2091,6 +2080,12 @@ next_desc:

  parent reply	other threads:[~2023-06-28 14:11 UTC|newest]

Thread overview: 113+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-15  1:31 patch 'kni: fix build with Linux 6.3' " luca.boccassi
2023-06-15  1:31 ` patch 'examples/ip_pipeline: fix build with GCC 13' " luca.boccassi
2023-06-15  1:31 ` patch 'examples/ntb: " luca.boccassi
2023-06-15  1:31 ` patch 'ring: fix use after free' " luca.boccassi
2023-06-15  1:32 ` patch 'vfio: fix include with musl runtime' " luca.boccassi
2023-06-15  1:32 ` patch 'kernel/freebsd: fix function parameter list' " luca.boccassi
2023-06-15  1:32 ` patch 'build: fix case of project language name' " luca.boccassi
2023-06-15  1:32 ` patch 'telemetry: fix autotest on Alpine' " luca.boccassi
2023-06-15  1:32 ` patch 'test/malloc: fix missing free' " luca.boccassi
2023-06-15  1:32 ` patch 'test/malloc: fix statistics checks' " luca.boccassi
2023-06-15  1:32 ` patch 'eal: avoid calling cleanup twice' " luca.boccassi
2023-06-15  1:32 ` patch 'pci: fix comment referencing renamed function' " luca.boccassi
2023-06-15  1:32 ` patch 'eal/x86: improve multiple of 64 bytes memcpy performance' " luca.boccassi
2023-06-15  1:32 ` patch 'eventdev/timer: fix timeout event wait behavior' " luca.boccassi
2023-06-15  1:32 ` patch 'doc: fix event timer adapter guide' " luca.boccassi
2023-06-15  1:32 ` patch 'event/dsw: free rings on close' " luca.boccassi
2023-06-15  1:32 ` patch 'eventdev/timer: fix buffer flush' " luca.boccassi
2023-06-15  1:32 ` patch 'eal/linux: fix secondary process crash for mp hotplug' " luca.boccassi
2023-06-15  1:32 ` patch 'eal/linux: fix legacy mem init with many segments' " luca.boccassi
2023-06-15  1:32 ` patch 'net/hns3: fix build warning' " luca.boccassi
2023-06-15  1:32 ` patch 'net/tap: set locally administered bit for fixed MAC address' " luca.boccassi
2023-06-15  1:32 ` patch 'net/dpaa2: fix checksum good flags' " luca.boccassi
2023-06-15  1:32 ` patch 'app/testpmd: fix GTP L2 length in checksum engine' " luca.boccassi
2023-06-15  1:32 ` patch 'net/vmxnet3: fix drop of empty segments in Tx' " luca.boccassi
2023-06-15  1:32 ` patch 'net/txgbe: fix use-after-free on remove' " luca.boccassi
2023-06-15  1:32 ` patch 'ethdev: fix MAC address occupies two entries' " luca.boccassi
2023-06-15  1:32 ` patch 'net/hns3: fix variable type mismatch' " luca.boccassi
2023-06-15  1:32 ` patch 'net/hns3: fix Rx multiple firmware reset interrupts' " luca.boccassi
2023-06-15  1:32 ` patch 'net/hns3: fix FEC mode for 200G ports' " luca.boccassi
2023-06-15  1:32 ` patch 'net/hns3: fix FEC mode check' " luca.boccassi
2023-06-15  1:32 ` patch 'doc: fix format in flow API guide' " luca.boccassi
2023-06-15  1:32 ` patch 'net/hns3: fix mbuf leakage when RxQ started during reset' " luca.boccassi
2023-06-15  1:32 ` patch 'net/hns3: fix mbuf leakage when RxQ started after " luca.boccassi
2023-06-15  1:32 ` patch 'net/hns3: fix device start return value' " luca.boccassi
2023-06-15  1:32 ` patch 'net/hns3: fix uninitialized variable' " luca.boccassi
2023-06-15  1:32 ` patch 'net/hns3: fix inaccurate log' " luca.boccassi
2023-06-15  1:32 ` patch 'net/hns3: fix redundant line break in " luca.boccassi
2023-06-15  1:32 ` patch 'net/hns3: fix IMP reset trigger' " luca.boccassi
2023-06-15  1:32 ` patch 'net/nfp: fix offloading flows' " luca.boccassi
2023-06-15  1:32 ` patch 'net/vmxnet3: fix return code in initializing' " luca.boccassi
2023-06-15  1:32 ` patch 'doc: fix auth algos in cryptoperf app' " luca.boccassi
2023-06-15  1:32 ` patch 'crypto/scheduler: fix last element for valid args' " luca.boccassi
2023-06-15  1:32 ` patch 'test/crypto: fix session creation check' " luca.boccassi
2023-06-15  1:32 ` patch 'vhost: fix invalid call FD handling' " luca.boccassi
2023-06-15  1:32 ` patch 'net/virtio: fix initialization to return negative errno' " luca.boccassi
2023-06-15  1:32 ` patch 'net/virtio-user: fix leak when initialisation fails' " luca.boccassi
2023-06-15  1:32 ` patch 'net/mlx5: enhance error log for tunnel offloading' " luca.boccassi
2023-06-15  1:32 ` patch 'net/mlx5: fix duplicated tag index matching in SWS' " luca.boccassi
2023-06-15  1:32 ` patch 'net/qede: fix RSS indirection table initialization' " luca.boccassi
2023-06-15  1:32 ` patch 'doc: fix typo in cnxk platform guide' " luca.boccassi
2023-06-15  1:32 ` patch 'net/i40e: fix Rx data buffer size' " luca.boccassi
2023-06-15  1:32 ` patch 'net/ice: " luca.boccassi
2023-06-15  1:32 ` patch 'net/iavf: " luca.boccassi
2023-06-15  1:32 ` patch 'net/ice: fix statistics' " luca.boccassi
2023-06-15  1:32 ` patch 'net/ice: fix DCF RSS initialization' " luca.boccassi
2023-06-15  1:32 ` patch 'net/iavf: release large VF when closing device' " luca.boccassi
2023-06-15  1:32 ` patch 'net/ice: fix DCF control thread crash' " luca.boccassi
2023-06-15  1:32 ` patch 'net/ice/base: remove unreachable code' " luca.boccassi
2023-06-15  1:32 ` patch 'net/ice: fix outer UDP checksum offload' " luca.boccassi
2023-06-15  1:32 ` patch 'net/iavf: fix virtchnl command called in interrupt' " luca.boccassi
2023-06-15  1:32 ` patch 'test/mbuf: fix crash in a forked process' " luca.boccassi
2023-06-15  1:32 ` patch 'doc: fix typo in graph guide' " luca.boccassi
2023-06-15  1:32 ` patch 'doc: remove warning with Doxygen 1.9.7' " luca.boccassi
2023-06-28 14:10   ` patch 'examples/l2fwd-cat: fix external build' " luca.boccassi
2023-06-28 14:10     ` patch 'test: add graph tests' " luca.boccassi
2023-06-28 14:55       ` David Marchand
2023-06-28 14:10     ` patch 'mbuf: fix Doxygen comment of distributor metadata' " luca.boccassi
2023-06-28 14:10     ` patch 'crypto/openssl: skip workaround at compilation time' " luca.boccassi
2023-06-28 14:10     ` patch 'ethdev: update documentation for API to set FEC' " luca.boccassi
2023-06-28 14:10     ` patch 'ethdev: check that at least one FEC mode is specified' " luca.boccassi
2023-06-28 14:10     ` patch 'ethdev: update documentation for API to get FEC' " luca.boccassi
2023-06-28 14:10     ` patch 'net/bonding: fix startup when NUMA is not supported' " luca.boccassi
2023-06-28 14:10     ` patch 'net/bonding: fix destroy dedicated queues flow' " luca.boccassi
2023-06-28 14:10     ` patch 'net/txgbe/base: fix Tx with fiber hotplug' " luca.boccassi
2023-06-28 14:10     ` patch 'net/txgbe: fix to set autoneg for 1G speed' " luca.boccassi
2023-06-28 14:10     ` patch 'net/txgbe: fix extended statistics' " luca.boccassi
2023-06-28 14:10     ` patch 'net/nfp: fix address always related with PF ID 0' " luca.boccassi
2023-06-28 14:10     ` patch 'common/sfc_efx/base: fix Rx queue without RSS hash prefix' " luca.boccassi
2023-06-28 14:10     ` patch 'net/ice: fix tunnel packet Tx descriptor' " luca.boccassi
2023-06-28 14:10     ` luca.boccassi [this message]
2023-06-28 14:10     ` patch 'net/iavf: fix abnormal disable HW interrupt' " luca.boccassi
2023-06-28 14:10     ` patch 'net/i40e: fix tunnel packet Tx descriptor' " luca.boccassi
2023-06-28 14:10     ` patch 'net/e1000: fix queue number initialization' " luca.boccassi
2023-06-28 14:10     ` patch 'net/mlx5: fix risk in NEON Rx descriptor read' " luca.boccassi
2023-06-28 14:10     ` patch 'net/mlx5: fix device removal event handling' " luca.boccassi
2023-06-28 14:10     ` patch 'common/mlx5: adjust fork call with new kernel API' " luca.boccassi
2023-07-14 22:34       ` patch 'ipc: fix file descriptor leakage with unhandled messages' " luca.boccassi
2023-07-14 22:34         ` patch 'fib: fix adding default route' " luca.boccassi
2023-07-14 22:34         ` patch 'mem: fix memsegs exhausted message' " luca.boccassi
2023-07-14 22:34         ` patch 'net/netvsc: fix sizeof calculation' " luca.boccassi
2023-07-14 22:34         ` patch 'app/testpmd: fix checksum engine with GTP on 32-bit' " luca.boccassi
2023-07-14 22:34         ` patch 'net/hns3: fix non-zero weight for disabled TC' " luca.boccassi
2023-07-14 22:34         ` patch 'net/hns3: fix index to look up table in NEON Rx' " luca.boccassi
2023-07-14 22:34         ` patch 'ethdev: fix potential leak in PCI probing helper' " luca.boccassi
2023-07-14 22:34         ` patch 'net/mlx5: forbid MPRQ restart' " luca.boccassi
2023-07-14 22:34         ` patch 'net/ice: fix 32-bit build' " luca.boccassi
2023-07-14 22:34         ` patch 'net/ice: fix RSS hash key generation' " luca.boccassi
2023-07-14 22:34         ` patch 'baseband/fpga_5gnr_fec: fix possible division by zero' " luca.boccassi
2023-07-14 22:34         ` patch 'baseband/fpga_5gnr_fec: fix starting unconfigured queue' " luca.boccassi
2023-07-14 22:34         ` patch 'test/crypto: fix PDCP-SDAP test vectors' " luca.boccassi
2023-07-14 22:34         ` patch 'examples/fips_validation: fix digest length in AES-GCM' " luca.boccassi
2023-07-14 22:34         ` patch 'app/crypto-perf: fix socket ID default value' " luca.boccassi
2023-07-14 22:34         ` patch 'examples/ipsec-secgw: fix TAP default MAC address' " luca.boccassi
2023-07-14 22:34         ` patch 'kni: fix build with Linux 6.5' " luca.boccassi
2023-07-20 10:58           ` patch 'doc: fix typos and wording in flow API guide' " luca.boccassi
2023-07-20 10:58             ` patch 'net/i40e: fix comments' " luca.boccassi
2023-07-20 10:58             ` patch 'net/iavf: fix stop ordering' " luca.boccassi
2023-07-20 10:58             ` patch 'common/iavf: fix MAC type for 710 NIC' " luca.boccassi
2023-07-20 10:58             ` patch 'net/ixgbe: fix Rx and Tx queue status' " luca.boccassi
2023-07-20 10:58             ` patch 'net/igc: " luca.boccassi
2023-07-20 10:58             ` patch 'net/e1000: " luca.boccassi
2023-07-20 10:58             ` patch 'net/mlx5: fix LRO TCP checksum' " luca.boccassi
2023-07-20 10:58             ` patch 'doc: update BIOS settings and supported HW for NTB' " luca.boccassi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230628141046.2145871-16-luca.boccassi@gmail.com \
    --to=luca.boccassi@gmail.com \
    --cc=ruifeng.wang@arm.com \
    --cc=stable@dpdk.org \
    --cc=zhoumin@loongson.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).