* [dpdk-dev] [PATCH] net/bnxt: fix memory barriers
@ 2019-09-06 6:37 Gavin Hu
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 0/5] fix and optimize barriers usage with some PMDs Gavin Hu
` (5 more replies)
0 siblings, 6 replies; 16+ messages in thread
From: Gavin Hu @ 2019-09-06 6:37 UTC (permalink / raw)
To: dev; +Cc: nd, thomas, jit.khaparde, somnath.kotur, Honnappa.Nagarahalli, stable
As there is an inclusive rte_io_wmb within the following rte_write32()
API who rings the doorbell, this makes the above rte_wmb unecessary and
remove it.
The doorbell ringing operation requires a rte_io_wmb immediately to make
it complete and visible to the device.
To read the doorbell response, which is held in the host CIO memory,
rte_cio_rmb is sufficient.
Fixes: 804e746c7b73 ("net/bnxt: add hardware resource manager init code")
Fixes: ca241d9a0952 ("net/bnxt: use I/O device memory read/write API")
Cc: stable@dpdk.org
Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
---
drivers/net/bnxt/bnxt_hwrm.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 9883fb5..ee07db9 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -115,9 +115,6 @@ static int bnxt_hwrm_send_message(struct bnxt *bp, void *msg,
data = (uint32_t *)&short_input;
msg_len = sizeof(short_input);
- /* Sync memory write before updating doorbell */
- rte_wmb();
-
max_req_len = BNXT_HWRM_SHORT_REQ_LEN;
}
@@ -137,11 +134,12 @@ static int bnxt_hwrm_send_message(struct bnxt *bp, void *msg,
/* Ring channel doorbell */
bar = (uint8_t *)bp->bar0 + mb_trigger_offset;
rte_write32(1, bar);
+ rte_io_wmb();
/* Poll for the valid bit */
for (i = 0; i < HWRM_CMD_TIMEOUT; i++) {
/* Sanity check on the resp->resp_len */
- rte_rmb();
+ rte_cio_rmb();
if (resp->resp_len && resp->resp_len <= bp->max_resp_len) {
/* Last byte of resp contains the valid key */
valid = (uint8_t *)resp + resp->resp_len - 1;
--
2.7.4
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH v2 0/5] fix and optimize barriers usage with some PMDs
2019-09-06 6:37 [dpdk-dev] [PATCH] net/bnxt: fix memory barriers Gavin Hu
@ 2019-09-16 11:27 ` Gavin Hu
2019-10-02 17:41 ` Ferruh Yigit
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 1/5] net/i40e: use relaxed and remove duplicate barrier Gavin Hu
` (4 subsequent siblings)
5 siblings, 1 reply; 16+ messages in thread
From: Gavin Hu @ 2019-09-16 11:27 UTC (permalink / raw)
To: dev
Cc: nd, thomas, bruce.richardson, yong.liu, yinan.wang,
ajit.khaparde, somnath.kotur, Honnappa.Nagarahalli, ruifeng.wang,
steve.capper
DPDK has well-defined barriers, such as CIO barriers and IO barriers.
X86, as a strong ordering model, implements the barriers as compiler
barriers, but on aarch64, as a weak memory ordering model, has fine
grained barriers. Using correct while as relaxed as possible barriers
makes a perf difference.
Upon investigation on a batch of PMDs and it was found that the barriers
are not always used correctly or relaxedly enough.
This series of patches is to optimize the barrier usage with some selected
PMDs and aim at best performance on all arches/platforms.
More PMDs may come next to this series but it takes time.
Gavin Hu (5):
net/i40e: use relaxed and remove duplicate barrier
net/ice: use relaxed and remove duplicate barrier
net/bnxt: remove duplicate barrier
net/bnxt: replace with cio barrier for doorbell resp
net/bnxt: enforce io barrier for doorbell command
drivers/net/bnxt/bnxt_hwrm.c | 11 +++++++----
drivers/net/i40e/i40e_rxtx.c | 12 +++---------
drivers/net/ice/ice_rxtx.c | 6 ------
3 files changed, 10 insertions(+), 19 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH v2 1/5] net/i40e: use relaxed and remove duplicate barrier
2019-09-06 6:37 [dpdk-dev] [PATCH] net/bnxt: fix memory barriers Gavin Hu
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 0/5] fix and optimize barriers usage with some PMDs Gavin Hu
@ 2019-09-16 11:27 ` Gavin Hu
2019-09-17 1:53 ` Zhang, Qi Z
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 2/5] net/ice: " Gavin Hu
` (3 subsequent siblings)
5 siblings, 1 reply; 16+ messages in thread
From: Gavin Hu @ 2019-09-16 11:27 UTC (permalink / raw)
To: dev
Cc: nd, thomas, bruce.richardson, yong.liu, yinan.wang,
ajit.khaparde, somnath.kotur, Honnappa.Nagarahalli, ruifeng.wang,
steve.capper
To guarantee the orderings of successive stores to CIO and MMIO memory,
a lighter weight rte_io_wmb [1] can be used instead of rte_wmb, and since
the I40E_PCI_REG_WRITE API already has an inclusive rte_io_wmb, this
explicit call can be even saved.
[1] http://git.dpdk.org/dpdk/tree/lib/librte_eal/common/include/generic/
rte_atomic.h#n98
Signed-off-by: Gavin Hu <gavin.hu@arm.com>
---
drivers/net/i40e/i40e_rxtx.c | 12 +++---------
1 file changed, 3 insertions(+), 9 deletions(-)
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 692c3ba..bfe161f 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -564,8 +564,7 @@ i40e_rx_alloc_bufs(struct i40e_rx_queue *rxq)
}
/* Update rx tail regsiter */
- rte_wmb();
- I40E_PCI_REG_WRITE_RELAXED(rxq->qrx_tail, rxq->rx_free_trigger);
+ I40E_PCI_REG_WRITE(rxq->qrx_tail, rxq->rx_free_trigger);
rxq->rx_free_trigger =
(uint16_t)(rxq->rx_free_trigger + rxq->rx_free_thresh);
@@ -1208,13 +1207,11 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
}
end_of_tx:
- rte_wmb();
-
PMD_TX_LOG(DEBUG, "port_id=%u queue_id=%u tx_tail=%u nb_tx=%u",
(unsigned) txq->port_id, (unsigned) txq->queue_id,
(unsigned) tx_id, (unsigned) nb_tx);
- I40E_PCI_REG_WRITE_RELAXED(txq->qtx_tail, tx_id);
+ I40E_PCI_REG_WRITE(txq->qtx_tail, tx_id);
txq->tx_tail = tx_id;
return nb_tx;
@@ -1365,8 +1362,7 @@ tx_xmit_pkts(struct i40e_tx_queue *txq,
txq->tx_tail = 0;
/* Update the tx tail register */
- rte_wmb();
- I40E_PCI_REG_WRITE_RELAXED(txq->qtx_tail, txq->tx_tail);
+ I40E_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail);
return nb_pkts;
}
@@ -1544,8 +1540,6 @@ i40e_dev_rx_queue_start(struct rte_eth_dev *dev, uint16_t rx_queue_id)
return err;
}
- rte_wmb();
-
/* Init the RX tail regieter. */
I40E_PCI_REG_WRITE(rxq->qrx_tail, rxq->nb_rx_desc - 1);
--
2.7.4
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH v2 2/5] net/ice: use relaxed and remove duplicate barrier
2019-09-06 6:37 [dpdk-dev] [PATCH] net/bnxt: fix memory barriers Gavin Hu
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 0/5] fix and optimize barriers usage with some PMDs Gavin Hu
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 1/5] net/i40e: use relaxed and remove duplicate barrier Gavin Hu
@ 2019-09-16 11:27 ` Gavin Hu
2019-09-17 1:51 ` Zhang, Qi Z
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 3/5] net/bnxt: " Gavin Hu
` (2 subsequent siblings)
5 siblings, 1 reply; 16+ messages in thread
From: Gavin Hu @ 2019-09-16 11:27 UTC (permalink / raw)
To: dev
Cc: nd, thomas, bruce.richardson, yong.liu, yinan.wang,
ajit.khaparde, somnath.kotur, Honnappa.Nagarahalli, ruifeng.wang,
steve.capper
To guarantee the orderings of successive stores to CIO and MMIO memory,
a lighter weight rte_io_wmb [1] can be used instead of rte_wmb, and since
the ICE_PCI_REG_WRITE API already has an inclusive rte_io_wmb, this
explicit call can even be saved.
[1] http://git.dpdk.org/dpdk/tree/lib/librte_eal/common/include/generic/
rte_atomic.h#n98
Signed-off-by: Gavin Hu <gavin.hu@arm.com>
---
drivers/net/ice/ice_rxtx.c | 6 ------
1 file changed, 6 deletions(-)
diff --git a/drivers/net/ice/ice_rxtx.c b/drivers/net/ice/ice_rxtx.c
index 81af814..2366119 100644
--- a/drivers/net/ice/ice_rxtx.c
+++ b/drivers/net/ice/ice_rxtx.c
@@ -363,8 +363,6 @@ ice_rx_queue_start(struct rte_eth_dev *dev, uint16_t rx_queue_id)
return -ENOMEM;
}
- rte_wmb();
-
/* Init the RX tail register. */
ICE_PCI_REG_WRITE(rxq->qrx_tail, rxq->nb_rx_desc - 1);
@@ -1212,7 +1210,6 @@ ice_rx_alloc_bufs(struct ice_rx_queue *rxq)
}
/* Update rx tail regsiter */
- rte_wmb();
ICE_PCI_REG_WRITE(rxq->qrx_tail, rxq->rx_free_trigger);
rxq->rx_free_trigger =
@@ -2132,8 +2129,6 @@ ice_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
ICE_TXD_QW1_CMD_S);
}
end_of_tx:
- rte_wmb();
-
/* update Tail register */
ICE_PCI_REG_WRITE(txq->qtx_tail, tx_id);
txq->tx_tail = tx_id;
@@ -2289,7 +2284,6 @@ tx_xmit_pkts(struct ice_tx_queue *txq,
txq->tx_tail = 0;
/* Update the tx tail register */
- rte_wmb();
ICE_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail);
return nb_pkts;
--
2.7.4
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH v2 3/5] net/bnxt: remove duplicate barrier
2019-09-06 6:37 [dpdk-dev] [PATCH] net/bnxt: fix memory barriers Gavin Hu
` (2 preceding siblings ...)
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 2/5] net/ice: " Gavin Hu
@ 2019-09-16 11:27 ` Gavin Hu
2019-09-17 1:55 ` Ajit Khaparde
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 4/5] net/bnxt: replace with cio barrier for doorbell resp Gavin Hu
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 5/5] net/bnxt: enforce io barrier for doorbell command Gavin Hu
5 siblings, 1 reply; 16+ messages in thread
From: Gavin Hu @ 2019-09-16 11:27 UTC (permalink / raw)
To: dev
Cc: nd, thomas, bruce.richardson, yong.liu, yinan.wang,
ajit.khaparde, somnath.kotur, Honnappa.Nagarahalli, ruifeng.wang,
steve.capper, stable
As there is an inclusive rte_io_wmb within the following rte_write32()
API who rings the doorbell, this makes the above rte_wmb unnecessary and
remove it.
Fixes: 1cd45aeb3270 ("net/bnxt: support Stratus VF device")
Cc: stable@dpdk.org
Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
---
drivers/net/bnxt/bnxt_hwrm.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 9883fb5..1d9fb17 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -115,9 +115,6 @@ static int bnxt_hwrm_send_message(struct bnxt *bp, void *msg,
data = (uint32_t *)&short_input;
msg_len = sizeof(short_input);
- /* Sync memory write before updating doorbell */
- rte_wmb();
-
max_req_len = BNXT_HWRM_SHORT_REQ_LEN;
}
--
2.7.4
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH v2 4/5] net/bnxt: replace with cio barrier for doorbell resp
2019-09-06 6:37 [dpdk-dev] [PATCH] net/bnxt: fix memory barriers Gavin Hu
` (3 preceding siblings ...)
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 3/5] net/bnxt: " Gavin Hu
@ 2019-09-16 11:27 ` Gavin Hu
2019-09-17 1:56 ` Ajit Khaparde
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 5/5] net/bnxt: enforce io barrier for doorbell command Gavin Hu
5 siblings, 1 reply; 16+ messages in thread
From: Gavin Hu @ 2019-09-16 11:27 UTC (permalink / raw)
To: dev
Cc: nd, thomas, bruce.richardson, yong.liu, yinan.wang,
ajit.khaparde, somnath.kotur, Honnappa.Nagarahalli, ruifeng.wang,
steve.capper, stable
To read the doorbell response, which is held in the host CIO memory,
rte_cio_rmb is sufficient.
Fixes: 804e746c7b73 ("net/bnxt: add hardware resource manager init code")
Cc: stable@dpdk.org
Signed-off-by: Gavin Hu <gavin.hu@arm.com>
---
drivers/net/bnxt/bnxt_hwrm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 1d9fb17..2d5dc00 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -138,7 +138,7 @@ static int bnxt_hwrm_send_message(struct bnxt *bp, void *msg,
/* Poll for the valid bit */
for (i = 0; i < HWRM_CMD_TIMEOUT; i++) {
/* Sanity check on the resp->resp_len */
- rte_rmb();
+ rte_cio_rmb();
if (resp->resp_len && resp->resp_len <= bp->max_resp_len) {
/* Last byte of resp contains the valid key */
valid = (uint8_t *)resp + resp->resp_len - 1;
--
2.7.4
^ permalink raw reply [flat|nested] 16+ messages in thread
* [dpdk-dev] [PATCH v2 5/5] net/bnxt: enforce io barrier for doorbell command
2019-09-06 6:37 [dpdk-dev] [PATCH] net/bnxt: fix memory barriers Gavin Hu
` (4 preceding siblings ...)
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 4/5] net/bnxt: replace with cio barrier for doorbell resp Gavin Hu
@ 2019-09-16 11:27 ` Gavin Hu
2019-09-17 1:55 ` Ajit Khaparde
5 siblings, 1 reply; 16+ messages in thread
From: Gavin Hu @ 2019-09-16 11:27 UTC (permalink / raw)
To: dev
Cc: nd, thomas, bruce.richardson, yong.liu, yinan.wang,
ajit.khaparde, somnath.kotur, Honnappa.Nagarahalli, ruifeng.wang,
steve.capper, stable
The doorbell ringing operation requires a rte_io_mb immediately to make
the command complete and visible to the device before reading the
response, otherwise it may read stale or invalid responses.
Fixes: ca241d9a0952 ("net/bnxt: use I/O device memory read/write API")
Cc: stable@dpdk.org
Signed-off-by: Gavin Hu <gavin.hu@arm.com>
---
drivers/net/bnxt/bnxt_hwrm.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 2d5dc00..7d192e3 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -134,6 +134,12 @@ static int bnxt_hwrm_send_message(struct bnxt *bp, void *msg,
/* Ring channel doorbell */
bar = (uint8_t *)bp->bar0 + mb_trigger_offset;
rte_write32(1, bar);
+ /*
+ * Make sure the channel doorbell ring command complete before
+ * reading the response to avoid getting stale or invalid
+ * responses.
+ */
+ rte_io_mb();
/* Poll for the valid bit */
for (i = 0; i < HWRM_CMD_TIMEOUT; i++) {
--
2.7.4
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v2 2/5] net/ice: use relaxed and remove duplicate barrier
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 2/5] net/ice: " Gavin Hu
@ 2019-09-17 1:51 ` Zhang, Qi Z
0 siblings, 0 replies; 16+ messages in thread
From: Zhang, Qi Z @ 2019-09-17 1:51 UTC (permalink / raw)
To: Gavin Hu, dev
Cc: nd, thomas, Richardson, Bruce, Liu, Yong, Wang, Yinan,
ajit.khaparde, somnath.kotur, Honnappa.Nagarahalli, ruifeng.wang,
steve.capper
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Gavin Hu
> Sent: Monday, September 16, 2019 7:27 PM
> To: dev@dpdk.org
> Cc: nd@arm.com; thomas@monjalon.net; Richardson, Bruce
> <bruce.richardson@intel.com>; Liu, Yong <yong.liu@intel.com>; Wang, Yinan
> <yinan.wang@intel.com>; ajit.khaparde@broadcom.com;
> somnath.kotur@broadcom.com; Honnappa.Nagarahalli@arm.com;
> ruifeng.wang@arm.com; steve.capper@arm.com
> Subject: [dpdk-dev] [PATCH v2 2/5] net/ice: use relaxed and remove duplicate
> barrier
>
> To guarantee the orderings of successive stores to CIO and MMIO memory, a
> lighter weight rte_io_wmb [1] can be used instead of rte_wmb, and since the
> ICE_PCI_REG_WRITE API already has an inclusive rte_io_wmb, this explicit call
> can even be saved.
>
> [1] http://git.dpdk.org/dpdk/tree/lib/librte_eal/common/include/generic/
> rte_atomic.h#n98
>
> Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/5] net/i40e: use relaxed and remove duplicate barrier
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 1/5] net/i40e: use relaxed and remove duplicate barrier Gavin Hu
@ 2019-09-17 1:53 ` Zhang, Qi Z
2019-09-17 2:07 ` Gavin Hu (Arm Technology China)
0 siblings, 1 reply; 16+ messages in thread
From: Zhang, Qi Z @ 2019-09-17 1:53 UTC (permalink / raw)
To: Gavin Hu, dev
Cc: nd, thomas, Richardson, Bruce, Liu, Yong, Wang, Yinan,
ajit.khaparde, somnath.kotur, Honnappa.Nagarahalli, ruifeng.wang,
steve.capper
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Gavin Hu
> Sent: Monday, September 16, 2019 7:27 PM
> To: dev@dpdk.org
> Cc: nd@arm.com; thomas@monjalon.net; Richardson, Bruce
> <bruce.richardson@intel.com>; Liu, Yong <yong.liu@intel.com>; Wang, Yinan
> <yinan.wang@intel.com>; ajit.khaparde@broadcom.com;
> somnath.kotur@broadcom.com; Honnappa.Nagarahalli@arm.com;
> ruifeng.wang@arm.com; steve.capper@arm.com
> Subject: [dpdk-dev] [PATCH v2 1/5] net/i40e: use relaxed and remove
> duplicate barrier
>
> To guarantee the orderings of successive stores to CIO and MMIO memory, a
> lighter weight rte_io_wmb [1] can be used instead of rte_wmb, and since the
> I40E_PCI_REG_WRITE API already has an inclusive rte_io_wmb, this explicit call
> can be even saved.
>
> [1] http://git.dpdk.org/dpdk/tree/lib/librte_eal/common/include/generic/
> rte_atomic.h#n98
>
> Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Can you also capture the one at the tail of i40e_xmit_pkts?
Otherwise
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v2 5/5] net/bnxt: enforce io barrier for doorbell command
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 5/5] net/bnxt: enforce io barrier for doorbell command Gavin Hu
@ 2019-09-17 1:55 ` Ajit Khaparde
0 siblings, 0 replies; 16+ messages in thread
From: Ajit Khaparde @ 2019-09-17 1:55 UTC (permalink / raw)
To: Gavin Hu
Cc: dev, nd, Thomas Monjalon, Bruce Richardson, yong.liu, yinan.wang,
Somnath Kotur, Honnappa.Nagarahalli, ruifeng.wang, steve.capper,
dpdk stable
On Mon, Sep 16, 2019 at 4:27 AM Gavin Hu <gavin.hu@arm.com> wrote:
> The doorbell ringing operation requires a rte_io_mb immediately to make
> the command complete and visible to the device before reading the
> response, otherwise it may read stale or invalid responses.
>
> Fixes: ca241d9a0952 ("net/bnxt: use I/O device memory read/write API")
> Cc: stable@dpdk.org
>
> Signed-off-by: Gavin Hu <gavin.hu@arm.com>
>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
> ---
> drivers/net/bnxt/bnxt_hwrm.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
> index 2d5dc00..7d192e3 100644
> --- a/drivers/net/bnxt/bnxt_hwrm.c
> +++ b/drivers/net/bnxt/bnxt_hwrm.c
> @@ -134,6 +134,12 @@ static int bnxt_hwrm_send_message(struct bnxt *bp,
> void *msg,
> /* Ring channel doorbell */
> bar = (uint8_t *)bp->bar0 + mb_trigger_offset;
> rte_write32(1, bar);
> + /*
> + * Make sure the channel doorbell ring command complete before
> + * reading the response to avoid getting stale or invalid
> + * responses.
> + */
> + rte_io_mb();
>
> /* Poll for the valid bit */
> for (i = 0; i < HWRM_CMD_TIMEOUT; i++) {
> --
> 2.7.4
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v2 3/5] net/bnxt: remove duplicate barrier
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 3/5] net/bnxt: " Gavin Hu
@ 2019-09-17 1:55 ` Ajit Khaparde
0 siblings, 0 replies; 16+ messages in thread
From: Ajit Khaparde @ 2019-09-17 1:55 UTC (permalink / raw)
To: Gavin Hu
Cc: dev, nd, Thomas Monjalon, Bruce Richardson, yong.liu, yinan.wang,
Somnath Kotur, Honnappa.Nagarahalli, ruifeng.wang, steve.capper,
dpdk stable
On Mon, Sep 16, 2019 at 4:27 AM Gavin Hu <gavin.hu@arm.com> wrote:
> As there is an inclusive rte_io_wmb within the following rte_write32()
> API who rings the doorbell, this makes the above rte_wmb unnecessary and
> remove it.
>
> Fixes: 1cd45aeb3270 ("net/bnxt: support Stratus VF device")
> Cc: stable@dpdk.org
>
> Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> Reviewed-by: Steve Capper <steve.capper@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> Reviewed-by: Phil Yang <phil.yang@arm.com>
>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
> ---
> drivers/net/bnxt/bnxt_hwrm.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
> index 9883fb5..1d9fb17 100644
> --- a/drivers/net/bnxt/bnxt_hwrm.c
> +++ b/drivers/net/bnxt/bnxt_hwrm.c
> @@ -115,9 +115,6 @@ static int bnxt_hwrm_send_message(struct bnxt *bp,
> void *msg,
> data = (uint32_t *)&short_input;
> msg_len = sizeof(short_input);
>
> - /* Sync memory write before updating doorbell */
> - rte_wmb();
> -
> max_req_len = BNXT_HWRM_SHORT_REQ_LEN;
> }
>
> --
> 2.7.4
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v2 4/5] net/bnxt: replace with cio barrier for doorbell resp
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 4/5] net/bnxt: replace with cio barrier for doorbell resp Gavin Hu
@ 2019-09-17 1:56 ` Ajit Khaparde
0 siblings, 0 replies; 16+ messages in thread
From: Ajit Khaparde @ 2019-09-17 1:56 UTC (permalink / raw)
To: Gavin Hu
Cc: dev, nd, Thomas Monjalon, Bruce Richardson, yong.liu, yinan.wang,
Somnath Kotur, Honnappa.Nagarahalli, ruifeng.wang, steve.capper,
dpdk stable
On Mon, Sep 16, 2019 at 4:27 AM Gavin Hu <gavin.hu@arm.com> wrote:
> To read the doorbell response, which is held in the host CIO memory,
> rte_cio_rmb is sufficient.
>
> Fixes: 804e746c7b73 ("net/bnxt: add hardware resource manager init code")
> Cc: stable@dpdk.org
>
> Signed-off-by: Gavin Hu <gavin.hu@arm.com>
>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
> ---
> drivers/net/bnxt/bnxt_hwrm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
> index 1d9fb17..2d5dc00 100644
> --- a/drivers/net/bnxt/bnxt_hwrm.c
> +++ b/drivers/net/bnxt/bnxt_hwrm.c
> @@ -138,7 +138,7 @@ static int bnxt_hwrm_send_message(struct bnxt *bp,
> void *msg,
> /* Poll for the valid bit */
> for (i = 0; i < HWRM_CMD_TIMEOUT; i++) {
> /* Sanity check on the resp->resp_len */
> - rte_rmb();
> + rte_cio_rmb();
> if (resp->resp_len && resp->resp_len <= bp->max_resp_len) {
> /* Last byte of resp contains the valid key */
> valid = (uint8_t *)resp + resp->resp_len - 1;
> --
> 2.7.4
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/5] net/i40e: use relaxed and remove duplicate barrier
2019-09-17 1:53 ` Zhang, Qi Z
@ 2019-09-17 2:07 ` Gavin Hu (Arm Technology China)
2019-09-17 3:23 ` Gavin Hu (Arm Technology China)
0 siblings, 1 reply; 16+ messages in thread
From: Gavin Hu (Arm Technology China) @ 2019-09-17 2:07 UTC (permalink / raw)
To: Zhang, Qi Z, dev
Cc: nd, thomas, Richardson, Bruce, Liu, Yong, Wang, Yinan,
ajit.khaparde, somnath.kotur, Honnappa Nagarahalli,
Ruifeng Wang (Arm Technology China),
Steve Capper
Hi Qi,
> -----Original Message-----
> From: Zhang, Qi Z <qi.z.zhang@intel.com>
> Sent: Tuesday, September 17, 2019 9:53 AM
> To: Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>; dev@dpdk.org
> Cc: nd <nd@arm.com>; thomas@monjalon.net; Richardson, Bruce
> <bruce.richardson@intel.com>; Liu, Yong <yong.liu@intel.com>; Wang, Yinan
> <yinan.wang@intel.com>; ajit.khaparde@broadcom.com;
> somnath.kotur@broadcom.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang (Arm Technology China)
> <Ruifeng.Wang@arm.com>; Steve Capper <Steve.Capper@arm.com>
> Subject: RE: [dpdk-dev] [PATCH v2 1/5] net/i40e: use relaxed and remove
> duplicate barrier
>
>
>
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Gavin Hu
> > Sent: Monday, September 16, 2019 7:27 PM
> > To: dev@dpdk.org
> > Cc: nd@arm.com; thomas@monjalon.net; Richardson, Bruce
> > <bruce.richardson@intel.com>; Liu, Yong <yong.liu@intel.com>; Wang,
> Yinan
> > <yinan.wang@intel.com>; ajit.khaparde@broadcom.com;
> > somnath.kotur@broadcom.com; Honnappa.Nagarahalli@arm.com;
> > ruifeng.wang@arm.com; steve.capper@arm.com
> > Subject: [dpdk-dev] [PATCH v2 1/5] net/i40e: use relaxed and remove
> > duplicate barrier
> >
> > To guarantee the orderings of successive stores to CIO and MMIO memory,
> a
> > lighter weight rte_io_wmb [1] can be used instead of rte_wmb, and since
> the
> > I40E_PCI_REG_WRITE API already has an inclusive rte_io_wmb, this explicit
> call
> > can be even saved.
> >
> > [1] http://git.dpdk.org/dpdk/tree/lib/librte_eal/common/include/generic/
> > rte_atomic.h#n98
> >
> > Signed-off-by: Gavin Hu <gavin.hu@arm.com>
>
> Can you also capture the one at the tail of i40e_xmit_pkts?
Thanks for your review, I will fix this in next version.
>
> Otherwise
> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/5] net/i40e: use relaxed and remove duplicate barrier
2019-09-17 2:07 ` Gavin Hu (Arm Technology China)
@ 2019-09-17 3:23 ` Gavin Hu (Arm Technology China)
2019-09-17 5:13 ` Zhang, Qi Z
0 siblings, 1 reply; 16+ messages in thread
From: Gavin Hu (Arm Technology China) @ 2019-09-17 3:23 UTC (permalink / raw)
To: Gavin Hu (Arm Technology China), Zhang, Qi Z, dev
Cc: nd, thomas, Richardson, Bruce, Liu, Yong, Wang, Yinan,
ajit.khaparde, somnath.kotur, Honnappa Nagarahalli,
Ruifeng Wang (Arm Technology China),
Steve Capper
Hi Qi,
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Gavin Hu (Arm
> Technology China)
> Sent: Tuesday, September 17, 2019 10:08 AM
> To: Zhang, Qi Z <qi.z.zhang@intel.com>; dev@dpdk.org
> Cc: nd <nd@arm.com>; thomas@monjalon.net; Richardson, Bruce
> <bruce.richardson@intel.com>; Liu, Yong <yong.liu@intel.com>; Wang,
> Yinan <yinan.wang@intel.com>; ajit.khaparde@broadcom.com;
> somnath.kotur@broadcom.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang (Arm Technology China)
> <Ruifeng.Wang@arm.com>; Steve Capper <Steve.Capper@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v2 1/5] net/i40e: use relaxed and remove
> duplicate barrier
>
> Hi Qi,
>
> > -----Original Message-----
> > From: Zhang, Qi Z <qi.z.zhang@intel.com>
> > Sent: Tuesday, September 17, 2019 9:53 AM
> > To: Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>;
> dev@dpdk.org
> > Cc: nd <nd@arm.com>; thomas@monjalon.net; Richardson, Bruce
> > <bruce.richardson@intel.com>; Liu, Yong <yong.liu@intel.com>; Wang,
> Yinan
> > <yinan.wang@intel.com>; ajit.khaparde@broadcom.com;
> > somnath.kotur@broadcom.com; Honnappa Nagarahalli
> > <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang (Arm Technology
> China)
> > <Ruifeng.Wang@arm.com>; Steve Capper <Steve.Capper@arm.com>
> > Subject: RE: [dpdk-dev] [PATCH v2 1/5] net/i40e: use relaxed and remove
> > duplicate barrier
> >
> >
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Gavin Hu
> > > Sent: Monday, September 16, 2019 7:27 PM
> > > To: dev@dpdk.org
> > > Cc: nd@arm.com; thomas@monjalon.net; Richardson, Bruce
> > > <bruce.richardson@intel.com>; Liu, Yong <yong.liu@intel.com>; Wang,
> > Yinan
> > > <yinan.wang@intel.com>; ajit.khaparde@broadcom.com;
> > > somnath.kotur@broadcom.com; Honnappa.Nagarahalli@arm.com;
> > > ruifeng.wang@arm.com; steve.capper@arm.com
> > > Subject: [dpdk-dev] [PATCH v2 1/5] net/i40e: use relaxed and remove
> > > duplicate barrier
> > >
> > > To guarantee the orderings of successive stores to CIO and MMIO
> memory,
> > a
> > > lighter weight rte_io_wmb [1] can be used instead of rte_wmb, and
> since
> > the
> > > I40E_PCI_REG_WRITE API already has an inclusive rte_io_wmb, this
> explicit
> > call
> > > can be even saved.
> > >
> > > [1]
> http://git.dpdk.org/dpdk/tree/lib/librte_eal/common/include/generic/
> > > rte_atomic.h#n98
> > >
> > > Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> >
> > Can you also capture the one at the tail of i40e_xmit_pkts?
> Thanks for your review, I will fix this in next version.
I checked again, in this version, the rte_wmb at the tail of i40e_xmit_pkts was already removed
and the following I40E_PCI_REG_WRITE_RELAXED was replaced by I40E_PCI_REG_WRITE, which has an inclusive rte_io_wmb.
> >
> > Otherwise
> > Acked-by: Qi Zhang <qi.z.zhang@intel.com>
> >
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/5] net/i40e: use relaxed and remove duplicate barrier
2019-09-17 3:23 ` Gavin Hu (Arm Technology China)
@ 2019-09-17 5:13 ` Zhang, Qi Z
0 siblings, 0 replies; 16+ messages in thread
From: Zhang, Qi Z @ 2019-09-17 5:13 UTC (permalink / raw)
To: Gavin Hu (Arm Technology China), dev
Cc: nd, thomas, Richardson, Bruce, Liu, Yong, Wang, Yinan,
ajit.khaparde, somnath.kotur, Honnappa Nagarahalli,
Ruifeng Wang (Arm Technology China),
Steve Capper
> -----Original Message-----
> From: Gavin Hu (Arm Technology China) [mailto:Gavin.Hu@arm.com]
> Sent: Tuesday, September 17, 2019 11:24 AM
> To: Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>; Zhang, Qi Z
> <qi.z.zhang@intel.com>; dev@dpdk.org
> Cc: nd <nd@arm.com>; thomas@monjalon.net; Richardson, Bruce
> <bruce.richardson@intel.com>; Liu, Yong <yong.liu@intel.com>; Wang, Yinan
> <yinan.wang@intel.com>; ajit.khaparde@broadcom.com;
> somnath.kotur@broadcom.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang (Arm Technology China)
> <Ruifeng.Wang@arm.com>; Steve Capper <Steve.Capper@arm.com>
> Subject: RE: [dpdk-dev] [PATCH v2 1/5] net/i40e: use relaxed and remove
> duplicate barrier
>
> Hi Qi,
>
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Gavin Hu (Arm Technology
> > China)
> > Sent: Tuesday, September 17, 2019 10:08 AM
> > To: Zhang, Qi Z <qi.z.zhang@intel.com>; dev@dpdk.org
> > Cc: nd <nd@arm.com>; thomas@monjalon.net; Richardson, Bruce
> > <bruce.richardson@intel.com>; Liu, Yong <yong.liu@intel.com>; Wang,
> > Yinan <yinan.wang@intel.com>; ajit.khaparde@broadcom.com;
> > somnath.kotur@broadcom.com; Honnappa Nagarahalli
> > <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang (Arm Technology China)
> > <Ruifeng.Wang@arm.com>; Steve Capper <Steve.Capper@arm.com>
> > Subject: Re: [dpdk-dev] [PATCH v2 1/5] net/i40e: use relaxed and
> > remove duplicate barrier
> >
> > Hi Qi,
> >
> > > -----Original Message-----
> > > From: Zhang, Qi Z <qi.z.zhang@intel.com>
> > > Sent: Tuesday, September 17, 2019 9:53 AM
> > > To: Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>;
> > dev@dpdk.org
> > > Cc: nd <nd@arm.com>; thomas@monjalon.net; Richardson, Bruce
> > > <bruce.richardson@intel.com>; Liu, Yong <yong.liu@intel.com>; Wang,
> > Yinan
> > > <yinan.wang@intel.com>; ajit.khaparde@broadcom.com;
> > > somnath.kotur@broadcom.com; Honnappa Nagarahalli
> > > <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang (Arm Technology
> > China)
> > > <Ruifeng.Wang@arm.com>; Steve Capper <Steve.Capper@arm.com>
> > > Subject: RE: [dpdk-dev] [PATCH v2 1/5] net/i40e: use relaxed and
> > > remove duplicate barrier
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Gavin Hu
> > > > Sent: Monday, September 16, 2019 7:27 PM
> > > > To: dev@dpdk.org
> > > > Cc: nd@arm.com; thomas@monjalon.net; Richardson, Bruce
> > > > <bruce.richardson@intel.com>; Liu, Yong <yong.liu@intel.com>;
> > > > Wang,
> > > Yinan
> > > > <yinan.wang@intel.com>; ajit.khaparde@broadcom.com;
> > > > somnath.kotur@broadcom.com; Honnappa.Nagarahalli@arm.com;
> > > > ruifeng.wang@arm.com; steve.capper@arm.com
> > > > Subject: [dpdk-dev] [PATCH v2 1/5] net/i40e: use relaxed and
> > > > remove duplicate barrier
> > > >
> > > > To guarantee the orderings of successive stores to CIO and MMIO
> > memory,
> > > a
> > > > lighter weight rte_io_wmb [1] can be used instead of rte_wmb, and
> > since
> > > the
> > > > I40E_PCI_REG_WRITE API already has an inclusive rte_io_wmb, this
> > explicit
> > > call
> > > > can be even saved.
> > > >
> > > > [1]
> > http://git.dpdk.org/dpdk/tree/lib/librte_eal/common/include/generic/
> > > > rte_atomic.h#n98
> > > >
> > > > Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> > >
> > > Can you also capture the one at the tail of i40e_xmit_pkts?
> > Thanks for your review, I will fix this in next version.
> I checked again, in this version, the rte_wmb at the tail of i40e_xmit_pkts was
> already removed and the following I40E_PCI_REG_WRITE_RELAXED was
> replaced by I40E_PCI_REG_WRITE, which has an inclusive rte_io_wmb.
Sorry, I must mixed the with ice patch, yes you are right, its already there.
> > >
> > > Otherwise
> > > Acked-by: Qi Zhang <qi.z.zhang@intel.com>
> > >
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [dpdk-dev] [PATCH v2 0/5] fix and optimize barriers usage with some PMDs
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 0/5] fix and optimize barriers usage with some PMDs Gavin Hu
@ 2019-10-02 17:41 ` Ferruh Yigit
0 siblings, 0 replies; 16+ messages in thread
From: Ferruh Yigit @ 2019-10-02 17:41 UTC (permalink / raw)
To: Gavin Hu, dev
Cc: nd, thomas, bruce.richardson, yong.liu, yinan.wang,
ajit.khaparde, somnath.kotur, Honnappa.Nagarahalli, ruifeng.wang,
steve.capper
On 9/16/2019 12:27 PM, Gavin Hu wrote:
> DPDK has well-defined barriers, such as CIO barriers and IO barriers.
>
> X86, as a strong ordering model, implements the barriers as compiler
> barriers, but on aarch64, as a weak memory ordering model, has fine
> grained barriers. Using correct while as relaxed as possible barriers
> makes a perf difference.
>
> Upon investigation on a batch of PMDs and it was found that the barriers
> are not always used correctly or relaxedly enough.
>
> This series of patches is to optimize the barrier usage with some selected
> PMDs and aim at best performance on all arches/platforms.
>
> More PMDs may come next to this series but it takes time.
>
> Gavin Hu (5):
> net/i40e: use relaxed and remove duplicate barrier
> net/ice: use relaxed and remove duplicate barrier
> net/bnxt: remove duplicate barrier
> net/bnxt: replace with cio barrier for doorbell resp
> net/bnxt: enforce io barrier for doorbell command
Series applied to dpdk-next-net/master, thanks.
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2019-10-02 17:41 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-06 6:37 [dpdk-dev] [PATCH] net/bnxt: fix memory barriers Gavin Hu
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 0/5] fix and optimize barriers usage with some PMDs Gavin Hu
2019-10-02 17:41 ` Ferruh Yigit
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 1/5] net/i40e: use relaxed and remove duplicate barrier Gavin Hu
2019-09-17 1:53 ` Zhang, Qi Z
2019-09-17 2:07 ` Gavin Hu (Arm Technology China)
2019-09-17 3:23 ` Gavin Hu (Arm Technology China)
2019-09-17 5:13 ` Zhang, Qi Z
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 2/5] net/ice: " Gavin Hu
2019-09-17 1:51 ` Zhang, Qi Z
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 3/5] net/bnxt: " Gavin Hu
2019-09-17 1:55 ` Ajit Khaparde
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 4/5] net/bnxt: replace with cio barrier for doorbell resp Gavin Hu
2019-09-17 1:56 ` Ajit Khaparde
2019-09-16 11:27 ` [dpdk-dev] [PATCH v2 5/5] net/bnxt: enforce io barrier for doorbell command Gavin Hu
2019-09-17 1:55 ` Ajit Khaparde
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).