To keep ordering of mixed accesses, rte_cio is sufficient. The rte_io barrier inside the I40E_PCI_REG_WRITE is overkill.[1] [1] http://inbox.dpdk.org/dev/CALBAE1M-ezVWCjqCZDBw+MMDEC4O9 qf0Kpn89EMdGDajepKoZQ@mail.gmail.com Fixes: 4861cde46116 ("i40e: new poll mode driver") Cc: stable@dpdk.org Signed-off-by: Gavin Hu <gavin.hu@arm.com> --- V4: - add the Fixes tag and CC stable <Xiaolong Ye> V3: - optimize the barriers in the fast path only, leave as it is for the barriers in the slow path and control path <jerin> - drop the virtio patches from the list as they are in the control path - it makes more sense to relax the barrier in the fast path, at the PMD level. relaxing the fundamental rte_io_x barriers APIs requires scrutinizations for each PMDs which use the barriers directly or indirectly. V2: - remove virtio_pci_read/write64 APIs definitions, they are not needed and generate compiling errors like " error: unused function 'virtio_pci_write64' [-Werror,-Wunused-function]" - update the reference link to kernel source code --- drivers/net/i40e/i40e_rxtx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c index fd1ae80da..8c0f7cc67 100644 --- a/drivers/net/i40e/i40e_rxtx.c +++ b/drivers/net/i40e/i40e_rxtx.c @@ -1248,7 +1248,8 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts) (unsigned) txq->port_id, (unsigned) txq->queue_id, (unsigned) tx_id, (unsigned) nb_tx); - I40E_PCI_REG_WRITE(txq->qtx_tail, tx_id); + rte_cio_wmb(); + I40E_PCI_REG_WRITE_RELAXED(txq->qtx_tail, tx_id); txq->tx_tail = tx_id; return nb_tx; -- 2.17.1
Hi Jerin,
Could you help review this patch? We discussed a lot on this topic and I think we agreed already to relax the barrier only on the fast path.
Best Regards,
Gavin
> -----Original Message-----
> From: Gavin Hu <Gavin.Hu@arm.com>
> Sent: Friday, February 14, 2020 3:58 PM
> To: Gavin Hu <Gavin.Hu@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v4] net/i40e: relaxed barrier in the tx
> fastpath
>
> To keep ordering of mixed accesses, rte_cio is sufficient.
> The rte_io barrier inside the I40E_PCI_REG_WRITE is overkill.[1]
>
> [1] http://inbox.dpdk.org/dev/CALBAE1M-ezVWCjqCZDBw+MMDEC4O9
> qf0Kpn89EMdGDajepKoZQ@mail.gmail.com
>
> Fixes: 4861cde46116 ("i40e: new poll mode driver")
> Cc: stable@dpdk.org
>
> Signed-off-by: Gavin Hu <gavin.hu@arm.com>
> ---
> V4:
> - add the Fixes tag and CC stable <Xiaolong Ye>
> V3:
> - optimize the barriers in the fast path only, leave as it is for the
> barriers in the slow path and control path <jerin>
> - drop the virtio patches from the list as they are in the control path
> - it makes more sense to relax the barrier in the fast path, at the PMD level.
> relaxing the fundamental rte_io_x barriers APIs requires scrutinizations for
> each PMDs which use the barriers directly or indirectly.
> V2:
> - remove virtio_pci_read/write64 APIs definitions, they are not needed and
> generate compiling errors like " error: unused function 'virtio_pci_write64' [-
> Werror,-Wunused-function]"
> - update the reference link to kernel source code
> ---
> drivers/net/i40e/i40e_rxtx.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
> index fd1ae80da..8c0f7cc67 100644
> --- a/drivers/net/i40e/i40e_rxtx.c
> +++ b/drivers/net/i40e/i40e_rxtx.c
> @@ -1248,7 +1248,8 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf
> **tx_pkts, uint16_t nb_pkts)
> (unsigned) txq->port_id, (unsigned) txq->queue_id,
> (unsigned) tx_id, (unsigned) nb_tx);
>
> -I40E_PCI_REG_WRITE(txq->qtx_tail, tx_id);
> +rte_cio_wmb();
> +I40E_PCI_REG_WRITE_RELAXED(txq->qtx_tail, tx_id);
> txq->tx_tail = tx_id;
>
> return nb_tx;
> --
> 2.17.1
>
s/relaxed/relax
On 02/12, Gavin Hu wrote:
>To keep ordering of mixed accesses, rte_cio is sufficient.
>The rte_io barrier inside the I40E_PCI_REG_WRITE is overkill.[1]
>
>[1] http://inbox.dpdk.org/dev/CALBAE1M-ezVWCjqCZDBw+MMDEC4O9
>qf0Kpn89EMdGDajepKoZQ@mail.gmail.com
>
>Fixes: 4861cde46116 ("i40e: new poll mode driver")
>Cc: stable@dpdk.org
>
>Signed-off-by: Gavin Hu <gavin.hu@arm.com>
>---
>V4:
>- add the Fixes tag and CC stable <Xiaolong Ye>
>V3:
>- optimize the barriers in the fast path only, leave as it is for the
> barriers in the slow path and control path <jerin>
>- drop the virtio patches from the list as they are in the control path
>- it makes more sense to relax the barrier in the fast path, at the PMD level.
> relaxing the fundamental rte_io_x barriers APIs requires scrutinizations for
> each PMDs which use the barriers directly or indirectly.
>V2:
>- remove virtio_pci_read/write64 APIs definitions, they are not needed and generate compiling errors like " error: unused function 'virtio_pci_write64' [-Werror,-Wunused-function]"
>- update the reference link to kernel source code
>---
> drivers/net/i40e/i40e_rxtx.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
>index fd1ae80da..8c0f7cc67 100644
>--- a/drivers/net/i40e/i40e_rxtx.c
>+++ b/drivers/net/i40e/i40e_rxtx.c
>@@ -1248,7 +1248,8 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
> (unsigned) txq->port_id, (unsigned) txq->queue_id,
> (unsigned) tx_id, (unsigned) nb_tx);
>
>- I40E_PCI_REG_WRITE(txq->qtx_tail, tx_id);
>+ rte_cio_wmb();
>+ I40E_PCI_REG_WRITE_RELAXED(txq->qtx_tail, tx_id);
> txq->tx_tail = tx_id;
>
> return nb_tx;
>--
>2.17.1
>
Applied to dpdk-next-net-intel with Jerin's Reviewed-by tag, Thanks.
15/02/2020 16:16, Ye Xiaolong: > s/relaxed/relax > > On 02/12, Gavin Hu wrote: > >To keep ordering of mixed accesses, rte_cio is sufficient. > >The rte_io barrier inside the I40E_PCI_REG_WRITE is overkill.[1] [...] > > Applied to dpdk-next-net-intel with Jerin's Reviewed-by tag, Thanks. I assume it is too much risky doing such optimization post-rc3. Ferruh, Xiaolong, you don't plan anymore pull from dpdk-next-net-intel in 20.02?
Hi, Thomas On 02/16, Thomas Monjalon wrote: >15/02/2020 16:16, Ye Xiaolong: >> s/relaxed/relax >> >> On 02/12, Gavin Hu wrote: >> >To keep ordering of mixed accesses, rte_cio is sufficient. >> >The rte_io barrier inside the I40E_PCI_REG_WRITE is overkill.[1] >[...] >> >> Applied to dpdk-next-net-intel with Jerin's Reviewed-by tag, Thanks. > >I assume it is too much risky doing such optimization post-rc3. Yes, this iss a valid concern, I agree to postpone it to next release. > >Ferruh, Xiaolong, you don't plan anymore pull from dpdk-next-net-intel >in 20.02? There are still some bug fixing work going on in PRC, so I assume there should be some fix patches after RC3, they are still allowed to be merged to 20.02, if the fix is relatively small in terms of lines of code and scope, right? Thanks, Xiaolong > >
16/02/2020 17:38, Ye Xiaolong:
> Hi, Thomas
>
> On 02/16, Thomas Monjalon wrote:
> >15/02/2020 16:16, Ye Xiaolong:
> >> s/relaxed/relax
> >>
> >> On 02/12, Gavin Hu wrote:
> >> >To keep ordering of mixed accesses, rte_cio is sufficient.
> >> >The rte_io barrier inside the I40E_PCI_REG_WRITE is overkill.[1]
> >[...]
> >>
> >> Applied to dpdk-next-net-intel with Jerin's Reviewed-by tag, Thanks.
> >
> >I assume it is too much risky doing such optimization post-rc3.
>
> Yes, this iss a valid concern, I agree to postpone it to next release.
>
> >
> >Ferruh, Xiaolong, you don't plan anymore pull from dpdk-next-net-intel
> >in 20.02?
>
> There are still some bug fixing work going on in PRC, so I assume there
> should be some fix patches after RC3, they are still allowed to be merged
> to 20.02, if the fix is relatively small in terms of lines of code and scope,
> right?
Right