patches for DPDK stable branches
 help / color / Atom feed
From: Slava Ovsiienko <viacheslavo@mellanox.com>
To: "Phil Yang (Arm Technology China)" <Phil.Yang@arm.com>,
	Yongseok Koh <yskoh@mellanox.com>,
	Matan Azrad <matan@mellanox.com>,
	Nélio Laranjeiro <nelio.laranjeiro@6wind.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Cc: Thomas Monjalon <thomas@monjalon.net>,
	"jerinj@marvell.com" <jerinj@marvell.com>,
	Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
	"Gavin Hu (Arm Technology China)" <Gavin.Hu@arm.com>,
	nd <nd@arm.com>, "stable@dpdk.org" <stable@dpdk.org>,
	nd <nd@arm.com>
Subject: Re: [dpdk-stable] [PATCH 2/2] net/mlx5: fix Tx CQ doorbell synchronization on aarch64
Date: Fri, 6 Sep 2019 12:26:34 +0000
Message-ID: <AM4PR05MB32656C508AF9C617E2CB7AFFD2BA0@AM4PR05MB3265.eurprd05.prod.outlook.com> (raw)
In-Reply-To: <VE1PR08MB4640629A8C8237DAB9D3AEACE9BA0@VE1PR08MB4640.eurprd08.prod.outlook.com>

Hi, Phil

Thanks for explanations, please, see below.

> -----Original Message-----
> From: Phil Yang (Arm Technology China) <Phil.Yang@arm.com>
> Sent: Friday, September 6, 2019 10:20
> To: Slava Ovsiienko <viacheslavo@mellanox.com>; Yongseok Koh
> <yskoh@mellanox.com>; Matan Azrad <matan@mellanox.com>; Nélio
> Laranjeiro <nelio.laranjeiro@6wind.com>; dev@dpdk.org
> Cc: Thomas Monjalon <thomas@monjalon.net>; jerinj@marvell.com;
> Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Gavin Hu (Arm
> Technology China) <Gavin.Hu@arm.com>; nd <nd@arm.com>;
> stable@dpdk.org; nd <nd@arm.com>
> Subject: RE: [PATCH 2/2] net/mlx5: fix Tx CQ doorbell synchronization on
> aarch64
> 
> Hi, Slava
> 
> Thanks for your comments.
> 
> > -----Original Message-----
> > From: Slava Ovsiienko <viacheslavo@mellanox.com>
> > Sent: Thursday, September 5, 2019 8:12 PM
> > To: Phil Yang (Arm Technology China) <Phil.Yang@arm.com>;
> > yskoh@mellanox.com; Matan Azrad <matan@mellanox.com>; Nélio
> Laranjeiro
> > <nelio.laranjeiro@6wind.com>; dev@dpdk.org
> > Cc: thomas@monjalon.net; jerinj@marvell.com; Honnappa Nagarahalli
> > <Honnappa.Nagarahalli@arm.com>; Gavin Hu (Arm Technology China)
> > <Gavin.Hu@arm.com>; nd <nd@arm.com>; stable@dpdk.org
> > Subject: RE: [PATCH 2/2] net/mlx5: fix Tx CQ doorbell synchronization
> > on
> > aarch64
> >
> > Hi, Phil
> >
> > This point is in datapath and performance is very critical.
> > The rte_cio_wmb() may take a lot of CPU cycles, waiting till all
> > previous writes become visible for all external (relating to core) agents.
> > The Tx CQE doorbelling does not need any writes to other locations to
> > be completed,
> 
> In my understanding, the PMD needs to wait till all txq fields update is
> completed then ring the doorbell for HW.
> Before the Tx CQE doorbelling, it will update the producer index of work
> queue in Tx queue descriptor (at line 2037).

txq->wqe_pi is exclusively software field, not related to HW directly.
We should not wait for write completions to this one (assuming the tx_burst()
must be called with strict affinity settings and core can't be changed).

There may be some concern about reading from "last_cqe->wqe_counter"
at the line 2037. The compiler barrier was implemented to guarantee this
read is issued before doorbell write.

As for possible reordering these operations (read index from CQE at 2037 and 
write to CQ doorbell register at 2046):

a) read is performed from already cached area (we touched
this CQE performing ownership check very recently) so it is quite unlikely
to be completed after the doorbell write 

b) The only risk to read wrong data is the case of CQE overwriting by HW
with CQ buffer overflow. We create the CQ ring buffer with some extra space,
so completions which are "in-flight" can't overwrite the CQE is being read.

The new completion request may be issued by setting flags in WQE descriptors
and following SQ doorbell write, which is already prepended by wmb. 
(in mlx5_tx_dbrec_cond_wmb(), line 4733). So, it seems there is no chance
for CQE to be overwritten.

> The compiler barrier cannot guarantee the ordering of these operations. So
> use the explicit HW fence to achieve that.
> 
> As same as the HW Tx doorbell in vectorized Tx burst routine, it uses a write
> memory barrier to enforce the register update visible to HW immediately.
> Section 32.5.2 in
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoc.d
> pdk.org%2Fguides%2Fnics%2Fmlx5.html&amp;data=02%7C01%7Cviacheslav
> o%40mellanox.com%7C76938b08a9f145c4a0dd08d7329ab932%7Ca652971
> c7d2e4d9ba6a4d149256f461b%7C0%7C1%7C637033512428674501&amp;sd
> ata=8tdVjY0%2FHOUFo1%2BeHiuqkPadSS%2FHLeo4b97gdgEHgME%3D&am
> p;reserved=0

This is quite different case. PMD build descriptors (WQEs) in the memory and must
guarantee these data are visible for external agents before SQ (sending queue,
not completion queue) doorbelling. Now there are no vectorized Tx routines (since 19.08),
but, of course, we still have the "true" write memory barrier (in mlx5_tx_dbrec_cond_wmb)
for this case.

> 
> > the only concern is not to reorder/merge the writes to the same
> > doorbell register of the same sending queue in the tx_burst() internal
> sending loop/subsequent calls.
> >
> > As far as I know - the writes to the same location should not be
> > reordered by any arch (may be merged if memory settings allow this, it
> > is not critical for CQE doorbell), could you, please, explain why we
> > need explicit hardware fence before CQE doorbell update? Do you think
> > doorbell write might be rearranged with previously reads from the ring
> buffer?
> >
> > WBR,
> > Slava
> >
> > > -----Original Message-----
> > > From: Phil Yang <phil.yang@arm.com>
> > > Sent: Thursday, September 5, 2019 13:55
> > > To: Yongseok Koh <yskoh@mellanox.com>; Slava Ovsiienko
> > > <viacheslavo@mellanox.com>; Matan Azrad <matan@mellanox.com>;
> > Nélio
> > > Laranjeiro <nelio.laranjeiro@6wind.com>; dev@dpdk.org
> > > Cc: Thomas Monjalon <thomas@monjalon.net>; jerinj@marvell.com;
> > > Honnappa.Nagarahalli@arm.com; gavin.hu@arm.com; nd@arm.com;
> > > stable@dpdk.org
> > > Subject: [PATCH 2/2] net/mlx5: fix Tx CQ doorbell synchronization on
> > > aarch64
> > >
> > > For the weaker memory model processors, the compiler barrier is not
> > > sufficient to guarantee the coherent memory update be observed by
> > > I/O device. It needs the coherent I/O memory barrier to enforce the
> > > ordering
> > of
> > > Tx completion queue doorbell operation.
> > >
> > > Fixes: da1df1ccabad ("net/mlx5: fix completion queue drain loop")
> > > Cc: stable@dpdk.org
> > >
> > > Suggested-by: Gavin Hu <gavin.hu@arm.com>
> > > Signed-off-by: Phil Yang <phil.yang@arm.com>
> > > Reviewed-by: Gavin Hu <gavin.hu@arm.com>
> > > ---
> > >  drivers/net/mlx5/mlx5_rxtx.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/net/mlx5/mlx5_rxtx.c
> > > b/drivers/net/mlx5/mlx5_rxtx.c index 4c01187..c11148b 100644
> > > --- a/drivers/net/mlx5/mlx5_rxtx.c
> > > +++ b/drivers/net/mlx5/mlx5_rxtx.c
> > > @@ -2042,7 +2042,7 @@ mlx5_tx_comp_flush(struct mlx5_txq_data
> > > *restrict txq,
> > >  	} else {
> > >  		return;
> > >  	}
> > > -	rte_compiler_barrier();
> > > +	rte_cio_wmb();
> > >  	*txq->cq_db = rte_cpu_to_be_32(txq->cq_ci);
> > >  	if (likely(tail != txq->elts_tail)) {
> > >  		mlx5_tx_free_elts(txq, tail, olx);
> > > --
> > > 2.7.4


  reply index

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-05 10:55 [dpdk-stable] [PATCH 1/2] net/mlx5: fix Rx " Phil Yang
2019-09-05 10:55 ` [dpdk-stable] [PATCH 2/2] net/mlx5: fix Tx " Phil Yang
2019-09-05 12:12   ` Slava Ovsiienko
2019-09-06  7:20     ` Phil Yang (Arm Technology China)
2019-09-06 12:26       ` Slava Ovsiienko [this message]
2019-09-09 10:12         ` Phil Yang (Arm Technology China)
2019-09-09 11:29           ` Slava Ovsiienko
2019-09-10  9:22             ` Phil Yang (Arm Technology China)
2019-09-10  7:22 ` [dpdk-stable] [PATCH 1/2] net/mlx5: fix Rx " Matan Azrad
2019-09-12  8:29 ` [dpdk-stable] [dpdk-dev] " Raslan Darawsheh

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM4PR05MB32656C508AF9C617E2CB7AFFD2BA0@AM4PR05MB3265.eurprd05.prod.outlook.com \
    --to=viacheslavo@mellanox.com \
    --cc=Gavin.Hu@arm.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=Phil.Yang@arm.com \
    --cc=dev@dpdk.org \
    --cc=jerinj@marvell.com \
    --cc=matan@mellanox.com \
    --cc=nd@arm.com \
    --cc=nelio.laranjeiro@6wind.com \
    --cc=stable@dpdk.org \
    --cc=thomas@monjalon.net \
    --cc=yskoh@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

patches for DPDK stable branches

Archives are clonable:
	git clone --mirror http://inbox.dpdk.org/stable/0 stable/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 stable stable/ http://inbox.dpdk.org/stable \
		stable@dpdk.org
	public-inbox-index stable


Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.stable


AGPL code for this site: git clone https://public-inbox.org/ public-inbox