DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Nélio Laranjeiro" <nelio.laranjeiro@6wind.com>
To: Matan Azrad <matan@mellanox.com>
Cc: Ophir Munk <ophirmu@mellanox.com>,
	Adrien Mazarguil <adrien.mazarguil@6wind.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	Thomas Monjalon <thomas@monjalon.net>,
	Olga Shern <olgas@mellanox.com>,
	Mordechay Haimovsky <motih@mellanox.com>
Subject: Re: [dpdk-dev] [PATCH v2 4/7] net/mlx4: merge Tx path functions
Date: Thu, 26 Oct 2017 14:12:19 +0200	[thread overview]
Message-ID: <20171026121219.ke3dz7hv4a5zfpih@laranjeiro-vm> (raw)
In-Reply-To: <HE1PR0502MB365998C9ABE7E943F60382CBD2450@HE1PR0502MB3659.eurprd05.prod.outlook.com>

On Thu, Oct 26, 2017 at 10:31:06AM +0000, Matan Azrad wrote:
> Hi Nelio
> 
> I think the memory barrier discussion is not relevant for this patch
> (if it will be relevant I will create new one).
> Please see my comments inline.

It was not my single comment.  There is also useless code like having
null segments in the packets which is not allowed on DPDK.

> Regarding this specific patch, I didn't see any comment from you, Are
> you agree with it? 
>  
> > -----Original Message-----
> > From: Nélio Laranjeiro [mailto:nelio.laranjeiro@6wind.com]
> > Sent: Wednesday, October 25, 2017 10:50 AM
> > To: Ophir Munk <ophirmu@mellanox.com>
> > Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; dev@dpdk.org;
> > Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > <olgas@mellanox.com>; Matan Azrad <matan@mellanox.com>
> > Subject: Re: [dpdk-dev] [PATCH v2 4/7] net/mlx4: merge Tx path functions
> > 
> > On Tue, Oct 24, 2017 at 08:36:52PM +0000, Ophir Munk wrote:
> > > Hi,
> > >
> > > On Tuesday, October 24, 2017 4:52 PM, Nélio Laranjeiro wrote:
> > > >
> > > > On Mon, Oct 23, 2017 at 02:21:57PM +0000, Ophir Munk wrote:
> > > > > From: Matan Azrad <matan@mellanox.com>
> > > > >
> > > > > Merge tx_burst and mlx4_post_send functions to prevent double
> > > > > asking about WQ remain space.
> > > > >
> > > > > This should improve performance.
> > > > >
> > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > ---
> > > > >  drivers/net/mlx4/mlx4_rxtx.c | 353
> > > > > +++++++++++++++++++++----------------------
> > > > >  1 file changed, 170 insertions(+), 183 deletions(-)
> > > >
> > > > What are the real expectation you have on the remaining patches of
> > > > the series?
> > > >
> > > > According to the comment of this commit log "This should improve
> > > > performance" there are too many barriers at each packet/segment
> > > > level to improve something.
> > > >
> > > > The point is, mlx4_burst_tx() should write all the WQE without any
> > > > barrier as it is processing a burst of packets (whereas Verbs
> > > > functions which may only process a single packet).
> > >
> > > > The lonely barrier which should be present is the one to ensure that
> > > > all the host memory is flushed before triggering the Tx doorbell.
> > > >
> > >
> > > There is a known ConnectX-3 HW limitation: the first 4 bytes of every
> > > TXWBB (64 bytes chunks) should be
> > > written in a reversed order (from last TXWBB to first TXWBB).
> > 
> > This means the first WQE filled by the burst function is the doorbell.
> > In such situation, the first four bytes of it can be written before
> > leaving the burst function and after a write memory barrier.
> > 
> > Until this first WQE is not complete, the NIC won't start processing the
> > packets.  Memory barriers per packets becomes useless.
> 
> I think this is not true, Since mlx4 HW can prefetch advanced TXbbs if their first 4
> bytes are valid in spite of the first WQE is still not valid (please read the spec).

A compiler barrier is enough on x86 to forbid the CPU to re-order the
instructions, on arm you need a memory barrier, there is a macro in DPDK
for that, rte_io_wmb().

Before triggering the doorbell you must flush the case, this is the only
place where the rte_wmb() should be used.

> > It gives something like:
> > 
> >  uint32_t tx_bb_db = 0;
> >  void *first_wqe = NULL;
> > 
> >  /*
> >   * Prepare all Packets by writing the WQEs without the 4 first bytes of
> >   * the first WQE.
> >   */
> >  for () {
> >  	if (!wqe) {
> > 		first_wqe = wqe;
> > 		tx_bb_db = foo;
> > 	}
> >  }
> >  /* Leaving. */
> >  rte_wmb();
> >  *(uin32_t*)wqe = tx_bb_db;
> >  return n;
> >
> 
> I will take care to check if we can do 2 loops:
> Write all  last 60B per TXbb.
> Memory barrier.
> Write all first 4B per TXbbs.
> 
> > > The last 60 bytes of any TXWBB can be written in any order (before
> > > writing the first 4 bytes).
> > > Is your last statement (using lonely barrier) is in accordance with
> > > this limitation? Please explain.
> > >
> > > > There is also too many cases handled which are useless in bursts
> > situation,
> > > > this function needs to be re-written to its minimal use case i.e.
> > processing a
> > > > valid burst of packets/segments and triggering at the end of the burst the
> > Tx
> > > > doorbell.
> > > >
> > 
> > Regards,
> > 
> > --
> > Nélio Laranjeiro
> > 6WIND

Regards,

-- 
Nélio Laranjeiro
6WIND

  reply	other threads:[~2017-10-26 12:12 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1508752838-30408-1-git-send-email-ophirmu@mellanox.com>
2017-10-23 14:21 ` [dpdk-dev] [PATCH v2 0/7] net/mlx4: follow-up on new TX datapath introduced in RC1 Ophir Munk
2017-10-23 14:21   ` [dpdk-dev] [PATCH v2 1/7] net/mlx4: remove error flows from Tx fast path Ophir Munk
2017-10-25 16:49     ` Adrien Mazarguil
2017-10-23 14:21   ` [dpdk-dev] [PATCH v2 2/7] net/mlx4: inline more Tx functions Ophir Munk
2017-10-25 16:49     ` Adrien Mazarguil
2017-10-25 21:42       ` Ophir Munk
2017-10-26  7:48         ` Adrien Mazarguil
2017-10-26 14:27           ` Ophir Munk
2017-10-29 19:30             ` Ophir Munk
2017-10-23 14:21   ` [dpdk-dev] [PATCH v2 3/7] net/mlx4: save lkey in big-endian format Ophir Munk
2017-10-23 15:24     ` Nélio Laranjeiro
2017-10-23 14:21   ` [dpdk-dev] [PATCH v2 4/7] net/mlx4: merge Tx path functions Ophir Munk
2017-10-24 13:51     ` Nélio Laranjeiro
2017-10-24 20:36       ` Ophir Munk
2017-10-25  7:50         ` Nélio Laranjeiro
2017-10-26 10:31           ` Matan Azrad
2017-10-26 12:12             ` Nélio Laranjeiro [this message]
2017-10-26 12:30               ` Matan Azrad
2017-10-26 13:44                 ` Nélio Laranjeiro
2017-10-26 16:21                   ` Matan Azrad
2017-10-23 14:21   ` [dpdk-dev] [PATCH v2 5/7] net/mlx4: remove unnecessary variables in Tx burst Ophir Munk
2017-10-25 16:49     ` Adrien Mazarguil
2017-10-23 14:21   ` [dpdk-dev] [PATCH v2 6/7] net/mlx4: improve performance of one Tx segment Ophir Munk
2017-10-25 16:50     ` Adrien Mazarguil
2017-10-23 14:22   ` [dpdk-dev] [PATCH v2 7/7] net/mlx4: separate Tx for multi-segments Ophir Munk
2017-10-25 16:50     ` Adrien Mazarguil
2017-10-30  8:15       ` Ophir Munk
2017-10-30 10:07   ` [dpdk-dev] [PATCH v3 0/7] Tx path improvements Matan Azrad
2017-10-30 10:07     ` [dpdk-dev] [PATCH v3 1/7] net/mlx4: remove error flows from Tx fast path Matan Azrad
2017-10-30 14:23       ` Adrien Mazarguil
2017-10-30 18:11         ` Matan Azrad
2017-10-31 10:16           ` Adrien Mazarguil
2017-10-30 10:07     ` [dpdk-dev] [PATCH v3 2/7] net/mlx4: associate MR to MP in a short function Matan Azrad
2017-10-30 14:23       ` Adrien Mazarguil
2017-10-31 13:25         ` Ophir Munk
2017-10-30 10:07     ` [dpdk-dev] [PATCH v3 3/7] net/mlx4: merge Tx path functions Matan Azrad
2017-10-30 14:23       ` Adrien Mazarguil
2017-10-30 18:12         ` Matan Azrad
2017-10-30 10:07     ` [dpdk-dev] [PATCH v3 4/7] net/mlx4: remove completion counter in Tx burst Matan Azrad
2017-10-30 14:23       ` Adrien Mazarguil
2017-10-30 10:07     ` [dpdk-dev] [PATCH v3 5/7] net/mlx4: separate Tx segment cases Matan Azrad
2017-10-30 14:23       ` Adrien Mazarguil
2017-10-30 18:23         ` Matan Azrad
2017-10-31 10:17           ` Adrien Mazarguil
2017-10-30 10:07     ` [dpdk-dev] [PATCH v3 6/7] net/mlx4: mitigate Tx path memory barriers Matan Azrad
2017-10-30 14:23       ` Adrien Mazarguil
2017-10-30 19:47         ` Matan Azrad
2017-10-31 10:17           ` Adrien Mazarguil
2017-10-31 11:35             ` Matan Azrad
2017-10-31 13:21               ` Adrien Mazarguil
2017-10-30 10:07     ` [dpdk-dev] [PATCH v3 7/7] net/mlx4: remove empty Tx segment support Matan Azrad
2017-10-30 14:24       ` Adrien Mazarguil
2017-10-31 18:21     ` [dpdk-dev] [PATCH v4 0/8] net/mlx4: Tx path improvements Matan Azrad
2017-10-31 18:21       ` [dpdk-dev] [PATCH v4 1/8] net/mlx4: remove error flows from Tx fast path Matan Azrad
2017-10-31 18:21       ` [dpdk-dev] [PATCH v4 2/8] net/mlx4: associate MR to MP in a short function Matan Azrad
2017-11-02 13:42         ` Adrien Mazarguil
2017-10-31 18:21       ` [dpdk-dev] [PATCH v4 3/8] net/mlx4: fix ring wraparound compiler hint Matan Azrad
2017-11-02 13:42         ` Adrien Mazarguil
2017-10-31 18:21       ` [dpdk-dev] [PATCH v4 4/8] net/mlx4: merge Tx path functions Matan Azrad
2017-11-02 13:42         ` Adrien Mazarguil
2017-10-31 18:21       ` [dpdk-dev] [PATCH v4 5/8] net/mlx4: remove duplicate handling in Tx burst Matan Azrad
2017-11-02 13:42         ` Adrien Mazarguil
2017-10-31 18:21       ` [dpdk-dev] [PATCH v4 6/8] net/mlx4: separate Tx segment cases Matan Azrad
2017-11-02 13:43         ` Adrien Mazarguil
2017-10-31 18:21       ` [dpdk-dev] [PATCH v4 7/8] net/mlx4: fix HW memory optimizations careless Matan Azrad
2017-11-02 13:43         ` Adrien Mazarguil
2017-10-31 18:21       ` [dpdk-dev] [PATCH v4 8/8] net/mlx4: mitigate Tx path memory barriers Matan Azrad
2017-11-02 13:43         ` Adrien Mazarguil
2017-11-02 13:41       ` [dpdk-dev] [PATCH] net/mlx4: fix missing include Adrien Mazarguil
2017-11-02 20:35         ` Ferruh Yigit
2017-11-02 16:42     ` [dpdk-dev] [PATCH v5 0/8] net/mlx4: Tx path improvements Matan Azrad
2017-11-02 16:42       ` [dpdk-dev] [PATCH v5 1/8] net/mlx4: remove error flows from Tx fast path Matan Azrad
2017-11-02 16:42       ` [dpdk-dev] [PATCH v5 2/8] net/mlx4: associate MR to MP in a short function Matan Azrad
2017-11-02 16:42       ` [dpdk-dev] [PATCH v5 3/8] net/mlx4: fix ring wraparound compiler hint Matan Azrad
2017-11-02 16:42       ` [dpdk-dev] [PATCH v5 4/8] net/mlx4: merge Tx path functions Matan Azrad
2017-11-02 16:42       ` [dpdk-dev] [PATCH v5 5/8] net/mlx4: remove duplicate handling in Tx burst Matan Azrad
2017-11-02 16:42       ` [dpdk-dev] [PATCH v5 6/8] net/mlx4: separate Tx segment cases Matan Azrad
2017-11-02 16:42       ` [dpdk-dev] [PATCH v5 7/8] net/mlx4: fix HW memory optimizations careless Matan Azrad
2017-11-02 16:42       ` [dpdk-dev] [PATCH v5 8/8] net/mlx4: mitigate Tx path memory barriers Matan Azrad
2017-11-02 17:07       ` [dpdk-dev] [PATCH v5 0/8] net/mlx4: Tx path improvements Adrien Mazarguil
2017-11-02 20:35         ` Ferruh Yigit
2017-11-02 20:41       ` Ferruh Yigit
2017-11-03  9:48         ` Adrien Mazarguil
2017-11-03 19:25       ` Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171026121219.ke3dz7hv4a5zfpih@laranjeiro-vm \
    --to=nelio.laranjeiro@6wind.com \
    --cc=adrien.mazarguil@6wind.com \
    --cc=dev@dpdk.org \
    --cc=matan@mellanox.com \
    --cc=motih@mellanox.com \
    --cc=olgas@mellanox.com \
    --cc=ophirmu@mellanox.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).