From: Matan Azrad <matan@mellanox.com>
To: "Nélio Laranjeiro" <nelio.laranjeiro@6wind.com>
Cc: Ophir Munk <ophirmu@mellanox.com>,
Adrien Mazarguil <adrien.mazarguil@6wind.com>,
"dev@dpdk.org" <dev@dpdk.org>,
Thomas Monjalon <thomas@monjalon.net>,
Olga Shern <olgas@mellanox.com>,
Mordechay Haimovsky <motih@mellanox.com>
Subject: Re: [dpdk-dev] [PATCH v2 4/7] net/mlx4: merge Tx path functions
Date: Thu, 26 Oct 2017 16:21:37 +0000 [thread overview]
Message-ID: <HE1PR0502MB3659806FDE4AF1D103584860D2450@HE1PR0502MB3659.eurprd05.prod.outlook.com> (raw)
In-Reply-To: <20171026134424.6hww2zyc3crbe322@laranjeiro-vm>
Hi Nelio
> -----Original Message-----
> From: Nélio Laranjeiro [mailto:nelio.laranjeiro@6wind.com]
> Sent: Thursday, October 26, 2017 4:44 PM
> To: Matan Azrad <matan@mellanox.com>
> Cc: Ophir Munk <ophirmu@mellanox.com>; Adrien Mazarguil
> <adrien.mazarguil@6wind.com>; dev@dpdk.org; Thomas Monjalon
> <thomas@monjalon.net>; Olga Shern <olgas@mellanox.com>; Mordechay
> Haimovsky <motih@mellanox.com>
> Subject: Re: [dpdk-dev] [PATCH v2 4/7] net/mlx4: merge Tx path functions
>
> On Thu, Oct 26, 2017 at 12:30:54PM +0000, Matan Azrad wrote:
> > Hi Nelio
> > Please see my comments below (3).
> >
> >
> > > -----Original Message-----
> > > From: Nélio Laranjeiro [mailto:nelio.laranjeiro@6wind.com]
> > > Sent: Thursday, October 26, 2017 3:12 PM
> > > To: Matan Azrad <matan@mellanox.com>
> > > Cc: Ophir Munk <ophirmu@mellanox.com>; Adrien Mazarguil
> > > <adrien.mazarguil@6wind.com>; dev@dpdk.org; Thomas Monjalon
> > > <thomas@monjalon.net>; Olga Shern <olgas@mellanox.com>;
> Mordechay
> > > Haimovsky <motih@mellanox.com>
> > > Subject: Re: [dpdk-dev] [PATCH v2 4/7] net/mlx4: merge Tx path
> > > functions
> > >
> > > On Thu, Oct 26, 2017 at 10:31:06AM +0000, Matan Azrad wrote:
> > > > Hi Nelio
> > > >
> > > > I think the memory barrier discussion is not relevant for this
> > > > patch (if it will be relevant I will create new one).
> > > > Please see my comments inline.
> > >
> > > It was not my single comment. There is also useless code like
> > > having null segments in the packets which is not allowed on DPDK.
> >
> > Sorry, but I can't find comments in the previous mails.
>
> You should search in the series,
>
> > Moreover this comment(first time I see it) is not relevant to this patch and
> asking something else.
> > All what this patch does is to merge 2 functions to prevent double
> > asking about WQ remain space...
>
> Again in the series itself.
>
> The point, this series embed 7 patches for "performance improvement",
> whereas the single improvement is avoiding to call an outside function by
> copy/pasting it into the PMD.
> In fact it will save few cycles, but this improvements could have been much
> more if the it was not a bare copy/paste.
>
This simple merge improves 0.2MPPS in my setup.
If you have more improvements (other than reduce if statement) regarding this merge please suggest.
> The real question is what is the improvement? If the improvement is
> significant, it worse having this series, otherwise it does not as it may also
> bring some bugs which may be resolve from its original source whereas this
> one will remain.
>
Each commit in this series improves performance - all of them improve performance significantly and brought us to our target.
By the way, I think series discussion should be in patch 0 :)
> > Remove memory\compiler barriers or dealing with null segments are not in
> the scope here.
> >
> > >
> > > > Regarding this specific patch, I didn't see any comment from you,
> > > > Are you agree with it?
> > > >
> > > > > -----Original Message-----
> > > > > From: Nélio Laranjeiro [mailto:nelio.laranjeiro@6wind.com]
> > > > > Sent: Wednesday, October 25, 2017 10:50 AM
> > > > > To: Ophir Munk <ophirmu@mellanox.com>
> > > > > Cc: Adrien Mazarguil <adrien.mazarguil@6wind.com>; dev@dpdk.org;
> > > > > Thomas Monjalon <thomas@monjalon.net>; Olga Shern
> > > > > <olgas@mellanox.com>; Matan Azrad <matan@mellanox.com>
> > > > > Subject: Re: [dpdk-dev] [PATCH v2 4/7] net/mlx4: merge Tx path
> > > > > functions
> > > > >
> > > > > On Tue, Oct 24, 2017 at 08:36:52PM +0000, Ophir Munk wrote:
> > > > > > Hi,
> > > > > >
> > > > > > On Tuesday, October 24, 2017 4:52 PM, Nélio Laranjeiro wrote:
> > > > > > >
> > > > > > > On Mon, Oct 23, 2017 at 02:21:57PM +0000, Ophir Munk wrote:
> > > > > > > > From: Matan Azrad <matan@mellanox.com>
> > > > > > > >
> > > > > > > > Merge tx_burst and mlx4_post_send functions to prevent
> > > > > > > > double asking about WQ remain space.
> > > > > > > >
> > > > > > > > This should improve performance.
> > > > > > > >
> > > > > > > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > > > > > > ---
> > > > > > > > drivers/net/mlx4/mlx4_rxtx.c | 353
> > > > > > > > +++++++++++++++++++++----------------------
> > > > > > > > 1 file changed, 170 insertions(+), 183 deletions(-)
> > > > > > >
> > > > > > > What are the real expectation you have on the remaining
> > > > > > > patches of the series?
> > > > > > >
> > > > > > > According to the comment of this commit log "This should
> > > > > > > improve performance" there are too many barriers at each
> > > > > > > packet/segment level to improve something.
> > > > > > >
> > > > > > > The point is, mlx4_burst_tx() should write all the WQE
> > > > > > > without any barrier as it is processing a burst of packets
> > > > > > > (whereas Verbs functions which may only process a single
> packet).
> > > > > >
> > > > > > > The lonely barrier which should be present is the one to
> > > > > > > ensure that all the host memory is flushed before triggering the Tx
> doorbell.
> > > > > > >
> > > > > >
> > > > > > There is a known ConnectX-3 HW limitation: the first 4 bytes
> > > > > > of every TXWBB (64 bytes chunks) should be written in a
> > > > > > reversed order (from last TXWBB to first TXWBB).
> > > > >
> > > > > This means the first WQE filled by the burst function is the doorbell.
> > > > > In such situation, the first four bytes of it can be written
> > > > > before leaving the burst function and after a write memory barrier.
> > > > >
> > > > > Until this first WQE is not complete, the NIC won't start
> > > > > processing the packets. Memory barriers per packets becomes
> useless.
> > > >
> > > > I think this is not true, Since mlx4 HW can prefetch advanced
> > > > TXbbs if their first 4 bytes are valid in spite of the first WQE
> > > > is still not valid (please
> > > read the spec).
> > >
> > > A compiler barrier is enough on x86 to forbid the CPU to re-order
> > > the instructions, on arm you need a memory barrier, there is a macro
> > > in DPDK for that, rte_io_wmb().
> > >
> > We are also using compiler barrier here.
> >
> > > Before triggering the doorbell you must flush the case, this is the
> > > only place where the rte_wmb() should be used.
> > >
> >
> > We are also using memory barrier only for this reason.
> >
> > > > > It gives something like:
> > > > >
> > > > > uint32_t tx_bb_db = 0;
> > > > > void *first_wqe = NULL;
> > > > >
> > > > > /*
> > > > > * Prepare all Packets by writing the WQEs without the 4 first bytes of
> > > > > * the first WQE.
> > > > > */
> > > > > for () {
> > > > > if (!wqe) {
> > > > > first_wqe = wqe;
> > > > > tx_bb_db = foo;
> > > > > }
> > > > > }
> > > > > /* Leaving. */
> > > > > rte_wmb();
> > > > > *(uin32_t*)wqe = tx_bb_db;
> > > > > return n;
> > > > >
> > > >
> > > > I will take care to check if we can do 2 loops:
> > > > Write all last 60B per TXbb.
> > > > Memory barrier.
> > > > Write all first 4B per TXbbs.
> > > >
> > > > > > The last 60 bytes of any TXWBB can be written in any order
> > > > > > (before writing the first 4 bytes).
> > > > > > Is your last statement (using lonely barrier) is in accordance
> > > > > > with this limitation? Please explain.
> > > > > >
> > > > > > > There is also too many cases handled which are useless in
> > > > > > > bursts
> > > > > situation,
> > > > > > > this function needs to be re-written to its minimal use case i.e.
> > > > > processing a
> > > > > > > valid burst of packets/segments and triggering at the end of
> > > > > > > the burst the
> > > > > Tx
> > > > > > > doorbell.
> > > > > > >
> > > > >
> > > > > Regards,
> > > > >
> > > > > --
> > > > > Nélio Laranjeiro
> > > > > 6WIND
> > >
> > > Regards,
> > >
> > > --
> > > Nélio Laranjeiro
> > > 6WIND
>
> --
> Nélio Laranjeiro
> 6WIND
next prev parent reply other threads:[~2017-10-26 16:21 UTC|newest]
Thread overview: 84+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1508752838-30408-1-git-send-email-ophirmu@mellanox.com>
2017-10-23 14:21 ` [dpdk-dev] [PATCH v2 0/7] net/mlx4: follow-up on new TX datapath introduced in RC1 Ophir Munk
2017-10-23 14:21 ` [dpdk-dev] [PATCH v2 1/7] net/mlx4: remove error flows from Tx fast path Ophir Munk
2017-10-25 16:49 ` Adrien Mazarguil
2017-10-23 14:21 ` [dpdk-dev] [PATCH v2 2/7] net/mlx4: inline more Tx functions Ophir Munk
2017-10-25 16:49 ` Adrien Mazarguil
2017-10-25 21:42 ` Ophir Munk
2017-10-26 7:48 ` Adrien Mazarguil
2017-10-26 14:27 ` Ophir Munk
2017-10-29 19:30 ` Ophir Munk
2017-10-23 14:21 ` [dpdk-dev] [PATCH v2 3/7] net/mlx4: save lkey in big-endian format Ophir Munk
2017-10-23 15:24 ` Nélio Laranjeiro
2017-10-23 14:21 ` [dpdk-dev] [PATCH v2 4/7] net/mlx4: merge Tx path functions Ophir Munk
2017-10-24 13:51 ` Nélio Laranjeiro
2017-10-24 20:36 ` Ophir Munk
2017-10-25 7:50 ` Nélio Laranjeiro
2017-10-26 10:31 ` Matan Azrad
2017-10-26 12:12 ` Nélio Laranjeiro
2017-10-26 12:30 ` Matan Azrad
2017-10-26 13:44 ` Nélio Laranjeiro
2017-10-26 16:21 ` Matan Azrad [this message]
2017-10-23 14:21 ` [dpdk-dev] [PATCH v2 5/7] net/mlx4: remove unnecessary variables in Tx burst Ophir Munk
2017-10-25 16:49 ` Adrien Mazarguil
2017-10-23 14:21 ` [dpdk-dev] [PATCH v2 6/7] net/mlx4: improve performance of one Tx segment Ophir Munk
2017-10-25 16:50 ` Adrien Mazarguil
2017-10-23 14:22 ` [dpdk-dev] [PATCH v2 7/7] net/mlx4: separate Tx for multi-segments Ophir Munk
2017-10-25 16:50 ` Adrien Mazarguil
2017-10-30 8:15 ` Ophir Munk
2017-10-30 10:07 ` [dpdk-dev] [PATCH v3 0/7] Tx path improvements Matan Azrad
2017-10-30 10:07 ` [dpdk-dev] [PATCH v3 1/7] net/mlx4: remove error flows from Tx fast path Matan Azrad
2017-10-30 14:23 ` Adrien Mazarguil
2017-10-30 18:11 ` Matan Azrad
2017-10-31 10:16 ` Adrien Mazarguil
2017-10-30 10:07 ` [dpdk-dev] [PATCH v3 2/7] net/mlx4: associate MR to MP in a short function Matan Azrad
2017-10-30 14:23 ` Adrien Mazarguil
2017-10-31 13:25 ` Ophir Munk
2017-10-30 10:07 ` [dpdk-dev] [PATCH v3 3/7] net/mlx4: merge Tx path functions Matan Azrad
2017-10-30 14:23 ` Adrien Mazarguil
2017-10-30 18:12 ` Matan Azrad
2017-10-30 10:07 ` [dpdk-dev] [PATCH v3 4/7] net/mlx4: remove completion counter in Tx burst Matan Azrad
2017-10-30 14:23 ` Adrien Mazarguil
2017-10-30 10:07 ` [dpdk-dev] [PATCH v3 5/7] net/mlx4: separate Tx segment cases Matan Azrad
2017-10-30 14:23 ` Adrien Mazarguil
2017-10-30 18:23 ` Matan Azrad
2017-10-31 10:17 ` Adrien Mazarguil
2017-10-30 10:07 ` [dpdk-dev] [PATCH v3 6/7] net/mlx4: mitigate Tx path memory barriers Matan Azrad
2017-10-30 14:23 ` Adrien Mazarguil
2017-10-30 19:47 ` Matan Azrad
2017-10-31 10:17 ` Adrien Mazarguil
2017-10-31 11:35 ` Matan Azrad
2017-10-31 13:21 ` Adrien Mazarguil
2017-10-30 10:07 ` [dpdk-dev] [PATCH v3 7/7] net/mlx4: remove empty Tx segment support Matan Azrad
2017-10-30 14:24 ` Adrien Mazarguil
2017-10-31 18:21 ` [dpdk-dev] [PATCH v4 0/8] net/mlx4: Tx path improvements Matan Azrad
2017-10-31 18:21 ` [dpdk-dev] [PATCH v4 1/8] net/mlx4: remove error flows from Tx fast path Matan Azrad
2017-10-31 18:21 ` [dpdk-dev] [PATCH v4 2/8] net/mlx4: associate MR to MP in a short function Matan Azrad
2017-11-02 13:42 ` Adrien Mazarguil
2017-10-31 18:21 ` [dpdk-dev] [PATCH v4 3/8] net/mlx4: fix ring wraparound compiler hint Matan Azrad
2017-11-02 13:42 ` Adrien Mazarguil
2017-10-31 18:21 ` [dpdk-dev] [PATCH v4 4/8] net/mlx4: merge Tx path functions Matan Azrad
2017-11-02 13:42 ` Adrien Mazarguil
2017-10-31 18:21 ` [dpdk-dev] [PATCH v4 5/8] net/mlx4: remove duplicate handling in Tx burst Matan Azrad
2017-11-02 13:42 ` Adrien Mazarguil
2017-10-31 18:21 ` [dpdk-dev] [PATCH v4 6/8] net/mlx4: separate Tx segment cases Matan Azrad
2017-11-02 13:43 ` Adrien Mazarguil
2017-10-31 18:21 ` [dpdk-dev] [PATCH v4 7/8] net/mlx4: fix HW memory optimizations careless Matan Azrad
2017-11-02 13:43 ` Adrien Mazarguil
2017-10-31 18:21 ` [dpdk-dev] [PATCH v4 8/8] net/mlx4: mitigate Tx path memory barriers Matan Azrad
2017-11-02 13:43 ` Adrien Mazarguil
2017-11-02 13:41 ` [dpdk-dev] [PATCH] net/mlx4: fix missing include Adrien Mazarguil
2017-11-02 20:35 ` Ferruh Yigit
2017-11-02 16:42 ` [dpdk-dev] [PATCH v5 0/8] net/mlx4: Tx path improvements Matan Azrad
2017-11-02 16:42 ` [dpdk-dev] [PATCH v5 1/8] net/mlx4: remove error flows from Tx fast path Matan Azrad
2017-11-02 16:42 ` [dpdk-dev] [PATCH v5 2/8] net/mlx4: associate MR to MP in a short function Matan Azrad
2017-11-02 16:42 ` [dpdk-dev] [PATCH v5 3/8] net/mlx4: fix ring wraparound compiler hint Matan Azrad
2017-11-02 16:42 ` [dpdk-dev] [PATCH v5 4/8] net/mlx4: merge Tx path functions Matan Azrad
2017-11-02 16:42 ` [dpdk-dev] [PATCH v5 5/8] net/mlx4: remove duplicate handling in Tx burst Matan Azrad
2017-11-02 16:42 ` [dpdk-dev] [PATCH v5 6/8] net/mlx4: separate Tx segment cases Matan Azrad
2017-11-02 16:42 ` [dpdk-dev] [PATCH v5 7/8] net/mlx4: fix HW memory optimizations careless Matan Azrad
2017-11-02 16:42 ` [dpdk-dev] [PATCH v5 8/8] net/mlx4: mitigate Tx path memory barriers Matan Azrad
2017-11-02 17:07 ` [dpdk-dev] [PATCH v5 0/8] net/mlx4: Tx path improvements Adrien Mazarguil
2017-11-02 20:35 ` Ferruh Yigit
2017-11-02 20:41 ` Ferruh Yigit
2017-11-03 9:48 ` Adrien Mazarguil
2017-11-03 19:25 ` Ferruh Yigit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=HE1PR0502MB3659806FDE4AF1D103584860D2450@HE1PR0502MB3659.eurprd05.prod.outlook.com \
--to=matan@mellanox.com \
--cc=adrien.mazarguil@6wind.com \
--cc=dev@dpdk.org \
--cc=motih@mellanox.com \
--cc=nelio.laranjeiro@6wind.com \
--cc=olgas@mellanox.com \
--cc=ophirmu@mellanox.com \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).