DPDK patches and discussions
 help / color / mirror / Atom feed
From: Morten Brørup <mb@smartsharesystems.com>
To: "Jerin Jacob" <jerinjacobk@gmail.com>,
	"Thomas Monjalon" <thomas@monjalon.net>
Cc: "dpdk-dev" <dev@dpdk.org>,
	"David Marchand" <david.marchand@redhat.com>,
	"Ferruh Yigit" <ferruh.yigit@intel.com>,
	"Olivier Matz" <olivier.matz@6wind.com>,
	"Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	"Andrew Rybchenko" <andrew.rybchenko@oktetlabs.ru>,
	"Viacheslav Ovsiienko" <viacheslavo@nvidia.com>,
	"Ajit Khaparde" <ajit.khaparde@broadcom.com>,
	"Jerin Jacob" <jerinj@marvell.com>,
	"Hemant Agrawal" <hemant.agrawal@nxp.com>,
	"Ray Kinsella" <mdr@ashroe.eu>,
	"Neil Horman" <nhorman@tuxdriver.com>,
	"Nithin Dabilpuram" <ndabilpuram@marvell.com>,
	"Kiran Kumar K" <kirankumark@marvell.com>, <techboard@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH 1/1] mbuf: move pool pointer in first half
Date: Mon, 9 Nov 2020 09:16:27 +0100
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35C61405@smartserver.smartshare.dk> (raw)
In-Reply-To: <CALBAE1NN=iF7+V5xSSCrepddmxpuyr8pqZfiZcvpg1XAo8+TJg@mail.gmail.com>

+CC techboard

> From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> Sent: Monday, November 9, 2020 6:18 AM
> 
> On Sun, Nov 8, 2020 at 2:03 AM Thomas Monjalon <thomas@monjalon.net>
> wrote:
> >
> > 07/11/2020 20:05, Jerin Jacob:
> > > On Sun, Nov 8, 2020 at 12:09 AM Thomas Monjalon
> <thomas@monjalon.net> wrote:
> > > > 07/11/2020 18:12, Jerin Jacob:
> > > > > On Sat, Nov 7, 2020 at 10:04 PM Thomas Monjalon
> <thomas@monjalon.net> wrote:
> > > > > >
> > > > > > The mempool pointer in the mbuf struct is moved
> > > > > > from the second to the first half.
> > > > > > It should increase performance on most systems having 64-byte
> cache line,
> > > > >
> > > > > > i.e. mbuf is split in two cache lines.
> > > > >
> > > > > But In any event, Tx needs to touch the pool to freeing back to
> the
> > > > > pool upon  Tx completion. Right?
> > > > > Not able to understand the motivation for moving it to the
> first 64B cache line?
> > > > > The gain varies from driver to driver. For example, a Typical
> > > > > ARM-based NPU does not need to
> > > > > touch the pool in Rx and its been filled by HW. Whereas it
> needs to
> > > > > touch in Tx if the reference count is implemented.
> > >
> > > See below.
> > >
> > > > >
> > > > > > Due to this change, tx_offload is moved, so some vector data
> paths
> > > > > > may need to be adjusted. Note: OCTEON TX2 check is removed
> temporarily!
> > > > >
> > > > > It will be breaking the Tx path, Please just don't remove the
> static
> > > > > assert without adjusting the code.
> > > >
> > > > Of course not.
> > > > I looked at the vector Tx path of OCTEON TX2,
> > > > it's close to be impossible to understand :)
> > > > Please help!
> > >
> > > Off course. Could you check the above section any share the
> rationale
> > > for this change
> > > and where it helps and how much it helps?
> >
> > It has been concluded in the techboard meeting you were part of.
> > I don't understand why we restart this discussion again.
> > I won't have the energy to restart this process myself.
> > If you don't want to apply the techboard decision, then please
> > do the necessary to request another quick decision.
> 
> Yes. Initially, I thought it is OK as we have 128B CL, After looking
> into Thomas's change, I realized
> it is not good for ARM64 64B catchlines based NPU as
> - A Typical  ARM-based NPU does not need to touch the pool in Rx and
> its been filled by HW. Whereas it needs to
> touch in Tx if the reference count is implemented.

Jerin, I don't understand what the problem is here...

Since RX doesn't touch m->pool, it shouldn't matter for RX which cache line m->pool resides in. I get that.

You are saying that TX needs to touch m->pool if the reference count is implemented. I get that. But I don't understand why it is worse having m->pool in the first cache line than in the second cache line; can you please clarify?

> - Also it will be effecting exiting vector routines

That is unavoidable if we move something from the second to the first cache line.

It may require some rework on the vector routines, but it shouldn't be too difficult for whoever wrote these vector routines.

> 
> I request to reconsider the tech board decision.

I was on the techboard meeting as an observer (or whatever the correct term would be for non-members), and this is my impression of the decision on the meeting:

The techboard clearly decided not to move any dynamic fields in the first cache line, on the grounds that if we move them away again in a later version, DPDK users utilizing a dynamic field in the first cache line might experience a performance drop at that later time. And this will be a very bad user experience, causing grief and complaints. To me, this seemed like a firm decision, based on solid arguments.

Then the techboard discussed which other field to move to the freed up space in the first cache line. There were no performance reports showing any improvements by moving the any of the suggested fields (m->pool, m->next, m->tx_offload), and there was a performance report showing no improvements by moving m->next in a test case with large segmented packets. The techboard decided to move m->pool as originally suggested. To me, this seemed like a somewhat random choice between A, B and C, on the grounds that moving one of them is probably better than moving none of them.

The techboard made its decision based on the information available at that time.

Unfortunately, I do not have the resources to test the performance improvement by moving m->next to the first cache line instead of m->pool and utilizing the DEV_TX_OFFLOAD_MBUF_FAST_FREE flag mentioned by Konstantin.

If no new information comes to light, we cannot expect the techboard to change a decision it has already made.

In any case, I am grateful for the joint effort put into nurturing the mbuf, and especially Thomas' unrelenting hard work in this area!


Med venlig hilsen / kind regards
- Morten Brørup


  parent reply	other threads:[~2020-11-09  8:16 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-07 15:53 Thomas Monjalon
2020-11-07 17:12 ` Jerin Jacob
2020-11-07 18:39   ` Thomas Monjalon
2020-11-07 19:05     ` Jerin Jacob
2020-11-07 20:33       ` Thomas Monjalon
2020-11-09  5:18         ` Jerin Jacob
2020-11-09  8:04           ` Thomas Monjalon
2020-11-09  8:27             ` Jerin Jacob
2020-11-09  9:47               ` Bruce Richardson
2020-11-09 12:01                 ` Jerin Jacob
2020-11-09 12:59                   ` Thomas Monjalon
2020-11-09 13:35                     ` Jerin Jacob
2020-11-09 14:02                       ` Thomas Monjalon
2020-11-09 14:08                         ` Jerin Jacob
2020-11-09 14:42                           ` Thomas Monjalon
2020-11-09 14:53                             ` Jerin Jacob
2020-11-09  8:16           ` Morten Brørup [this message]
2020-11-09 10:06             ` [dpdk-dev] [dpdk-techboard] " Bruce Richardson
2020-11-09 10:21               ` Morten Brørup
2020-11-09 18:04                 ` Stephen Hemminger
2020-11-10  7:15                   ` Morten Brørup
2020-11-07 18:57 ` [dpdk-dev] " Morten Brørup
2020-11-09 10:08   ` Bruce Richardson
2020-11-09 10:30     ` Morten Brørup
2020-11-09 10:33     ` Ananyev, Konstantin
2020-11-09 10:36       ` Bruce Richardson
2020-11-09 11:24       ` Ananyev, Konstantin
2020-11-09 21:29 ` [dpdk-dev] [PATCH v2 0/2] move mbuf pool pointer Thomas Monjalon
2020-11-09 21:29   ` [dpdk-dev] [PATCH v2 1/2] drivers: disable OCTEON TX2 in 32-bit build Thomas Monjalon
2020-11-10 18:05     ` Jerin Jacob
2020-11-09 21:29   ` [dpdk-dev] [PATCH v2 2/2] mbuf: move pool pointer in first half Thomas Monjalon
2020-11-10 10:05     ` Morten Brørup
2020-11-10 10:44       ` Thomas Monjalon
2020-11-10 16:25     ` Olivier Matz
2020-11-10 18:06       ` Jerin Jacob
2020-11-12 14:39         ` Thomas Monjalon
2020-11-10 18:08       ` Stephen Hemminger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35C61405@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=ajit.khaparde@broadcom.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=jerinj@marvell.com \
    --cc=jerinjacobk@gmail.com \
    --cc=kirankumark@marvell.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=mdr@ashroe.eu \
    --cc=ndabilpuram@marvell.com \
    --cc=nhorman@tuxdriver.com \
    --cc=olivier.matz@6wind.com \
    --cc=techboard@dpdk.org \
    --cc=thomas@monjalon.net \
    --cc=viacheslavo@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git