DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Pradeep Satyanarayana" <pradeep@us.ibm.com>
To: Shahaf Shuler <shahafs@mellanox.com>
Cc: Chao Zhu <chaozhu@linux.vnet.ibm.com>,
	Dekel Peled <dekelp@mellanox.com>, "dev@dpdk.org" <dev@dpdk.org>,
	David Christensen <drc@ibm.com>, Ori Kam <orika@mellanox.com>,
	"stable@dpdk.org" <stable@dpdk.org>,
	Thomas Monjalon <thomas@monjalon.net>,
	David Wilder <wilder@us.ibm.com>,
	Yongseok Koh <yskoh@mellanox.com>
Subject: Re: [dpdk-dev] [PATCH] eal/ppc: remove fix of memory barrier for IBM POWER
Date: Thu, 21 Mar 2019 17:40:44 -0800	[thread overview]
Message-ID: <OF1F3C40D3.A212732F-ON882583C4.007F1C22-882583C5.0009390F@notes.na.collabserv.com> (raw)
In-Reply-To: <AM0PR0502MB3795291036D9019239391AF4C3420@AM0PR0502MB3795.eurprd05.prod.outlook.com>



Shahaf Shuler <shahafs@mellanox.com> wrote on 03/21/2019 01:49:39 AM:

> From: Shahaf Shuler <shahafs@mellanox.com>
> To: "pradeep@us.ibm.com" <pradeep@us.ibm.com>, Thomas Monjalon
> <thomas@monjalon.net>
> Cc: Chao Zhu <chaozhu@linux.vnet.ibm.com>, Dekel Peled
> <dekelp@mellanox.com>, "dev@dpdk.org" <dev@dpdk.org>, Ori Kam
> <orika@mellanox.com>, "stable@dpdk.org" <stable@dpdk.org>, Yongseok
> Koh <yskoh@mellanox.com>, David Christensen <drc@ibm.com>, David
> Wilder <wilder@us.ibm.com>
> Date: 03/21/2019 01:54 AM
> Subject: RE: [PATCH] eal/ppc: remove fix of memory barrier for IBM POWER
>
> Pradeep Satyanarayana <pradeep@us.ibm.com> wrote on Thu 3/21/2019 12:41
AM:
> >Thomas Monjalon <thomas@monjalon.net> wrote on 03/19/2019 01:45:01 PM:
> >
> >> From: Thomas Monjalon <thomas@monjalon.net>
> >> To: Shahaf Shuler <shahafs@mellanox.com>
> >> Cc: Dekel Peled <dekelp@mellanox.com>, Chao Zhu
> >> <chaozhu@linux.vnet.ibm.com>, Yongseok Koh <yskoh@mellanox.com>,
> >> "dev@dpdk.org" <dev@dpdk.org>, Ori Kam <orika@mellanox.com>,
> >> "stable@dpdk.org" <stable@dpdk.org>, pradeep@us.ibm.com
> >> Date: 03/19/2019 01:45 PM
> >> Subject: Re: [PATCH] eal/ppc: remove fix of memory barrier for IBM
POWER
> >>
>
> [...]
>
> >> >
> >> > So far, when not running on power, we used the rte_wmb for that.
> >> On x86 and ARM systems it provided the needed guarantees.
> >> > It is also mentioned in the barrier doxygen on ARM arch:
> >> > "
> >> > Write memory barrier.
> >> >
> >> > Guarantees that the STORE operations generated before the barrier
> >> > occur before the STORE operations generated after.
> >> > "
> >> >
> >> > It doesn't restrict to store to system memory only.
> >> > w/ power is on somewhat different and in fact rte_mb is required.
> >> It obviously miss the point of those barrier if we will need to use
> >> a different barrier based on the system arch.
> >> >
> >> > We need to align the definition of the different barriers in DPDK:
> >> > 1. need a clear documentation of each. this should be global and
> >> not part of the specific implementation on each arch.
> >
> >A single approach may not work for all architectures. Power is different
> >from others, so we need to be able to accommodate that. More comments
below.
>
> it don't get this claim.
> It is ok to have some differences between the different arch, but
> here you implement a well-defined barrier - rte_wmb.
> if you see a need we can discuss to define a **new** barrier which
> sync STORE only to system memory, and will be able to utilize the
> lwsync command.
>
> >
> >>
> >> The global definition is in lib/librte_eal/common/include/
> generic/rte_atomic.h
> >>
> >> There are some copy/paste in Arm32 and PPC that I will remove.
> >>
> >> > 2. either modify ppc rte_wmb to match ARM and x86 ones or to
> >> define a new type of barrier which will sync between both I/O and
> >> stores to systems memory.
> >>
> >> The basic memory barrier of DPDK does not mention
> >> a difference between I/O and system memory.
> >
> >In the case of Power, sync will cater to both I/O and system
> memory. However, that
> >is too big a hammer in all cases.
>
> rte_wmb requires such sync. you propose to have the wrong barrier in
> favor of performance.
> to mitigate this you can take my suggestion above and define a new,
> more lightweight one.
>
> >
> >> It is not explicit (yet) but I assume it is protecting both.
> >> So, in my opinion, we need to make it explicit in the doc,
> >> and fix the PPC implementation to comply with this definition.
> >>
> >> Anyway, I don't see any significant effort from IBM to move from
> >> the alpha support stage to a real Open Source support.
> >> PS: sending a mail every two months, to promise improvements, is
> not enough!
> >
>
> […]
>
> >
> >We should retain lwsync, should not be removed as discussed in here:
> >
> >http://mails.dpdk.org/archives/dev/2019-March/126746.html
>
> i don't agree.
> it is very clear the rte_wmb implementation in power is broken and
> we need to fix this right away before other customers will hit the
> same issue.


In the DPDK source I see a couple of different classes of memory barriers.
I am
not clear on the usage of these in the drivers, but I would think the
guidelines
to be as shown below (for Power):

- rte_[rw]mb (general memory barrier) --> should be lwsync
- rte_smp_[rw]mb (SMP memory barrier) -->should be lwsync
- rte_io_[rw]mb (I/O memory barrier)  --> should be sync
- rte_cio_[rw]mb (coherent I/O memory barrier) -->should be sync

lwsync is appropriate for cases where CPUs are accessing cacheable memory
(i.e. Memory Coherence Required) while the sync instruction should be used
in all other cases.

With the patch in:
http://mails.dpdk.org/archives/dev/2019-March/126746.html

It converts even the rte_smp_[rw]mb into sync. That is not what the
rte_smp*() should
be implementing as per the guidelines above.

static __rte_always_inline void
mlx5_tx_dbrec_cond_wmb(struct mlx5_txq_data *txq, volatile struct mlx5_wqe
*wqe,
                       int cond)
{
        uint64_t *dst = (uint64_t *)((uintptr_t)txq->bf_reg);
        volatile uint64_t *src = ((volatile uint64_t *)wqe);

        rte_cio_wmb(); --> would rte_cio_rmb() be more appropriate?
        *txq->qp_db = rte_cpu_to_be_32(txq->wqe_ci);
        /* Ensure ordering between DB record and BF copy. */
        rte_wmb(); --> what has been established is that for Power we need
"sync" instead of lwsync
		We are dealing with device memory -should we be using an
rte_io_wmb() here?

        mlx5_uar_write64_relaxed(*src, dst, txq->uar_lock);
        if (cond)
                rte_wmb(); --> what about here? Are there conditions when
we need sync?
}


Thanks
Pradeep
pradeep@us.ibm.com

  parent reply	other threads:[~2019-03-22  1:40 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-18 12:58 Dekel Peled
2019-03-18 12:58 ` Dekel Peled
2019-03-19  3:24 ` Chao Zhu
2019-03-19  3:24   ` Chao Zhu
2019-03-19 10:05   ` Dekel Peled
2019-03-19 10:05     ` Dekel Peled
2019-03-19 11:14     ` Thomas Monjalon
2019-03-19 11:14       ` Thomas Monjalon
2019-03-19 19:42       ` Shahaf Shuler
2019-03-19 19:42         ` Shahaf Shuler
2019-03-19 20:45         ` Thomas Monjalon
2019-03-19 20:45           ` Thomas Monjalon
2019-03-20 22:40           ` Pradeep Satyanarayana
2019-03-20 22:40             ` Pradeep Satyanarayana
2019-03-21  8:49             ` Shahaf Shuler
2019-03-21  8:49               ` Shahaf Shuler
2019-03-22  1:40               ` Pradeep Satyanarayana [this message]
2019-03-22  1:40                 ` Pradeep Satyanarayana
2019-03-22  8:49                 ` Thomas Monjalon
2019-03-22  8:49                   ` Thomas Monjalon
2019-03-22 15:30                   ` Pradeep Satyanarayana
2019-03-22 15:30                     ` Pradeep Satyanarayana
2019-03-22 17:51                     ` Thomas Monjalon
2019-03-22 17:51                       ` Thomas Monjalon
2019-03-22 22:57                       ` Pradeep Satyanarayana
2019-03-22 22:57                         ` Pradeep Satyanarayana
2019-03-24  6:37                         ` Shahaf Shuler
2019-03-24  6:37                           ` Shahaf Shuler
2019-03-24 17:37                           ` Pradeep Satyanarayana
2019-03-24 17:37                             ` Pradeep Satyanarayana
2019-03-26  9:15                             ` Dekel Peled
2019-03-26  9:15                               ` Dekel Peled
2019-03-27  9:19                               ` Thomas Monjalon
2019-03-27  9:19                                 ` Thomas Monjalon
2019-03-27 23:50                                 ` Pradeep Satyanarayana
2019-03-27 23:50                                   ` Pradeep Satyanarayana
     [not found]                                 ` <OF456B0ECC.006EF7E7-ON882583CA.00827A75-882583CA.0082F7BE@LocalDomain>
2019-03-28 17:51                                   ` Pradeep Satyanarayana
2019-03-28 17:51                                     ` Pradeep Satyanarayana
2019-03-28 17:56                                     ` Thomas Monjalon
2019-03-28 17:56                                       ` Thomas Monjalon
2019-03-28 22:50 ` [dpdk-dev] [dpdk-stable] " Thomas Monjalon
2019-03-28 22:50   ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=OF1F3C40D3.A212732F-ON882583C4.007F1C22-882583C5.0009390F@notes.na.collabserv.com \
    --to=pradeep@us.ibm.com \
    --cc=chaozhu@linux.vnet.ibm.com \
    --cc=dekelp@mellanox.com \
    --cc=dev@dpdk.org \
    --cc=drc@ibm.com \
    --cc=orika@mellanox.com \
    --cc=shahafs@mellanox.com \
    --cc=stable@dpdk.org \
    --cc=thomas@monjalon.net \
    --cc=wilder@us.ibm.com \
    --cc=yskoh@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).