DPDK patches and discussions
 help / color / mirror / Atom feed
From: "didier.pallard" <didier.pallard@6wind.com>
To: "Richardson, Bruce" <bruce.richardson@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH 1/2] mem: add write memory barrier before changing heap state
Date: Wed, 16 Apr 2014 10:55:07 +0200	[thread overview]
Message-ID: <534E456B.8080909@6wind.com> (raw)
In-Reply-To: <59AF69C657FD0841A61C55336867B5B01A9FCCFF@IRSMSX103.ger.corp.intel.com>

On 04/15/2014 04:08 PM, Richardson, Bruce wrote:
>> -----Original Message-----
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of David Marchand
>> Sent: Tuesday, April 15, 2014 2:51 PM
>> To: dev@dpdk.org
>> Subject: [dpdk-dev] [PATCH 1/2] mem: add write memory barrier before
>> changing heap state
>>
>> From: Didier Pallard <didier.pallard@6wind.com>
>>
>> a write memory barrier is needed before changing heap state value, else some
>> concurrent core may see state changing before all initialization values are
>> written to memory, causing unpredictable results in malloc function.
>>
>> Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
> No barrier should be necessary here. As in a number of other places, such as rings, compiler barriers can be used in place of write memory barriers, due to IA ordering rules. However, in this case, both variables referenced are volatile variables and so the assignments to them cannot be reordered by the compiler so no compiler barrier is necessary either.
>
> Regards,
> /Bruce

Hi bruce,

Indeed a compiler barrier is absolutely needed here. volatile variable 
use is absolutely not a serializing instruction from compiler point of 
view; only atomic variable use is serializing, due to asm volatile 
(memory) directive use.
Here is the assembler generated with and without rte_wmb:


With rte_wmb

  142:   f0 45 0f b1 07          lock cmpxchg %r8d,(%r15)
  147:   0f 94 c0                sete   %al
  14a:   84 c0                   test   %al,%al
  14c:   74 ea                   je     138 <malloc_heap_alloc+0x68>
  14e:   49 c7 47 10 00 00 00    movq   $0x0,0x10(%r15)
  155:   00
  156:   41 c7 47 18 00 00 00    movl   $0x0,0x18(%r15)
  15d:   00
  15e:   41 c7 47 08 00 00 00    movl   $0x0,0x8(%r15)
  165:   00
  166:   41 c7 47 1c 00 00 00    movl   $0x0,0x1c(%r15)
  16d:   00
  16e:   49 c7 47 20 00 00 00    movq   $0x0,0x20(%r15)
  175:   00
  176:   45 89 57 04             mov    %r10d,0x4(%r15)
  17a:   0f ae f8                sfence
* 17d:   41 c7 07 02 00 00 00    movl   $0x2,(%r15)**
** 184:   41 8b 37                mov    (%r15),%esi**
* 187:   83 fe 02                cmp    $0x2,%esi
  18a:   75 b4                   jne    140 <malloc_heap_alloc+0x70>
  18c:   0f 1f 40 00             nopl   0x0(%rax)
  190:   48 83 c3 3f             add    $0x3f,%rbx


Without rte_wmb

  142:   f0 45 0f b1 07          lock cmpxchg %r8d,(%r15)
  147:   0f 94 c0                sete   %al
  14a:   84 c0                   test   %al,%al
  14c:   74 ea                   je     138 <malloc_heap_alloc+0x68>
  14e:   49 c7 47 10 00 00 00    movq   $0x0,0x10(%r15)
  155:   00
  156:   41 c7 47 08 00 00 00    movl   $0x0,0x8(%r15)
  15d:   00
* 15e:   41 c7 07 02 00 00 00    movl   $0x2,(%r15)**
** 165:   41 8b 37                mov    (%r15),%esi**
* 168:   41 c7 47 18 00 00 00    movl $0x0,0x18(%r15)
  16f:   00
  170:   41 c7 47 1c 00 00 00    movl   $0x0,0x1c(%r15)
  177:   00
  178:   49 c7 47 20 00 00 00    movq   $0x0,0x20(%r15)
  17f:   00
  180:   45 89 57 04             mov    %r10d,0x4(%r15)
  184:   83 fe 02                cmp    $0x2,%esi
  187:   75 b7                   jne    140 <malloc_heap_alloc+0x70>
  189:   0f 1f 80 00 00 00 00    nopl   0x0(%rax)
  190:   48 83 c3 3f             add    $0x3f,%rbx

It's clear that the *heap->initialised = INITIALISED;* instruction has 
been reordered by the compiler.

About rte_wmb and rte_rmb use, i agree with you that on intel 
architecture those macro should do nothing more than compiler barrier, 
due to Intel architecture choices.
But for code completness, i think those memory barriers should remain in 
place in the code, and rte_*mb should map to compiler barrier on intel 
architecture.

didier

  reply	other threads:[~2014-04-16  8:55 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-15 13:50 David Marchand
2014-04-15 13:50 ` [dpdk-dev] [PATCH 2/2] mem: fix initialization check for malloc heap David Marchand
2014-04-15 14:08 ` [dpdk-dev] [PATCH 1/2] mem: add write memory barrier before changing heap state Richardson, Bruce
2014-04-16  8:55   ` didier.pallard [this message]
2014-04-15 14:44 ` Neil Horman
2014-04-18 12:56   ` [dpdk-dev] [PATCH 0/2] rework heap initialisation David Marchand
2014-04-18 12:56     ` [dpdk-dev] [PATCH 1/2] malloc: get rid of numa_socket field David Marchand
2014-04-18 13:08       ` Neil Horman
2014-04-30  9:46         ` Thomas Monjalon
2014-04-18 12:56     ` [dpdk-dev] [PATCH 2/2] malloc: simplify heap initialisation David Marchand
2014-04-18 13:09       ` Neil Horman
2014-04-30  9:47         ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=534E456B.8080909@6wind.com \
    --to=didier.pallard@6wind.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).