From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <hejianet@gmail.com>
Received: from mail-pf0-f193.google.com (mail-pf0-f193.google.com
 [209.85.192.193]) by dpdk.org (Postfix) with ESMTP id 6A7581B62A
 for <dev@dpdk.org>; Fri, 13 Oct 2017 05:23:43 +0200 (CEST)
Received: by mail-pf0-f193.google.com with SMTP id z11so7932741pfk.4
 for <dev@dpdk.org>; Thu, 12 Oct 2017 20:23:43 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=subject:to:cc:references:from:message-id:date:user-agent
 :mime-version:in-reply-to;
 bh=bjruzpqCiyszecfUKW3rRsOCDdqWnWDOGJLSYvk4F3k=;
 b=BpjKByJiW1f+8qro6EpEMqZmcY3FM0bwdUhQYbmj2XSqdsU2kABd1z4g/hZG4HpR8z
 jm4KZarXdC627D2ZY4FOhJiDHnpSIKXN960rKlTl7Hpu3WH0QqQ2PNjZCS84ETmmCj8z
 8Eu2Rh+2od8+DlBohmp7sSL52yk7M+MEM+NkzEjhM40ILgxtjMvJRhauZKp4h8u4fh/u
 gG/dSTF2xbNT6foVg33NSteFw8LKr3AmxX1v/u8md0jW2WYbe1XRU9PWlSRgNgIhwnvp
 qE3NtMUtNWG5KHbdybFJyaYDwug4MROqLj8fB5owH5FFZV7MljblAXcwUdfvgMxvkxOY
 9uJg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:cc:references:from:message-id:date
 :user-agent:mime-version:in-reply-to;
 bh=bjruzpqCiyszecfUKW3rRsOCDdqWnWDOGJLSYvk4F3k=;
 b=QKXOKpVFxW9B9eqm//wXCAX6/jqCYIlouamyJ6FQeXDGnokgLN5BnNmHLr6e4xYw2h
 9lElfYhoyBBR8dlpfyoT2Xrw8lIt6P4ffe0cJjjsqv4NO2Ngw7lEoX7Cehf6MAyV/Oia
 FJpRTucyVDTlOLQDoJdqArsbqx4DyPVOxw+K/l80oquNarCmAGytSKNRBzWCzslfVnYX
 T+Q+LYvp4spgrFj2pzqF/W7Hj7OhreLyafCtnl0dXyQaCRJFp1yU+tCITLkYRa5xRLBC
 tmfeUKEev7R/0P00E0+Ku7MnQdFYcKDdclImigkHWWvTNzkstiGKjVJ1lYqPF0V1l81t
 qUbA==
X-Gm-Message-State: AMCzsaW84sooUkBBWuHh7iDJkIBsOEbS31Q9MWeJMxiNImEkBJxEnjGz
 lQ2wNN5AnZx2wL5B//zBxns=
X-Google-Smtp-Source: AOwi7QDwXvR3NdVNzUJirYAP6Ybg3EUqpJGMlT+7cRNDAY+bq67lV3xXe1KMAArZYVHjUOTv41u6JQ==
X-Received: by 10.98.71.20 with SMTP id u20mr47375pfa.23.1507865022407;
 Thu, 12 Oct 2017 20:23:42 -0700 (PDT)
Received: from [0.0.0.0] (67.209.179.165.16clouds.com. [67.209.179.165])
 by smtp.gmail.com with ESMTPSA id p90sm30667375pfj.157.2017.10.12.20.23.37
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Thu, 12 Oct 2017 20:23:41 -0700 (PDT)
To: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Cc: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
 Olivier MATZ <olivier.matz@6wind.com>, "dev@dpdk.org" <dev@dpdk.org>,
 "jia.he@hxt-semitech.com" <jia.he@hxt-semitech.com>,
 "jie2.liu@hxt-semitech.com" <jie2.liu@hxt-semitech.com>,
 "bing.zhao@hxt-semitech.com" <bing.zhao@hxt-semitech.com>
References: <20171010095636.4507-1-hejianet@gmail.com>
 <20171012155350.j34ddtivxzd27pag@platinum>
 <2601191342CEEE43887BDE71AB9772585FAA859F@IRSMSX103.ger.corp.intel.com>
 <20171012172311.GA8524@jerin>
 <c3517bf8-95f1-0aa4-fc64-47922c35ce1f@gmail.com>
 <d48351a5-2c43-4fff-0a66-fd06707a530d@gmail.com>
 <20171013014914.GA2067@jerin>
From: Jia He <hejianet@gmail.com>
Message-ID: <0c307108-be4f-42bf-f6a6-ac3099bf2985@gmail.com>
Date: Fri, 13 Oct 2017 11:23:34 +0800
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101
 Thunderbird/52.3.0
MIME-Version: 1.0
In-Reply-To: <20171013014914.GA2067@jerin>
Content-Type: multipart/mixed; boundary="------------F1F84F412802B2F161498D11"
Subject: Re: [dpdk-dev] [PATCH] ring: guarantee ordering of cons/prod
 loading when doing enqueue/dequeue
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Fri, 13 Oct 2017 03:23:43 -0000

This is a multi-part message in MIME format.
--------------F1F84F412802B2F161498D11
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit

Hi Jerin


On 10/13/2017 9:49 AM, Jerin Jacob Wrote:
> -----Original Message-----
>> Date: Fri, 13 Oct 2017 09:16:31 +0800
>> From: Jia He <hejianet@gmail.com>
>> To: Jerin Jacob <jerin.jacob@caviumnetworks.com>, "Ananyev, Konstantin"
>>   <konstantin.ananyev@intel.com>
>> Cc: Olivier MATZ <olivier.matz@6wind.com>, "dev@dpdk.org" <dev@dpdk.org>,
>>   "jia.he@hxt-semitech.com" <jia.he@hxt-semitech.com>,
>>   "jie2.liu@hxt-semitech.com" <jie2.liu@hxt-semitech.com>,
>>   "bing.zhao@hxt-semitech.com" <bing.zhao@hxt-semitech.com>
>> Subject: Re: [PATCH] ring: guarantee ordering of cons/prod loading when
>>   doing enqueue/dequeue
>> User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101
>>   Thunderbird/52.3.0
>>
>> Hi
>>
>>
>> On 10/13/2017 9:02 AM, Jia He Wrote:
>>> Hi Jerin
>>>
>>>
>>> On 10/13/2017 1:23 AM, Jerin Jacob Wrote:
>>>> -----Original Message-----
>>>>> Date: Thu, 12 Oct 2017 17:05:50 +0000
>>>>>
>> [...]
>>>> On the same lines,
>>>>
>>>> Jia He, jie2.liu, bing.zhao,
>>>>
>>>> Is this patch based on code review or do you saw this issue on any
>>>> of the
>>>> arm/ppc target? arm64 will have performance impact with this change.
>> sorry, miss one important information
>> Our platform is an aarch64 server with 46 cpus.
> Is this an OOO(Out of order execution) aarch64 CPU implementation?
I think so, it is a server cpu (ARMv8-A), but do you know how to confirm it?
cat /proc/cpuinfo
processor       : 0
BogoMIPS        : 40.00
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid asimdrdm
CPU implementer : 0x51
CPU architecture: 8
CPU variant     : 0x0
CPU part        : 0x800
CPU revision    : 0
>> If we reduced the involved cpu numbers, the bug occurred less frequently.
>>
>> Yes, mb barrier impact the performance, but correctness is more important,
>> isn't it ;-)
> Yes.
>
>> Maybe we can  find any other lightweight barrier here?
> Yes, Regarding the lightweight barrier, arm64 has native support for acquire and release
> semantics, which is exposed through gcc as architecture agnostic
> functions.
> https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
> http://preshing.com/20130922/acquire-and-release-fences/
>
> Good to know,
> 1) How much overhead this patch in your platform? Just relative
> numbers are enough
I create a *standalone* test case for test_mbuf
Attached the debug patch
It is hard to believe but the truth is that the performance after adding 
rmb barrier
is better than no adding.

With this patch (4 times running)
time ./test_good --no-huge -l 1-20
real    0m23.311s
user    7m21.870s
sys     0m0.021s

time ./test_bad --no-huge -l 1-20
Without this patch
real    0m38.972s
user    12m35.271s
sys     0m0.030s

Cheers,
Jia
> 2) As a prototype, Is Changing to acquire and release schematics
> reduces the overhead in your platform?
>
> Reference FreeBSD ring/DPDK style ring implementation through acquire
> and release schematics
> https://github.com/Linaro/odp/blob/master/platform/linux-generic/pktio/ring.c
>
> I will also spend on cycles on this.
>
>
>> Cheers,
>> Jia
>>> Based on mbuf_autotest, the rte_panic will be invoked in seconds.
>>>
>>> PANIC in test_refcnt_iter():
>>> (lcore=0, iter=0): after 10s only 61 of 64 mbufs left free
>>> 1: [./test(rte_dump_stack+0x38) [0x58d868]]
>>> Aborted (core dumped)
>>>
>>> Cheers,
>>> Jia
>>>>
>>>>> Konstantin


--------------F1F84F412802B2F161498D11
Content-Type: text/plain; charset=UTF-8;
 name="barrier_performance_debug.patch"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename="barrier_performance_debug.patch"

ZGlmZiAtLWdpdCBhL2xpYi9saWJydGVfcmluZy9ydGVfcmluZy5oIGIvbGliL2xpYnJ0ZV9y
aW5nL3J0ZV9yaW5nLmgKaW5kZXggNWU5YjNiNy4uMjMxNjhlNyAxMDA2NDQKLS0tIGEvbGli
L2xpYnJ0ZV9yaW5nL3J0ZV9yaW5nLmgKKysrIGIvbGliL2xpYnJ0ZV9yaW5nL3J0ZV9yaW5n
LmgKQEAgLTUxNyw2ICs1MTcsNyBAQCBfX3J0ZV9yaW5nX21vdmVfY29uc19oZWFkKHN0cnVj
dCBydGVfcmluZyAqciwgaW50IGlzX3NjLAogCQluID0gbWF4OwogCiAJCSpvbGRfaGVhZCA9
IHItPmNvbnMuaGVhZDsKKwkJcnRlX3NtcF9ybWIoKTsKIAkJY29uc3QgdWludDMyX3QgcHJv
ZF90YWlsID0gci0+cHJvZC50YWlsOwogCQkvKiBUaGUgc3VidHJhY3Rpb24gaXMgZG9uZSBi
ZXR3ZWVuIHR3byB1bnNpZ25lZCAzMmJpdHMgdmFsdWUKIAkJICogKHRoZSByZXN1bHQgaXMg
YWx3YXlzIG1vZHVsbyAzMiBiaXRzIGV2ZW4gaWYgd2UgaGF2ZQpkaWZmIC0tZ2l0IGEvdGVz
dC90ZXN0L3Rlc3QuYyBiL3Rlc3QvdGVzdC90ZXN0LmMKaW5kZXggOWFjY2JkMS4uNjkxMGIw
NiAxMDA2NDQKLS0tIGEvdGVzdC90ZXN0L3Rlc3QuYworKysgYi90ZXN0L3Rlc3QvdGVzdC5j
CkBAIC02MSw2ICs2MSw3IEBAIGV4dGVybiBjbWRsaW5lX3BhcnNlX2N0eF90IG1haW5fY3R4
W107CiAKICNpbmNsdWRlICJ0ZXN0LmgiCiAKK2V4dGVybiBpbnQgdGVzdF9tYnVmKHZvaWQp
OwogI2RlZmluZSBSVEVfTE9HVFlQRV9BUFAgUlRFX0xPR1RZUEVfVVNFUjEKIAogY29uc3Qg
Y2hhciAqcHJnbmFtZTsgLyogdG8gYmUgc2V0IHRvIGFyZ3ZbMF0gKi8KQEAgLTEzNSw2ICsx
MzYsOSBAQCBtYWluKGludCBhcmdjLCBjaGFyICoqYXJndikKIAkJUlRFX0xPRyhJTkZPLCBB
UFAsCiAJCQkJIkhQRVQgaXMgbm90IGVuYWJsZWQsIHVzaW5nIFRTQyBhcyBkZWZhdWx0IHRp
bWVyXG4iKTsKIAorCXRlc3RfbWJ1ZigpOworCXByaW50ZigidGVzdF9tYnVmIGRvbmVcbiIp
OworCXJldHVybiAwOwogCiAjaWZkZWYgUlRFX0xJQlJURV9DTURMSU5FCiAJY2wgPSBjbWRs
aW5lX3N0ZGluX25ldyhtYWluX2N0eCwgIlJURT4+Iik7CmRpZmYgLS1naXQgYS90ZXN0L3Rl
c3QvdGVzdF9tYnVmLmMgYi90ZXN0L3Rlc3QvdGVzdF9tYnVmLmMKaW5kZXggMzM5NmI0YS4u
NzI3NzI2MCAxMDA2NDQKLS0tIGEvdGVzdC90ZXN0L3Rlc3RfbWJ1Zi5jCisrKyBiL3Rlc3Qv
dGVzdC90ZXN0X21idWYuYwpAQCAtNTksNyArNTksNiBAQAogI2luY2x1ZGUgPHJ0ZV9jeWNs
ZXMuaD4KIAogI2luY2x1ZGUgInRlc3QuaCIKLQogI2RlZmluZSBNQlVGX0RBVEFfU0laRSAg
ICAgICAgICAyMDQ4CiAjZGVmaW5lIE5CX01CVUYgICAgICAgICAgICAgICAgIDEyOAogI2Rl
ZmluZSBNQlVGX1RFU1RfREFUQV9MRU4gICAgICAxNDY0CkBAIC04NSw3ICs4NCw3IEBACiAK
IHN0YXRpYyB2b2xhdGlsZSB1aW50MzJfdCByZWZjbnRfc3RvcF9zbGF2ZXM7CiBzdGF0aWMg
dW5zaWduZWQgcmVmY250X2xjb3JlW1JURV9NQVhfTENPUkVdOwotCitpbnQgdGVzdF9tYnVm
KHZvaWQpOwogI2VuZGlmCiAKIC8qCkBAIC03MTgsNyArNzE3LDcgQEAgdGVzdF9yZWZjbnRf
aXRlcih1bnNpZ25lZCBpbnQgbGNvcmUsIHVuc2lnbmVkIGludCBpdGVyLAogCWZvciAoaSA9
IDAsIG4gPSBydGVfbWVtcG9vbF9hdmFpbF9jb3VudChyZWZjbnRfcG9vbCk7CiAJICAgIGkg
IT0gbiAmJiAobSA9IHJ0ZV9wa3RtYnVmX2FsbG9jKHJlZmNudF9wb29sKSkgIT0gTlVMTDsK
IAkgICAgaSsrKSB7Ci0JCXJlZiA9IFJURV9NQVgocnRlX3JhbmQoKSAlIFJFRkNOVF9NQVhf
UkVGLCAxVUwpOworCQlyZWYgPSBSRUZDTlRfTUFYX1JFRjsKIAkJdHJlZiArPSByZWY7CiAJ
CWlmICgocmVmICYgMSkgIT0gMCkgewogCQkJcnRlX3BrdG1idWZfcmVmY250X3VwZGF0ZSht
LCByZWYpOwpAQCAtNzQ1LDE0ICs3NDQsMTcgQEAgdGVzdF9yZWZjbnRfaXRlcih1bnNpZ25l
ZCBpbnQgbGNvcmUsIHVuc2lnbmVkIGludCBpdGVyLAogCWZvciAod24gPSAwOyB3biAhPSBS
RUZDTlRfTUFYX1RJTUVPVVQ7IHduKyspIHsKIAkJaWYgKChpID0gcnRlX21lbXBvb2xfYXZh
aWxfY291bnQocmVmY250X3Bvb2wpKSA9PSBuKSB7CiAJCQlyZWZjbnRfbGNvcmVbbGNvcmVd
ICs9IHRyZWY7Ci0JCQlwcmludGYoIiVzKGxjb3JlPSV1LCBpdGVyPSV1KSBjb21wbGV0ZWQs
ICIKKwkJCS8qcHJpbnRmKCIlcyhsY29yZT0ldSwgaXRlcj0ldSkgY29tcGxldGVkLCAiCiAJ
CQkgICAgIiV1IHJlZmVyZW5jZXMgcHJvY2Vzc2VkXG4iLAotCQkJICAgIF9fZnVuY19fLCBs
Y29yZSwgaXRlciwgdHJlZik7CisJCQkgICAgX19mdW5jX18sIGxjb3JlLCBpdGVyLCB0cmVm
KTsqLwogCQkJcmV0dXJuOwogCQl9CiAJCXJ0ZV9kZWxheV9tcygxMDApOwogCX0KIAorICAg
ICAgICBydGVfbWVtcG9vbF9kdW1wKHN0ZG91dCwgcmVmY250X3Bvb2wpOworCXJ0ZV9yaW5n
X2R1bXAoc3Rkb3V0LCByZWZjbnRfcG9vbC0+cG9vbF9kYXRhKTsKKyAgICAgICAgcnRlX3Jp
bmdfZHVtcChzdGRvdXQsIHJlZmNudF9tYnVmX3JpbmcpOwogCXJ0ZV9wYW5pYygiKGxjb3Jl
PSV1LCBpdGVyPSV1KTogYWZ0ZXIgJXVzIG9ubHkgIgogCSAgICAgICAgICAiJXUgb2YgJXUg
bWJ1ZnMgbGVmdCBmcmVlXG4iLCBsY29yZSwgaXRlciwgd24sIGksIG4pOwogfQpAQCAtNzY2
LDcgKzc2OCw3IEBAIHRlc3RfcmVmY250X21hc3RlcihzdHJ1Y3QgcnRlX21lbXBvb2wgKnJl
ZmNudF9wb29sLAogCWxjb3JlID0gcnRlX2xjb3JlX2lkKCk7CiAJcHJpbnRmKCIlcyBzdGFy
dGVkIGF0IGxjb3JlICV1XG4iLCBfX2Z1bmNfXywgbGNvcmUpOwogCi0JZm9yIChpID0gMDsg
aSAhPSBSRUZDTlRfTUFYX0lURVI7IGkrKykKKwlmb3IgKGkgPSAwOyBpICE9IDEwKlJFRkNO
VF9NQVhfSVRFUjsgaSsrKQogCQl0ZXN0X3JlZmNudF9pdGVyKGxjb3JlLCBpLCByZWZjbnRf
cG9vbCwgcmVmY250X21idWZfcmluZyk7CiAKIAlyZWZjbnRfc3RvcF9zbGF2ZXMgPSAxOwpA
QCAtMTA1OCw4ICsxMDYwLDcgQEAgdGVzdF9tYnVmX2xpbmVhcml6ZV9jaGVjayhzdHJ1Y3Qg
cnRlX21lbXBvb2wgKnBrdG1idWZfcG9vbCkKIAlyZXR1cm4gMDsKIH0KIAotc3RhdGljIGlu
dAotdGVzdF9tYnVmKHZvaWQpCitpbnQgdGVzdF9tYnVmKHZvaWQpCiB7CiAJaW50IHJldCA9
IC0xOwogCXN0cnVjdCBydGVfbWVtcG9vbCAqcGt0bWJ1Zl9wb29sID0gTlVMTDsK
--------------F1F84F412802B2F161498D11--