From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from m50-132.163.com (m50-132.163.com [123.125.50.132]) by dpdk.org (Postfix) with ESMTP id 4EA491B1F1 for ; Thu, 19 Oct 2017 13:18:46 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=163.com; s=s110527; h=Subject:From:Message-ID:Date:MIME-Version; bh=GEYoO q9X8zHEz/vzDvXXkQNpxM0exyhZOOM7bCUXekY=; b=OrUV4eUFb/hknhejrJB+7 vrWOr5bsxK0tMPwZWxEJrpolHFDq1wA2KolHEdf9jsqGM+VUT/Dc+dgPOYkcT0Gj ywZqPY8osZU5AheeQR6gF0cTd/nIonNI3RV5P5+E1DSOiHKSJENngF/Nt5Y0VfFK FL1MzOSV2qLE5UJ1JkEkK4= Received: from [10.65.21.177] (unknown [180.173.249.63]) by smtp2 (Coremail) with SMTP id DNGowAB3PzUOiuhZ3rYPAA--.125S2; Thu, 19 Oct 2017 19:18:38 +0800 (CST) To: "Ananyev, Konstantin" , Jia He , Jerin Jacob Cc: Olivier MATZ , "dev@dpdk.org" , "jia.he@hxt-semitech.com" , "jie2.liu@hxt-semitech.com" , "bing.zhao@hxt-semitech.com" References: <20171010095636.4507-1-hejianet@gmail.com> <20171012155350.j34ddtivxzd27pag@platinum> <2601191342CEEE43887BDE71AB9772585FAA859F@IRSMSX103.ger.corp.intel.com> <20171012172311.GA8524@jerin> <2601191342CEEE43887BDE71AB9772585FAAB171@IRSMSX103.ger.corp.intel.com> From: "Zhao, Bing" Message-ID: <8806e2bd-c57b-03ff-a315-0a311690f1d9@163.com> Date: Thu, 19 Oct 2017 19:18:38 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: <2601191342CEEE43887BDE71AB9772585FAAB171@IRSMSX103.ger.corp.intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-CM-TRANSID: DNGowAB3PzUOiuhZ3rYPAA--.125S2 X-Coremail-Antispam: 1Uf129KBjvJXoW7WFWkCw18KrW7KF1UtFW3trb_yoW8uF1rpr WSkFs7JFsrG340yw1vqw1rXF4Iyw4Syr1UWrWrGr4Du3909w1qqr1xt3WYgry3WrZ2va4j yrWjgFnrCr1DZ3DanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x07bUNVkUUUUU= X-Originating-IP: [180.173.249.63] X-CM-SenderInfo: xlor4vhwkxzzi6rwjhhfrp/xtbB0Q92t1khv7Fb2gAAso Subject: Re: [dpdk-dev] [PATCH] ring: guarantee ordering of cons/prod loading when doing enqueue/dequeue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Oct 2017 11:18:49 -0000 Hi, On 2017/10/19 18:02, Ananyev, Konstantin wrote: > > Hi Jia, > >> >> Hi >> >> >> On 10/13/2017 9:02 AM, Jia He Wrote: >>> Hi Jerin >>> >>> >>> On 10/13/2017 1:23 AM, Jerin Jacob Wrote: >>>> -----Original Message----- >>>>> Date: Thu, 12 Oct 2017 17:05:50 +0000 >>>>> >> [...] >>>> On the same lines, >>>> >>>> Jia He, jie2.liu, bing.zhao, >>>> >>>> Is this patch based on code review or do you saw this issue on any of >>>> the >>>> arm/ppc target? arm64 will have performance impact with this change. >> sorry, miss one important information >> Our platform is an aarch64 server with 46 cpus. >> If we reduced the involved cpu numbers, the bug occurred less frequently. >> >> Yes, mb barrier impact the performance, but correctness is more >> important, isn't it ;-) >> Maybe we canĀ  find any other lightweight barrier here? >> >> Cheers, >> Jia >>> Based on mbuf_autotest, the rte_panic will be invoked in seconds. >>> >>> PANIC in test_refcnt_iter(): >>> (lcore=0, iter=0): after 10s only 61 of 64 mbufs left free >>> 1: [./test(rte_dump_stack+0x38) [0x58d868]] >>> Aborted (core dumped) >>> > > So is it only reproducible with mbuf refcnt test? > Could it be reproduced with some 'pure' ring test > (no mempools/mbufs refcnt, etc.)? > The reason I am asking - in that test we also have mbuf refcnt updates > (that's what for that test was created) and we are doing some optimizations here too > to avoid excessive atomic updates. > BTW, if the problem is not reproducible without mbuf refcnt, > can I suggest to extend the test with: > - add a check that enqueue() operation was successful > - walk through the pool and check/printf refcnt of each mbuf. > Hopefully that would give us some extra information what is going wrong here. > Konstantin > > Currently, the issue is only found in this case here on the ARM platform, not sure how it is going with the X86_64 platform. In another mail of this thread, we've made a simple test based on this and captured some information and I pasted there.(I pasted the patch there :-)) And it seems that Juhamatti & Jacod found some reverting action several months ago. BR. Bing