From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4DA41A04DB; Fri, 16 Oct 2020 14:44:31 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 840A11ED98; Fri, 16 Oct 2020 14:43:34 +0200 (CEST) Received: from mailout2.w1.samsung.com (mailout2.w1.samsung.com [210.118.77.12]) by dpdk.org (Postfix) with ESMTP id 628761D9ED for ; Fri, 16 Oct 2020 14:43:33 +0200 (CEST) Received: from eucas1p2.samsung.com (unknown [182.198.249.207]) by mailout2.w1.samsung.com (KnoxPortal) with ESMTP id 20201016124322euoutp02619acac15baea15a70c44ec52dcd0ce4~_eZXuC49P1889718897euoutp02v for ; Fri, 16 Oct 2020 12:43:22 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.w1.samsung.com 20201016124322euoutp02619acac15baea15a70c44ec52dcd0ce4~_eZXuC49P1889718897euoutp02v DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1602852202; bh=kNEd/z31jG5IDPmTPojTzHHxjdq3KYbD1EgU45sAxPU=; h=Subject:To:Cc:From:Date:In-Reply-To:References:From; b=TQk8hbXsgwUduS3IcF0bXk6WyYBUWc2agYrlQBb/4qASlvdVTeTg2ynCE8zTuPaax QxsAJ8sm80xEnLSFMfjdA3CxArYwYI2dzpMoRvw/54lZfO2WZ90nop/QeoXN+cCymM KA1x3XBFm/yukDnPKJ6uqr+NibvKCFCFvwy1Y11w= Received: from eusmges2new.samsung.com (unknown [203.254.199.244]) by eucas1p1.samsung.com (KnoxPortal) with ESMTP id 20201016124313eucas1p11acac35080d19e44dd476fc8491ee2b2~_eZPOz8ka2965629656eucas1p1u; Fri, 16 Oct 2020 12:43:13 +0000 (GMT) Received: from eucas1p1.samsung.com ( [182.198.249.206]) by eusmges2new.samsung.com (EUCPMTA) with SMTP id 61.C7.05997.065998F5; Fri, 16 Oct 2020 13:43:13 +0100 (BST) Received: from eusmtrp1.samsung.com (unknown [182.198.249.138]) by eucas1p1.samsung.com (KnoxPortal) with ESMTPA id 20201016124312eucas1p17a02f1adf0ab69a1708e6f1ecf82d35f~_eZOni6rO2954529545eucas1p16; Fri, 16 Oct 2020 12:43:12 +0000 (GMT) Received: from eusmgms1.samsung.com (unknown [182.198.249.179]) by eusmtrp1.samsung.com (KnoxPortal) with ESMTP id 20201016124312eusmtrp1aa675a35fc30732c505cc856e59d6c2a~_eZOm4iq-1545315453eusmtrp1Y; Fri, 16 Oct 2020 12:43:12 +0000 (GMT) X-AuditID: cbfec7f4-677ff7000000176d-59-5f899560f38d Received: from eusmtip1.samsung.com ( [203.254.199.221]) by eusmgms1.samsung.com (EUCPMTA) with SMTP id 48.B8.06314.065998F5; Fri, 16 Oct 2020 13:43:12 +0100 (BST) Received: from [106.210.88.70] (unknown [106.210.88.70]) by eusmtip1.samsung.com (KnoxPortal) with ESMTPA id 20201016124311eusmtip1cd9aa05736082d35f2318c17544a1ed7~_eZNvR72w1704817048eusmtip16; Fri, 16 Oct 2020 12:43:11 +0000 (GMT) To: Honnappa Nagarahalli , David Hunt , Bruce Richardson Cc: "dev@dpdk.org" , "stable@dpdk.org" , nd , David Marchand , "\"'Lukasz Wojciechowski'\"," From: Lukasz Wojciechowski Message-ID: Date: Fri, 16 Oct 2020 14:43:10 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.12.1 MIME-Version: 1.0 In-Reply-To: Content-Transfer-Encoding: 8bit Content-Language: en-US X-Brightmail-Tracker: H4sIAAAAAAAAA01Se0hTcRTut3t3dzVnP6fiyQJjElmRWvbHwNSKrPVHUAhlhtpVLyr5YtOl EmQktnQOm4g5S0UTbb7SfJTPNs1HPkLIJ4URmhqpZRr4YNV2lfzv+75zvnPOB4cmRLV8Rzoy Jp6VxTBRYsqSbOxeGzrG5DwMdn/wHknGdT4SteYnT9JUnk5JFpebeJK8mVSB5KuqGkkGylSE xKjcFJympZUFlUi6XlTKl5a0zvOk+sVWnnSpfYS6zA+wPBXGRkUqWJmb903LiM6nFVRc97XE 4VotLwVN+KYjCxrwSRia0/PSkSUtwuUICl58RxxZQdCQ1SvgyC8Es4+/UNsW/bM2kiuUIXhT aKQ4soBA/URLpCOatsX+kNt7yKTb4QwEmsEB81wC9yHIGxpGplEU9oKuvFW+CQvxeZhaniRN ZhIfhOziEJNsj4MgqzaV5FpsoC9v2owtcCDcq/5gthLYCe435BMcdoDJ6UJzIMAGAehaphF3 9jn42F68FcEWvvXUCzi8H/qzVSRnaEQwsrGGONKBYCyzfKvLEzqNG5TpOgIfhppmN04+A4bN OXNiwNYwvmDDHWENmsbcLVkIyjQR1+0KM6octL12s2qazEJi7Y5o2h1xtDviaP/vLUKkDjmw CfLocFZ+Ioa97SpnouUJMeGuobHRdejfI/Ube1ZeoebNEAPCNBJbCZV+ymARn1HIk6INCGhC bCc8O9gfJBKGMUnJrCw2WJYQxcoNaB9Nih2EHsXzgSIczsSzt1g2jpVtV3m0hWMK2qtQOGVI 6TKv12GhHu66o+pPWVOZtXdClT5Ve5acfROpFHuyp8tj5tKuUuc/10tmVaM/Lr6763m1Q20Y Yf1fJgTKOoex4VHL8u/EAVFF8tvdKfXZ66Pjud1al7YAP+8LVle8D6zWxXeo07r1OnXh5xvN +TWzLv5jrIbJ1TyfiBKT8gjm+BFCJmf+AlxDDr1EAwAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrEIsWRmVeSWpSXmKPExsVy+t/xu7oJUzvjDU5eErO4screom/SRyaL 7Su62CzefdrOZDHzaQu7xbOedYwWZ5b3MFv86/jD7sDhsWbeGkaPXwuWsnos3vOSyePguz1M Hu/3XWULYI3SsynKLy1JVcjILy6xVYo2tDDSM7S00DMysdQzNDaPtTIyVdK3s0lJzcksSy3S t0vQyzg8dzVbwbHwiosbZzE1MN507WLk5JAQMJE4uGQvSxcjF4eQwFJGid+nfrN2MXIAJWQk PlwSgKgRlvhzrYsNouY1o8ThpWuYQGqEBSIkpp9QB4mLCPQySmzvmssE4jALnGSU2PaujQmi 4wuzxKtdu5hBRrEJ2EocmfmVFcTmFXCTuP/pFgvIJBYBVYnJi5JAwqICcRI/JvayQZQISpyc +YQFxOYUiJVoXHcFrJVZwExi3uaHzBC2vETz1tlQtrjErSfzmSYwCs1C0j4LScssJC2zkLQs YGRZxSiSWlqcm55bbKhXnJhbXJqXrpecn7uJERh/24793LyD8dLG4EOMAhyMSjy8DT4d8UKs iWXFlbmHGCU4mJVEeJ3Ono4T4k1JrKxKLcqPLyrNSS0+xGgK9NtEZinR5HxgasgriTc0NTS3 sDQ0NzY3NrNQEuftEDgYIySQnliSmp2aWpBaBNPHxMEp1cBYI8pVccAwInehxMcZtlm1Ojve cDMH+x5addnncMAvhtomb+3wZIGiGPPDleo/mNPa1cQvTFiZcHFtNaN01w9Lje+OnvdmslTE eH9Is04o1Oa49tLitXV+yib5dwJTdJxXFLc0TLsstfJ5qsO9pa53mTy279H4OcPN9tgClbWl F71FZhaELlRiKc5INNRiLipOBAC/XEab1QIAAA== X-CMS-MailID: 20201016124312eucas1p17a02f1adf0ab69a1708e6f1ecf82d35f X-Msg-Generator: CA Content-Type: text/plain; charset="utf-8" X-RootMTR: 20200925224217eucas1p1bb5f73109b4aeed8f2badf311fa8dfb5 X-EPHeader: CA CMS-TYPE: 201P X-CMS-RootMailID: 20200925224217eucas1p1bb5f73109b4aeed8f2badf311fa8dfb5 References: <20200923132541.21417-1-l.wojciechow@partner.samsung.com> <20200925224209.12173-1-l.wojciechow@partner.samsung.com> <20200925224209.12173-3-l.wojciechow@partner.samsung.com> Subject: Re: [dpdk-dev] [PATCH v4 2/8] test/distributor: synchronize lcores statistics X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Honnappa, Thank you for your answer. In the current v7 version I followed your advise and used RELAXED memory model. And it works without any issues. I guess after fixing other issues found since v4 the distributor works more stable. I didn't have time to rearrange all tests in the way I proposed, but I guess if they work like this it's not a top priority. Can you give an ack on the series? I believe David Marchand is waiting for your opinion to process it. Best regards Lukasz W dniu 16.10.2020 o 07:43, Honnappa Nagarahalli pisze: > > >> Hi Honnappa, >> >> Many thanks for the review! >> >> I'll write my answers here not inline as it would be easier to read them in one >> place, I think. >> So first of all I agree with you in 2 things: >> 1) all uses of statistics must be atomic and lack of that caused most of the >> problems >> 2) it would be better to replace barrier and memset in >> clear_packet_count() with atomic stores as you suggested >> >> So I will apply both of above. >> >> However I wasn't not fully convinced on changing acquire/release to relaxed. >> It wood be perfectly ok if it would look like in this Herb Sutter's example: >> https://youtu.be/KeLBd2[] EJLOU?t=4170 >> But in his case the counters are cleared before worker threads start and are >> printout after they are completed. >> >> In case of the dpdk distributor tests both worker and main cores are running >> at the same time. In the sanity_test, the statistics are cleared and verified >> few times for different hashes of packages. The worker cores are not >> stopped at this time and they continue their loops in handle procedure. >> Verification made in main core is an exchange of data as the current statistics >> indicate how the test will result. > Agree. The key point we have to note is that the data that is exchanged between the two threads is already atomic (handled_packets is atomic). > >> So as I wasn't convinced, I run some tests with both both relaxed and >> acquire/release modes and they both fail :( The failures caused by statistics >> errors to number of tests ratio for >> 200000 tests was: >> for relaxed: 0,000790562 >> for acq/rel: 0,000091321 >> >> >> That's why I'm going to modify tests in such way, that they would: >> 1) clear statistics >> 2) launch worker threads >> 3) run test >> 4) wait for workers procedures to complete >> 5) check stats, verify results and print them out >> >> This way worker main core will use (clear or verify) stats only when there are >> no worker threads. This would make things simpler and allowing to focus on >> testing the distributor not tests. And of course relaxed mode would be >> enough! > Agree, this would be the only way to ensure that the main thread sees the correct statistics (just like in the video) > >> >> Best regards >> Lukasz >> >> >> W dniu 29.09.2020 o 07:49, Honnappa Nagarahalli pisze: >>> >>> >>>> Statistics of handled packets are cleared and read on main lcore, >>>> while they are increased in workers handlers on different lcores. >>>> >>>> Without synchronization occasionally showed invalid values. >>>> This patch uses atomic acquire/release mechanisms to synchronize. >>> In general, load-acquire and store-release memory orderings are required >> while synchronizing data (that cannot be updated atomically) between >> threads. In the situation, making counters atomic is enough. >>>> Fixes: c3eabff124e6 ("distributor: add unit tests") >>>> Cc: bruce.richardson@intel.com >>>> Cc: stable@dpdk.org >>>> >>>> Signed-off-by: Lukasz Wojciechowski >>>> >>>> Acked-by: David Hunt >>>> --- >>>> app/test/test_distributor.c | 39 ++++++++++++++++++++++++----------- >> -- >>>> 1 file changed, 26 insertions(+), 13 deletions(-) >>>> >>>> diff --git a/app/test/test_distributor.c >>>> b/app/test/test_distributor.c index >>>> 35b25463a..0e49e3714 100644 >>>> --- a/app/test/test_distributor.c >>>> +++ b/app/test/test_distributor.c >>>> @@ -43,7 +43,8 @@ total_packet_count(void) { >>>> unsigned i, count = 0; >>>> for (i = 0; i < worker_idx; i++) >>>> - count += worker_stats[i].handled_packets; >>>> + count += >>>> __atomic_load_n(&worker_stats[i].handled_packets, >>>> + __ATOMIC_ACQUIRE); >>> RELAXED memory order is sufficient. For ex: the worker threads are not >> 'releasing' any data that is not atomically updated to the main thread. >>>> return count; >>>> } >>>> >>>> @@ -52,6 +53,7 @@ static inline void >>>> clear_packet_count(void) >>>> { >>>> memset(&worker_stats, 0, sizeof(worker_stats)); >>>> + rte_atomic_thread_fence(__ATOMIC_RELEASE); >>> Ideally, the counters should be set to 0 atomically rather than using a >> memset. >>>> } >>>> >>>> /* this is the basic worker function for sanity test @@ -72,13 >>>> +74,13 @@ handle_work(void *arg) >>>> num = rte_distributor_get_pkt(db, id, buf, buf, num); >>>> while (!quit) { >>>> __atomic_fetch_add(&worker_stats[id].handled_packets, >>>> num, >>>> - __ATOMIC_RELAXED); >>>> + __ATOMIC_ACQ_REL); >>> Using the __ATOMIC_ACQ_REL order does not mean anything to the main >> thread. The main thread might still see the updates from different threads in >> different order. >>>> count += num; >>>> num = rte_distributor_get_pkt(db, id, >>>> buf, buf, num); >>>> } >>>> __atomic_fetch_add(&worker_stats[id].handled_packets, num, >>>> - __ATOMIC_RELAXED); >>>> + __ATOMIC_ACQ_REL); >>> Same here, do not see why this change is required. >>> >>>> count += num; >>>> rte_distributor_return_pkt(db, id, buf, num); >>>> return 0; >>>> @@ -134,7 +136,8 @@ sanity_test(struct worker_params *wp, struct >>>> rte_mempool *p) >>>> >>>> for (i = 0; i < rte_lcore_count() - 1; i++) >>>> printf("Worker %u handled %u packets\n", i, >>>> - worker_stats[i].handled_packets); >>>> + __atomic_load_n(&worker_stats[i].handled_packets, >>>> + __ATOMIC_ACQUIRE)); >>> __ATOMIC_RELAXED is enough. >>> >>>> printf("Sanity test with all zero hashes done.\n"); >>>> >>>> /* pick two flows and check they go correctly */ @@ -159,7 +162,9 >>>> @@ sanity_test(struct worker_params *wp, struct rte_mempool *p) >>>> >>>> for (i = 0; i < rte_lcore_count() - 1; i++) >>>> printf("Worker %u handled %u packets\n", i, >>>> - worker_stats[i].handled_packets); >>>> + __atomic_load_n( >>>> + &worker_stats[i].handled_packets, >>>> + __ATOMIC_ACQUIRE)); >>> __ATOMIC_RELAXED is enough >>> >>>> printf("Sanity test with two hash values done\n"); >>>> } >>>> >>>> @@ -185,7 +190,8 @@ sanity_test(struct worker_params *wp, struct >>>> rte_mempool *p) >>>> >>>> for (i = 0; i < rte_lcore_count() - 1; i++) >>>> printf("Worker %u handled %u packets\n", i, >>>> - worker_stats[i].handled_packets); >>>> + __atomic_load_n(&worker_stats[i].handled_packets, >>>> + __ATOMIC_ACQUIRE)); >>> __ATOMIC_RELAXED is enough >>> >>>> printf("Sanity test with non-zero hashes done\n"); >>>> >>>> rte_mempool_put_bulk(p, (void *)bufs, BURST); @@ -280,15 >>>> +286,17 @@ handle_work_with_free_mbufs(void *arg) >>>> buf[i] = NULL; >>>> num = rte_distributor_get_pkt(d, id, buf, buf, num); >>>> while (!quit) { >>>> - worker_stats[id].handled_packets += num; >>>> count += num; >>>> + __atomic_fetch_add(&worker_stats[id].handled_packets, >>>> num, >>>> + __ATOMIC_ACQ_REL); >>> IMO, the problem would be the non-atomic update of the statistics. So, >>> __ATOMIC_RELAXED is enough >>> >>>> for (i = 0; i < num; i++) >>>> rte_pktmbuf_free(buf[i]); >>>> num = rte_distributor_get_pkt(d, >>>> id, buf, buf, num); >>>> } >>>> - worker_stats[id].handled_packets += num; >>>> count += num; >>>> + __atomic_fetch_add(&worker_stats[id].handled_packets, num, >>>> + __ATOMIC_ACQ_REL); >>> Same here, the problem is non-atomic update of the statistics, >> __ATOMIC_RELAXED is enough. >>> Similarly, for changes below, __ATOMIC_RELAXED is enough. >>> >>>> rte_distributor_return_pkt(d, id, buf, num); >>>> return 0; >>>> } >>>> @@ -363,8 +371,9 @@ handle_work_for_shutdown_test(void *arg) >>>> /* wait for quit single globally, or for worker zero, wait >>>> * for zero_quit */ >>>> while (!quit && !(id == zero_id && zero_quit)) { >>>> - worker_stats[id].handled_packets += num; >>>> count += num; >>>> + __atomic_fetch_add(&worker_stats[id].handled_packets, >>>> num, >>>> + __ATOMIC_ACQ_REL); >>>> for (i = 0; i < num; i++) >>>> rte_pktmbuf_free(buf[i]); >>>> num = rte_distributor_get_pkt(d, >>>> @@ -379,10 +388,11 @@ handle_work_for_shutdown_test(void *arg) >>>> >>>> total += num; >>>> } >>>> - worker_stats[id].handled_packets += num; >>>> count += num; >>>> returned = rte_distributor_return_pkt(d, id, buf, num); >>>> >>>> + __atomic_fetch_add(&worker_stats[id].handled_packets, num, >>>> + __ATOMIC_ACQ_REL); >>>> if (id == zero_id) { >>>> /* for worker zero, allow it to restart to pick up last packet >>>> * when all workers are shutting down. >>>> @@ -394,10 +404,11 @@ handle_work_for_shutdown_test(void *arg) >>>> id, buf, buf, num); >>>> >>>> while (!quit) { >>>> - worker_stats[id].handled_packets += num; >>>> count += num; >>>> rte_pktmbuf_free(pkt); >>>> num = rte_distributor_get_pkt(d, id, buf, buf, num); >>>> + >>>> __atomic_fetch_add(&worker_stats[id].handled_packets, >>>> + num, __ATOMIC_ACQ_REL); >>>> } >>>> returned = rte_distributor_return_pkt(d, >>>> id, buf, num); >>>> @@ -461,7 +472,8 @@ sanity_test_with_worker_shutdown(struct >>>> worker_params *wp, >>>> >>>> for (i = 0; i < rte_lcore_count() - 1; i++) >>>> printf("Worker %u handled %u packets\n", i, >>>> - worker_stats[i].handled_packets); >>>> + __atomic_load_n(&worker_stats[i].handled_packets, >>>> + __ATOMIC_ACQUIRE)); >>>> >>>> if (total_packet_count() != BURST * 2) { >>>> printf("Line %d: Error, not all packets flushed. " >>>> @@ -514,7 +526,8 @@ test_flush_with_worker_shutdown(struct >>>> worker_params *wp, >>>> zero_quit = 0; >>>> for (i = 0; i < rte_lcore_count() - 1; i++) >>>> printf("Worker %u handled %u packets\n", i, >>>> - worker_stats[i].handled_packets); >>>> + __atomic_load_n(&worker_stats[i].handled_packets, >>>> + __ATOMIC_ACQUIRE)); >>>> >>>> if (total_packet_count() != BURST) { >>>> printf("Line %d: Error, not all packets flushed. " >>>> -- >>>> 2.17.1 >> -- >> Lukasz Wojciechowski >> Principal Software Engineer >> >> Samsung R&D Institute Poland >> Samsung Electronics >> Office +48 22 377 88 25 >> l.wojciechow@partner.samsung.com -- Lukasz Wojciechowski Principal Software Engineer Samsung R&D Institute Poland Samsung Electronics Office +48 22 377 88 25 l.wojciechow@partner.samsung.com