From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 205E5A04BC; Thu, 8 Oct 2020 22:48:18 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 056FC1BA45; Thu, 8 Oct 2020 22:48:17 +0200 (CEST) Received: from mailout1.w1.samsung.com (mailout1.w1.samsung.com [210.118.77.11]) by dpdk.org (Postfix) with ESMTP id 1915D1B9EC for ; Thu, 8 Oct 2020 22:48:16 +0200 (CEST) Received: from eucas1p2.samsung.com (unknown [182.198.249.207]) by mailout1.w1.samsung.com (KnoxPortal) with ESMTP id 20201008204804euoutp01e9fb247add18469e7dbf4deb5505d033~8H2SIKbOn1861518615euoutp01c for ; Thu, 8 Oct 2020 20:48:04 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout1.w1.samsung.com 20201008204804euoutp01e9fb247add18469e7dbf4deb5505d033~8H2SIKbOn1861518615euoutp01c DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1602190084; bh=EFCVUpX90UaaAJb2tU0TzxvOutUMoKX/tu9zBzH6WUw=; h=Subject:To:Cc:From:Date:In-Reply-To:References:From; b=YocZXKmLh59oLbOXgsOK77WhXBYX2qcGnPMaTBW69h51woryaC3ZI91V3MtKHsVUe ADFdkQRDUosLacmKAvf74G1gW//1w50XV6j9zYO++c/M/CkPdVQTPAxBlY5k8IOB/P ymZIS3Vp2dVJN45qsP6C5ScrAU2UbNnc8X3Ak+lA= Received: from eusmges2new.samsung.com (unknown [203.254.199.244]) by eucas1p1.samsung.com (KnoxPortal) with ESMTP id 20201008204758eucas1p181f00497e2af37169af12868d7f4cf52~8H2NOKTZt0347703477eucas1p1h; Thu, 8 Oct 2020 20:47:58 +0000 (GMT) Received: from eucas1p2.samsung.com ( [182.198.249.207]) by eusmges2new.samsung.com (EUCPMTA) with SMTP id AA.D7.05997.EFA7F7F5; Thu, 8 Oct 2020 21:47:58 +0100 (BST) Received: from eusmtrp1.samsung.com (unknown [182.198.249.138]) by eucas1p2.samsung.com (KnoxPortal) with ESMTPA id 20201008204757eucas1p2ad2c1fed17a322f2ab55fec5157172b1~8H2MO308G2578325783eucas1p2W; Thu, 8 Oct 2020 20:47:57 +0000 (GMT) Received: from eusmgms2.samsung.com (unknown [182.198.249.180]) by eusmtrp1.samsung.com (KnoxPortal) with ESMTP id 20201008204757eusmtrp13c93e4c5f021f21755f671d613f775c8~8H2MOQHj71393713937eusmtrp1w; Thu, 8 Oct 2020 20:47:57 +0000 (GMT) X-AuditID: cbfec7f4-65dff7000000176d-63-5f7f7afeaf8b Received: from eusmtip2.samsung.com ( [203.254.199.222]) by eusmgms2.samsung.com (EUCPMTA) with SMTP id A0.9E.06017.DFA7F7F5; Thu, 8 Oct 2020 21:47:57 +0100 (BST) Received: from [106.210.88.70] (unknown [106.210.88.70]) by eusmtip2.samsung.com (KnoxPortal) with ESMTPA id 20201008204757eusmtip25befbd31c9de3c740bb5d35160126da8~8H2LqHEZN0472104721eusmtip2T; Thu, 8 Oct 2020 20:47:57 +0000 (GMT) To: Honnappa Nagarahalli , David Hunt , Bruce Richardson Cc: "dev@dpdk.org" , "stable@dpdk.org" , nd From: Lukasz Wojciechowski Message-ID: Date: Thu, 8 Oct 2020 22:47:48 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.12.1 MIME-Version: 1.0 In-Reply-To: Content-Transfer-Encoding: 8bit Content-Language: en-US X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprDKsWRmVeSWpSXmKPExsWy7djP87r/qurjDdYu17G4screom/SRyaL d5+2M1nMfNrCbnFmeQ+zxb+OP+wObB5r5q1h9Pi1YCmrx+I9L5kCmKO4bFJSczLLUov07RK4 MlqXT2QrOBhYsevjVNYGxr0OXYycHBICJhIP115h6WLk4hASWMEosb9rGxOE84VR4vqt7ewQ zmdGicc75rDBtLxc/x8qsZxR4vWvTWwQzltGiVMd34CGcXAIC0RITD+hDhIXEehmlJh09gwj SDezQKTEyTOv2EFsNgFbiSMzv7KC2LwCbhLn5h8Aq2ERUJG48+48E4gtKhAnMWFjCwtEjaDE yZlPwGxOAXeJ439+s0LMlJdo3jqbGcIWl7j1ZD7YDxICi9gljt3vYQM5SELARWLP1iSID4Ql Xh3fwg5hy0icntzDAlG/jVHi6u+fjBDOfmAA9K6AqrKWOPzvN9ggZgFNifW79CHCjhKH/rxg hpjPJ3HjrSDEDXwSk7ZNhwrzSnS0CUFU60k87ZnKCLP2z9onLBMYlWYh+WwWkm9mIflmFsLe BYwsqxjFU0uLc9NTi43yUsv1ihNzi0vz0vWS83M3MQLTzOl/x7/sYNz1J+kQowAHoxIP7wr+ +ngh1sSy4srcQ4wSHMxKIrxOZ0/HCfGmJFZWpRblxxeV5qQWH2KU5mBREuc1XvQyVkggPbEk NTs1tSC1CCbLxMEp1cDobPfb4c7KECsL/xmy1juca0qjql6E1b+P/7iXmd1/auFuY6VYtYnx V53ijZY8T0iPy5JqPLNcyjz/Q9zJvVrPnpVvWCx8UjyOd/outvJ3kWtzpc0kf0tV/Z+898/M GA4e98ZTDBamF8vMnRLWBIQuuKNzolHsa+bPe9eN/yyZ8K+WLfvhi69KLMUZiYZazEXFiQDO mc1RLwMAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrHIsWRmVeSWpSXmKPExsVy+t/xe7p/q+rjDRav1LK4screom/SRyaL d5+2M1nMfNrCbnFmeQ+zxb+OP+wObB5r5q1h9Pi1YCmrx+I9L5kCmKP0bIryS0tSFTLyi0ts laINLYz0DC0t9IxMLPUMjc1jrYxMlfTtbFJSczLLUov07RL0MlqXT2QrOBhYsevjVNYGxr0O XYycHBICJhIv1/9n72Lk4hASWMooMft1P3MXIwdQQkbiwyUBiBphiT/Xutggal4zSqy/PYUV pEZYIEJi+gl1kLiIQC+jxPauuUwgDcwCkRJTFsxmhmhYyCzxZd0ZsASbgK3EkZlfWUFsXgE3 iXPzDzCC2CwCKhJ33p0HqxEViJP4MbGXDaJGUOLkzCcsIDangLvE8T+/WSEWmEnM2/yQGcKW l2jeOhvKFpe49WQ+0wRGoVlI2mchaZmFpGUWkpYFjCyrGEVSS4tz03OLjfSKE3OLS/PS9ZLz czcxAuNq27GfW3Ywdr0LPsQowMGoxMO7gr8+Xog1say4MvcQowQHs5IIr9PZ03FCvCmJlVWp RfnxRaU5qcWHGE2BnpvILCWanA+M+bySeENTQ3MLS0NzY3NjMwslcd4OgYMxQgLpiSWp2amp BalFMH1MHJxSDYwME1g2np8+8cnkDRvum50yVmTT/PHVvt9gs8Mz6zNzvWZeuvVJrLHKzHr/ k/wHpXb/f+SkHubZ/q3AL7Fp8VKDLTF1Nru5DXoEchNdDKbu+3jrG0di1qxQ2YkJxWev2Ew2 Puqg0lPj+nxVxOL8e71cexTWfm1ifmh/78hN825BDaOWk/aCNZFKLMUZiYZazEXFiQBd0E3u wQIAAA== X-CMS-MailID: 20201008204757eucas1p2ad2c1fed17a322f2ab55fec5157172b1 X-Msg-Generator: CA Content-Type: text/plain; charset="utf-8" X-RootMTR: 20200925224217eucas1p1bb5f73109b4aeed8f2badf311fa8dfb5 X-EPHeader: CA CMS-TYPE: 201P X-CMS-RootMailID: 20200925224217eucas1p1bb5f73109b4aeed8f2badf311fa8dfb5 References: <20200923132541.21417-1-l.wojciechow@partner.samsung.com> <20200925224209.12173-1-l.wojciechow@partner.samsung.com> <20200925224209.12173-3-l.wojciechow@partner.samsung.com> Subject: Re: [dpdk-dev] [PATCH v4 2/8] test/distributor: synchronize lcores statistics X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Honappa, I pushed v5 of the patches today. However all I fixed in this patch is replacing memset to loop of operations on atomic. I didn't move clearing and checking the statistics out of the test yet (to make sure no worker is running). After fixing few more distributor issues I wasn't able now to catch even a single failure using the ACQ/REL memory models. I'll test it maybe tomorrow with RELAXED to see if it will work. Best regards Lukasz W dniu 02.10.2020 o 13:25, Lukasz Wojciechowski pisze: > Hi Honnappa, > > Many thanks for the review! > > I'll write my answers here not inline as it would be easier to read them > in one place, I think. > So first of all I agree with you in 2 things: > 1) all uses of statistics must be atomic and lack of that caused most of > the problems > 2) it would be better to replace barrier and memset in > clear_packet_count() with atomic stores as you suggested > > So I will apply both of above. > > However I wasn't not fully convinced on changing acquire/release to > relaxed. It wood be perfectly ok > if it would look like in this Herb Sutter's example: > https://youtu.be/KeLBd2EJLOU?t=4170 > But in his case the counters are cleared before worker threads start and > are printout after they are completed. > > In case of the dpdk distributor tests both worker and main cores are > running at the same time. In the sanity_test, the statistics are cleared > and verified few times for different hashes of packages. The worker > cores are not stopped at this time and they continue their loops in > handle procedure. Verification made in main core is an exchange of data > as the current statistics indicate how the test will result. > > So as I wasn't convinced, I run some tests with both both relaxed and > acquire/release modes and they both fail :( > The failures caused by statistics errors to number of tests ratio for > 200000 tests was: > for relaxed: 0,000790562 > for acq/rel: 0,000091321 > > > That's why I'm going to modify tests in such way, that they would: > 1) clear statistics > 2) launch worker threads > 3) run test > 4) wait for workers procedures to complete > 5) check stats, verify results and print them out > > This way worker main core will use (clear or verify) stats only when > there are no worker threads. This would make things simpler and allowing > to focus on testing the distributor not tests. And of course relaxed > mode would be enough! > > > Best regards > Lukasz > > > W dniu 29.09.2020 o 07:49, Honnappa Nagarahalli pisze: >> >> >>> Statistics of handled packets are cleared and read on main lcore, while they >>> are increased in workers handlers on different lcores. >>> >>> Without synchronization occasionally showed invalid values. >>> This patch uses atomic acquire/release mechanisms to synchronize. >> In general, load-acquire and store-release memory orderings are required while synchronizing data (that cannot be updated atomically) between threads. In the situation, making counters atomic is enough. >> >>> Fixes: c3eabff124e6 ("distributor: add unit tests") >>> Cc: bruce.richardson@intel.com >>> Cc: stable@dpdk.org >>> >>> Signed-off-by: Lukasz Wojciechowski >>> Acked-by: David Hunt >>> --- >>> app/test/test_distributor.c | 39 ++++++++++++++++++++++++------------- >>> 1 file changed, 26 insertions(+), 13 deletions(-) >>> >>> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c index >>> 35b25463a..0e49e3714 100644 >>> --- a/app/test/test_distributor.c >>> +++ b/app/test/test_distributor.c >>> @@ -43,7 +43,8 @@ total_packet_count(void) { >>> unsigned i, count = 0; >>> for (i = 0; i < worker_idx; i++) >>> - count += worker_stats[i].handled_packets; >>> + count += >>> __atomic_load_n(&worker_stats[i].handled_packets, >>> + __ATOMIC_ACQUIRE); >> RELAXED memory order is sufficient. For ex: the worker threads are not 'releasing' any data that is not atomically updated to the main thread. >> >>> return count; >>> } >>> >>> @@ -52,6 +53,7 @@ static inline void >>> clear_packet_count(void) >>> { >>> memset(&worker_stats, 0, sizeof(worker_stats)); >>> + rte_atomic_thread_fence(__ATOMIC_RELEASE); >> Ideally, the counters should be set to 0 atomically rather than using a memset. >> >>> } >>> >>> /* this is the basic worker function for sanity test @@ -72,13 +74,13 @@ >>> handle_work(void *arg) >>> num = rte_distributor_get_pkt(db, id, buf, buf, num); >>> while (!quit) { >>> __atomic_fetch_add(&worker_stats[id].handled_packets, >>> num, >>> - __ATOMIC_RELAXED); >>> + __ATOMIC_ACQ_REL); >> Using the __ATOMIC_ACQ_REL order does not mean anything to the main thread. The main thread might still see the updates from different threads in different order. >> >>> count += num; >>> num = rte_distributor_get_pkt(db, id, >>> buf, buf, num); >>> } >>> __atomic_fetch_add(&worker_stats[id].handled_packets, num, >>> - __ATOMIC_RELAXED); >>> + __ATOMIC_ACQ_REL); >> Same here, do not see why this change is required. >> >>> count += num; >>> rte_distributor_return_pkt(db, id, buf, num); >>> return 0; >>> @@ -134,7 +136,8 @@ sanity_test(struct worker_params *wp, struct >>> rte_mempool *p) >>> >>> for (i = 0; i < rte_lcore_count() - 1; i++) >>> printf("Worker %u handled %u packets\n", i, >>> - worker_stats[i].handled_packets); >>> + __atomic_load_n(&worker_stats[i].handled_packets, >>> + __ATOMIC_ACQUIRE)); >> __ATOMIC_RELAXED is enough. >> >>> printf("Sanity test with all zero hashes done.\n"); >>> >>> /* pick two flows and check they go correctly */ @@ -159,7 +162,9 >>> @@ sanity_test(struct worker_params *wp, struct rte_mempool *p) >>> >>> for (i = 0; i < rte_lcore_count() - 1; i++) >>> printf("Worker %u handled %u packets\n", i, >>> - worker_stats[i].handled_packets); >>> + __atomic_load_n( >>> + &worker_stats[i].handled_packets, >>> + __ATOMIC_ACQUIRE)); >> __ATOMIC_RELAXED is enough >> >>> printf("Sanity test with two hash values done\n"); >>> } >>> >>> @@ -185,7 +190,8 @@ sanity_test(struct worker_params *wp, struct >>> rte_mempool *p) >>> >>> for (i = 0; i < rte_lcore_count() - 1; i++) >>> printf("Worker %u handled %u packets\n", i, >>> - worker_stats[i].handled_packets); >>> + __atomic_load_n(&worker_stats[i].handled_packets, >>> + __ATOMIC_ACQUIRE)); >> __ATOMIC_RELAXED is enough >> >>> printf("Sanity test with non-zero hashes done\n"); >>> >>> rte_mempool_put_bulk(p, (void *)bufs, BURST); @@ -280,15 >>> +286,17 @@ handle_work_with_free_mbufs(void *arg) >>> buf[i] = NULL; >>> num = rte_distributor_get_pkt(d, id, buf, buf, num); >>> while (!quit) { >>> - worker_stats[id].handled_packets += num; >>> count += num; >>> + __atomic_fetch_add(&worker_stats[id].handled_packets, >>> num, >>> + __ATOMIC_ACQ_REL); >> IMO, the problem would be the non-atomic update of the statistics. So, __ATOMIC_RELAXED is enough >> >>> for (i = 0; i < num; i++) >>> rte_pktmbuf_free(buf[i]); >>> num = rte_distributor_get_pkt(d, >>> id, buf, buf, num); >>> } >>> - worker_stats[id].handled_packets += num; >>> count += num; >>> + __atomic_fetch_add(&worker_stats[id].handled_packets, num, >>> + __ATOMIC_ACQ_REL); >> Same here, the problem is non-atomic update of the statistics, __ATOMIC_RELAXED is enough. >> Similarly, for changes below, __ATOMIC_RELAXED is enough. >> >>> rte_distributor_return_pkt(d, id, buf, num); >>> return 0; >>> } >>> @@ -363,8 +371,9 @@ handle_work_for_shutdown_test(void *arg) >>> /* wait for quit single globally, or for worker zero, wait >>> * for zero_quit */ >>> while (!quit && !(id == zero_id && zero_quit)) { >>> - worker_stats[id].handled_packets += num; >>> count += num; >>> + __atomic_fetch_add(&worker_stats[id].handled_packets, >>> num, >>> + __ATOMIC_ACQ_REL); >>> for (i = 0; i < num; i++) >>> rte_pktmbuf_free(buf[i]); >>> num = rte_distributor_get_pkt(d, >>> @@ -379,10 +388,11 @@ handle_work_for_shutdown_test(void *arg) >>> >>> total += num; >>> } >>> - worker_stats[id].handled_packets += num; >>> count += num; >>> returned = rte_distributor_return_pkt(d, id, buf, num); >>> >>> + __atomic_fetch_add(&worker_stats[id].handled_packets, num, >>> + __ATOMIC_ACQ_REL); >>> if (id == zero_id) { >>> /* for worker zero, allow it to restart to pick up last packet >>> * when all workers are shutting down. >>> @@ -394,10 +404,11 @@ handle_work_for_shutdown_test(void *arg) >>> id, buf, buf, num); >>> >>> while (!quit) { >>> - worker_stats[id].handled_packets += num; >>> count += num; >>> rte_pktmbuf_free(pkt); >>> num = rte_distributor_get_pkt(d, id, buf, buf, num); >>> + >>> __atomic_fetch_add(&worker_stats[id].handled_packets, >>> + num, __ATOMIC_ACQ_REL); >>> } >>> returned = rte_distributor_return_pkt(d, >>> id, buf, num); >>> @@ -461,7 +472,8 @@ sanity_test_with_worker_shutdown(struct >>> worker_params *wp, >>> >>> for (i = 0; i < rte_lcore_count() - 1; i++) >>> printf("Worker %u handled %u packets\n", i, >>> - worker_stats[i].handled_packets); >>> + __atomic_load_n(&worker_stats[i].handled_packets, >>> + __ATOMIC_ACQUIRE)); >>> >>> if (total_packet_count() != BURST * 2) { >>> printf("Line %d: Error, not all packets flushed. " >>> @@ -514,7 +526,8 @@ test_flush_with_worker_shutdown(struct >>> worker_params *wp, >>> zero_quit = 0; >>> for (i = 0; i < rte_lcore_count() - 1; i++) >>> printf("Worker %u handled %u packets\n", i, >>> - worker_stats[i].handled_packets); >>> + __atomic_load_n(&worker_stats[i].handled_packets, >>> + __ATOMIC_ACQUIRE)); >>> >>> if (total_packet_count() != BURST) { >>> printf("Line %d: Error, not all packets flushed. " >>> -- >>> 2.17.1 -- Lukasz Wojciechowski Principal Software Engineer Samsung R&D Institute Poland Samsung Electronics Office +48 22 377 88 25 l.wojciechow@partner.samsung.com