From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 36CF2A04BB; Thu, 17 Sep 2020 16:01:30 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 1C5FF1D660; Thu, 17 Sep 2020 16:01:29 +0200 (CEST) Received: from mailout2.w1.samsung.com (mailout2.w1.samsung.com [210.118.77.12]) by dpdk.org (Postfix) with ESMTP id 04F7C1D65C for ; Thu, 17 Sep 2020 16:01:26 +0200 (CEST) Received: from eucas1p2.samsung.com (unknown [182.198.249.207]) by mailout2.w1.samsung.com (KnoxPortal) with ESMTP id 20200917140126euoutp02016ddb6d4b5953323e2c4fec82648d09~1lwQNjdi80584505845euoutp02d for ; Thu, 17 Sep 2020 14:01:26 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.w1.samsung.com 20200917140126euoutp02016ddb6d4b5953323e2c4fec82648d09~1lwQNjdi80584505845euoutp02d DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1600351286; bh=rr9nHp6TxRLIKcAcqcX8lsZUeYSFNSCjTx1b/i/Zhuw=; h=Subject:To:Cc:From:Date:In-Reply-To:References:From; b=sWo+rpo+7FwGwbCXO6alCvqtST7lqYPTbKvV+ZlywIhTmosBwD5gcffMehX1ctYdi WFWHEi/8kt4LyPTxRKVU+mJvoHJP9+rmL4c9uh1td5tq4lVrAIIb+3ZIu3Ndb2/JOM kR6AeK9XPGNlDjMZOtdkQrvKdSjg9KMEPp3HiMRY= Received: from eusmges2new.samsung.com (unknown [203.254.199.244]) by eucas1p1.samsung.com (KnoxPortal) with ESMTP id 20200917140126eucas1p159bcc95c77ac562f676e588e8c7b7131~1lwQAAirT0286402864eucas1p1c; Thu, 17 Sep 2020 14:01:26 +0000 (GMT) Received: from eucas1p1.samsung.com ( [182.198.249.206]) by eusmges2new.samsung.com (EUCPMTA) with SMTP id EF.4A.05997.53C636F5; Thu, 17 Sep 2020 15:01:25 +0100 (BST) Received: from eusmtrp2.samsung.com (unknown [182.198.249.139]) by eucas1p2.samsung.com (KnoxPortal) with ESMTPA id 20200917140125eucas1p267b58dc33b112b39c811790b93ee11ed~1lwPpZqp91363813638eucas1p2b; Thu, 17 Sep 2020 14:01:25 +0000 (GMT) Received: from eusmgms2.samsung.com (unknown [182.198.249.180]) by eusmtrp2.samsung.com (KnoxPortal) with ESMTP id 20200917140125eusmtrp274055959c07ba471effc403a80df8ac4~1lwPoycxx1990319903eusmtrp2B; Thu, 17 Sep 2020 14:01:25 +0000 (GMT) X-AuditID: cbfec7f4-65dff7000000176d-76-5f636c352897 Received: from eusmtip1.samsung.com ( [203.254.199.221]) by eusmgms2.samsung.com (EUCPMTA) with SMTP id 58.D2.06017.53C636F5; Thu, 17 Sep 2020 15:01:25 +0100 (BST) Received: from [106.210.88.70] (unknown [106.210.88.70]) by eusmtip1.samsung.com (KnoxPortal) with ESMTPA id 20200917140125eusmtip1a98569e6f8837f012698d7902ab80e12~1lwPH_LCa2266722667eusmtip1-; Thu, 17 Sep 2020 14:01:25 +0000 (GMT) To: David Hunt , Bruce Richardson Cc: dev@dpdk.org, stable@dpdk.org, "\"'Lukasz Wojciechowski'\"," From: Lukasz Wojciechowski Message-ID: <8b4bb66c-f280-2bf8-74a2-99c017d90a78@partner.samsung.com> Date: Thu, 17 Sep 2020 16:01:24 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: Content-Transfer-Encoding: 8bit Content-Language: en-US X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprGKsWRmVeSWpSXmKPExsWy7djPc7qmOcnxBn/+W1vcWGVv0TfpI5PF u0/bmSye9axjtPjX8YfdgdXj14KlrB6L97xk8jj4bg9TAHMUl01Kak5mWWqRvl0CV8aKz6sY C86oViz4soOlgXGufBcjJ4eEgInEkhevWLsYuTiEBFYwStx7+poRwvnCKPHs6H4o5zOjxNGZ N5m6GDnAWpadsIaIL2eUuPloEzuE8xaofc0JFpC5wgJuEhP2H2EEsUUEwiSam/eygDQzC6RK bP/tDhJmE7CVODLzKyuIzQtUfmZOEzOIzSKgKnF55k8mEFtUIE7i2KlHLBA1ghInZz4BszmB eg/e+AhmMwvISzRvnc0MYYtL3HoynwnkHgmByewSVz98Z4H400Xi9pa3TBC2sMSr41vYIWwZ if87YRq2MUpc/f2TEcLZzyhxvXcFVJW1xOF/v9kgPtCUWL9LHyLsKDHxz2lGSKjwSdx4Kwhx BJ/EpG3TmSHCvBIdbUIQ1XoST3umMsKs/bP2CcsERqVZSF6bheSdWUjemYWwdwEjyypG8dTS 4tz01GKjvNRyveLE3OLSvHS95PzcTYzA1HL63/EvOxh3/Uk6xCjAwajEw1sQkhwvxJpYVlyZ e4hRgoNZSYTX6ezpOCHelMTKqtSi/Pii0pzU4kOM0hwsSuK8xotexgoJpCeWpGanphakFsFk mTg4pRoYZRcdZguN81hyukMj4ZD7D5/jpR7FL8PObNjwwOSyetpP27//HrHmBfRnXjT806rB KJi4aKnP9bsr339Q4q/X0HtVyGenJ6elPWW5tLJ78nFr/vrHH/Ut925c15Tcobb0Vef8/5ri Cn4f3Y7k3c1xyxbdFKY6zbuB+3pmicUsr4SkIzfzJ91RYinOSDTUYi4qTgQAy79gNykDAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFlrBIsWRmVeSWpSXmKPExsVy+t/xu7qmOcnxBh87lS1urLK36Jv0kcni 3aftTBbPetYxWvzr+MPuwOrxa8FSVo/Fe14yeRx8t4cpgDlKz6Yov7QkVSEjv7jEVina0MJI z9DSQs/IxFLP0Ng81srIVEnfziYlNSezLLVI3y5BL2PF51WMBWdUKxZ82cHSwDhXvouRg0NC wERi2QnrLkYuDiGBpYwSHVc/sEHEZSQ+XBLoYuQEMoUl/lzrYoOoec0ose7wdVaQhLCAm8SE /UcYQWwRgTCJ61/vs4PYzAKpEs86zjJDNHQwSUzrPsEEkmATsJU4MvMrWDMvUPOZOU3MIDaL gKrE5Zk/wWpEBeIkzvS8YIOoEZQ4OfMJC4jNCdR78MZHFogFZhLzNj9khrDlJZq3zoayxSVu PZnPNIFRaBaS9llIWmYhaZmFpGUBI8sqRpHU0uLc9NxiI73ixNzi0rx0veT83E2MwFjaduzn lh2MXe+CDzEKcDAq8fAWhCTHC7EmlhVX5h5ilOBgVhLhdTp7Ok6INyWxsiq1KD++qDQntfgQ oynQcxOZpUST84FxnlcSb2hqaG5haWhubG5sZqEkztshcDBGSCA9sSQ1OzW1ILUIpo+Jg1Oq gdGMy251aKrsv41e21tPpi5fV7BIZsLx0MCV2yWP/9B25Jk3q1L45OrdMdlqmcJ7HmWd/f1x 2YcT/Y/f3u54c3ZvdGPf9FNaghsq+SzCNz+NeLfFzS9NUyhhVcV85bmfdlZ0lOlzBh9UC5i2 vSL5NIer3QL9Q6tOaqnfuyT98WbFlvmcwR/Kd/9WYinOSDTUYi4qTgQAhvF9Y7sCAAA= X-CMS-MailID: 20200917140125eucas1p267b58dc33b112b39c811790b93ee11ed X-Msg-Generator: CA Content-Type: text/plain; charset="utf-8" X-RootMTR: 20200915193457eucas1p2adbe25c41a0e4ef16c029e7bff104503 X-EPHeader: CA CMS-TYPE: 201P X-CMS-RootMailID: 20200915193457eucas1p2adbe25c41a0e4ef16c029e7bff104503 References: <20200915193449.13310-1-l.wojciechow@partner.samsung.com> <20200915193449.13310-2-l.wojciechow@partner.samsung.com> Subject: Re: [dpdk-dev] [PATCH v1 1/6] app/test: fix deadlock in distributor test X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi David, W dniu 17.09.2020 o 13:21, David Hunt pisze: > Hi Lukasz, > > On 15/9/2020 8:34 PM, Lukasz Wojciechowski wrote: >> The sanity test with worker shutdown delegates all bufs >> to be processed by a single lcore worker, then it freezes >> one of the lcore workers and continues to send more bufs. >> >> Problem occurred if freezed lcore is the same as the one >> that is processing the mbufs. The lcore processing mbufs >> might be different every time test is launched. >> This is caused by keeping the value of wkr static variable >> in rte_distributor_process function between running test cases. >> >> Test freezed always lcore with 0 id. The patch changes avoids >> possible collision by freezing lcore with zero_idx. The lcore >> that receives the data updates the zero_idx, so it is not freezed >> itself. >> >> To reproduce the issue fixed by this patch, please run >> distributor_autotest command in test app several times in a row. >> >> Fixes: c3eabff124e6 ("distributor: add unit tests") >> Cc: bruce.richardson@intel.com >> Cc: stable@dpdk.org >> >> Signed-off-by: Lukasz Wojciechowski >> --- >>   app/test/test_distributor.c | 22 ++++++++++++++++++++-- >>   1 file changed, 20 insertions(+), 2 deletions(-) >> >> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c >> index ba1f81cf8..35b25463a 100644 >> --- a/app/test/test_distributor.c >> +++ b/app/test/test_distributor.c >> @@ -28,6 +28,7 @@ struct worker_params worker_params; >>   static volatile int quit;      /**< general quit variable for all >> threads */ >>   static volatile int zero_quit; /**< var for when we just want thr0 >> to quit*/ >>   static volatile unsigned worker_idx; >> +static volatile unsigned zero_idx; >>     struct worker_stats { >>       volatile unsigned handled_packets; >> @@ -346,27 +347,43 @@ handle_work_for_shutdown_test(void *arg) >>       unsigned int total = 0; >>       unsigned int i; >>       unsigned int returned = 0; >> +    unsigned int zero_id = 0; >>       const unsigned int id = __atomic_fetch_add(&worker_idx, 1, >>               __ATOMIC_RELAXED); >>         num = rte_distributor_get_pkt(d, id, buf, buf, num); >>   +    zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE); >> +    if (id == zero_id && num > 0) { >> +        zero_id = (zero_id + 1) % __atomic_load_n(&worker_idx, >> +            __ATOMIC_ACQUIRE); >> +        __atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE); >> +    } >> + >>       /* wait for quit single globally, or for worker zero, wait >>        * for zero_quit */ >> -    while (!quit && !(id == 0 && zero_quit)) { >> +    while (!quit && !(id == zero_id && zero_quit)) { >>           worker_stats[id].handled_packets += num; >>           count += num; >>           for (i = 0; i < num; i++) >>               rte_pktmbuf_free(buf[i]); >>           num = rte_distributor_get_pkt(d, >>                   id, buf, buf, num); >> + >> +        zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE); >> +        if (id == zero_id && num > 0) { >> +            zero_id = (zero_id + 1) % __atomic_load_n(&worker_idx, >> +                __ATOMIC_ACQUIRE); >> +            __atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE); >> +        } >> + >>           total += num; >>       } >>       worker_stats[id].handled_packets += num; >>       count += num; >>       returned = rte_distributor_return_pkt(d, id, buf, num); >>   -    if (id == 0) { >> +    if (id == zero_id) { >>           /* for worker zero, allow it to restart to pick up last packet >>            * when all workers are shutting down. >>            */ >> @@ -586,6 +603,7 @@ quit_workers(struct worker_params *wp, struct >> rte_mempool *p) >>       rte_eal_mp_wait_lcore(); >>       quit = 0; >>       worker_idx = 0; >> +    zero_idx = 0; >>   } >>     static int > > > Lockup reproducable if you run the distributor_autotest 19 times in > succesion. I was able to run the test many times more than that with > the patch applied. Thanks. The number depends on number of lcores on your test environment. > > Tested-by: David Hunt Thank you very much for reviewing and testing whole series. > > > -- Lukasz Wojciechowski Principal Software Engineer Samsung R&D Institute Poland Samsung Electronics Office +48 22 377 88 25 l.wojciechow@partner.samsung.com