From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 60496A0540;
	Mon, 20 Jul 2020 14:52:02 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id D026529CB;
	Mon, 20 Jul 2020 14:52:01 +0200 (CEST)
Received: from mailout1.w1.samsung.com (mailout1.w1.samsung.com
 [210.118.77.11]) by dpdk.org (Postfix) with ESMTP id EA12B1023
 for <dev@dpdk.org>; Mon, 20 Jul 2020 14:52:00 +0200 (CEST)
Received: from eucas1p1.samsung.com (unknown [182.198.249.206])
 by mailout1.w1.samsung.com (KnoxPortal) with ESMTP id
 20200720125200euoutp012ab46d540e22bb46d5fae285475282cb~jdvyjB3aD2075320753euoutp01E
 for <dev@dpdk.org>; Mon, 20 Jul 2020 12:52:00 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 mailout1.w1.samsung.com
 20200720125200euoutp012ab46d540e22bb46d5fae285475282cb~jdvyjB3aD2075320753euoutp01E
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com;
 s=mail20170921; t=1595249520;
 bh=dih+5fa+O0dvGCMx3EbmUpAmJAOHKZNi1naA5+bIUrM=;
 h=Subject:To:Cc:From:Date:In-Reply-To:References:From;
 b=pqTegOfq72wA0X33vXTdQWhV9BL9lCNLdOVbVwpGESAChRwp/vzQh16UBO+hyeIru
 +6K5feZX2KtOaozp0w7vz6X0kM8x/7y5+GH0zClKPOb2NQtwQJyyCU0RoXzvGkBOJP
 j2Kfc5sgQsSOCEs90e0Kdxb87SOZgrF5lc/JHUQo=
Received: from eusmges1new.samsung.com (unknown [203.254.199.242]) by
 eucas1p1.samsung.com (KnoxPortal) with ESMTP id
 20200720125200eucas1p1d7f4b59a9380f226814251baf7ffcc03~jdvyWB4633096630966eucas1p1S;
 Mon, 20 Jul 2020 12:52:00 +0000 (GMT)
Received: from eucas1p2.samsung.com ( [182.198.249.207]) by
 eusmges1new.samsung.com (EUCPMTA) with SMTP id E1.AE.06456.073951F5; Mon, 20
 Jul 2020 13:52:00 +0100 (BST)
Received: from eusmtrp1.samsung.com (unknown [182.198.249.138]) by
 eucas1p2.samsung.com (KnoxPortal) with ESMTPA id
 20200720125159eucas1p217be2baca613a022ae05d6ba59c6e1eb~jdvx0rmJ81998619986eucas1p2V;
 Mon, 20 Jul 2020 12:51:59 +0000 (GMT)
Received: from eusmgms2.samsung.com (unknown [182.198.249.180]) by
 eusmtrp1.samsung.com (KnoxPortal) with ESMTP id
 20200720125159eusmtrp1e2fe1cf6d523448d7178def59c44e054~jdvxz6F1l1012410124eusmtrp1a;
 Mon, 20 Jul 2020 12:51:59 +0000 (GMT)
X-AuditID: cbfec7f2-7efff70000001938-b5-5f159370951c
Received: from eusmtip1.samsung.com ( [203.254.199.221]) by
 eusmgms2.samsung.com (EUCPMTA) with SMTP id 09.13.06017.F63951F5; Mon, 20
 Jul 2020 13:51:59 +0100 (BST)
Received: from [106.210.88.70] (unknown [106.210.88.70]) by
 eusmtip1.samsung.com (KnoxPortal) with ESMTPA id
 20200720125158eusmtip1f731f147c7798be296ef6b04a2cc59ae~jdvw6RZSG0037600376eusmtip1e;
 Mon, 20 Jul 2020 12:51:58 +0000 (GMT)
To: Harry van Haaren <harry.van.haaren@intel.com>, dev@dpdk.org
Cc: david.marchand@redhat.com, igor.romanov@oktetlabs.ru,
 honnappa.nagarahalli@arm.com, ferruh.yigit@intel.com, nd@arm.com,
 aconole@redhat.com
From: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Message-ID: <fbd9c539-24c1-8cd8-988f-56dd423f892d@partner.samsung.com>
Date: Mon, 20 Jul 2020 14:51:56 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101
 Thunderbird/68.10.0
MIME-Version: 1.0
In-Reply-To: <20200720120938.34660-1-harry.van.haaren@intel.com>
Content-Transfer-Encoding: 8bit
Content-Language: en-US
X-Brightmail-Tracker: H4sIAAAAAAAAA01SfyyUcRzue+97d6+bs6/DfKbGXPlDFln+eDcRLbota/6prV+uw7tjzrE7
 RLPF6AdpCWXOhukHMxxNzuTIXZLdtVYRE2WNK78ionFS7l6W/57P83k+3+d5ti9FiFq4HlSi
 Mo1RKWUKMU9Atr9ae3sotdRNerjV6Eivz03waV19IY/+saTj0GN6E5/OvbtK0hVT+XzaUJpE
 m+uKiDBK0ljViCTrNY+5kodd0xzJ8MoMV7LQPcSL5p4XHI1nFIkZjCog9LIgocXYRqZ2e2be
 3HjBy0FlUIgcKMBBYF7PJ21YhOsRNPXGFiLBFv6F4Of1Dzx2WEagX7Tydy6sA018dlGHINc4
 RLDDPILO1V7CpnLBEfD6j5ljw674GCwPmjg2EYFLEFimGu1P8XAIvKxY4dqwEEdCnb5wS0RR
 JPYB/TdvG+2GY0A7reOwEmcYqJi0Z3XAYaDLH7OfEtgL8p5VEix2h9HJarsXYAMfvjY0c9jY
 J6BjtgWx2AVm+tu26+wDU2kRyR60IxiyriF26EEwfKd+WxUMxk0rz5aOwL6g7Qxg6XCYM7/j
 22jATjAy78yGcIKS9nKCpYVw64aIVfvDVNF9tGO70TRJFiOxZlc1za46ml11NP99axDZgNyZ
 dHWynFEHKpkr/mpZsjpdKfePS0l+irb+kWmzf6kDrbyPNSBMIbGjcPG2m1TElWWos5INCChC
 7Co8/sYUIxLGy7KuMqoUqSpdwagNaC9Fit2FR2qnL4mwXJbGJDFMKqPa2XIoB48cdI774JMv
 ZzZi/XnuhQyL/nOAJ/P3iXZu0CL1k8dmf5cox/NA0ayPLlB1lnm3xhkfaQ6UjbfRDX3XMoO/
 eBV4nao9HR7ycWSiazDKtWetSru/Mk9nqd3Tp0krHq/UbQbRUc73ZHP4wFlrts/o74Xyk2mR
 ocVav4s5bo4L1TVnxsSkOkEWeJBQqWX/AMnqXmxDAwAA
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrEIsWRmVeSWpSXmKPExsVy+t/xu7r5k0XjDXruKVn8evOA3WL7ii42
 i3eftjNZ3Nl7mt2isf8bi8XMpy3sFocmZ1ucWd7D7MDhsWbeGkaPXwuWsnos3vOSyeP611es
 Hu/3XWULYI3SsynKLy1JVcjILy6xVYo2tDDSM7S00DMysdQzNDaPtTIyVdK3s0lJzcksSy3S
 t0vQy9hweAtLwT65ivY/B9gaGKdIdDFyckgImEj8PrmWvYuRi0NIYCmjxOVTKxm7GDmAEjIS
 Hy4JQNQIS/y51sUGUfOaUeL+4aVMIAlhAVeJE3/PgNkiAvYSn6+cZgIpYhaYxCjxr3MBK0TH
 ZEaJk2/Ws4NUsQnYShyZ+ZUVxOYVcJNYvreLCWQbi4CqxN7niiBhUYE4ieVb5rNDlAhKnJz5
 hAXE5hRwkNjecgeslVnATGLe5ofMELa8RPPW2VC2uMStJ/OZJjAKzULSPgtJyywkLbOQtCxg
 ZFnFKJJaWpybnltspFecmFtcmpeul5yfu4kRGH/bjv3csoOx613wIUYBDkYlHt4P3aLxQqyJ
 ZcWVuYcYJTiYlUR4nc6ejhPiTUmsrEotyo8vKs1JLT7EaAr020RmKdHkfGBqyCuJNzQ1NLew
 NDQ3Njc2s1AS5+0QOBgjJJCeWJKanZpakFoE08fEwSnVwHh1Z+W1oFPK77n597wL/xT96Zj1
 zpvVR/sWPfz3p21r+/k/jTpsr2KmNJ989eZ/r4zHJ4VFGxz+lpxbcPzx6hKpE6t/uLzMendj
 6/lq5pUWDuk35KK+/s+qjFs34eaXvb9aXtYLa95yOvP98CPzK0YXr06UMg6O6Ev5IuJwJW+P
 ssuxA0LtjSu9lFiKMxINtZiLihMBmu1j9NUCAAA=
X-CMS-MailID: 20200720125159eucas1p217be2baca613a022ae05d6ba59c6e1eb
X-Msg-Generator: CA
Content-Type: text/plain; charset="utf-8"
X-RootMTR: 20200720120850eucas1p10debcafa273d244a7e63d225c50cc9df
X-EPHeader: CA
CMS-TYPE: 201P
X-CMS-RootMailID: 20200720120850eucas1p10debcafa273d244a7e63d225c50cc9df
References: <CGME20200720120850eucas1p10debcafa273d244a7e63d225c50cc9df@eucas1p1.samsung.com>
 <20200720120938.34660-1-harry.van.haaren@intel.com>
Subject: Re: [dpdk-dev] [PATCH] service: fix stop API to wait for service
	thread
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>


W dniu 20.07.2020 o 14:09, Harry van Haaren pisze:
> This commit improves the service_lcore_stop() implementation,
> waiting for the service core in question to return. The service
> thread itself now has a variable to indicate if its thread is
> active. When zero the service thread has completed its service,
> and has returned from the service_runner_func() function.
>
> This fixes a race condition observed in the DPDK CI, where the
> statistics of the service were not consistent with the expectation
> due to the service thread still running, and incrementing a stat
> after stop was called.
>
> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
>
> ---
>
> This is one possible solution, that avoids a class of race-conditions
> based on stop() api and following behaviours. Without a change in
> implementation of the service core thread, we could not detect when
> the thread was actually finished. This is now possible, and the stop
> api makes use of it to wait for 1000x one millisecond, or log a warning
> that a service core didn't return quickly.
>
> Thanks for the discussion/debug on list - I'm not sure how to add
> reported-by/suggested-by etc tags: but I'll resend a V2 (or can add
> on apply).
>
> ---
>   lib/librte_eal/common/rte_service.c | 24 ++++++++++++++++++++++++
>   1 file changed, 24 insertions(+)
>
> diff --git a/lib/librte_eal/common/rte_service.c b/lib/librte_eal/common/rte_service.c
> index 6a0e0ff65..d2255587d 100644
> --- a/lib/librte_eal/common/rte_service.c
> +++ b/lib/librte_eal/common/rte_service.c
> @@ -65,6 +65,7 @@ struct core_state {
>   	/* map of services IDs are run on this core */
>   	uint64_t service_mask;
>   	uint8_t runstate; /* running or stopped */
> +	uint8_t thread_active; /* indicates when the thread is in service_run() */
>   	uint8_t is_service_core; /* set if core is currently a service core */
>   	uint8_t service_active_on_lcore[RTE_SERVICE_NUM_MAX];
>   	uint64_t loops;
> @@ -457,6 +458,8 @@ service_runner_func(void *arg)
>   	const int lcore = rte_lcore_id();
>   	struct core_state *cs = &lcore_states[lcore];
>   
> +	__atomic_store_n(&cs->thread_active, 1, __ATOMIC_RELAXED);
> +
>   	/* runstate act as the guard variable. Use load-acquire
>   	 * memory order here to synchronize with store-release
>   	 * in runstate update functions.
> @@ -475,6 +478,7 @@ service_runner_func(void *arg)
>   		cs->loops++;
>   	}
>   
> +	__atomic_store_n(&cs->thread_active, 0, __ATOMIC_RELAXED);
>   	return 0;
>   }
>   
> @@ -765,6 +769,26 @@ rte_service_lcore_stop(uint32_t lcore)
>   	__atomic_store_n(&lcore_states[lcore].runstate, RUNSTATE_STOPPED,
>   		__ATOMIC_RELEASE);
>   
> +	/* wait for service lcore to return */
> +	i = 0;
> +	uint8_t active;
> +	uint64_t start = rte_rdtsc();
> +	do {
> +		active = __atomic_load_n(&lcore_states[lcore].thread_active,
> +					 __ATOMIC_RELAXED);
> +		if (active == 0)
> +			break;
> +		rte_delay_ms(1);
> +		i++;
> +	} while (i < 1000);
> +
> +	if (active != 0) {
> +		uint64_t end = rte_rdtsc();
> +		RTE_LOG(WARNING, EAL,
> +			"service lcore stop() failed, waited for %ld cycles\n",
> +			end - start);
> +	}
> +
>   	return 0;
>   }
>   
I don't like the idea of inserting this polling loop inside API call. 
And I don't like setting up a 1000 iterations constraint.
How about keeping the thread_active flag, but moving checking state of 
this flag to separate function. This way the user of the API would be 
able to write own loop.
Maybe he/she would like a custom loop, because:
* waiting for more cores
* would like to wait longer
* would like to check if service is finished less often...

-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com