From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 94315A04DD; Tue, 26 Nov 2019 16:58:46 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id DA0842B88; Tue, 26 Nov 2019 16:58:45 +0100 (CET) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) by dpdk.org (Postfix) with ESMTP id E7B6E28EE for ; Tue, 26 Nov 2019 16:58:43 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1574783923; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TgB3gag00T1/n9G2aenSp/tSAMgp6mqq25ED29513oo=; b=ik3hssWMeSs8OqnbgNm3iwcGy5Dm1bJt6gyENEbtrpMikjYuO4f0kfbuxHYeCdwpvVaNfk Gan1H86uk/1B42NYet+QJl8O3QQAvifWdlX27yREem4wdnYOM+iQLW8NEK5FEvZStvYgkF cmYhz7zhEvduH9SUhUhvHvvdVy+eE4Q= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-249-wORW2juXNPWjjPfS3AjI3Q-1; Tue, 26 Nov 2019 10:58:39 -0500 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 40A8E80183C; Tue, 26 Nov 2019 15:58:38 +0000 (UTC) Received: from dhcp-25.97.bos.redhat.com (unknown [10.18.25.127]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 524815D6BE; Tue, 26 Nov 2019 15:58:37 +0000 (UTC) From: Aaron Conole To: "Van Haaren\, Harry" Cc: Thomas Monjalon , "Amber\, Kumar" , "dev\@dpdk.org" , "Wang\, Yipeng1" , "Yigit\, Ferruh" , "Thakur\, Sham Singh" , David Marchand References: <20191122182100.15631-1-kumar.amber@intel.com> <2900799.QLPOietlla@xps> Date: Tue, 26 Nov 2019 10:58:36 -0500 In-Reply-To: (Van Haaren's message of "Tue, 26 Nov 2019 13:19:02 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-MC-Unique: wORW2juXNPWjjPfS3AjI3Q-1 X-Mimecast-Spam-Score: 0 Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-dev] [PATCH v3] hash: added a new API to hash to query key id X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" "Van Haaren, Harry" writes: > Hi Aaron, > >> -----Original Message----- >> From: Aaron Conole >> Sent: Monday, November 25, 2019 10:54 PM >> To: Thomas Monjalon >> Cc: Van Haaren, Harry ; Amber, Kumar >> ; dev@dpdk.org; Wang, Yipeng1 >> ; Yigit, Ferruh ; Thakur= , >> Sham Singh ; David Marchand >> >> Subject: Re: [dpdk-dev] [PATCH v3] hash: added a new API to hash to quer= y >> key id >>=20 >> Aaron Conole writes: >>=20 >> > Thomas Monjalon writes: >> > >> >>> From: Aaron Conole >> >>> > -=09if (!service_valid(id)) >> >>> > +=09if (id >=3D RTE_SERVICE_NUM_MAX || !service_valid(id)) >> >> >> >> Why not adding this check in service_valid()? >> > >> > I think the best fix is to use SERVICE_VALID_GET_OR_ERR_RET() in these >> > places. For this, I at least want to try and show that there aren't a= ny >> > further errors. And my test loop has been running for a while now >> > without any more errors or segfaults, so I guess it's okay to build a >> > proper patch. >>=20 >> This popped up: >>=20 >> EAL: Test assert service_lcore_en_dis_able line 487 failed: Ex-service c= ore >> function call had no effect. >>=20 >> So I'll spend some time in this area, it seems. > > > The below diff makes it 100% reproducible here, failing every time. > > It seems like the main thread is returning, before the service thread has= returned. > > The rte_eal_mp_wait_lcore() call seems to not wait on the service-core, w= hich allows > the main thread to read the "service_remote_launch_flag" value as 0 (befo= re the service-thread writes it to 1). > > Adding the delay between the service launch and service write being perfo= rmed makes this issue much much more likely to occur - so the above descrip= tion I have confidence in. > > What I'm not clear on (yet) is why the eal_mp_wait_lcore() isn't waiting.= .. As I wrote in the other thread, it's because eal_mp_wait_lcore won't look at lcores with ROLE_SERVICE. > -H I've been running something similar to the suggested patch for 24 minutes now with no failure. I've also removed the eal_mp_wait_lcore() call in other areas throughout the test and switched to individual core waiting "just in case." I don't think it's the right fix, though.