From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id A7793A09E4; Fri, 29 Jan 2021 17:07:33 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7D6BB40684; Fri, 29 Jan 2021 17:07:33 +0100 (CET) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by mails.dpdk.org (Postfix) with ESMTP id 63C6C40395 for ; Fri, 29 Jan 2021 17:07:31 +0100 (CET) IronPort-SDR: 1yBXeKO5dy7s3Lnxs72jxRiQqHFwZAlHadftUn6Cn5vnRE7K+QA8qtFEa0Gm7R7AmV3CsxxQTE XypvdyoVHBjw== X-IronPort-AV: E=McAfee;i="6000,8403,9878"; a="265276732" X-IronPort-AV: E=Sophos;i="5.79,385,1602572400"; d="scan'208";a="265276732" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2021 08:07:29 -0800 IronPort-SDR: C6NsmoYNUEwHXlYJpTUIychRrNjSKEFrPwRTm43KipZ4fymytO5Tl2pJwmaDDKnR2ES2jZaGFm DjrnrWmPpIYA== X-IronPort-AV: E=Sophos;i="5.79,385,1602572400"; d="scan'208";a="365317694" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.213.210.186]) ([10.213.210.186]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jan 2021 08:07:28 -0800 To: Thomas Monjalon Cc: dev@dpdk.org, james.r.harris@intel.com References: <4e0688f841f6ba2408fde949aabce8e36c0d46f0.1611934186.git.anatoly.burakov@intel.com> <16449461.VCXEoCD3cp@thomas> From: "Burakov, Anatoly" Message-ID: Date: Fri, 29 Jan 2021 16:07:27 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: <16449461.VCXEoCD3cp@thomas> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH] mem: fix deadlock on secondary allocation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 29-Jan-21 3:40 PM, Thomas Monjalon wrote: > 29/01/2021 16:29, Anatoly Burakov: >> Previous fix used `rte_malloc_heap_socket_is_external()` to check if the >> heap was an external heap. However, that API is thread-safe, and when >> we're inside the allocation process, we're already write-locked, so >> calling `rte_malloc_heap_socket_is_external()` will result in a >> deadlock followed by a timeout. >> >> Fix it by replacing the API call with a check against maximum number of >> NUMA nodes, because external heaps always have higher socket ID's. > > Is there some unit tests for such thing? I couldn't reproduce this using autotests, but Jim has SPDK tests which triggered this error. Since this is dependent upon secondary process, any test would necessarily have to be manual here, i think. > >> >> Fixes: 7ac31e82bc8f ("mem: improve parameter checking on memory hotplug") >> >> Reported-by: Jim Harris >> > > No need of blank line here. Need to update my scripts :P > >> Signed-off-by: Anatoly Burakov >> --- >> lib/librte_eal/common/malloc_mp.c | 9 +++++++-- >> 1 file changed, 7 insertions(+), 2 deletions(-) >> >> diff --git a/lib/librte_eal/common/malloc_mp.c b/lib/librte_eal/common/malloc_mp.c >> index 0b19d4d5fb..b1f7f7824b 100644 >> --- a/lib/librte_eal/common/malloc_mp.c >> +++ b/lib/librte_eal/common/malloc_mp.c >> - /* for allocations, we must only use internal heaps */ >> - if (rte_malloc_heap_socket_is_external(heap->socket_id)) { >> + /* >> + * for allocations, we must only use internal heaps, but since the >> + * rte_malloc_heap_socket_is_external() is thread-safe and we're already >> + * read-locked, we'll have to take advantage of the fac that internal > > fac -> fact? > Yes. >> + * socket ID's are always lower than RTE_MAX_NUMA_NODES. >> + */ >> + if (heap->socket_id >= RTE_MAX_NUMA_NODES) { > > > > -- Thanks, Anatoly