From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id E5D211E2F for ; Tue, 30 Oct 2018 11:11:58 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Oct 2018 03:11:58 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,444,1534834800"; d="scan'208";a="103731643" Received: from aburakov-mobl1.ger.corp.intel.com (HELO [10.237.220.72]) ([10.237.220.72]) by fmsmga001.fm.intel.com with ESMTP; 30 Oct 2018 03:11:56 -0700 To: Thomas Monjalon , Alejandro Lucero Cc: lei.a.yao@intel.com, dev , "Xu, Qian Q" , xueqin.lin@intel.com, Ferruh Yigit References: <1538743527-8285-1-git-send-email-alejandro.lucero@netronome.com> <2DBBFF226F7CF64BAFCA79B681719D954502B94F@shsmsx102.ccr.corp.intel.com> <3483377.PMXnpSGLS9@xps> From: "Burakov, Anatoly" Message-ID: <30339c03-6ec2-f72a-d113-5b150f441bf9@intel.com> Date: Tue, 30 Oct 2018 10:11:55 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <3483377.PMXnpSGLS9@xps> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Oct 2018 10:11:59 -0000 On 29-Oct-18 2:18 PM, Thomas Monjalon wrote: > 29/10/2018 14:40, Alejandro Lucero: >> On Mon, Oct 29, 2018 at 1:18 PM Yao, Lei A wrote: >>> *From:* Alejandro Lucero [mailto:alejandro.lucero@netronome.com] >>> On Mon, Oct 29, 2018 at 11:46 AM Thomas Monjalon >>> wrote: >>> >>> 29/10/2018 12:39, Alejandro Lucero: >>>> I got a patch that solves a bug when calling rte_eal_dma_mask using the >>>> mask instead of the maskbits. However, this does not solves the >>> deadlock. >>> >>> The deadlock is a bigger concern I think. >>> >>> I think once the call to rte_eal_check_dma_mask uses the maskbits instead >>> of the mask, calling rte_memseg_walk_thread_unsafe avoids the deadlock. >>> >>> Yao, can you try with the attached patch? >>> >>> Hi, Lucero >>> >>> This patch can fix the issue at my side. Thanks a lot >>> for you quick action. >> >> Great! >> >> I will send an official patch with the changes. > > Please, do not forget my other request to better comment functions. > > >> I have to say that I tested the patchset, but I think it was where >> legacy_mem was still there and therefore dynamic memory allocation code not >> used during memory initialization. >> >> There is something that concerns me though. Using >> rte_memseg_walk_thread_unsafe could be a problem under some situations >> although those situations being unlikely. >> >> Usually, calling rte_eal_check_dma_mask happens during initialization. Then >> it is safe to use the unsafe function for walking memsegs, but with device >> hotplug and dynamic memory allocation, there exists a potential race >> condition when the primary process is allocating more memory and >> concurrently a device is hotplugged and a secondary process does the device >> initialization. By now, this is just a problem with the NFP, and the >> potential race condition window really unlikely, but I will work on this >> asap. > > Yes, this is what concerns me. > You can add a comment explaining the unsafe which is not handled. The issue here is that this code is called from both memory-locked and memory-unlocked context. Virtio had a similar issue with their mem table update code - they solved it by manually locking the memory before doing everything else, and using thread_unsafe version of the walk. Could something like that be done here? > > >>>> Interestingly, the problem looks like a compiler one. Calling >>>> rte_memseg_walk does not return when calling inside rt_eal_dma_mask, >>> but if >>>> you modify the call like this: >>>> >>>> - if (rte_memseg_walk(check_iova, &mask)) >>>> + if (!rte_memseg_walk(check_iova, &mask)) >>>> >>>> it works, although the value returned to the invoker changes, of course. >>>> But the point here is it should be the same behaviour when calling >>>> rte_memseg_walk than before and it is not. >>> >>> Anyway, the coding style requires to save the return value in a variable, >>> instead of nesting the call in an "if" condition. >>> And the "if" check should be explicitly != 0 because it is not a real >>> boolean. >>> >>> PS: please do not top post and avoid HTML emails, thanks >>> >>> >> > > > > > > -- Thanks, Anatoly