From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-f67.google.com (mail-ed1-f67.google.com [209.85.208.67]) by dpdk.org (Postfix) with ESMTP id 546522C54 for ; Tue, 30 Oct 2018 11:19:47 +0100 (CET) Received: by mail-ed1-f67.google.com with SMTP id b7-v6so9945362edd.9 for ; Tue, 30 Oct 2018 03:19:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netronome-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=4i5jgZMfwA51j/uym1/ijiruJDRKBnmheWmiKZ/ZhmE=; b=E0ERz6T1bt9nrYBlRxabovDjjLY2awP/rgpCfQggGDNZ0Pl8f5e2j6TKPH0X50q+fD n6gXRBL8OrVf13Q3WtbLr+HecpNbN9ncM4k86aTJAPtwGStp27lHEDs1ON7AnCbHFOzT 0NTeG2sYKAsDHBbU3In2keUnlzLAn3SkqnFtFi2lf1OFynBNGi2xKc7b4lhLYzvGSED/ 4WUtMyrsRPcTXz9ikkFFx2ENqe5FGyC8WoPbfiCVyiDbUC+/1Rg2h1A+OVXQCvr9Y0C4 6bzfRgm3j1VLX3iIvDUvddhr4XJAFHxknPkiHF3/nkwrH1pJZrPTrCScuRGF3vDAwffq LZOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=4i5jgZMfwA51j/uym1/ijiruJDRKBnmheWmiKZ/ZhmE=; b=Xh6JztvAiFjbq5Q0xbee+m/hbcxbHWzQjru0YyEl+X0ew5BIxxFfXtr+GngxE+3rcc CFq8Y3sqNMnKn6FnclBUOCSNjXV79vpaQad5ZE0lWx0Y6G4PktXPS39bs7wIHU2LjbgY MyHATU2jLBpYRk62Lt71n3DgGOmREIIwZZxyipy4Kb9lL9H5EdkwptBftUSPiee8FJ0Z 06FI6FjgCBckyeypLTW3ukakaGbqB4m55yFdC8tSDBRcrzXpHUYEyzEvpgugCAa6SCmf Pe2CPi+MuBM1D53PW7GyxCxGGnGSIyX4YpeFfxKbrx3Icgy2LHYkf5RiUk+2UzEmtxAH xwaQ== X-Gm-Message-State: AGRZ1gKxeAAiy64swmVM9DnlGDlb/TRLJUnqH+T+vmnEsxYRRBZV6Pv9 DaTZV0ezvUr8wkUMR4c1gov5ujzLSSnLEIdTDheIjg== X-Google-Smtp-Source: AJdET5cRsy8qVC73koNhsAR0GuOcs8IS07ssfsWMC6oKfvRif8vIbar5ymDXfzTEZTZEyDrAM373Az+UJObuqtU1XE0= X-Received: by 2002:aa7:c0c4:: with SMTP id j4-v6mr13779872edp.173.1540894787032; Tue, 30 Oct 2018 03:19:47 -0700 (PDT) MIME-Version: 1.0 References: <1538743527-8285-1-git-send-email-alejandro.lucero@netronome.com> <2DBBFF226F7CF64BAFCA79B681719D954502B94F@shsmsx102.ccr.corp.intel.com> <3483377.PMXnpSGLS9@xps> <30339c03-6ec2-f72a-d113-5b150f441bf9@intel.com> In-Reply-To: <30339c03-6ec2-f72a-d113-5b150f441bf9@intel.com> From: Alejandro Lucero Date: Tue, 30 Oct 2018 10:19:35 +0000 Message-ID: To: "Burakov, Anatoly" Cc: Thomas Monjalon , lei.a.yao@intel.com, dev , "Xu, Qian Q" , xueqin.lin@intel.com, Ferruh Yigit Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Oct 2018 10:19:47 -0000 On Tue, Oct 30, 2018 at 10:11 AM Burakov, Anatoly wrote: > On 29-Oct-18 2:18 PM, Thomas Monjalon wrote: > > 29/10/2018 14:40, Alejandro Lucero: > >> On Mon, Oct 29, 2018 at 1:18 PM Yao, Lei A wrote: > >>> *From:* Alejandro Lucero [mailto:alejandro.lucero@netronome.com] > >>> On Mon, Oct 29, 2018 at 11:46 AM Thomas Monjalon > >>> wrote: > >>> > >>> 29/10/2018 12:39, Alejandro Lucero: > >>>> I got a patch that solves a bug when calling rte_eal_dma_mask using > the > >>>> mask instead of the maskbits. However, this does not solves the > >>> deadlock. > >>> > >>> The deadlock is a bigger concern I think. > >>> > >>> I think once the call to rte_eal_check_dma_mask uses the maskbits > instead > >>> of the mask, calling rte_memseg_walk_thread_unsafe avoids the deadlock. > >>> > >>> Yao, can you try with the attached patch? > >>> > >>> Hi, Lucero > >>> > >>> This patch can fix the issue at my side. Thanks a lot > >>> for you quick action. > >> > >> Great! > >> > >> I will send an official patch with the changes. > > > > Please, do not forget my other request to better comment functions. > > > > > >> I have to say that I tested the patchset, but I think it was where > >> legacy_mem was still there and therefore dynamic memory allocation code > not > >> used during memory initialization. > >> > >> There is something that concerns me though. Using > >> rte_memseg_walk_thread_unsafe could be a problem under some situations > >> although those situations being unlikely. > >> > >> Usually, calling rte_eal_check_dma_mask happens during initialization. > Then > >> it is safe to use the unsafe function for walking memsegs, but with > device > >> hotplug and dynamic memory allocation, there exists a potential race > >> condition when the primary process is allocating more memory and > >> concurrently a device is hotplugged and a secondary process does the > device > >> initialization. By now, this is just a problem with the NFP, and the > >> potential race condition window really unlikely, but I will work on this > >> asap. > > > > Yes, this is what concerns me. > > You can add a comment explaining the unsafe which is not handled. > > The issue here is that this code is called from both memory-locked and > memory-unlocked context. Virtio had a similar issue with their mem table > update code - they solved it by manually locking the memory before doing > everything else, and using thread_unsafe version of the walk. > > Could something like that be done here? > > I have a patch adding a safe and an unsafe dma mask check versions. However, because the multiprocess problem reported, I think the fixing requires other type of work. The problem I see now is calling rte_eal_check_dma_mask from set_iova_mode code path is wrong. This can not be done at that point because the memory has not been initialized yet. > > > > > >>>> Interestingly, the problem looks like a compiler one. Calling > >>>> rte_memseg_walk does not return when calling inside rt_eal_dma_mask, > >>> but if > >>>> you modify the call like this: > >>>> > >>>> - if (rte_memseg_walk(check_iova, &mask)) > >>>> + if (!rte_memseg_walk(check_iova, &mask)) > >>>> > >>>> it works, although the value returned to the invoker changes, of > course. > >>>> But the point here is it should be the same behaviour when calling > >>>> rte_memseg_walk than before and it is not. > >>> > >>> Anyway, the coding style requires to save the return value in a > variable, > >>> instead of nesting the call in an "if" condition. > >>> And the "if" check should be explicitly != 0 because it is not a real > >>> boolean. > >>> > >>> PS: please do not top post and avoid HTML emails, thanks > >>> > >>> > >> > > > > > > > > > > > > > > > -- > Thanks, > Anatoly >