From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <alejandro.lucero@netronome.com>
Received: from mail-ed1-f66.google.com (mail-ed1-f66.google.com
 [209.85.208.66]) by dpdk.org (Postfix) with ESMTP id ED8511C01
 for <dev@dpdk.org>; Mon, 29 Oct 2018 20:38:06 +0100 (CET)
Received: by mail-ed1-f66.google.com with SMTP id t10-v6so8374444eds.12
 for <dev@dpdk.org>; Mon, 29 Oct 2018 12:38:06 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=netronome-com.20150623.gappssmtp.com; s=20150623;
 h=mime-version:references:in-reply-to:from:date:message-id:subject:to
 :cc; bh=152wW8FZXWOScTj75xnfIIdoBg0/+VMs69RbmdEqNX0=;
 b=wAA11604dkf5fNR7w+wUbhlyORCP1mobld1kR6zSFFmuGYeqK//WAyuM53ehKUrh7g
 2XDzA9inx1JjCaY4c6FV2NB733CG2mAicoExTm0y0Mev+IoHCiuF8MHEUb5M8EoNJUK7
 pdnxDcLZHCQ2HwuYkUizmsCwLKgrBZBcPFpLR+NdFseyGRzjdawgAEGUFNF0iyBzQft0
 GTES1bQgdIENMZKzhzyMc7L+I6N05O6cXTYbQSvejS2uW6Vi/4CRa5wHqaiKPKhFp4Ak
 5/LT4YJFDcdwXwo9zRoeO/dGpb37BIQtFHMpd46eHmV1lx/4yLhyUgwwL9dWNPYBVICn
 Tt/g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=152wW8FZXWOScTj75xnfIIdoBg0/+VMs69RbmdEqNX0=;
 b=bZ7/nnPIWB6XAXGqvNDlqakPZVOefDtQqP/so3FdPfZ7ZL7liCmmMRZ4lEUxUj+NDj
 kBkxgZo32ZlNjefWEEcDB+3p6SRUa/t382aXDMAT+intTk2nDKdXbyou/QKnkvJnrJp5
 R0pgaJM2fBurcgplH76fthR2Y+NfNZyB4jmlEcJVtqaCTyDnof/dluAUWy3Ioi2dH7WC
 k+6LBiL7n2enWSG4BIDiZUVpBQdo0nSw+0CUrrocZtYqDyyhxQY7yn1awv9sk4RyBMAX
 kkwArOPqYwtBQ1QS4cBmlShTR6/hwDHWyf3nZbq7UMHOunjdBWbA5EantjTCpE6Jbssf
 qVPQ==
X-Gm-Message-State: AGRZ1gJX0wszQmvE/Hd/HTTuaD7cZsWgUmOBqRQZJ7kVkj/ciAbr/SjZ
 UiR7QeLUwHWM7XrpN8xIcim5+U6HGVF307jbxN8aaA==
X-Google-Smtp-Source: AJdET5eJ0aY0N6eW2fR6LDsu/dZYu5QE/g93ojppex/yx9hSVpB2xYjova8DcJK8ssBcO6UIUW52piDjiJFggl/ybA8=
X-Received: by 2002:aa7:c0c4:: with SMTP id
 j4-v6mr11981843edp.173.1540841886589; 
 Mon, 29 Oct 2018 12:38:06 -0700 (PDT)
MIME-Version: 1.0
References: <1538743527-8285-1-git-send-email-alejandro.lucero@netronome.com>
 <2DBBFF226F7CF64BAFCA79B681719D954502B94F@shsmsx102.ccr.corp.intel.com>
 <CAD+H990gvgYU8UPhEMeY3gmDqW-LXM+FZaZSWVDGttu4V3J2DQ@mail.gmail.com>
 <3483377.PMXnpSGLS9@xps> <621BE501-6B10-4053-AC33-50ABE0231A44@mellanox.com>
In-Reply-To: <621BE501-6B10-4053-AC33-50ABE0231A44@mellanox.com>
From: Alejandro Lucero <alejandro.lucero@netronome.com>
Date: Mon, 29 Oct 2018 19:37:56 +0000
Message-ID: <CAD+H99105xwBk4+BRq7nfQ4ak83tXiohVkn0vN_Ft00GGZ7soA@mail.gmail.com>
To: Yongseok Koh <yskoh@mellanox.com>
Cc: lei.a.yao@intel.com, Thomas Monjalon <thomas@monjalon.net>,
 dev <dev@dpdk.org>, 
 "Xu, Qian Q" <qian.q.xu@intel.com>, xueqin.lin@intel.com, 
 "Burakov, Anatoly" <anatoly.burakov@intel.com>,
 Ferruh Yigit <ferruh.yigit@intel.com>
Content-Type: text/plain; charset="UTF-8"
X-Content-Filtered-By: Mailman/MimeDel 2.1.15
Subject: Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Mon, 29 Oct 2018 19:38:07 -0000

On Mon, Oct 29, 2018 at 6:54 PM Yongseok Koh <yskoh@mellanox.com> wrote:

>
> > On Oct 29, 2018, at 7:18 AM, Thomas Monjalon <thomas@monjalon.net>
> wrote:
> >
> > 29/10/2018 14:40, Alejandro Lucero:
> >> On Mon, Oct 29, 2018 at 1:18 PM Yao, Lei A <lei.a.yao@intel.com> wrote:
> >>> *From:* Alejandro Lucero [mailto:alejandro.lucero@netronome.com]
> >>> On Mon, Oct 29, 2018 at 11:46 AM Thomas Monjalon <thomas@monjalon.net>
> >>> wrote:
> >>>
> >>> 29/10/2018 12:39, Alejandro Lucero:
> >>>> I got a patch that solves a bug when calling rte_eal_dma_mask using
> the
> >>>> mask instead of the maskbits. However, this does not solves the
> >>> deadlock.
> >>>
> >>> The deadlock is a bigger concern I think.
> >>>
> >>> I think once the call to rte_eal_check_dma_mask uses the maskbits
> instead
> >>> of the mask, calling rte_memseg_walk_thread_unsafe avoids the deadlock.
> >>>
> >>> Yao, can you try with the attached patch?
> >>>
> >>> Hi, Lucero
> >>>
> >>> This patch can fix the issue at my side. Thanks a lot
> >>> for you quick action.
> >>
> >> Great!
> >>
> >> I will send an official patch with the changes.
> >
> > Please, do not forget my other request to better comment functions.
>
> Alejandro,
>
> This patchset has been merged to stable/17.11 per your request for the
> last release.
> You must send a fix to stable/17.11 as well, if you think there's a same
> issue there.
>
>
The patchset for 17.11 was much more simpler. There have been a lot of
changes to the memory code since 17.11, and this problem should not be
present in stable 17.11.

Once I have said that, if there are any reports about a problem with this
patchset in 17.11, I will work on it as a priority.

Thanks.


> Thanks,
> Yongseok
>
> >> I have to say that I tested the patchset, but I think it was where
> >> legacy_mem was still there and therefore dynamic memory allocation code
> not
> >> used during memory initialization.
> >>
> >> There is something that concerns me though. Using
> >> rte_memseg_walk_thread_unsafe could be a problem under some situations
> >> although those situations being unlikely.
> >>
> >> Usually, calling rte_eal_check_dma_mask happens during initialization.
> Then
> >> it is safe to use the unsafe function for walking memsegs, but with
> device
> >> hotplug and dynamic memory allocation, there exists a potential race
> >> condition when the primary process is allocating more memory and
> >> concurrently a device is hotplugged and a secondary process does the
> device
> >> initialization. By now, this is just a problem with the NFP, and the
> >> potential race condition window really unlikely, but I will work on this
> >> asap.
> >
> > Yes, this is what concerns me.
> > You can add a comment explaining the unsafe which is not handled.
> >
> >
> >>>> Interestingly, the problem looks like a compiler one. Calling
> >>>> rte_memseg_walk does not return when calling inside rt_eal_dma_mask,
> >>> but if
> >>>> you modify the call like this:
> >>>>
> >>>> -       if (rte_memseg_walk(check_iova, &mask))
> >>>> +       if (!rte_memseg_walk(check_iova, &mask))
> >>>>
> >>>> it works, although the value returned to the invoker changes, of
> course.
> >>>> But the point here is it should be the same behaviour when calling
> >>>> rte_memseg_walk than before and it is not.
> >>>
> >>> Anyway, the coding style requires to save the return value in a
> variable,
> >>> instead of nesting the call in an "if" condition.
> >>> And the "if" check should be explicitly != 0 because it is not a real
> >>> boolean.
> >>>
> >>> PS: please do not top post and avoid HTML emails, thanks
> >>>
> >>>
> >>
> >
> >
> >
> >
> >
>
>