DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Lin, Xueqin" <xueqin.lin@intel.com>
To: Alejandro Lucero <alejandro.lucero@netronome.com>
Cc: "Yao, Lei A" <lei.a.yao@intel.com>,
	Thomas Monjalon <thomas@monjalon.net>,  dev <dev@dpdk.org>,
	"Xu, Qian Q" <qian.q.xu@intel.com>,
	"Burakov, Anatoly" <anatoly.burakov@intel.com>,
	"Yigit, Ferruh" <ferruh.yigit@intel.com>,
	"Zhang, Qi Z" <qi.z.zhang@intel.com>
Subject: Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask
Date: Tue, 30 Oct 2018 15:09:22 +0000	[thread overview]
Message-ID: <0D300480287911409D9FF92C1FA2A3355B443176@SHSMSX104.ccr.corp.intel.com> (raw)
In-Reply-To: <CAD+H993WCW10==N_DiiNepXBtJ=QnKzrM4CgUOqySaJucW6s4w@mail.gmail.com>

Hi Lucero,

From: Alejandro Lucero [mailto:alejandro.lucero@netronome.com]
Sent: Tuesday, October 30, 2018 10:57 PM
To: Lin, Xueqin <xueqin.lin@intel.com>
Cc: Yao, Lei A <lei.a.yao@intel.com>; Thomas Monjalon <thomas@monjalon.net>; dev <dev@dpdk.org>; Xu, Qian Q <qian.q.xu@intel.com>; Burakov, Anatoly <anatoly.burakov@intel.com>; Yigit, Ferruh <ferruh.yigit@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>
Subject: Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask


On Tue, Oct 30, 2018 at 2:45 PM Lin, Xueqin <xueqin.lin@intel.com<mailto:xueqin.lin@intel.com>> wrote:
Hi Lucero,

The patch could fix both testpmd  and multi-process can’t setup issues on my environment.
Hope that you could upload fix patch to patches page in community.
Thanks a lot.


Great.

I need to format the patchset properly and clean things up but I hope I can send a patchset this week.

Thanks for testing!

By the way, is this testing something you are doing by yourself or it is part of Intel DPDK work?


We are from Intel DPDK validation team☺
It is 18.11 rc1 cycle, the issue block most of our cases can’t continue, include NIC, NIC VF, vhost/virtio, sample…
It is very urgent for us to check DPDK QA in very limit time.
Hope you could send fix patch officially soon, then merge to master branch after review.
Thanks.
Best regards,
Xueqin

From: Alejandro Lucero [mailto:alejandro.lucero@netronome.com<mailto:alejandro.lucero@netronome.com>]
Sent: Tuesday, October 30, 2018 10:05 PM
To: Lin, Xueqin <xueqin.lin@intel.com<mailto:xueqin.lin@intel.com>>
Cc: Yao, Lei A <lei.a.yao@intel.com<mailto:lei.a.yao@intel.com>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; dev <dev@dpdk.org<mailto:dev@dpdk.org>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>; Yigit, Ferruh <ferruh.yigit@intel.com<mailto:ferruh.yigit@intel.com>>; Zhang, Qi Z <qi.z.zhang@intel.com<mailto:qi.z.zhang@intel.com>>
Subject: Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask


On Tue, Oct 30, 2018 at 12:37 PM Alejandro Lucero <alejandro.lucero@netronome.com<mailto:alejandro.lucero@netronome.com>> wrote:

On Tue, Oct 30, 2018 at 12:22 PM Lin, Xueqin <xueqin.lin@intel.com<mailto:xueqin.lin@intel.com>> wrote:
Some found on some our servers:
If  not add ”intel_iommu=on iommu=pt” in /boot/grub2/grub.cfg file, then reboot to make it effective.
18.11 rc1: Success to setup testpmd  and secondary process.

If  add  ”intel_iommu=on iommu=pt” in /boot/grub2/grub.cfg file, then reboot to make it effective.
18.11 rc1:  Fail to setup testpmd  and secondary process.
18.11 rc1+ dma_mask_fix patch: success to setup testpmd, but fail to setup secondary process.

Maybe ”intel_iommu=on iommu=pt” enable or not result in our test gap.
Most of our team servers should enable the IOMMU for VT-d and vfio test.


It makes sense because the problem is when the IOVA mode is set inside drivers/bus/pci/linux/pci.c and if there is not IOMMU, not call to rte_eal_check_dma_mask at all.


Best regards,
Xueqin

From: Alejandro Lucero [mailto:alejandro.lucero@netronome.com<mailto:alejandro.lucero@netronome.com>]
Sent: Tuesday, October 30, 2018 6:38 PM
To: Lin, Xueqin <xueqin.lin@intel.com<mailto:xueqin.lin@intel.com>>
Cc: Yao, Lei A <lei.a.yao@intel.com<mailto:lei.a.yao@intel.com>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; dev <dev@dpdk.org<mailto:dev@dpdk.org>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>; Yigit, Ferruh <ferruh.yigit@intel.com<mailto:ferruh.yigit@intel.com>>; Zhang, Qi Z <qi.z.zhang@intel.com<mailto:qi.z.zhang@intel.com>>
Subject: Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask


On Tue, Oct 30, 2018 at 10:34 AM Lin, Xueqin <xueqin.lin@intel.com<mailto:xueqin.lin@intel.com>> wrote:
Hi Lucero,

No, we have reproduced multi-process issues(include symmetric_mp, simple_mp, hotplug_mp, multi-process unit test… )on most of our servers.
It is also strange that 1~2 servers don’t have the issue.


Yes, you are right. I could execute it but it was due to how this problem triggers.
I think I can fix this and at the same time solving properly the initial issue without any limitation like that potential race condition I mentioned.
I can give you a patch to try in a couple of hours.


Hi Lin,

Can you try the patch attached?

Thanks

Thanks

Bind two NNT ports or FVL ports

./build/symmetric_mp -c 4 --proc-type=auto -- -p 3 --num-procs=4 --proc-id=1

EAL: Detected 88 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Auto-detected process type: SECONDARY
[New Thread 0x7ffff6eda700 (LWP 90103)]
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket_90099_2f1b553882b62
[New Thread 0x7ffff66d9700 (LWP 90104)]

Thread 1 "symmetric_mp" received signal SIGSEGV, Segmentation fault.
0x00000000005566b5 in rte_fbarray_find_next_used ()
(gdb) bt
#0  0x00000000005566b5 in rte_fbarray_find_next_used ()
#1  0x000000000054da9c in rte_eal_check_dma_mask ()
#2  0x0000000000572ae7 in pci_one_device_iommu_support_va ()
#3  0x0000000000573988 in rte_pci_get_iommu_class ()
#4  0x000000000054f743 in rte_bus_get_iommu_class ()
#5  0x000000000053c123 in rte_eal_init ()
#6  0x000000000046be2b in main ()

Best regards,
Xueqin

From: Alejandro Lucero [mailto:alejandro.lucero@netronome.com<mailto:alejandro.lucero@netronome.com>]
Sent: Tuesday, October 30, 2018 5:41 PM
To: Lin, Xueqin <xueqin.lin@intel.com<mailto:xueqin.lin@intel.com>>
Cc: Yao, Lei A <lei.a.yao@intel.com<mailto:lei.a.yao@intel.com>>; Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; dev <dev@dpdk.org<mailto:dev@dpdk.org>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>; Yigit, Ferruh <ferruh.yigit@intel.com<mailto:ferruh.yigit@intel.com>>; Zhang, Qi Z <qi.z.zhang@intel.com<mailto:qi.z.zhang@intel.com>>
Subject: Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask


On Tue, Oct 30, 2018 at 3:20 AM Lin, Xueqin <xueqin.lin@intel.com<mailto:xueqin.lin@intel.com>> wrote:
Hi Lucero&Thomas,

Find the patch can’t fix multi-process cases.

Hi,

I think it is not specifically about multiprocess but about hotplug with multiprocess because I can execute the symmetric_mp successfully with a secondary process.

Working on this as a priority.

Thanks.

Steps:

1.       Setup primary process successfully

./hotplug_mp --proc-type=auto



2.       Fail to setup secondary process

./hotplug_mp --proc-type=auto

EAL: Detected 88 lcore(s)

EAL: Detected 2 NUMA nodes

EAL: Auto-detected process type: SECONDARY

EAL: Multi-process socket /var/run/dpdk/rte/mp_socket_147212_2bfe08ee88d23

Segmentation fault (core dumped)


More information as below:

Thread 1 "hotplug_mp" received signal SIGSEGV, Segmentation fault.

0x0000000000597cfb in find_next (arr=0x7ffff7ff20a4, start=0, used=true)

    at /root/dpdk/lib/librte_eal/common/eal_common_fbarray.c:264

264             for (idx = first; idx < msk->n_masks; idx++) {

#0  0x0000000000597cfb in find_next (arr=0x7ffff7ff20a4, start=0, used=true)

    at /root/dpdk/lib/librte_eal/common/eal_common_fbarray.c:264

#1  0x0000000000598573 in fbarray_find (arr=0x7ffff7ff20a4, start=0, next=true,

    used=true) at /root/dpdk/lib/librte_eal/common/eal_common_fbarray.c:1001

#2  0x000000000059929b in rte_fbarray_find_next_used (arr=0x7ffff7ff20a4, start=0)

    at /root/dpdk/lib/librte_eal/common/eal_common_fbarray.c:1018

#3  0x000000000058c877 in rte_memseg_walk_thread_unsafe (func=0x58c401 <check_iova>,

    arg=0x7fffffffcc38) at /root/dpdk/lib/librte_eal/common/eal_common_memory.c:589

#4  0x000000000058ce08 in rte_eal_check_dma_mask (maskbits=48 '0')

    at /root/dpdk/lib/librte_eal/common/eal_common_memory.c:465

#5  0x00000000005b96c4 in pci_one_device_iommu_support_va (dev=0x11b3d90)

    at /root/dpdk/drivers/bus/pci/linux/pci.c:593

#6  0x00000000005b9738 in pci_devices_iommu_support_va ()

    at /root/dpdk/drivers/bus/pci/linux/pci.c:626

#7  0x00000000005b97a7 in rte_pci_get_iommu_class ()

    at /root/dpdk/drivers/bus/pci/linux/pci.c:650

#8  0x000000000058f1ce in rte_bus_get_iommu_class ()

    at /root/dpdk/lib/librte_eal/common/eal_common_bus.c:237

#9  0x0000000000577c7a in rte_eal_init (argc=2, argv=0x7fffffffdf98)

    at /root/dpdk/lib/librte_eal/linuxapp/eal/eal.c:919

#10 0x000000000045dd56 in main (argc=2, argv=0x7fffffffdf98)

    at /root/dpdk/examples/multi_process/hotplug_mp/main.c:28


Best regards,
Xueqin

From: Alejandro Lucero [mailto:alejandro.lucero@netronome.com<mailto:alejandro.lucero@netronome.com>]
Sent: Monday, October 29, 2018 9:41 PM
To: Yao, Lei A <lei.a.yao@intel.com<mailto:lei.a.yao@intel.com>>
Cc: Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>; dev <dev@dpdk.org<mailto:dev@dpdk.org>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Lin, Xueqin <xueqin.lin@intel.com<mailto:xueqin.lin@intel.com>>; Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>; Yigit, Ferruh <ferruh.yigit@intel.com<mailto:ferruh.yigit@intel.com>>
Subject: Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask


On Mon, Oct 29, 2018 at 1:18 PM Yao, Lei A <lei.a.yao@intel.com<mailto:lei.a.yao@intel.com>> wrote:


From: Alejandro Lucero [mailto:alejandro.lucero@netronome.com<mailto:alejandro.lucero@netronome.com>]
Sent: Monday, October 29, 2018 8:56 PM
To: Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>>
Cc: Yao, Lei A <lei.a.yao@intel.com<mailto:lei.a.yao@intel.com>>; dev <dev@dpdk.org<mailto:dev@dpdk.org>>; Xu, Qian Q <qian.q.xu@intel.com<mailto:qian.q.xu@intel.com>>; Lin, Xueqin <xueqin.lin@intel.com<mailto:xueqin.lin@intel.com>>; Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>; Yigit, Ferruh <ferruh.yigit@intel.com<mailto:ferruh.yigit@intel.com>>
Subject: Re: [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask


On Mon, Oct 29, 2018 at 11:46 AM Thomas Monjalon <thomas@monjalon.net<mailto:thomas@monjalon.net>> wrote:
29/10/2018 12:39, Alejandro Lucero:
> I got a patch that solves a bug when calling rte_eal_dma_mask using the
> mask instead of the maskbits. However, this does not solves the deadlock.

The deadlock is a bigger concern I think.

I think once the call to rte_eal_check_dma_mask uses the maskbits instead of the mask, calling rte_memseg_walk_thread_unsafe avoids the deadlock.

Yao, can you try with the attached patch?

Hi, Lucero

This patch can fix the issue at my side. Thanks a lot
for you quick action.


Great!

I will send an official patch with the changes.

I have to say that I tested the patchset, but I think it was where legacy_mem was still there and therefore dynamic memory allocation code not used during memory initialization.

There is something that concerns me though. Using rte_memseg_walk_thread_unsafe could be a problem under some situations although those situations being unlikely.

Usually, calling rte_eal_check_dma_mask happens during initialization. Then it is safe to use the unsafe function for walking memsegs, but with device hotplug and dynamic memory allocation, there exists a potential race condition when the primary process is allocating more memory and concurrently a device is hotplugged and a secondary process does the device initialization. By now, this is just a problem with the NFP, and the potential race condition window really unlikely, but I will work on this asap.

BRs
Lei

> Interestingly, the problem looks like a compiler one. Calling
> rte_memseg_walk does not return when calling inside rt_eal_dma_mask, but if
> you modify the call like this:
>
> -       if (rte_memseg_walk(check_iova, &mask))
> +       if (!rte_memseg_walk(check_iova, &mask))
>
> it works, although the value returned to the invoker changes, of course.
> But the point here is it should be the same behaviour when calling
> rte_memseg_walk than before and it is not.

Anyway, the coding style requires to save the return value in a variable,
instead of nesting the call in an "if" condition.
And the "if" check should be explicitly != 0 because it is not a real boolean.

PS: please do not top post and avoid HTML emails, thanks

  reply	other threads:[~2018-10-30 15:09 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-05 12:45 Alejandro Lucero
2018-10-05 12:45 ` [dpdk-dev] [PATCH v3 1/6] mem: add function for checking memsegs IOVAs addresses Alejandro Lucero
2018-10-10  8:56   ` Tu, Lijuan
2018-10-11  9:26     ` Alejandro Lucero
2018-10-28 21:03   ` Thomas Monjalon
2018-10-05 12:45 ` [dpdk-dev] [PATCH v3 2/6] mem: use address hint for mapping hugepages Alejandro Lucero
2018-10-29 16:08   ` Dariusz Stojaczyk
2018-10-29 16:40     ` Alejandro Lucero
2018-10-05 12:45 ` [dpdk-dev] [PATCH v3 3/6] bus/pci: check iommu addressing limitation just once Alejandro Lucero
2018-10-05 12:45 ` [dpdk-dev] [PATCH v3 4/6] bus/pci: use IOVAs dmak mask check when setting IOVA mode Alejandro Lucero
2018-10-05 12:45 ` [dpdk-dev] [PATCH v3 5/6] net/nfp: check hugepages IOVAs based on DMA mask Alejandro Lucero
2018-10-05 12:45 ` [dpdk-dev] [PATCH v3 6/6] net/nfp: support IOVA VA mode Alejandro Lucero
2018-10-28 21:04 ` [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask Thomas Monjalon
2018-10-29  8:23   ` Yao, Lei A
2018-10-29  8:42     ` Thomas Monjalon
2018-10-29  9:07       ` Thomas Monjalon
2018-10-29  9:25         ` Alejandro Lucero
2018-10-29  9:44           ` Yao, Lei A
2018-10-29  9:36       ` Yao, Lei A
2018-10-29  9:48         ` Thomas Monjalon
2018-10-29 10:11           ` Alejandro Lucero
2018-10-29 10:15             ` Alejandro Lucero
2018-10-29 11:39               ` Alejandro Lucero
2018-10-29 11:46                 ` Thomas Monjalon
2018-10-29 12:55                   ` Alejandro Lucero
2018-10-29 13:18                     ` Yao, Lei A
2018-10-29 13:40                       ` Alejandro Lucero
2018-10-29 14:18                         ` Thomas Monjalon
2018-10-29 14:35                           ` Alejandro Lucero
2018-10-29 18:54                           ` Yongseok Koh
2018-10-29 19:37                             ` Alejandro Lucero
2018-10-30 10:10                               ` Burakov, Anatoly
2018-10-30 10:11                           ` Burakov, Anatoly
2018-10-30 10:19                             ` Alejandro Lucero
2018-10-30  3:20                         ` Lin, Xueqin
2018-10-30  9:41                           ` Alejandro Lucero
2018-10-30 10:33                             ` Lin, Xueqin
2018-10-30 10:38                               ` Alejandro Lucero
2018-10-30 12:21                                 ` Lin, Xueqin
2018-10-30 12:37                                   ` Alejandro Lucero
2018-10-30 14:04                                     ` Alejandro Lucero
2018-10-30 14:14                                       ` Burakov, Anatoly
2018-10-30 14:45                                         ` Alejandro Lucero
2018-10-30 14:45                                       ` Lin, Xueqin
2018-10-30 14:57                                         ` Alejandro Lucero
2018-10-30 15:09                                           ` Lin, Xueqin [this message]
2018-10-30 10:18                 ` Burakov, Anatoly
2018-10-30 10:23                   ` Alejandro Lucero
  -- strict thread matches above, loose matches on Subject: below --
2018-07-04 12:53 Alejandro Lucero

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0D300480287911409D9FF92C1FA2A3355B443176@SHSMSX104.ccr.corp.intel.com \
    --to=xueqin.lin@intel.com \
    --cc=alejandro.lucero@netronome.com \
    --cc=anatoly.burakov@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=lei.a.yao@intel.com \
    --cc=qi.z.zhang@intel.com \
    --cc=qian.q.xu@intel.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).