From: "Wiles, Keith" <keith.wiles@intel.com>
To: Kai Zhang <kay21s@gmail.com>
Cc: "users@dpdk.org" <users@dpdk.org>
Subject: Re: [dpdk-users] Issue with more Cores assigned: Cannot mmap device resource file
Date: Sun, 12 Mar 2017 19:24:56 +0000 [thread overview]
Message-ID: <305F974F-C9A3-4D54-85C2-6F503A98D890@intel.com> (raw)
In-Reply-To: <CACKSPmfuzU0LeQ6zTJZxcnO14_K3eYyhPfNq_v86thnZSRSo6A@mail.gmail.com>
> On Mar 12, 2017, at 6:39 PM, Kai Zhang <kay21s@gmail.com> wrote:
>
>
> Your application may be attaching to the same port for each core. Normally this means the each core could be allocating memory and the 4th core just goes over the amount of memory you have reserved.
>
> I don't think so. Because the error is in the rte_eal_init(), which is executed in the first line of the main() function. At the time, the other threads are not even launched.
>
> Is it possible to consider this as a bug in DPDK?
One more thing, I run Pktgen as two processes all of the time. The big difference is I do not run in primary and secondary modes. I run two different instances of pktgen at the same time without seeing this type problem. If the failure is associated with primary/secondary application model, then it could be a bug in that code as a lot of syncing up between the two processes needs to be done because of memory/device sharing. One problem with P/S applications is memory needs to be mapped at the same address between the processes and Linux has the Random memory mapping builtin for security reasons. I forget the name of the mode in Linux to turn off the random page mapping and google is not work for me ATM.
Does your application require running as a primary/secondary application?
>
> Regards,
> Kai
>
>
> >
> > EAL: Cannot mmap device resource file /sys/bus/pci/devices/0000:02:00.0/resource0 to address: 0x7fff65bfc000
> > EAL: Error - exiting with code: 1
> > Cause: Requested device 0000:02:00.0 cannot be used
> >
> > Regards,
> > Kai
> >
> > On Sun, Mar 12, 2017 at 11:21 AM, Kai Zhang <kay21s@gmail.com> wrote:
> >
> > Command line:
> > primary: sudo ./primary -l 0,1,2,3 -n 4 --proc-type=primary
> > secondary: sudo ./secondary -l 4,5,6,7,8 -n 4 --proc-type=secondary
> >
> > The configurations are as follows:
> > A) 1 x Intel E5-2650 v4, 12 cores [UMA], XL710 40GbE, bind 02:00.0, 2048 x 4k huge page
> > 02:00.0 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02) [<<- Only bind this one]
> > 02:00.1 Ethernet controller: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 02)
> > 05:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)
> > 06:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)
> > Socket 0
> > --------
> > Core 0 [0, 12]
> > Core 1 [1, 13]
> > Core 2 [2, 14]
> > Core 3 [3, 15]
> > Core 4 [4, 16]
> > Core 5 [5, 17]
> > Core 8 [6, 18]
> > Core 9 [7, 19]
> > Core 10 [8, 20]
> > Core 11 [9, 21]
> > Core 12 [10, 22]
> > Core 13 [11, 23]
> >
> > B) 2 x Intel E5-2640 v4, 10 cores [NUMA], No Port Bind, 2048 x 4k huge page
> > 05:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)
> > 06:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)
> > Socket 0 Socket 1
> > -------- --------
> > Core 0 [0, 20] [10, 30]
> > Core 1 [1, 21] [11, 31]
> > Core 2 [2, 22] [12, 32]
> > Core 3 [3, 23] [13, 33]
> > Core 4 [4, 24] [14, 34]
> > Core 8 [5, 25] [15, 35]
> > Core 9 [6, 26] [16, 36]
> > Core 10 [7, 27] [17, 37]
> > Core 11 [8, 28] [18, 38]
> > Core 12 [9, 29] [19, 39]
> >
> > Ah, as machine B does not have a 40GbE, I did not bind any NIC and run my program with locally generated packets. But I am using other DPDK features, such as memory sharing and message passing. Maybe that is the reason it works correctly? I can only access machine B remotely, so I am unable to install a NIC on it. I have another PC that is used as a client that only has four cores, which also cannot be used for verification...
> >
> > Regards,
> > Kai
> >
> >
> > On Sun, Mar 12, 2017 at 2:59 AM, Wiles, Keith <keith.wiles@intel.com> wrote:
> >
> > > On Mar 11, 2017, at 9:45 AM, Kai Zhang <kay21s@gmail.com> wrote:
> > >
> > > Hi Keith,
> > >
> > > Thank you for your reply.
> > >
> > > I have tested my program on two machines
> > > A) 1 x Intel E5-2650 v4, 12 cores [UMA]
> > > B) 2 x Intel E5-2640 v4, 10 cores [NUMA]
> > >
> > > I am very sure that the primary process uses different cores with the secondary process. The strange thing is that my program works correctly on machine B. But on machine A, the above issue happens with more than 4 cores assigned to the secondary process.
> > >
> > > I have tried to assign cores 1-5 to the secondary process and also tried other core assignment policies, but the error still happens rte_eal_init() with more than 4 cores.
> >
> > It would be nice to see both command lines. I am not sure I can help more all I can do is suggest some ideas to look at.
> >
> > Does machine B have the same number and type of NICs? Use ‘lspci | grep Ethernet’ to get a list of all Ethernet devices on both machines.
> >
> > What is the number of hugepages you have allocated for both machines.
> >
> > Also look at the cpu_layout.py script to see why adding the 5th core would be different on the two machines and try to make them the same.
> >
> > >
> > > Regards,
> > > Kai
> > >
> > > On Sat, Mar 11, 2017 at 10:52 PM, Wiles, Keith <keith.wiles@intel.com> wrote:
> > >
> > > > On Mar 10, 2017, at 9:35 PM, Kai Zhang <kay21s@gmail.com> wrote:
> > > >
> > > > Hi, there
> > > >
> > > > I am using DPDK-16.11 on XL710 40GbE NIC. OS: CentOS 7.3.1611 with Linux
> > > > kernel version 3.8.0-30.
> > > >
> > > > I have a master process and a secondary process. When I run the secondary
> > > > process with less than or equal to 4 cores, it works correctly. Such as:
> > > > sudo ./program -l 4,5,6,7 -n 4 --proc-type=secondary
> > > > sudo ./program -c 0x0f -n 4 --proc-type=secondary
> > > >
> > > > However, there will be error in the rte_eal_init if I assign more than 4
> > > > cores.
> > > > sudo ./program -l 0,1,2,3,4 -n 4 --proc-type=secondary
> > > > sudo ./program -c 0x1f -n 4 --proc-type=secondary
> > > >
> > > > EAL: Cannot mmap device resource file
> > > > /sys/bus/pci/devices/0000:02:00.0/resource0 to address: 0x7fff65bfc000
> > > > EAL: Error - exiting with code: 1
> > > > Cause: Requested device 0000:02:00.0 cannot be used
> > >
> > > I assume you have at least 8 cores. Have you tried -l 1-5 on the secondary process.
> > >
> > > You did not show the primary process command line, but the if you use 1-5 then you can only give primary process -l 6-7 or two cores. It is always a reasonable thing is to leave core zero for linux to use.
> > >
> > > Also it could be you ran out of memory or hugepages you allocated to the system.
> > >
> > > >
> > > > Anyone knows why this happens?
> > > >
> > > > Thanks a lot,
> > > > Kai Zhang
> > >
> > > Regards,
> > > Keith
> > >
> > >
> >
> > Regards,
> > Keith
> >
> >
> >
>
> Regards,
> Keith
Regards,
Keith
next prev parent reply other threads:[~2017-03-12 19:24 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-11 3:35 Kai Zhang
2017-03-11 14:52 ` Wiles, Keith
2017-03-11 15:45 ` Kai Zhang
2017-03-11 18:59 ` Wiles, Keith
2017-03-12 3:21 ` Kai Zhang
2017-03-12 3:29 ` Kai Zhang
2017-03-12 10:32 ` Wiles, Keith
2017-03-12 10:39 ` Kai Zhang
2017-03-12 18:55 ` Wiles, Keith
2017-03-12 19:24 ` Wiles, Keith [this message]
2017-03-12 23:44 ` Kai Zhang
2017-03-13 9:58 ` Van Haaren, Harry
2017-03-13 10:59 ` Kai Zhang
2017-03-15 14:56 David Coen
2017-03-15 15:48 David Coen
2017-03-15 17:02 ` Kai Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=305F974F-C9A3-4D54-85C2-6F503A98D890@intel.com \
--to=keith.wiles@intel.com \
--cc=kay21s@gmail.com \
--cc=users@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).