From: Sushil Adhikari <sushil446@gmail.com>
To: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Cc: "Wiles, Keith" <keith.wiles@intel.com>,
"users@dpdk.org" <users@dpdk.org>
Subject: Re: [dpdk-users] Larger number of hugepages causes bus error.
Date: Thu, 23 Feb 2017 11:02:59 -0600 [thread overview]
Message-ID: <CAPO9LfR8AgbJDDXiiY-KhJ6aFO9xS2RY3mRyfvJ1M7mgWhEnig@mail.gmail.com> (raw)
In-Reply-To: <dc2add3e-96f6-a14b-1240-c950f6f28135@intel.com>
Thank you Keith and Monroy, with your help I was able to track down the
problem, My var/run was too small to hold the hugepage information so when
I increased its size, it worked. Thank you so much.
On Thu, Feb 23, 2017 at 10:35 AM, Sergio Gonzalez Monroy <
sergio.gonzalez.monroy@intel.com> wrote:
> As Keith suggested, gdb is probably your best bet now.
> You could also do 'strace' to see if something shows up there.
>
> If you are running as root, the application is opening a file in /var/run
> to store some hugepage information, then it memsets to 0.
>
> What distro and kernel are you running on?
>
>
>
> On 23/02/2017 16:19, Sushil Adhikari wrote:
>
>> I didn't understand what you mean by hugepage value, if you mean number of
>> hugepages here's what it looks like
>> [~]$ grep -ri hugepages /proc/meminfo
>> AnonHugePages: 0 kB
>> HugePages_Total: 512
>> HugePages_Free: 512
>> HugePages_Rsvd: 0
>> HugePages_Surp: 0
>> Hugepagesize: 2048 kB
>>
>> And the linux version is 4.4.20.
>>
>> On Thu, Feb 23, 2017 at 9:17 AM, Wiles, Keith <keith.wiles@intel.com>
>> wrote:
>>
>> On Feb 22, 2017, at 7:18 PM, Sushil Adhikari <sushil446@gmail.com>
>>>>
>>> wrote:
>>>
>>>> Thank you Keith for the response,
>>>>
>>>> Yes it should be line 1142 not 1405, I was using 16.11 and now I'm using
>>>>
>>> 17.02 and still getting the same error.
>>>
>>> Not sure what to say here, it looks like some type of system
>>> configuration
>>> issue as I do not see it on my machine.
>>>
>>> Can you tell if the hugepage has a value and is it sane? The next thing
>>> is
>>> to see where in that memory is it failing start, end or middle someplace.
>>> Use GDB and compile the code with ‘make install
>>> T=x86_64-native-lunixapp-gcc EXTRA_CFLAGS=“-g -O0”' then set a break
>>> point
>>> on ‘b eal_memory.c:1142’ and inspect the memory pointer hugepage. I do
>>> not
>>> think it is overrun error meaning the size for memset is different then
>>> what was allocated and just stepping off the end.
>>>
>>> Also you did not tell me the linux version you are using?
>>>
>>> On Wed, Feb 22, 2017 at 8:46 PM, Wiles, Keith <keith.wiles@intel.com>
>>>>
>>> wrote:
>>>
>>>> On Feb 22, 2017, at 6:43 PM, Wiles, Keith <keith.wiles@intel.com>
>>>>>
>>>> wrote:
>>>
>>>> On Feb 22, 2017, at 6:30 PM, Sushil Adhikari <sushil446@gmail.com>
>>>>>>
>>>>> wrote:
>>>
>>>> I used the basic command line option "dpdkTimer -c 0xf -n 4"
>>>>>> And to update on my findings so far I have narrowed down to this
>>>>>>
>>>>> line(1405)
>>>
>>>> memset(hugepage, 0, nr_hugefiles * sizeof(struct hugepage_file));
>>>>>> of function rte_eal_hugepage_init() in file
>>>>>>
>>>>> dpdk\lib\librte_eal\linuxapp\eal\eal_memory.c
>>>
>>>> What version of DPDK are you using? I was looking at the file at 1405
>>>>>
>>>> and I do not see a memset() call.
>>>
>>>> I found the memset call at 1142 in my 17.05-rc0 code. Please try the
>>>>
>>> latest version and see if you get the same problem.
>>>
>>>> Yes I have the hugepages of size 2MB(2048) and when I calculate the
>>>>>>
>>>>> memory this memset function is trying to set, it comes out to
>>> 512(nr_hugefiles) * 4144 ( sizeof(struct hugepage_file) ) = 2121728 which
>>> larger than 2MB, so my doubt is that the hugepages I have
>>> allocated(512*2MB) is not contiguous 1GB memory its trying to access
>>> memory
>>> thats not part of hugepage, is that a possibility, even though I am
>>> setting
>>> up hugepages during boot time by providing it through kernel option.
>>>
>>>>
>>>>>> On Wed, Feb 22, 2017 at 8:05 PM, Wiles, Keith <keith.wiles@intel.com>
>>>>>>
>>>>> wrote:
>>>
>>>> On Feb 22, 2017, at 3:05 PM, Sushil Adhikari <sushil446@gmail.com>
>>>>>>>
>>>>>> wrote:
>>>
>>>> Hi,
>>>>>>>
>>>>>>> I was trying to run dpdk timer app by setting 512 2MB hugepages but
>>>>>>>
>>>>>> the
>>>
>>>> application crashed with following error
>>>>>>> EAL: Detected 4 lcore(s)
>>>>>>> EAL: Probing VFIO support...
>>>>>>> Bus error (core dumped)
>>>>>>>
>>>>>>> If I reduce the number of hugepages to 256 it works fine. I
>>>>>>>
>>>>>> wondering what
>>>
>>>> could be the problem here. Here's my cpu info
>>>>>>>
>>>>>> I normally run with 2048 x 2 or 2048 per socket on my machine. What
>>>>>>
>>>>> is the command line you are using to start the application?
>>>
>>>> processor : 0
>>>>>>> vendor_id : GenuineIntel
>>>>>>> cpu family : 6
>>>>>>> model : 26
>>>>>>> model name : Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz
>>>>>>> stepping : 5
>>>>>>> microcode : 0x11
>>>>>>> cpu MHz : 2794.000
>>>>>>> cache size : 8192 KB
>>>>>>> physical id : 0
>>>>>>> siblings : 4
>>>>>>> core id : 0
>>>>>>> cpu cores : 4
>>>>>>> apicid : 0
>>>>>>> initial apicid : 0
>>>>>>> fpu : yes
>>>>>>> fpu_exception : yes
>>>>>>> cpuid level : 11
>>>>>>> wp : yes
>>>>>>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
>>>>>>>
>>>>>> pge mca
>>>
>>>> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
>>>>>>>
>>>>>> syscall nx
>>>
>>>> rdtscp lm constant_tsc arch_
>>>>>>> perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni
>>>>>>>
>>>>>> dtes64
>>>
>>>> monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt
>>>>>>> lahf_lm ida dtherm tpr_shadow vnm
>>>>>>> i flexpriority ept vpid
>>>>>>> bugs :
>>>>>>> bogomips : 5600.00
>>>>>>> clflush size : 64
>>>>>>> cache_alignment : 64
>>>>>>> address sizes : 36 bits physical, 48 bits virtual
>>>>>>> power management:
>>>>>>>
>>>>>>> processor : 1
>>>>>>> vendor_id : GenuineIntel
>>>>>>> cpu family : 6
>>>>>>> model : 26
>>>>>>> model name : Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz
>>>>>>> stepping : 5
>>>>>>> microcode : 0x11
>>>>>>> cpu MHz : 2794.000
>>>>>>> cache size : 8192 KB
>>>>>>> physical id : 0
>>>>>>> siblings : 4
>>>>>>> core id : 1
>>>>>>> cpu cores : 4
>>>>>>> apicid : 2
>>>>>>> initial apicid : 2
>>>>>>> fpu : yes
>>>>>>> fpu_exception : yes
>>>>>>> cpuid level : 11
>>>>>>> wp : yes
>>>>>>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
>>>>>>>
>>>>>> pge mca
>>>
>>>> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
>>>>>>>
>>>>>> syscall nx
>>>
>>>> rdtscp lm constant_tsc arch_
>>>>>>> perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni
>>>>>>>
>>>>>> dtes64
>>>
>>>> monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt
>>>>>>> lahf_lm ida dtherm tpr_shadow vnm
>>>>>>> i flexpriority ept vpid
>>>>>>> bugs :
>>>>>>> bogomips : 5600.00
>>>>>>> clflush size : 64
>>>>>>> cache_alignment : 64
>>>>>>> address sizes : 36 bits physical, 48 bits virtual
>>>>>>> power management:......
>>>>>>>
>>>>>>> And Here's my meminfo
>>>>>>>
>>>>>>> MemTotal: 24679608 kB
>>>>>>> MemFree: 24014156 kB
>>>>>>> MemAvailable: 23950600 kB
>>>>>>> Buffers: 3540 kB
>>>>>>> Cached: 31436 kB
>>>>>>> SwapCached: 0 kB
>>>>>>> Active: 21980 kB
>>>>>>> Inactive: 22256 kB
>>>>>>> Active(anon): 10760 kB
>>>>>>> Inactive(anon): 2940 kB
>>>>>>> Active(file): 11220 kB
>>>>>>> Inactive(file): 19316 kB
>>>>>>> Unevictable: 0 kB
>>>>>>> Mlocked: 0 kB
>>>>>>> SwapTotal: 0 kB
>>>>>>> SwapFree: 0 kB
>>>>>>> Dirty: 32 kB
>>>>>>> Writeback: 0 kB
>>>>>>> AnonPages: 9252 kB
>>>>>>> Mapped: 11912 kB
>>>>>>> Shmem: 4448 kB
>>>>>>> Slab: 27712 kB
>>>>>>> SReclaimable: 11276 kB
>>>>>>> SUnreclaim: 16436 kB
>>>>>>> KernelStack: 2672 kB
>>>>>>> PageTables: 1000 kB
>>>>>>> NFS_Unstable: 0 kB
>>>>>>> Bounce: 0 kB
>>>>>>> WritebackTmp: 0 kB
>>>>>>> CommitLimit: 12077660 kB
>>>>>>> Committed_AS: 137792 kB
>>>>>>> VmallocTotal: 34359738367 kB
>>>>>>> VmallocUsed: 0 kB
>>>>>>> VmallocChunk: 0 kB
>>>>>>> HardwareCorrupted: 0 kB
>>>>>>> AnonHugePages: 2048 kB
>>>>>>> CmaTotal: 0 kB
>>>>>>> CmaFree: 0 kB
>>>>>>> HugePages_Total: 256
>>>>>>> HugePages_Free: 0
>>>>>>> HugePages_Rsvd: 0
>>>>>>> HugePages_Surp: 0
>>>>>>> Hugepagesize: 2048 kB
>>>>>>> DirectMap4k: 22000 kB
>>>>>>> DirectMap2M: 25133056 kB
>>>>>>>
>>>>>> Regards,
>>>>>> Keith
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>> Keith
>>>>>
>>>> Regards,
>>>> Keith
>>>>
>>>>
>>>> Regards,
>>> Keith
>>>
>>>
>>>
>
next prev parent reply other threads:[~2017-02-23 17:03 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-22 23:05 Sushil Adhikari
2017-02-23 2:05 ` Wiles, Keith
2017-02-23 2:30 ` Sushil Adhikari
2017-02-23 2:43 ` Wiles, Keith
2017-02-23 2:46 ` Wiles, Keith
2017-02-23 3:18 ` Sushil Adhikari
2017-02-23 15:17 ` Wiles, Keith
2017-02-23 16:19 ` Sushil Adhikari
2017-02-23 16:35 ` Sergio Gonzalez Monroy
2017-02-23 17:02 ` Sushil Adhikari [this message]
2017-02-23 15:18 ` Sergio Gonzalez Monroy
2017-02-23 15:20 ` Sergio Gonzalez Monroy
2017-02-23 16:10 ` Sushil Adhikari
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAPO9LfR8AgbJDDXiiY-KhJ6aFO9xS2RY3mRyfvJ1M7mgWhEnig@mail.gmail.com \
--to=sushil446@gmail.com \
--cc=keith.wiles@intel.com \
--cc=sergio.gonzalez.monroy@intel.com \
--cc=users@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).