From: Ilya Matveychikov <matvejchikov@gmail.com>
To: Olivier Matz <olivier.matz@6wind.com>
Cc: dev@dpdk.org,
"adrien.mazarguil@6wind.com" <adrien.mazarguil@6wind.com>,
Jan Blunck <jblunck@infradead.org>,
Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Subject: Re: [dpdk-dev] A (possible) problem with `--no-huge` option
Date: Fri, 9 Jun 2017 16:08:24 +0400 [thread overview]
Message-ID: <80B1FDDC-4B0B-4B12-843F-B3AB1587CD08@gmail.com> (raw)
In-Reply-To: <20170609102727.0eb7f39d@platinum>
Hi Olivier,
The patch from you solves the problem for me.
Thank you.
> On Jun 9, 2017, at 12:27 PM, Olivier Matz <olivier.matz@6wind.com> wrote:
>
> Hi Ilya,
>
> On Sun, 14 May 2017 14:34:14 +0400, Ilya Matveychikov <matvejchikov@gmail.com> wrote:
>> Hi guys,
>>
>> I have a problem while running DPDK with `--no-huge` option. It seems that the problem occurs since commit cdc242f260e766bd95a658b5e0686a62ec04f5b0 and that is the change that affects me:
>>
>> + if ((page & 0x7fffffffffffffULL) == 0)
>> + return RTE_BAD_PHYS_ADDR;
>> +
>>
>> What I did is to try to create memory pool using rte_pktmbuf_pool_create(). I dig into the issue and found that in my case “page" value is 0x0080000000000000 which means that the page is not present and “soft-dirty” (according to kernel’s documentation):
>>
>> * Bits 0-54 page frame number (PFN) if present
>> * Bits 0-4 swap type if swapped
>> * Bits 5-54 swap offset if swapped
>> * Bit 55 pte is soft-dirty (see Documentation/vm/soft-dirty.txt)
>> * Bit 56 page exclusively mapped (since 4.2)
>> * Bits 57-60 zero
>> * Bit 61 page is file-page or shared-anon (since 3.5)
>> * Bit 62 page swapped
>> * Bit 63 page present
>>
>> So, before the change mentioned all “works” fine and such pages were not handled. But now the check causes rte_mempool_populate_default to fail with -EINVAL...
>> Can anyone familiar with the memory pool allocation helps with the issue?
>>
>> Thanks in advice,
>> Ilya Matveychikov.
>>
>
> I can reproduce the issue:
>
> make config T=x86_64-native-linuxapp-gcc
> make -j32 EXTRA_CFLAGS="-O0 -g"
> mkdir -p /mnt/huge
> mount -t hugetlbfs nodev /mnt/huge
> echo 256 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
>
> # ok
> ./build/app/testpmd -l 2,4 --log-level 8 --vdev=eth_null0 -- --no-numa --total-num-mbufs=4096 -i --port-topology=chained
>
> # fail
> ./build/app/testpmd --no-huge -l 2,4 --log-level 8 --vdev=eth_null0 -- --no-numa --total-num-mbufs=4096 -i --port-topology=chained
>
>
> I confirm that rte_mem_virt2phy() returns RTE_BAD_PHYS_ADDR,
> which makes rte_mempool_populate_virt() to fail.
>
> Reverting cdc242f260e7 ("eal/linux: support running as unprivileged user")
> fixes the problem. Actually, it makes rte_mem_virt2phy() return 0 instead
> of RTE_BAD_PHYS_ADDR, which is seen as a valid address.
>
> I think querying the physical address when using --no-huge does not make
> sense because the memory is not locked, and could be swapped.
>
> Another strange thing, when using --no-huge, the physical address returned
> when allocating a memzone is the virtual address.
>
> I see several solutions to fix the issue:
>
> 1/ Always set physical addresses to RTE_BAD_PHYS_ADDR when started
> with --no-huge. We consider that the physical address is invalid
> in that case and must not be used.
>
> This impacts rte_mem_virt2phy() and memzone_reserve*() functions.
>
> In rte_mempool_populate_virt(), don't expect a physical address
> if the application is started with --no-huge.
>
> 2/ Change rte_mem_virt2phy() to return the virtual address when we
> ask for the physical address when started with --no-huge. This is
> wrong, but consistent with what is done in memzones today.
>
> In rte_mem_virt2phy(), add at the beginning:
>
> if (!rte_eal_has_hugepages())
> return (intptr_t)virtaddr;
>
> 3/ lock pages in memory by reverting
> 729f17a932dd ("mem: revert page locking when not using hugepages")
>
> This would make the physical address available.
> As explained in the commit log, this would also break the ability to
> start dpdk with --no-huge for non-root users.
>
>
> I think 1/ is better. I'm sending a patch in reply to this mail.
> Ilya, please let me know if it fixes your issue.
>
> Regards,
> Olivier
prev parent reply other threads:[~2017-06-09 12:08 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-14 10:34 Ilya Matveychikov
2017-06-09 8:27 ` Olivier Matz
2017-06-09 8:29 ` [dpdk-dev] [PATCH] eal: don't advertise a physical address when no hugepages Olivier Matz
2017-06-10 8:31 ` Jan Blunck
2017-06-23 8:11 ` Olivier Matz
2017-06-23 17:08 ` Jan Blunck
2017-06-26 7:11 ` santosh
2017-06-12 13:58 ` Adrien Mazarguil
2017-07-03 10:04 ` [dpdk-dev] [PATCH v2] " Olivier Matz
2017-07-03 10:17 ` Jan Blunck
2017-07-04 15:53 ` [dpdk-dev] [dpdk-stable] " Thomas Monjalon
2017-06-09 12:08 ` Ilya Matveychikov [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=80B1FDDC-4B0B-4B12-843F-B3AB1587CD08@gmail.com \
--to=matvejchikov@gmail.com \
--cc=adrien.mazarguil@6wind.com \
--cc=dev@dpdk.org \
--cc=jblunck@infradead.org \
--cc=olivier.matz@6wind.com \
--cc=sergio.gonzalez.monroy@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).