From: Imre Pinter <imre.pinter@ericsson.com>
To: "users@dpdk.org" <users@dpdk.org>
Cc: "Gabor Halász" <gabor.halasz@ericsson.com>,
"Péter Suskovics" <peter.suskovics@ericsson.com>
Subject: [dpdk-users] Slow DPDK startup with many 1G hugepages
Date: Thu, 1 Jun 2017 07:55:20 +0000 [thread overview]
Message-ID: <VI1PR07MB13578627486437F8339E99FE80F60@VI1PR07MB1357.eurprd07.prod.outlook.com> (raw)
In-Reply-To: <VI1PR07MB1357C989E2F7092D9A31ED9F80F30@VI1PR07MB1357.eurprd07.prod.outlook.com>
Hi,
We experience slow startup time in DPDK-OVS, when backing memory with 1G hugepages instead of 2M hugepages.
Currently we're mapping 2M hugepages as memory backend for DPDK OVS. In the future we would like to allocate this memory from the 1G hugepage pool. Currently in our deployments we have significant amount of 1G hugepages allocated (min. 54G) for VMs and only 2G memory on 2M hugepages.
Typical setup for 2M hugepages:
GRUB:
hugepagesz=2M hugepages=1024 hugepagesz=1G hugepages=54 default_hugepagesz=1G
$ grep hugetlbfs /proc/mounts
nodev /mnt/huge_ovs_2M hugetlbfs rw,relatime,pagesize=2M 0 0
nodev /mnt/huge_qemu_1G hugetlbfs rw,relatime,pagesize=1G 0 0
Typical setup for 1GB hugepages:
GRUB:
hugepagesz=1G hugepages=56 default_hugepagesz=1G
$ grep hugetlbfs /proc/mounts
nodev /mnt/huge_qemu_1G hugetlbfs rw,relatime,pagesize=1G 0 0
DPDK OVS startup times based on the ovs-vswitchd.log logs:
* 2M (2G memory allocated) - startup time ~3 sec:
2017-05-03T08:13:50.177Z|00009|dpdk|INFO|EAL ARGS: ovs-vswitchd -c 0x1 --huge-dir /mnt/huge_ovs_2M --socket-mem 1024,1024
2017-05-03T08:13:50.708Z|00010|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports recirculation
* 1G (56G memory allocated) - startup time ~13 sec:
2017-05-03T08:09:22.114Z|00009|dpdk|INFO|EAL ARGS: ovs-vswitchd -c 0x1 --huge-dir /mnt/huge_qemu_1G --socket-mem 1024,1024
2017-05-03T08:09:32.706Z|00010|ofproto_dpif|INFO|netdev@ovs-netdev: Datapath supports recirculation
I used DPDK 16.11 for OVS and testpmd and tested on Ubuntu 14.04 with kernel 3.13.0-117-generic and 4.4.0-78-generic.
We had a discussion with Mark Gray (from Intel), and he come up with the following items:
· The ~10 sec time difference is there with testpmd as well
· They believe it is a kernel overhead (mmap is slow, perhaps it is zeroing pages). The following code from eal_memory.c does the above mentioned printout in EAL startup:
469 /* map the segment, and populate page tables,
470 * the kernel fills this segment with zeros */
468 uint64_t start = rte_rdtsc();
471 virtaddr = mmap(vma_addr, hugepage_sz, PROT_READ | PROT_WRITE,
472 MAP_SHARED | MAP_POPULATE, fd, 0);
473 if (virtaddr == MAP_FAILED) {
474 RTE_LOG(DEBUG, EAL, "%s(): mmap failed: %s\n", __func__,
475 strerror(errno));
476 close(fd);
477 return i;
478 }
479
480 if (orig) {
481 hugepg_tbl[i].orig_va = virtaddr;
482 printf("Original mapping of page %u took: %"PRIu64" ticks, %"PRIu64" ms\n ",
483 i, rte_rdtsc() - start,
484 (rte_rdtsc() - start) * 1000 /
485 rte_get_timer_hz());
486 }
A solution could be to mount 1G hugepages to 2 separate directory: 2G for OVS and the remaining for the VMs, but the NUMA location for these hugepages is non-deterministic. Since mount cannot handle NUMA related parameters during mounting hugetlbfs, and fstab forks the mounts during boot.
Do you have a solution on how to use 1G hugepages for VMs and have reasonable DPDK EAL startup time?
Thanks,
Imre
next parent reply other threads:[~2017-06-01 7:55 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <VI1PR07MB1357C989E2F7092D9A31ED9F80F30@VI1PR07MB1357.eurprd07.prod.outlook.com>
2017-06-01 7:55 ` Imre Pinter [this message]
2017-06-01 8:50 ` Tan, Jianfeng
2017-06-01 10:12 ` Marco Varlese
2017-06-02 1:40 ` Tan, Jianfeng
2017-06-06 12:39 ` Imre Pinter
2017-06-06 14:31 ` Tan, Jianfeng
2017-06-06 15:25 ` Imre Pinter
2017-06-07 8:22 ` Tan, Jianfeng
2017-06-08 14:40 ` Imre Pinter
2017-06-01 9:02 ` Sergio Gonzalez Monroy
2017-06-08 14:30 ` Imre Pinter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=VI1PR07MB13578627486437F8339E99FE80F60@VI1PR07MB1357.eurprd07.prod.outlook.com \
--to=imre.pinter@ericsson.com \
--cc=gabor.halasz@ericsson.com \
--cc=peter.suskovics@ericsson.com \
--cc=users@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).