* [dpdk-dev] upper limit on the size of allocation through rte_malloc in dpdk-1.8.0? @ 2015-02-04 15:24 Stefan Puiu 2015-02-06 11:00 ` Bruce Richardson 0 siblings, 1 reply; 4+ messages in thread From: Stefan Puiu @ 2015-02-04 15:24 UTC (permalink / raw) To: dev Hi, I'm trying to alter an existing program to use the Intel DPDK. I'm using 1.8.0, compiled by me as a shared library (CONFIG_RTE_BUILD_COMBINE_LIBS=y and CONFIG_RTE_BUILD_SHARED_LIB=y in .config) on Ubuntu 12.04. The program needs to allocate large blocks of memory (between 1 and 4 chunks of 4.5GB, also 1-4 chunks of 2.5 GB). I tried changing my C++ code to use an array allocated using rte_malloc() instead of the std::vector I was using beforehand, but it seems the call to rte_malloc() fails. I then made a simple test program using the DPDK that takes a size to allocate and if that fails, tries again with sizes of 100MB less, basically the code below. This is C++ code (well, now that I look it could've been plain C, but I need C++) compiled with g++-4.6 with '-std=gnu++0x': int main(int argc, char **argv) { int ret = rte_eal_init(argc, argv); if (ret < 0) rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n"); argc -= ret; argv += ret; [... check argc >= 2] size_t size = strtoul(argv[1], NULL, 10); size_t s = size; for (size_t i = 0; i < 30; ++i) { printf("Trying to allocate %'zu bytes\n", s); buf = rte_malloc("test", s, 0); if (!buf) printf ("Failed!\n"); else { printf ("Success!\n"); rte_free(buf); break; } s = s - (100 * 1024ULL * 1024ULL); } return 0; } I'm getting: Trying to allocate 4,832,038,656 bytes Failed! Trying to allocate 4,727,181,056 bytes Failed! [...] Trying to allocate 2,944,601,856 bytes Success! It's not always the same value, but usually somewhere around 3GB rte_malloc() succeeds. I'm running on a physical (non-VM) NUMA machine with 2 physical CPUs, each having 64GBs of local memory. The machine also runs Ubuntu 12.04 server. I've created 16384 hugepages of 2MB: echo 16384 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages I'm running the basic app like this: sudo numactl --membind=0 ~/src/test/dpdk_test/alloc -c 1f -n 4 -w 04:00.0 --socket-mem=16384,0 -- 4832038656 I'm trying to run only on NUMA node 0 and only allocate memory from there - that's what the app I'm moving to the DPDK works like (using numactl --membind=x and --cpunodebind=x). Is there an upper limit on the amount of memory rte_malloc() will try to allocate? I tried both after a reboot and when the machine had been running for a while with not much success. Am I missing something? It's a bit weird to be only able to allocate 3GB out of the 32GB assigned to the app... On a related note, what would be a good way to compile the DPDK with debug info (and preferably -O0)? There's quite a web of .mk files used and I haven't figured out where the optimization level / debug options are set. Thanks in advance, Stefan. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [dpdk-dev] upper limit on the size of allocation through rte_malloc in dpdk-1.8.0? 2015-02-04 15:24 [dpdk-dev] upper limit on the size of allocation through rte_malloc in dpdk-1.8.0? Stefan Puiu @ 2015-02-06 11:00 ` Bruce Richardson 2015-02-06 11:25 ` Olivier MATZ 0 siblings, 1 reply; 4+ messages in thread From: Bruce Richardson @ 2015-02-06 11:00 UTC (permalink / raw) To: Stefan Puiu; +Cc: dev On Wed, Feb 04, 2015 at 05:24:58PM +0200, Stefan Puiu wrote: > Hi, > > I'm trying to alter an existing program to use the Intel DPDK. I'm > using 1.8.0, compiled by me as a shared library > (CONFIG_RTE_BUILD_COMBINE_LIBS=y and CONFIG_RTE_BUILD_SHARED_LIB=y in > .config) on Ubuntu 12.04. The program needs to allocate large blocks > of memory (between 1 and 4 chunks of 4.5GB, also 1-4 chunks of 2.5 > GB). I tried changing my C++ code to use an array allocated using > rte_malloc() instead of the std::vector I was using beforehand, but it > seems the call to rte_malloc() fails. I then made a simple test > program using the DPDK that takes a size to allocate and if that > fails, tries again with sizes of 100MB less, basically the code below. > This is C++ code (well, now that I look it could've been plain C, but > I need C++) compiled with g++-4.6 with '-std=gnu++0x': > > int main(int argc, char **argv) > { > int ret = rte_eal_init(argc, argv); > if (ret < 0) > rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n"); > argc -= ret; > argv += ret; > > [... check argc >= 2] > size_t size = strtoul(argv[1], NULL, 10); > size_t s = size; > > for (size_t i = 0; i < 30; ++i) { > printf("Trying to allocate %'zu bytes\n", s); > buf = rte_malloc("test", s, 0); > if (!buf) > printf ("Failed!\n"); > else { > printf ("Success!\n"); > rte_free(buf); > break; > } > > s = s - (100 * 1024ULL * 1024ULL); > } > > return 0; > } > > I'm getting: > Trying to allocate 4,832,038,656 bytes > Failed! > Trying to allocate 4,727,181,056 bytes > Failed! > [...] > Trying to allocate 2,944,601,856 bytes > Success! > > It's not always the same value, but usually somewhere around 3GB > rte_malloc() succeeds. I'm running on a physical (non-VM) NUMA machine > with 2 physical CPUs, each having 64GBs of local memory. The machine > also runs Ubuntu 12.04 server. I've created 16384 hugepages of 2MB: > > echo 16384 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages > > I'm running the basic app like this: > > sudo numactl --membind=0 ~/src/test/dpdk_test/alloc -c 1f -n 4 -w > 04:00.0 --socket-mem=16384,0 -- 4832038656 > > I'm trying to run only on NUMA node 0 and only allocate memory from > there - that's what the app I'm moving to the DPDK works like (using > numactl --membind=x and --cpunodebind=x). > > Is there an upper limit on the amount of memory rte_malloc() will try > to allocate? I tried both after a reboot and when the machine had been > running for a while with not much success. Am I missing something? > It's a bit weird to be only able to allocate 3GB out of the 32GB > assigned to the app... > > On a related note, what would be a good way to compile the DPDK with > debug info (and preferably -O0)? There's quite a web of .mk files used > and I haven't figured out where the optimization level / debug options > are set. > > Thanks in advance, > Stefan. Does your system support 1G pages? I would recommend using a smaller number of 1G pages vs the huge number of 2MB pages that you are currently using. There may be issues with the allocations failing due to a lack of contiguous blocks of memory due to the 2MB pages being spread across memory. Regards, /Bruce ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [dpdk-dev] upper limit on the size of allocation through rte_malloc in dpdk-1.8.0? 2015-02-06 11:00 ` Bruce Richardson @ 2015-02-06 11:25 ` Olivier MATZ 2015-02-10 16:53 ` Stefan Puiu 0 siblings, 1 reply; 4+ messages in thread From: Olivier MATZ @ 2015-02-06 11:25 UTC (permalink / raw) To: Bruce Richardson, Stefan Puiu; +Cc: dev Hi, On 02/06/2015 12:00 PM, Bruce Richardson wrote: > On Wed, Feb 04, 2015 at 05:24:58PM +0200, Stefan Puiu wrote: >> Hi, >> >> I'm trying to alter an existing program to use the Intel DPDK. I'm >> using 1.8.0, compiled by me as a shared library >> (CONFIG_RTE_BUILD_COMBINE_LIBS=y and CONFIG_RTE_BUILD_SHARED_LIB=y in >> .config) on Ubuntu 12.04. The program needs to allocate large blocks >> of memory (between 1 and 4 chunks of 4.5GB, also 1-4 chunks of 2.5 >> GB). I tried changing my C++ code to use an array allocated using >> rte_malloc() instead of the std::vector I was using beforehand, but it >> seems the call to rte_malloc() fails. I then made a simple test >> program using the DPDK that takes a size to allocate and if that >> fails, tries again with sizes of 100MB less, basically the code below. >> This is C++ code (well, now that I look it could've been plain C, but >> I need C++) compiled with g++-4.6 with '-std=gnu++0x': >> >> int main(int argc, char **argv) >> { >> int ret = rte_eal_init(argc, argv); >> if (ret < 0) >> rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n"); >> argc -= ret; >> argv += ret; >> >> [... check argc >= 2] >> size_t size = strtoul(argv[1], NULL, 10); >> size_t s = size; >> >> for (size_t i = 0; i < 30; ++i) { >> printf("Trying to allocate %'zu bytes\n", s); >> buf = rte_malloc("test", s, 0); >> if (!buf) >> printf ("Failed!\n"); >> else { >> printf ("Success!\n"); >> rte_free(buf); >> break; >> } >> >> s = s - (100 * 1024ULL * 1024ULL); >> } >> >> return 0; >> } >> >> I'm getting: >> Trying to allocate 4,832,038,656 bytes >> Failed! >> Trying to allocate 4,727,181,056 bytes >> Failed! >> [...] >> Trying to allocate 2,944,601,856 bytes >> Success! >> >> It's not always the same value, but usually somewhere around 3GB >> rte_malloc() succeeds. I'm running on a physical (non-VM) NUMA machine >> with 2 physical CPUs, each having 64GBs of local memory. The machine >> also runs Ubuntu 12.04 server. I've created 16384 hugepages of 2MB: >> >> echo 16384 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages >> >> I'm running the basic app like this: >> >> sudo numactl --membind=0 ~/src/test/dpdk_test/alloc -c 1f -n 4 -w >> 04:00.0 --socket-mem=16384,0 -- 4832038656 >> >> I'm trying to run only on NUMA node 0 and only allocate memory from >> there - that's what the app I'm moving to the DPDK works like (using >> numactl --membind=x and --cpunodebind=x). >> >> Is there an upper limit on the amount of memory rte_malloc() will try >> to allocate? I tried both after a reboot and when the machine had been >> running for a while with not much success. Am I missing something? >> It's a bit weird to be only able to allocate 3GB out of the 32GB >> assigned to the app... >> >> On a related note, what would be a good way to compile the DPDK with >> debug info (and preferably -O0)? There's quite a web of .mk files used >> and I haven't figured out where the optimization level / debug options >> are set. >> >> Thanks in advance, >> Stefan. > > Does your system support 1G pages? I would recommend using a smaller number of > 1G pages vs the huge number of 2MB pages that you are currently using. There > may be issues with the allocations failing due to a lack of contiguous blocks > of memory due to the 2MB pages being spread across memory. Indeed, rte_malloc() tries to allocate memory which is physically contiguous. Using 1G pages instead of 2MB pages will probably help as Bruce suggests. Another idea is to use another allocation method. It depends on what you want to do with the allocated data (accessed in dataplane or not), and when you allocate it (in dataplane or not). For instance, if you want to allocate a large zone at init, you can just mmap() and anonymous zone in hugetlbfs (your dpdk config need to keep unused huge pages for this usage). By the way, I recently noticed that rte_malloc() does not work well for data larger than 4GB but I had no time to dig into this issue. There is probably somewhere in the rte_malloc code where 32 bit addresses are used. Regards, Olivier ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [dpdk-dev] upper limit on the size of allocation through rte_malloc in dpdk-1.8.0? 2015-02-06 11:25 ` Olivier MATZ @ 2015-02-10 16:53 ` Stefan Puiu 0 siblings, 0 replies; 4+ messages in thread From: Stefan Puiu @ 2015-02-10 16:53 UTC (permalink / raw) To: Olivier MATZ; +Cc: dev Hi and thanks for replying, On Fri, Feb 6, 2015 at 1:25 PM, Olivier MATZ <olivier.matz@6wind.com> wrote: > Hi, > > On 02/06/2015 12:00 PM, Bruce Richardson wrote: >> On Wed, Feb 04, 2015 at 05:24:58PM +0200, Stefan Puiu wrote: >>> Hi, >>> >>> I'm trying to alter an existing program to use the Intel DPDK. I'm >>> using 1.8.0, compiled by me as a shared library >>> (CONFIG_RTE_BUILD_COMBINE_LIBS=y and CONFIG_RTE_BUILD_SHARED_LIB=y in >>> .config) on Ubuntu 12.04. The program needs to allocate large blocks >>> of memory (between 1 and 4 chunks of 4.5GB, also 1-4 chunks of 2.5 >>> GB). I tried changing my C++ code to use an array allocated using >>> rte_malloc() instead of the std::vector I was using beforehand, but it >>> seems the call to rte_malloc() fails. I then made a simple test >>> program using the DPDK that takes a size to allocate and if that >>> fails, tries again with sizes of 100MB less, basically the code below. >>> This is C++ code (well, now that I look it could've been plain C, but >>> I need C++) compiled with g++-4.6 with '-std=gnu++0x': >>> >>> int main(int argc, char **argv) >>> { >>> int ret = rte_eal_init(argc, argv); >>> if (ret < 0) >>> rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n"); >>> argc -= ret; >>> argv += ret; >>> >>> [... check argc >= 2] >>> size_t size = strtoul(argv[1], NULL, 10); >>> size_t s = size; >>> >>> for (size_t i = 0; i < 30; ++i) { >>> printf("Trying to allocate %'zu bytes\n", s); >>> buf = rte_malloc("test", s, 0); >>> if (!buf) >>> printf ("Failed!\n"); >>> else { >>> printf ("Success!\n"); >>> rte_free(buf); >>> break; >>> } >>> >>> s = s - (100 * 1024ULL * 1024ULL); >>> } >>> >>> return 0; >>> } >>> >>> I'm getting: >>> Trying to allocate 4,832,038,656 bytes >>> Failed! >>> Trying to allocate 4,727,181,056 bytes >>> Failed! >>> [...] >>> Trying to allocate 2,944,601,856 bytes >>> Success! >>> >>> It's not always the same value, but usually somewhere around 3GB >>> rte_malloc() succeeds. I'm running on a physical (non-VM) NUMA machine >>> with 2 physical CPUs, each having 64GBs of local memory. The machine >>> also runs Ubuntu 12.04 server. I've created 16384 hugepages of 2MB: >>> >>> echo 16384 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages >>> >>> I'm running the basic app like this: >>> >>> sudo numactl --membind=0 ~/src/test/dpdk_test/alloc -c 1f -n 4 -w >>> 04:00.0 --socket-mem=16384,0 -- 4832038656 >>> >>> I'm trying to run only on NUMA node 0 and only allocate memory from >>> there - that's what the app I'm moving to the DPDK works like (using >>> numactl --membind=x and --cpunodebind=x). >>> >>> Is there an upper limit on the amount of memory rte_malloc() will try >>> to allocate? I tried both after a reboot and when the machine had been >>> running for a while with not much success. Am I missing something? >>> It's a bit weird to be only able to allocate 3GB out of the 32GB >>> assigned to the app... >>> >>> On a related note, what would be a good way to compile the DPDK with >>> debug info (and preferably -O0)? There's quite a web of .mk files used >>> and I haven't figured out where the optimization level / debug options >>> are set. >>> >>> Thanks in advance, >>> Stefan. >> >> Does your system support 1G pages? I would recommend using a smaller number of >> 1G pages vs the huge number of 2MB pages that you are currently using. There >> may be issues with the allocations failing due to a lack of contiguous blocks >> of memory due to the 2MB pages being spread across memory. > > Indeed, rte_malloc() tries to allocate memory which is physically > contiguous. Using 1G pages instead of 2MB pages will probably help > as Bruce suggests. Another idea is to use another allocation method. > It depends on what you want to do with the allocated data (accessed in > dataplane or not), and when you allocate it (in dataplane or not). I wanted to use the memory on the dataplane, yes. I'm not allocating it on startup, but when I receive a certain external command. The workers won't be processing packets at that point, though - there's another command for starting Rx. > > For instance, if you want to allocate a large zone at init, you can just > mmap() and anonymous zone in hugetlbfs (your dpdk config need to keep > unused huge pages for this usage). Yep, I think I'm going to use hugetlbfs, I've tried a simple test program and it was successful in allocating the amount I wanted. Hopefully get_hugepage_region() honors the mempolicy. Thanks, Stefan. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-02-10 16:53 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-02-04 15:24 [dpdk-dev] upper limit on the size of allocation through rte_malloc in dpdk-1.8.0? Stefan Puiu 2015-02-06 11:00 ` Bruce Richardson 2015-02-06 11:25 ` Olivier MATZ 2015-02-10 16:53 ` Stefan Puiu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).