From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.droids-corp.org (zoll.droids-corp.org [94.23.50.67]) by dpdk.org (Postfix) with ESMTP id 8BE8BB55 for ; Fri, 6 Feb 2015 12:26:07 +0100 (CET) Received: from was59-1-82-226-113-214.fbx.proxad.net ([82.226.113.214] helo=[192.168.0.10]) by mail.droids-corp.org with esmtpsa (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from ) id 1YJh6X-0003W8-UK; Fri, 06 Feb 2015 12:29:52 +0100 Message-ID: <54D4A4C1.4010109@6wind.com> Date: Fri, 06 Feb 2015 12:25:53 +0100 From: Olivier MATZ User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.3.0 MIME-Version: 1.0 To: Bruce Richardson , Stefan Puiu References: <20150206110014.GA16144@bricha3-MOBL3> In-Reply-To: <20150206110014.GA16144@bricha3-MOBL3> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Cc: dev@dpdk.org Subject: Re: [dpdk-dev] upper limit on the size of allocation through rte_malloc in dpdk-1.8.0? X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Feb 2015 11:26:07 -0000 Hi, On 02/06/2015 12:00 PM, Bruce Richardson wrote: > On Wed, Feb 04, 2015 at 05:24:58PM +0200, Stefan Puiu wrote: >> Hi, >> >> I'm trying to alter an existing program to use the Intel DPDK. I'm >> using 1.8.0, compiled by me as a shared library >> (CONFIG_RTE_BUILD_COMBINE_LIBS=y and CONFIG_RTE_BUILD_SHARED_LIB=y in >> .config) on Ubuntu 12.04. The program needs to allocate large blocks >> of memory (between 1 and 4 chunks of 4.5GB, also 1-4 chunks of 2.5 >> GB). I tried changing my C++ code to use an array allocated using >> rte_malloc() instead of the std::vector I was using beforehand, but it >> seems the call to rte_malloc() fails. I then made a simple test >> program using the DPDK that takes a size to allocate and if that >> fails, tries again with sizes of 100MB less, basically the code below. >> This is C++ code (well, now that I look it could've been plain C, but >> I need C++) compiled with g++-4.6 with '-std=gnu++0x': >> >> int main(int argc, char **argv) >> { >> int ret = rte_eal_init(argc, argv); >> if (ret < 0) >> rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n"); >> argc -= ret; >> argv += ret; >> >> [... check argc >= 2] >> size_t size = strtoul(argv[1], NULL, 10); >> size_t s = size; >> >> for (size_t i = 0; i < 30; ++i) { >> printf("Trying to allocate %'zu bytes\n", s); >> buf = rte_malloc("test", s, 0); >> if (!buf) >> printf ("Failed!\n"); >> else { >> printf ("Success!\n"); >> rte_free(buf); >> break; >> } >> >> s = s - (100 * 1024ULL * 1024ULL); >> } >> >> return 0; >> } >> >> I'm getting: >> Trying to allocate 4,832,038,656 bytes >> Failed! >> Trying to allocate 4,727,181,056 bytes >> Failed! >> [...] >> Trying to allocate 2,944,601,856 bytes >> Success! >> >> It's not always the same value, but usually somewhere around 3GB >> rte_malloc() succeeds. I'm running on a physical (non-VM) NUMA machine >> with 2 physical CPUs, each having 64GBs of local memory. The machine >> also runs Ubuntu 12.04 server. I've created 16384 hugepages of 2MB: >> >> echo 16384 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages >> >> I'm running the basic app like this: >> >> sudo numactl --membind=0 ~/src/test/dpdk_test/alloc -c 1f -n 4 -w >> 04:00.0 --socket-mem=16384,0 -- 4832038656 >> >> I'm trying to run only on NUMA node 0 and only allocate memory from >> there - that's what the app I'm moving to the DPDK works like (using >> numactl --membind=x and --cpunodebind=x). >> >> Is there an upper limit on the amount of memory rte_malloc() will try >> to allocate? I tried both after a reboot and when the machine had been >> running for a while with not much success. Am I missing something? >> It's a bit weird to be only able to allocate 3GB out of the 32GB >> assigned to the app... >> >> On a related note, what would be a good way to compile the DPDK with >> debug info (and preferably -O0)? There's quite a web of .mk files used >> and I haven't figured out where the optimization level / debug options >> are set. >> >> Thanks in advance, >> Stefan. > > Does your system support 1G pages? I would recommend using a smaller number of > 1G pages vs the huge number of 2MB pages that you are currently using. There > may be issues with the allocations failing due to a lack of contiguous blocks > of memory due to the 2MB pages being spread across memory. Indeed, rte_malloc() tries to allocate memory which is physically contiguous. Using 1G pages instead of 2MB pages will probably help as Bruce suggests. Another idea is to use another allocation method. It depends on what you want to do with the allocated data (accessed in dataplane or not), and when you allocate it (in dataplane or not). For instance, if you want to allocate a large zone at init, you can just mmap() and anonymous zone in hugetlbfs (your dpdk config need to keep unused huge pages for this usage). By the way, I recently noticed that rte_malloc() does not work well for data larger than 4GB but I had no time to dig into this issue. There is probably somewhere in the rte_malloc code where 32 bit addresses are used. Regards, Olivier