From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id 2A0692BA6 for ; Tue, 4 Oct 2016 16:09:39 +0200 (CEST) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga104.fm.intel.com with ESMTP; 04 Oct 2016 07:09:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.31,295,1473145200"; d="scan'208";a="1065959287" Received: from smonroyx-mobl.ger.corp.intel.com (HELO [10.237.220.71]) ([10.237.220.71]) by fmsmga002.fm.intel.com with ESMTP; 04 Oct 2016 07:09:31 -0700 From: Sergio Gonzalez Monroy To: tom.barbette@ulg.ac.be, Andriy Berestovskyy References: <57F36199.5020100@oneaccess-net.com> <57F3787A.6060105@oneaccess-net.com> <57F388E5.3010405@oneaccess-net.com> <512920892.31118614.1475582525691.JavaMail.zimbra@ulg.ac.be> Cc: Renata Saiakhova , users Message-ID: <3f747784-4468-87bd-389c-9ed2d51e7c03@intel.com> Date: Tue, 4 Oct 2016 15:09:29 +0100 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: <512920892.31118614.1475582525691.JavaMail.zimbra@ulg.ac.be> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-users] rte_segments: hugepages are not in contiguous memory X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Oct 2016 14:09:39 -0000 Hi folks, In theory, there shouldn't be any performance difference between having a mempool allocated from a single memseg (given the use the same number of hugepages) versus multiple memsegs as it is all done on mempool creation/setup and each mbuf has its own phys address. Tom, I cannot think of a reason why you would have higher memory access for having scatter hugapages vs contig hugepages. Any details on the test you were running? Sergio On 04/10/2016 13:02, tom.barbette@ulg.ac.be wrote: > There is a noticeable performance drop with more scattering of the huge pages. > > I did not measure any difference accurately but I ended up rebooting my DUT between each performance test because the pages get scattered with time and re-launch of the DPDK application instead of the whole machine, because the tests showed higher memory access cost each time I re-launched the application. > > Tom > > ----- Mail original ----- > De: "Andriy Berestovskyy" > À: "Renata Saiakhova" > Cc: "Sergio Gonzalez Monroy", "users" > Envoyé: Mardi 4 Octobre 2016 13:27:23 > Objet: Re: [dpdk-users] rte_segments: hugepages are not in contiguous memory > > Renata, > In theory 512 contiguous 2MB huge pages might get transparently > promoted to a single 1GB "superpage" and single TLB entry, but I am > not even sure if it is implemented in Linux... > > So, I do not think there will be any noticeable performance difference > between contiguous and non-contiguous 2MB huge pages. But you better > measure it to make sure ;) > > Regards, > Andriy > > On Tue, Oct 4, 2016 at 12:48 PM, Renata Saiakhova > wrote: >> Hi Andriy, >> >> thanks for your reply. I guess that contiguous memory is requested because >> of the performance reasons. Do you know if I can expect a noticeable >> performance drop using non-contiguous memory? >> >> Renata >> >> >> On 10/04/2016 12:13 PM, Andriy Berestovskyy wrote: >>> Hi Renata, >>> DPDK supports non-contiguous memory pools, but >>> rte_pktmbuf_pool_create() uses rte_mempool_create_empty() with flags >>> set to zero, i.e. requests contiguous memory. >>> >>> As a workaround, in rte_pktmbuf_pool_create() try to pass >>> MEMPOOL_F_NO_PHYS_CONTIG flag as the last argument to >>> rte_mempool_create_empty(). >>> >>> Note that KNI and some PMDs in 16.07 still require contiguous memory >>> pools, so the trick might not work for your setup. For the KNI try the >>> DPDK's master branch which includes the commit by Ferruh Yigit: >>> >>> 8451269 kni: remove continuous memory restriction >>> >>> Regards, >>> Andriy >>> >>> >>> On Tue, Oct 4, 2016 at 11:38 AM, Renata Saiakhova >>> wrote: >>>> Hi Sergio, >>>> >>>> thank you for your quick answer. I also tried to allocate 1GB hugepage, >>>> but >>>> seems kernel fails to allocate it: previously I've seen that >>>> HugePages_Total >>>> in /proc/meminfo is set to 0, now - kernel hangs at boot time (don't know >>>> why). >>>> But anyway, if there is no way to control hugepage allocation in the >>>> sense >>>> they are in contiguous memory there is only way to accept it and adapt >>>> the >>>> code that it creates several pools which in total satisfy the requested >>>> size. >>>> >>>> Renata >>>> >>>> >>>> On 10/04/2016 10:27 AM, Sergio Gonzalez Monroy wrote: >>>>> On 04/10/2016 09:00, Renata Saiakhova wrote: >>>>>> Hi all, >>>>>> >>>>>> I'm using dpdk 16.04 (I tried 16.07 with the same results) and linux >>>>>> kernel 4.4.20 in a virtual machine (I'm using libvirt framework). I >>>>>> pass a >>>>>> parameter in kernel command line to allocate 512 hugepages of 2 MB at >>>>>> boot >>>>>> time. They are successfully allocated. When an application with dpdk >>>>>> starts >>>>>> it calls rte_pktmbuf_pool_create() which in turns requests internally >>>>>> 649363712 bytes. Those bytes should be allocated from one of >>>>>> rte_memseg. >>>>>> rte_memsegs describes contiguous portions of memory (both physical and >>>>>> virtual) built on hugepages. This allocation fails, because there are >>>>>> no >>>>>> rte_memsegs of this size (or bigger). Further debugging shows that >>>>>> hugepages >>>>>> are allocated in non-contiguous physical memory and therefore >>>>>> rte_memsegs >>>>>> are built respecting gaps in physical memory. >>>>>> Below are the sizes of segments built on hugepages (in bytes) >>>>>> 2097152 >>>>>> 6291456 >>>>>> 2097152 >>>>>> 524288000 >>>>>> 2097152 >>>>>> 532676608 >>>>>> 2097152 >>>>>> 2097152 >>>>>> So there are 5 segments which includes only one hugepage! >>>>>> This behavior is completely different to what I observe with linux >>>>>> kernel >>>>>> 3.8 (used with the same application with dpdk) - where all hugepages >>>>>> are >>>>>> allocated in contiguous memory. >>>>>> Does anyone experience the same issue? Could it be some kernel option >>>>>> which can do the magic? If not, and kernel can allocated hugepages in >>>>>> non-contiguous memory how dpdk is going to resolve it? >>>>>> >>>>> I don't think there is anything we can do to force the kernel to >>>>> pre-allocate contig hugepages on boot. If there was, we wouldn't need to >>>>> do >>>>> all this mapping sorting and grouping we do on DPDK >>>>> as we would rely on the kernel giving us pre-allocated contig hugepages. >>>>> >>>>> If you have plenty of memory one possible work around would be to >>>>> increase >>>>> the number of default hugepages so we are likely to find more contiguous >>>>> ones. >>>>> >>>>> Is using 1GB hugepages a possibility in your case? >>>>> >>>>> Sergio >>>>> >>>>>> Thanks in advance, >>>>>> Renata >>>>>> >>>>> . >>>>> >