From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f178.google.com (mail-wi0-f178.google.com [209.85.212.178]) by dpdk.org (Postfix) with ESMTP id 1A760C358 for ; Thu, 23 Jul 2015 13:48:47 +0200 (CEST) Received: by wicmv11 with SMTP id mv11so20312057wic.0 for ; Thu, 23 Jul 2015 04:48:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:organization :user-agent:in-reply-to:references:mime-version :content-transfer-encoding:content-type; bh=6U1QuR4lr9AMa5Yl2LoeTwDK8HJYLtHXbAOEmO0MDdY=; b=ZjmuM0d8jgkITPIipbYEH6JvlCAjzgZ37GvOab/iPt5D06DHgxvWJPFoKhWyZOCyWV DzZYv9c7dNKHkp2XcNJzmBfabsD0QoIj5klAfU/yCqWs7o67oGSuzSxrv5/0Lkx94syp 5dtpLkyF3SByJfmsvjtvvcfktfqTI7yGZ18oapYHa6Ib87oa3mNovxC8NJmDerRMdIe+ 4x7rsS1Bg/5NlrAQm1jETfvWMluoGP0XMjoMXMKmm5jIdmP7G/RRV90KbVtFdKXl2gId TBf0yi7nC3U2OTRdx9OGS5crJWb+lB84lcT9wXl5IxGkiWZj12gPC58woi/k2ixnvzdh tKDA== X-Gm-Message-State: ALoCoQn4XKlFTMrAR3tlAs+q5M5/Y4ZP+2tkjU8E6uE85K9HNWzKzuvLxdJeTySs6Kwt2Jr8usRb X-Received: by 10.180.218.227 with SMTP id pj3mr15589734wic.59.1437652126932; Thu, 23 Jul 2015 04:48:46 -0700 (PDT) Received: from xps13.localnet (136-92-190-109.dsl.ovh.fr. [109.190.92.136]) by smtp.gmail.com with ESMTPSA id ez4sm26782836wid.14.2015.07.23.04.48.45 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 23 Jul 2015 04:48:45 -0700 (PDT) From: Thomas Monjalon To: "Gonzalez Monroy, Sergio" Date: Thu, 23 Jul 2015 13:47:33 +0200 Message-ID: <2817573.3YJep94MMg@xps13> Organization: 6WIND User-Agent: KMail/4.14.8 (Linux/4.0.4-2-ARCH; KDE/4.14.8; x86_64; ; ) In-Reply-To: <55B0B411.5050903@intel.com> References: <3797202.NyZr8XgqE1@xps13> <1504831.JexCQJ5PJA@xps13> <55B0B411.5050903@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Cc: dev@dpdk.org Subject: Re: [dpdk-dev] libhugetlbfs X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Jul 2015 11:48:47 -0000 2015-07-23 10:29, Gonzalez Monroy, Sergio: > On 23/07/2015 09:12, Thomas Monjalon wrote: > > 2015-07-23 08:34, Gonzalez Monroy, Sergio: > >> On 22/07/2015 11:40, Thomas Monjalon wrote: > >>> Sergio, > >>> > >>> As the maintainer of memory allocation, would you consider using > >>> libhugetlbfs in DPDK for Linux? > >>> It may simplify a part of our memory allocator and avoid some potential > >>> bugs which would be already fixed in the dedicated lib. > >> I did have a look at it a couple of months ago and I thought there were > >> a few issues: > >> - get_hugepage_region/get_huge_pages only allocates default size huge pages > >> (you can set a different default huge page size with environment > >> variables but no > >> support for multiple sizes) plus we have no guarantee on physically > >> contiguous pages. > > Speaking about that, we don't always need contiguous pages. > > Maybe we should take it into account when reserving memory. > > Some flags DMA (locked physical pages that are not swappable) and CONTIGUOUS > > may be considered. > Sure. I think I also mentioned this as possible future work in the > Dynamic Memzones RFC. > >> - That leave us with > >> hugetlbfs_unlinked_fd/hugetlbfs_unlinked_fd_for_size. These APIs > >> wouldn't simplify a lot the current code, just the allocation of the > >> pages themselves > >> (ie. creating a file in hugetlbfs mount). > >> Then there is the issue with multi-process; because they return a > >> file descriptor while > >> unlinking the file, we would need some sort of Inter-Process > >> Communication to pass > >> the descriptors to secondary processes. > >> - Not a big deal but AFAIK it is not possible to have multiple mount > >> points for the same > >> hugepage size, and even if you do, hugetlbfs_find_path_for_size > >> returns always the > >> same path (ie. first found in list). > >> - We still need to parse /proc/self/pagemap to get physical address of > >> mapped hugepages. > >> > >> I guess that if we were to push for a new API such as > >> hugetlbfs_fd_for_size, we could use > >> it for the hugepage allocation, but we still would have to parse > >> /proc/self/pagemap to get > >> physical address and then order those hugepages. > >> > >> Thoughts? > > Why not extending the API and pushing our code to this lib? > > It would allow to share the maintenance. > > > > The same move could be done to libpciaccess. > I don't disagree with the idea of using libhugetlbfs, I just tried to > point out that > it's not just a drop in replacement. Yes, thank you for the fine analysis, Sergio.