DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "Gonzalez Monroy, Sergio" <sergio.gonzalez.monroy@intel.com>,
	"Thomas Monjalon" <thomas.monjalon@6wind.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] no hugepage with UIO poll-mode driver
Date: Wed, 25 Nov 2015 14:12:20 +0000	[thread overview]
Message-ID: <2601191342CEEE43887BDE71AB97725836ACD0DB@irsmsx105.ger.corp.intel.com> (raw)
In-Reply-To: <5655BB2C.4090806@intel.com>



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Sergio Gonzalez Monroy
> Sent: Wednesday, November 25, 2015 1:44 PM
> To: Thomas Monjalon
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] no hugepage with UIO poll-mode driver
> 
> On 25/11/2015 13:22, Thomas Monjalon wrote:
> > 2015-11-25 12:02, Bruce Richardson:
> >> On Wed, Nov 25, 2015 at 12:03:05PM +0100, Thomas Monjalon wrote:
> >>> 2015-11-25 11:00, Bruce Richardson:
> >>>> On Wed, Nov 25, 2015 at 11:23:57AM +0100, Thomas Monjalon wrote:
> >>>>> 2015-11-25 10:08, Bruce Richardson:
> >>>>>> On Wed, Nov 25, 2015 at 03:39:17PM +0900, Younghwan Go wrote:
> >>>>>>> Hi Jianfeng,
> >>>>>>>
> >>>>>>> Thanks for the email. rte mempool was successfully created without any
> >>>>>>> error. Now the next problem is that rte_eth_rx_burst() is always returning 0
> >>>>>>> as if there was no packet to receive... Do you have any suggestion on what
> >>>>>>> might be causing this issue? In the meantime, I will be digging through
> >>>>>>> ixgbe driver code to see what's going on.
> >>>>>>>
> >>>>>>> Thank you,
> >>>>>>> Younghwan
> >>>>>>>
> >>>>>> The problem is that with --no-huge we don't have the physical address of the memory
> >>>>>> to write to the network card. That's what it's marked as for testing only.
> >>>>> Even with rte_mem_virt2phy() + rte_mem_lock_page() ?
> >>>>>
> >>>> With no-huge, we just set up a single memory segment at startup and set its
> >>>> "physaddr" to be the virtual address.
> >>>>
> >>>>          /* hugetlbfs can be disabled */
> >>>>          if (internal_config.no_hugetlbfs) {
> >>>>                  addr = mmap(NULL, internal_config.memory, PROT_READ | PROT_WRITE,
> >>>>                                  MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
> >>>>                  if (addr == MAP_FAILED) {
> >>>>                          RTE_LOG(ERR, EAL, "%s: mmap() failed: %s\n", __func__,
> >>>>                                          strerror(errno));
> >>>>                          return -1;
> >>>>                  }
> >>>>                  mcfg->memseg[0].phys_addr = (phys_addr_t)(uintptr_t)addr;
> >>> rte_mem_virt2phy() does not use memseg.phys_addr but /proc/self/pagemap:
> >>>
> >>>      /*
> >>>       * the pfn (page frame number) are bits 0-54 (see
> >>>       * pagemap.txt in linux Documentation)
> >>>       */
> >>>      physaddr = ((page & 0x7fffffffffffffULL) * page_size)
> >>>          + ((unsigned long)virtaddr % page_size);
> >>>
> >> Yes, you are right. I was not aware that that function was used as part of the
> >> mempool init, but now I see that "rte_mempool_virt2phy()" does indeed call that
> >> function if hugepages are disabled, so my bad.
> > Do you think we could move --no-huge in the main section (not only for testing)?
> Hi,
> 
> I think the main issue is going to be the HW descriptors queues.
> AFAIK drivers now call rte_eth_dma_zone_reserve, which is basically a
> wrapper around
> rte_memzone_reserve, to get a chunk of phys memory, and in the case of
> --no-huge is
> not going to be really phys contiguous.
> 
> Ideally we would move and expand the functionality of dma_zone reserve
> API to the EAL,
> so we could detect what page size we have and set the boundary for such
> page size.
> dma_zone_reserve does something similar to work on Xen target by
> reserving memzones
> on 2MB boundary.

With xen we have a special kernel driver that allocates physically continuous 
chunks of memory for us.
So we can guarantee that each such chunk would be at least 2MB long.
That's enough to allocate HW rings (max HW ring size for let say ixgbe is ~64KB).
Here there is absolutely no guarantee that memory allocated by kernel will be memory continuous.
Of course you can search though all pages that you allocated and most likely you'll find a continuous
chunk big enough for that.
Another problem - mbufs. 
You need to be sure that each mbuf doesn't cross page boundary
(in case next page is not adjacent to current one).
So you'll probably need to use rte_mempool_xmem_create() to allocate mbufs from no hugepages.
BTW, as I remember with vfio in place you should be able to do IO with no-hugepages options, no?
As it relies on vfio ability to setup IOMMU tables for you.
Konstantin

> 
> Sergio

  reply	other threads:[~2015-11-25 14:13 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-25  5:45 Younghwan Go
2015-11-25  6:19 ` Tan, Jianfeng
2015-11-25  6:39   ` Younghwan Go
2015-11-25 10:08     ` Bruce Richardson
2015-11-25 10:23       ` Thomas Monjalon
2015-11-25 11:00         ` Bruce Richardson
2015-11-25 11:03           ` Thomas Monjalon
2015-11-25 12:02             ` Bruce Richardson
2015-11-25 13:22               ` Thomas Monjalon
2015-11-25 13:44                 ` Sergio Gonzalez Monroy
2015-11-25 14:12                   ` Ananyev, Konstantin [this message]
2015-11-26  4:47                     ` Younghwan Go
2015-11-26 14:37                       ` Kyle Larose

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2601191342CEEE43887BDE71AB97725836ACD0DB@irsmsx105.ger.corp.intel.com \
    --to=konstantin.ananyev@intel.com \
    --cc=dev@dpdk.org \
    --cc=sergio.gonzalez.monroy@intel.com \
    --cc=thomas.monjalon@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).