DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
To: Thomas Monjalon <thomas.monjalon@6wind.com>
Cc: David Marchand <david.marchand@6wind.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	 "Gonzalez Monroy, Sergio" <sergio.gonzalez.monroy@intel.com>,
	"Yigit, Ferruh" <ferruh.yigit@intel.com>,
	"Traynor, Kevin" <kevin.traynor@intel.com>,
	"pmatilai@redhat.com" <pmatilai@redhat.com>
Subject: Re: [dpdk-dev] [PATCH] dropping librte_ivshmem - was log: deprecate history dump
Date: Fri, 10 Jun 2016 09:47:07 +0000	[thread overview]
Message-ID: <C6ECDF3AB251BE4894318F4E451236978212BFFF@IRSMSX109.ger.corp.intel.com> (raw)
In-Reply-To: <1594485.zPMdI6dQJ2@xps13>

> > Hi Thomas,
> >
> > Just a few notes:
> >
> > > 3/ The automatic mapped allocation of DPDK objects in the guest.
> > > It should not be done in EAL.
> > > An ivshmem driver would be called by rte_eal_dev_init.
> > > It would check where are the shared DPDK structures, as currently
> > > done with the IVSHMEM_MAGIC (0x0BADC0DE), and do the appropriate
> allocations.
> > > Thus only the driver would depend on ring and mempool.
> >
> > The problem here is IVSHMEM doesn't allocate the memory from DPDK, it
> allocates new memory segments by mapping a PCI device. I.e. it doesn't do
> mallocs, it modifies mem_config and adds memory to DPDK. Can that be
> done from within a PMD?
> 
> Everything is possible :)
> Maybe you just need to add an API to add some memory segments.
> Other question: why is it so important to register these memory segments in
> EAL? I think they just need to be known by the ivshmem driver which map
> some objects on top.

That's because we need the memzone_lookup functionality. We can get by without it with rings because those are tailq-based, so we can just put rings there, but memzones are looked up through the memconfig, so IVSHMEM memzones have to be present there in order for the code to work without any additional API's.

Although, I guess we don't really need to have _memsegs_ in order to lookup memzones - we just have to create some memzones directly inside mcfg, bypassing the normal memzone_reserve stuff. That would still be a hack, but probably much less of a hack than what there is right now :) 

Another possible issue here is the order in which the memory is allocated. We put IVSHMEM init in EAL because we have to map things at specific addresses. The later IVSHMEM initializes, the more chance something will take up memory space that IVSHMEM needs. This could probably be solved with --base-virtaddr, so documentation will have to be updated to include advice to use that flag.

> 
> > > The last step of the ivshmem cleanup will be to remove the memory
> > > hack RTE_EAL_SINGLE_FILE_SEGMENTS. Then
> CONFIG_RTE_LIBRTE_IVSHMEM
> > > could be removed.
> >
> > The reason for that hack is that we often need to map several hugepages,
> and some of those pages could be 2M in size. If you're sharing 1G worth of
> contiguous memory backed by 2M pages, that's 512 files in the command line
> in vanilla DPDK, but can be made into one with
> RTE_EAL_SINGLE_FILE_SEGMENTS, so that QEMU command-line doesn't get
> overly long.
> >
> > So removing this hack, while definitely desired, will adversely affect
> > some use cases, such as using IVSHMEM on platforms where 1G pages
> > aren't supported. Whether we want to go with the effort of supporting
> > those is of course an open question - I personally don't have any data
> > on IVSHMEM userbase. Maybe Kevin/other OVS devs could help me out
> here
> > :)
> 
> We can keep supporting 2M pages by having a command line option, instead
> of the #ifdef RTE_EAL_SINGLE_FILE_SEGMENTS.
> But as I said, it is not the top priority to remove this hack.

Ah, so you're not suggesting removing the _functionality_, just the #ifdef? That could be made to work I guess...

Also, please correct me if I'm wrong, but I seem to remember some patches about putting all memory in a single file - I think that should work for IVSHMEM as well, because I believe IVSHMEM handles holes in files just fine, and can map even if everything resides inside a single file. So if that patch does what I think it does, we might just integrate it and remove the single file segments code entirely.

Thanks,
Anatoly

  reply	other threads:[~2016-06-10  9:47 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-09 14:09 [dpdk-dev] [PATCH] " Thomas Monjalon
2016-06-09 14:45 ` David Marchand
2016-06-09 15:01   ` Thomas Monjalon
2016-06-09 21:26     ` [dpdk-dev] [PATCH] dropping librte_ivshmem - was " Thomas Monjalon
2016-06-10  9:05       ` Burakov, Anatoly
2016-06-10  9:30         ` Thomas Monjalon
2016-06-10  9:47           ` Burakov, Anatoly [this message]
2016-06-10 10:13             ` Thomas Monjalon
2016-06-10 12:08               ` Burakov, Anatoly
2016-06-10 12:26                 ` Thomas Monjalon
2016-06-15 18:16       ` Ferruh Yigit
2016-06-15 18:34         ` [dpdk-dev] [PATCH] dropping librte_ivshmem Thomas Monjalon
2016-06-20 15:44           ` Ferruh Yigit
2016-06-20 15:54             ` Thomas Monjalon
2016-06-21  6:49       ` [dpdk-dev] [PATCH] dropping librte_ivshmem - was log: deprecate history dump Panu Matilainen
2016-06-09 15:01   ` [dpdk-dev] [PATCH] " Christian Ehrhardt
2016-06-09 15:06 ` [dpdk-dev] [PATCH v2] " Thomas Monjalon
2016-06-09 22:10   ` [dpdk-dev] [PATCH v3] " Thomas Monjalon
2016-06-10  9:50     ` David Marchand
2016-06-10 13:09       ` Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C6ECDF3AB251BE4894318F4E451236978212BFFF@IRSMSX109.ger.corp.intel.com \
    --to=anatoly.burakov@intel.com \
    --cc=david.marchand@6wind.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=kevin.traynor@intel.com \
    --cc=pmatilai@redhat.com \
    --cc=sergio.gonzalez.monroy@intel.com \
    --cc=thomas.monjalon@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).