DPDK patches and discussions
 help / color / mirror / Atom feed
From: Alejandro Lucero <alejandro.lucero@netronome.com>
To: dariusz.stojaczyk@intel.com
Cc: "Burakov, Anatoly" <anatoly.burakov@intel.com>,
	dev <dev@dpdk.org>,
	 Santosh Shukla <santosh.shukla@caviumnetworks.com>,
	Hemant Agrawal <hemant.agrawal@nxp.com>,
	 Jerin Jacob <jerin.jacob@caviumnetworks.com>,
	 Maxime Coquelin <maxime.coquelin@redhat.com>,
	chas3@att.com
Subject: Re: [dpdk-dev] [PATCH v2] eal/bus: use RTE_IOVA_PA only if phys addresses are available
Date: Tue, 30 Oct 2018 12:58:35 +0000	[thread overview]
Message-ID: <CAD+H9930AiP+7yENkwc_V-kHXtRRJndDc9qdiULBpO8txugqrw@mail.gmail.com> (raw)
In-Reply-To: <FBE7E039FA50BF47A673AD0BD3CD56A8461E8117@HASMSX105.ger.corp.intel.com>

On Mon, Sep 17, 2018 at 2:06 PM Stojaczyk, Dariusz <
dariusz.stojaczyk@intel.com> wrote:

>
>
> > -----Original Message-----
> > From: Burakov, Anatoly
> > Sent: Monday, September 17, 2018 12:34 PM
> > To: Stojaczyk, Dariusz <dariusz.stojaczyk@intel.com>; dev@dpdk.org;
> > Santosh Shukla <santosh.shukla@caviumnetworks.com>; Hemant Agrawal
> > <hemant.agrawal@nxp.com>; Jerin Jacob
> > <jerin.jacob@caviumnetworks.com>
> > Cc: Maxime Coquelin <maxime.coquelin@redhat.com>; Chas Williams
> > <chas3@att.com>
> > Subject: Re: [PATCH v2] eal/bus: use RTE_IOVA_PA only if phys addresses
> > are available
> >
> > On 07-Sep-18 4:58 PM, Darek Stojaczyk wrote:
> > > When neither RTE_IOVA_VA nor RTE_IOVA_PA was explicitly requested,
> > > DPDK would currently fallback to the default RTE_IOVA_PA mode and
> > > possibly encounter a failure later on if running as a non-priviledged
> > > user. Attempting to use RTE_IOVA_VA if no phys addresses are available
> > > may help in this case.
> > >
> > > Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
> > > ---
> > > Changes since v1:
> > >   * added a missing rte_memory.h include
> > >
> > >   lib/librte_eal/common/eal_common_bus.c | 19 +++++++++++++++----
> > >   1 file changed, 15 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/lib/librte_eal/common/eal_common_bus.c
> > > b/lib/librte_eal/common/eal_common_bus.c
> > > index 0943851cc..68c581b8a 100644
> > > --- a/lib/librte_eal/common/eal_common_bus.c
> > > +++ b/lib/librte_eal/common/eal_common_bus.c
> > > @@ -37,6 +37,7 @@
> > >   #include <rte_bus.h>
> > >   #include <rte_debug.h>
> > >   #include <rte_string_fns.h>
> > > +#include <rte_memory.h>
> > >
> > >   #include "eal_private.h"
> > >
> > > @@ -236,9 +237,19 @@ rte_bus_get_iommu_class(void)
> > >                     mode |= bus->get_iommu_class();
> > >     }
> > >
> > > -   if (mode != RTE_IOVA_VA) {
> > > -           /* Use default IOVA mode */
> > > -           mode = RTE_IOVA_PA;
> > > +   if (mode == RTE_IOVA_VA)
> > > +           return RTE_IOVA_VA;
> > > +
> > > +   if (mode & RTE_IOVA_PA) {
> > > +           /* Not all buses support RTE_IOVA_VA, fallback to
> > RTE_IOVA_PA */
> > > +           return RTE_IOVA_PA;
> > > +   }
> > > +
> > > +   if (rte_eal_using_phys_addrs()) {
> > > +           /* Default to RTE_IOVA_PA only if it's supported */
> > > +           return RTE_IOVA_PA;
> > >     }
> > > -   return mode;
> > > +
> > > +   /* Since RTE_IOVA_PA is unsupported, fallback to RTE_IOVA_VA */
> > > +   return RTE_IOVA_VA;
> > >   }
> > >
> >
> > This is a good change, however I think that this is too pessimistic. If
> i don't
> > have any devices that explictly require IOVA_PA, i should be running in
> > IOVA_VA mode.
>
> Another problem may occur when trying to hotplug devices that support only
> 39bit DMA. You may not be able to map any memory with vfio when in
> RTE_IOVA_VA mode, as virtual addresses likely occupy more than 39 bits.
>
>
There is now a hint for trying to map memory as low as possible instead of
using default Linux mmap base address. This makes devices with addressing
limitations being usable as long as the physical memory to map is not more
than what those devices allow.



> The rte_pci bus enforces RTE_IOVA_PA whenever it finds such devices on
> init.
>
> I have no doubt the logic can be improved here, but for now RTE_IOVA_PA is
> the only safe default.
>
> D.
>
> >
> > This of course doesn't take hotplug into account, so a command-line
> switch
> > to force one or the other should also be available.
> >
> > For example, at startup, i might have devices bound to VFIO, so IOVA_VA
> > mode is picked. However, even though at a time of startup none of the
> > devices require physical addresses, i also know that i might later
> hotplug a
> > device that requires IOVA_PA (leaving the question of hotplug brokenness
> > aside for now...) - currently, this scenario will not work, as i will be
> forced to
> > use IOVA_VA mode unless i happen to have a IOVA_PA device available at
> > startup.
> >
> > Similarly, if i'm running DPDK as root but am only using virtual devices
> like
> > pcap, i should be able to force DPDK into using VA addresses [*], yet
> > currently i will be forced to use IOVA_PA if i don't *also* have a few
> devices
> > bound exclusively to VFIO.
> >
> > [*] Do we have vdev devices that require IOVA_PA? I can't think of any...
> >
> > --
> > Thanks,
> > Anatoly
>

  reply	other threads:[~2018-10-30 12:58 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-07 15:47 [dpdk-dev] [PATCH] " Darek Stojaczyk
2018-09-07 15:58 ` [dpdk-dev] [PATCH v2] " Darek Stojaczyk
2018-09-17 10:33   ` Burakov, Anatoly
2018-09-17 13:06     ` Stojaczyk, Dariusz
2018-10-30 12:58       ` Alejandro Lucero [this message]
2018-10-28 23:11     ` Thomas Monjalon
2018-10-30 10:25       ` Burakov, Anatoly

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAD+H9930AiP+7yENkwc_V-kHXtRRJndDc9qdiULBpO8txugqrw@mail.gmail.com \
    --to=alejandro.lucero@netronome.com \
    --cc=anatoly.burakov@intel.com \
    --cc=chas3@att.com \
    --cc=dariusz.stojaczyk@intel.com \
    --cc=dev@dpdk.org \
    --cc=hemant.agrawal@nxp.com \
    --cc=jerin.jacob@caviumnetworks.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=santosh.shukla@caviumnetworks.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).