DPDK patches and discussions
 help / color / Atom feed
From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
To: David Marchand <david.marchand@redhat.com>, dev@dpdk.org
Cc: jerinj@marvell.com, thomas@monjalon.net,
	John McNamara <john.mcnamara@intel.com>,
	Marko Kovacevic <marko.kovacevic@intel.com>,
	Igor Russkikh <igor.russkikh@aquantia.com>,
	Pavel Belous <pavel.belous@aquantia.com>,
	Ajit Khaparde <ajit.khaparde@broadcom.com>,
	Somnath Kotur <somnath.kotur@broadcom.com>,
	Wenzhuo Lu <wenzhuo.lu@intel.com>,
	John Daley <johndale@cisco.com>,
	Hyong Youb Kim <hyonkim@cisco.com>,
	Qi Zhang <qi.z.zhang@intel.com>,
	Xiao Wang <xiao.w.wang@intel.com>,
	Beilei Xing <beilei.xing@intel.com>,
	Jingjing Wu <jingjing.wu@intel.com>,
	Qiming Yang <qiming.yang@intel.com>,
	Konstantin Ananyev <konstantin.ananyev@intel.com>,
	Matan Azrad <matan@mellanox.com>,
	Shahaf Shuler <shahafs@mellanox.com>,
	Yongseok Koh <yskoh@mellanox.com>,
	Viacheslav Ovsiienko <viacheslavo@mellanox.com>,
	Alejandro Lucero <alejandro.lucero@netronome.com>,
	Nithin Dabilpuram <ndabilpuram@marvell.com>,
	Kiran Kumar K <kirankumark@marvell.com>,
	Rasesh Mody <rmody@marvell.com>,
	Shahed Shaikh <shshaikh@marvell.com>,
	Bruce Richardson <bruce.richardson@intel.com>
Subject: Re: [dpdk-dev] [PATCH 2/2] eal: fix IOVA mode selection as VA for pci drivers
Date: Fri, 12 Jul 2019 12:03:20 +0100
Message-ID: <615dc23b-639a-01a0-4871-4c168e64bbb9@intel.com> (raw)
In-Reply-To: <1562795329-16652-3-git-send-email-david.marchand@redhat.com>

On 10-Jul-19 10:48 PM, David Marchand wrote:
> The incriminated commit broke the use of RTE_PCI_DRV_IOVA_AS_VA which
> was intended to mean "driver only supports VA" but had been understood
> as "driver supports both PA and VA" by most net drivers and used to let
> dpdk processes to run as non root (which do not have access to physical
> addresses on recent kernels).
> 
> The check on physical addresses actually closed the gap for those
> drivers. We don't need to mark them with RTE_PCI_DRV_IOVA_AS_VA and this
> flag can retain its intended meaning.
> Document explicitly its meaning.
> 

So, we always assume that all devices support both IOVA as PA and IOVA 
as VA by default. Well, as long as it's understood and documented :)

Unless...


<snip>

> +
> +IOVA Mode is selected by considering what the current usable Devices on the
> +system requires and/or supports.
> +
> +Below is the 2-step heuristic for this choice.
> +
> +For the first step, EAL asks each bus its requirement in terms of IOVA mode
> +and decides on a preferred IOVA mode.
> +
> +- if all buses report RTE_IOVA_PA, then the preferred IOVA mode is RTE_IOVA_PA,
> +- if all buses report RTE_IOVA_VA, then the preferred IOVA mode is RTE_IOVA_VA,
> +- if all buses report RTE_IOVA_DC, no bus expressed a preferrence, then the
> +  preferred mode is RTE_IOVA_DC,
> +- if the buses disagree (at least one wants RTE_IOVA_PA and at least one wants
> +  RTE_IOVA_VA), then the preferred IOVA mode is RTE_IOVA_DC (see below with the
> +  check on Physical Addresses availability),
> +
> +The second step is checking if the preferred mode complies with the Physical
> +Addresses availability since those are only available to root user in recent
> +kernels.
> +
> +- if the preferred mode is RTE_IOVA_PA but there is no access to Physical
> +  Addresses, then EAL init will fail early, since later probing of the devices
> +  would fail anyway,
> +- if the preferred mode is RTE_IOVA_DC then based on the Physical Addresses
> +  availability, the preferred mode is adjusted to RTE_IOVA_PA or RTE_IOVA_VA.
> +  In the case when the buses had disagreed on the IOVA Mode at the first step,
> +  part of the buses won't work because of this decision.

Is there any specific reason why we always prefer PA if physical 
addresses are available? Since we're already assuming that all devices 
support PA and VA anyway, what's the harm in enabling VA by default?

I seem to recall there were some concerns around SPDK and PA address 
availability - doesn't that mean that the assumption regarding PA and VA 
mode always being supported doesn't actually hold in practice?

By the way, the reason i'm harping away on IOVA as VA being the default 
is because having IOVA as PA is not a free (as in beer) choice - we 
sacrifice some usability by doing that. Right now, by default, mempool 
will ask for IOVA-contiguous memory first, and this is slow in IOVA as 
PA mode - meaning, e.g. testpmd startup time is greatly increased for 
smaller page sizes because of IOVA as PA mode is the default in DPDK.

I would also like to steer people away from using real physical 
addresses because doing so while requiring lots of IOVA contiguous 
memory also requires legacy mem mode, which i would rather people not 
use and grow dependent on, and would like to remove it at some point as 
it adds a lot of complexity for a corner case.

So, picking address mode is not *just* about whether the device supports 
them - it has usability implications as well.

-- 
Thanks,
Anatoly

  parent reply index

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-10 21:48 [dpdk-dev] [PATCH 0/2] Fixes on IOVA mode selection David Marchand
2019-07-10 21:48 ` [dpdk-dev] [PATCH 1/2] Revert "bus/pci: add Mellanox kernel driver type" David Marchand
2019-07-16 10:37   ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
2019-07-10 21:48 ` [dpdk-dev] [PATCH 2/2] eal: fix IOVA mode selection as VA for pci drivers David Marchand
2019-07-11 14:40   ` Thomas Monjalon
2019-07-12  8:05     ` Jerin Jacob Kollanukkaran
2019-07-12 11:03   ` Burakov, Anatoly [this message]
2019-07-12 12:43     ` Thomas Monjalon
2019-07-12 12:58       ` Burakov, Anatoly
2019-07-12 13:19         ` Bruce Richardson
2019-07-15 14:26       ` Jerin Jacob Kollanukkaran
2019-07-15 15:03         ` Thomas Monjalon
2019-07-15 15:35           ` Jerin Jacob Kollanukkaran
2019-07-15 16:06             ` Thomas Monjalon
2019-07-15 16:27               ` Jerin Jacob Kollanukkaran
2019-07-16 13:46 ` [dpdk-dev] [PATCH v2 0/4] Fixes on IOVA mode selection jerinj
2019-07-16 13:46   ` [dpdk-dev] [PATCH v2 1/4] Revert "bus/pci: add Mellanox kernel driver type" jerinj
2019-07-16 13:46   ` [dpdk-dev] [PATCH v2 2/4] eal: fix IOVA mode selection as VA for pci drivers jerinj
2019-07-16 14:26     ` Burakov, Anatoly
2019-07-16 15:07       ` Jerin Jacob Kollanukkaran
2019-07-16 13:46   ` [dpdk-dev] [PATCH v2 3/4] eal: change RTE_PCI_DRV_IOVA_AS_VA flag name jerinj
2019-07-16 13:46   ` [dpdk-dev] [PATCH v2 4/4] eal: select IOVA mode as VA for default case jerinj
2019-07-16 14:33     ` Burakov, Anatoly
2019-07-17  8:33       ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
2019-07-17 12:38         ` Burakov, Anatoly
2019-07-17 14:04           ` Jerin Jacob Kollanukkaran
2019-07-18  6:45   ` [dpdk-dev] [PATCH v3 0/4] Fixes on IOVA mode selection jerinj
2019-07-18  6:45     ` [dpdk-dev] [PATCH v3 1/4] Revert "bus/pci: add Mellanox kernel driver type" jerinj
2019-07-18  6:45     ` [dpdk-dev] [PATCH v3 2/4] eal: fix IOVA mode selection as VA for pci drivers jerinj
2019-07-18  6:45     ` [dpdk-dev] [PATCH v3 3/4] eal: change RTE_PCI_DRV_IOVA_AS_VA flag name jerinj
2019-07-18  6:45     ` [dpdk-dev] [PATCH v3 4/4] eal: select IOVA mode as VA for default case jerinj
2019-07-22 11:28     ` [dpdk-dev] [PATCH v3 0/4] Fixes on IOVA mode selection David Marchand
2019-07-22 12:56 ` [dpdk-dev] [PATCH v4 " David Marchand
2019-07-22 12:56   ` [dpdk-dev] [PATCH v4 1/4] Revert "bus/pci: add Mellanox kernel driver type" David Marchand
2019-07-22 12:56   ` [dpdk-dev] [PATCH v4 2/4] eal: fix IOVA mode selection as VA for PCI drivers David Marchand
2019-11-25  9:33     ` Ferruh Yigit
2019-11-25 10:22       ` Thomas Monjalon
2019-11-25 12:03         ` Ferruh Yigit
2019-11-25 12:36           ` David Marchand
2019-11-25 12:58             ` Burakov, Anatoly
2019-11-25 14:29               ` Thomas Monjalon
2019-11-25 11:07       ` Jerin Jacob
2019-07-22 12:56   ` [dpdk-dev] [PATCH v4 3/4] drivers: change IOVA as VA PCI flag name David Marchand
2019-07-22 12:56   ` [dpdk-dev] [PATCH v4 4/4] eal: select IOVA as VA mode for default case David Marchand
2019-07-22 15:53   ` [dpdk-dev] [PATCH v4 0/4] Fixes on IOVA mode selection Thomas Monjalon
2019-07-23  3:35     ` Stojaczyk, Dariusz
2019-07-23  4:18       ` Jerin Jacob Kollanukkaran
2019-07-23  4:54         ` Stojaczyk, Dariusz
2019-07-23  5:27           ` Jerin Jacob Kollanukkaran
2019-07-23  7:21             ` Thomas Monjalon
2019-07-23  9:57             ` Burakov, Anatoly
2019-07-23 10:25               ` Thomas Monjalon
2019-07-23 13:56                 ` Burakov, Anatoly
2019-07-23 14:24                   ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
2019-07-23 14:29                   ` [dpdk-dev] " Burakov, Anatoly
2019-07-23 14:36                     ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
2019-07-23 15:47                       ` Burakov, Anatoly

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=615dc23b-639a-01a0-4871-4c168e64bbb9@intel.com \
    --to=anatoly.burakov@intel.com \
    --cc=ajit.khaparde@broadcom.com \
    --cc=alejandro.lucero@netronome.com \
    --cc=beilei.xing@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=hyonkim@cisco.com \
    --cc=igor.russkikh@aquantia.com \
    --cc=jerinj@marvell.com \
    --cc=jingjing.wu@intel.com \
    --cc=john.mcnamara@intel.com \
    --cc=johndale@cisco.com \
    --cc=kirankumark@marvell.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=marko.kovacevic@intel.com \
    --cc=matan@mellanox.com \
    --cc=ndabilpuram@marvell.com \
    --cc=pavel.belous@aquantia.com \
    --cc=qi.z.zhang@intel.com \
    --cc=qiming.yang@intel.com \
    --cc=rmody@marvell.com \
    --cc=shahafs@mellanox.com \
    --cc=shshaikh@marvell.com \
    --cc=somnath.kotur@broadcom.com \
    --cc=thomas@monjalon.net \
    --cc=viacheslavo@mellanox.com \
    --cc=wenzhuo.lu@intel.com \
    --cc=xiao.w.wang@intel.com \
    --cc=yskoh@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

Archives are clonable:
	git clone --mirror http://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ http://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev


Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox