From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
To: Thomas Monjalon <thomas@monjalon.net>
Cc: David Marchand <david.marchand@redhat.com>,
dev@dpdk.org, jerinj@marvell.com,
John McNamara <john.mcnamara@intel.com>,
Marko Kovacevic <marko.kovacevic@intel.com>,
Igor Russkikh <igor.russkikh@aquantia.com>,
Pavel Belous <pavel.belous@aquantia.com>,
Ajit Khaparde <ajit.khaparde@broadcom.com>,
Somnath Kotur <somnath.kotur@broadcom.com>,
Wenzhuo Lu <wenzhuo.lu@intel.com>,
John Daley <johndale@cisco.com>,
Hyong Youb Kim <hyonkim@cisco.com>,
Qi Zhang <qi.z.zhang@intel.com>,
Xiao Wang <xiao.w.wang@intel.com>,
Beilei Xing <beilei.xing@intel.com>,
Jingjing Wu <jingjing.wu@intel.com>,
Qiming Yang <qiming.yang@intel.com>,
Konstantin Ananyev <konstantin.ananyev@intel.com>,
Matan Azrad <matan@mellanox.com>,
Shahaf Shuler <shahafs@mellanox.com>,
Yongseok Koh <yskoh@mellanox.com>,
Viacheslav Ovsiienko <viacheslavo@mellanox.com>,
Alejandro Lucero <alejandro.lucero@netronome.com>,
Nithin Dabilpuram <ndabilpuram@marvell.com>,
Kiran Kumar K <kirankumark@marvell.com>,
Rasesh Mody <rmody@marvell.com>,
Shahed Shaikh <shshaikh@marvell.com>,
Bruce Richardson <bruce.richardson@intel.com>,
alialnu@mellanox.com, aconole@redhat.com
Subject: Re: [dpdk-dev] [PATCH 2/2] eal: fix IOVA mode selection as VA for pci drivers
Date: Fri, 12 Jul 2019 13:58:46 +0100 [thread overview]
Message-ID: <4998e025-ed05-3a56-1e4c-c053cf67a7c4@intel.com> (raw)
In-Reply-To: <2927698.45TxNz31xh@xps>
On 12-Jul-19 1:43 PM, Thomas Monjalon wrote:
> 12/07/2019 13:03, Burakov, Anatoly:
>> On 10-Jul-19 10:48 PM, David Marchand wrote:
>>> The incriminated commit broke the use of RTE_PCI_DRV_IOVA_AS_VA which
>>> was intended to mean "driver only supports VA" but had been understood
>>> as "driver supports both PA and VA" by most net drivers and used to let
>>> dpdk processes to run as non root (which do not have access to physical
>>> addresses on recent kernels).
>>>
>>> The check on physical addresses actually closed the gap for those
>>> drivers. We don't need to mark them with RTE_PCI_DRV_IOVA_AS_VA and this
>>> flag can retain its intended meaning.
>>> Document explicitly its meaning.
>>>
>>
>> So, we always assume that all devices support both IOVA as PA and IOVA
>> as VA by default. Well, as long as it's understood and documented :)
>
> Yes
> Please make sure it is well documented.
>
>> Unless...
>>
>>
>> <snip>
>>
>>> +
>>> +IOVA Mode is selected by considering what the current usable Devices on the
>>> +system requires and/or supports.
>>> +
>>> +Below is the 2-step heuristic for this choice.
>>> +
>>> +For the first step, EAL asks each bus its requirement in terms of IOVA mode
>>> +and decides on a preferred IOVA mode.
>>> +
>>> +- if all buses report RTE_IOVA_PA, then the preferred IOVA mode is RTE_IOVA_PA,
>>> +- if all buses report RTE_IOVA_VA, then the preferred IOVA mode is RTE_IOVA_VA,
>>> +- if all buses report RTE_IOVA_DC, no bus expressed a preferrence, then the
>>> + preferred mode is RTE_IOVA_DC,
>>> +- if the buses disagree (at least one wants RTE_IOVA_PA and at least one wants
>>> + RTE_IOVA_VA), then the preferred IOVA mode is RTE_IOVA_DC (see below with the
>>> + check on Physical Addresses availability),
>>> +
>>> +The second step is checking if the preferred mode complies with the Physical
>>> +Addresses availability since those are only available to root user in recent
>>> +kernels.
>>> +
>>> +- if the preferred mode is RTE_IOVA_PA but there is no access to Physical
>>> + Addresses, then EAL init will fail early, since later probing of the devices
>>> + would fail anyway,
>>> +- if the preferred mode is RTE_IOVA_DC then based on the Physical Addresses
>>> + availability, the preferred mode is adjusted to RTE_IOVA_PA or RTE_IOVA_VA.
>>> + In the case when the buses had disagreed on the IOVA Mode at the first step,
>>> + part of the buses won't work because of this decision.
>>
>> Is there any specific reason why we always prefer PA if physical
>> addresses are available? Since we're already assuming that all devices
>> support PA and VA anyway, what's the harm in enabling VA by default?
>
> If PA is available, it means we are running as root.
> We can assume that using root is a choice, probably related
> to a preference for PA.
>
>> I seem to recall there were some concerns around SPDK and PA address
>> availability - doesn't that mean that the assumption regarding PA and VA
>> mode always being supported doesn't actually hold in practice?
>>
>> By the way, the reason i'm harping away on IOVA as VA being the default
>> is because having IOVA as PA is not a free (as in beer) choice - we
>> sacrifice some usability by doing that. Right now, by default, mempool
>> will ask for IOVA-contiguous memory first, and this is slow in IOVA as
>> PA mode - meaning, e.g. testpmd startup time is greatly increased for
>> smaller page sizes because of IOVA as PA mode is the default in DPDK.
>>
>> I would also like to steer people away from using real physical
>> addresses because doing so while requiring lots of IOVA contiguous
>> memory also requires legacy mem mode, which i would rather people not
>> use and grow dependent on, and would like to remove it at some point as
>> it adds a lot of complexity for a corner case.
>
> That's why we should better encourage to not run as root.
> We need more documentation about how to run as normal user.
>
>> So, picking address mode is not *just* about whether the device supports
>> them - it has usability implications as well.
>
> If we consider running as root an exception, then it makes
> sense to pick address mode which fits this exception (PA).
>
When you put it that way, that does indeed make sense. Typically though,
developers tend to run as root. I shall hereby stop doing so :)
--
Thanks,
Anatoly
next prev parent reply other threads:[~2019-07-12 12:58 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-10 21:48 [dpdk-dev] [PATCH 0/2] Fixes on IOVA mode selection David Marchand
2019-07-10 21:48 ` [dpdk-dev] [PATCH 1/2] Revert "bus/pci: add Mellanox kernel driver type" David Marchand
2019-07-16 10:37 ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
2019-07-10 21:48 ` [dpdk-dev] [PATCH 2/2] eal: fix IOVA mode selection as VA for pci drivers David Marchand
2019-07-11 14:40 ` Thomas Monjalon
2019-07-12 8:05 ` Jerin Jacob Kollanukkaran
2019-07-12 11:03 ` Burakov, Anatoly
2019-07-12 12:43 ` Thomas Monjalon
2019-07-12 12:58 ` Burakov, Anatoly [this message]
2019-07-12 13:19 ` Bruce Richardson
2019-07-15 14:26 ` Jerin Jacob Kollanukkaran
2019-07-15 15:03 ` Thomas Monjalon
2019-07-15 15:35 ` Jerin Jacob Kollanukkaran
2019-07-15 16:06 ` Thomas Monjalon
2019-07-15 16:27 ` Jerin Jacob Kollanukkaran
2019-07-16 13:46 ` [dpdk-dev] [PATCH v2 0/4] Fixes on IOVA mode selection jerinj
2019-07-16 13:46 ` [dpdk-dev] [PATCH v2 1/4] Revert "bus/pci: add Mellanox kernel driver type" jerinj
2019-07-16 13:46 ` [dpdk-dev] [PATCH v2 2/4] eal: fix IOVA mode selection as VA for pci drivers jerinj
2019-07-16 14:26 ` Burakov, Anatoly
2019-07-16 15:07 ` Jerin Jacob Kollanukkaran
2019-07-16 13:46 ` [dpdk-dev] [PATCH v2 3/4] eal: change RTE_PCI_DRV_IOVA_AS_VA flag name jerinj
2019-07-16 13:46 ` [dpdk-dev] [PATCH v2 4/4] eal: select IOVA mode as VA for default case jerinj
2019-07-16 14:33 ` Burakov, Anatoly
2019-07-17 8:33 ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
2019-07-17 12:38 ` Burakov, Anatoly
2019-07-17 14:04 ` Jerin Jacob Kollanukkaran
2019-07-18 6:45 ` [dpdk-dev] [PATCH v3 0/4] Fixes on IOVA mode selection jerinj
2019-07-18 6:45 ` [dpdk-dev] [PATCH v3 1/4] Revert "bus/pci: add Mellanox kernel driver type" jerinj
2019-07-18 6:45 ` [dpdk-dev] [PATCH v3 2/4] eal: fix IOVA mode selection as VA for pci drivers jerinj
2019-07-18 6:45 ` [dpdk-dev] [PATCH v3 3/4] eal: change RTE_PCI_DRV_IOVA_AS_VA flag name jerinj
2019-07-18 6:45 ` [dpdk-dev] [PATCH v3 4/4] eal: select IOVA mode as VA for default case jerinj
2019-07-22 11:28 ` [dpdk-dev] [PATCH v3 0/4] Fixes on IOVA mode selection David Marchand
2019-07-22 12:56 ` [dpdk-dev] [PATCH v4 " David Marchand
2019-07-22 12:56 ` [dpdk-dev] [PATCH v4 1/4] Revert "bus/pci: add Mellanox kernel driver type" David Marchand
2019-07-22 12:56 ` [dpdk-dev] [PATCH v4 2/4] eal: fix IOVA mode selection as VA for PCI drivers David Marchand
2019-11-25 9:33 ` Ferruh Yigit
2019-11-25 10:22 ` Thomas Monjalon
2019-11-25 12:03 ` Ferruh Yigit
2019-11-25 12:36 ` David Marchand
2019-11-25 12:58 ` Burakov, Anatoly
2019-11-25 14:29 ` Thomas Monjalon
2019-11-25 11:07 ` Jerin Jacob
2019-07-22 12:56 ` [dpdk-dev] [PATCH v4 3/4] drivers: change IOVA as VA PCI flag name David Marchand
2019-07-22 12:56 ` [dpdk-dev] [PATCH v4 4/4] eal: select IOVA as VA mode for default case David Marchand
2019-07-22 15:53 ` [dpdk-dev] [PATCH v4 0/4] Fixes on IOVA mode selection Thomas Monjalon
2019-07-23 3:35 ` Stojaczyk, Dariusz
2019-07-23 4:18 ` Jerin Jacob Kollanukkaran
2019-07-23 4:54 ` Stojaczyk, Dariusz
2019-07-23 5:27 ` Jerin Jacob Kollanukkaran
2019-07-23 7:21 ` Thomas Monjalon
2019-07-23 9:57 ` Burakov, Anatoly
2019-07-23 10:25 ` Thomas Monjalon
2019-07-23 13:56 ` Burakov, Anatoly
2019-07-23 14:24 ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
2019-07-23 14:29 ` [dpdk-dev] " Burakov, Anatoly
2019-07-23 14:36 ` [dpdk-dev] [EXT] " Jerin Jacob Kollanukkaran
2019-07-23 15:47 ` Burakov, Anatoly
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4998e025-ed05-3a56-1e4c-c053cf67a7c4@intel.com \
--to=anatoly.burakov@intel.com \
--cc=aconole@redhat.com \
--cc=ajit.khaparde@broadcom.com \
--cc=alejandro.lucero@netronome.com \
--cc=alialnu@mellanox.com \
--cc=beilei.xing@intel.com \
--cc=bruce.richardson@intel.com \
--cc=david.marchand@redhat.com \
--cc=dev@dpdk.org \
--cc=hyonkim@cisco.com \
--cc=igor.russkikh@aquantia.com \
--cc=jerinj@marvell.com \
--cc=jingjing.wu@intel.com \
--cc=john.mcnamara@intel.com \
--cc=johndale@cisco.com \
--cc=kirankumark@marvell.com \
--cc=konstantin.ananyev@intel.com \
--cc=marko.kovacevic@intel.com \
--cc=matan@mellanox.com \
--cc=ndabilpuram@marvell.com \
--cc=pavel.belous@aquantia.com \
--cc=qi.z.zhang@intel.com \
--cc=qiming.yang@intel.com \
--cc=rmody@marvell.com \
--cc=shahafs@mellanox.com \
--cc=shshaikh@marvell.com \
--cc=somnath.kotur@broadcom.com \
--cc=thomas@monjalon.net \
--cc=viacheslavo@mellanox.com \
--cc=wenzhuo.lu@intel.com \
--cc=xiao.w.wang@intel.com \
--cc=yskoh@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).