From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
To: Shahaf Shuler <shahafs@mellanox.com>, dev@dpdk.org
Cc: olgas@mellanox.com, yskoh@mellanox.com,
pawelx.wodkowski@intel.com, gowrishankar.m@linux.vnet.ibm.com,
ferruh.yigit@intel.com, thomas@monjalon.net,
arybchenko@solarflare.com, shreyansh.jain@nxp.com
Subject: Re: [dpdk-dev] [RFC] ethdev: introduce DMA memory mapping for external memory
Date: Wed, 14 Nov 2018 11:19:06 +0000 [thread overview]
Message-ID: <ba00bccc-f20d-47a9-052f-3e5b6bc1c2c7@intel.com> (raw)
In-Reply-To: <aae4d2d1d6ceabe661e22ae8a7591193cea62104.1541335203.git.shahafs@mellanox.com>
Hi Shahaf,
Great to see such effort! Few comments below.
Note: halfway through writing my comments i realized that i am starting
with an assumption that this API is a replacement for current VFIO DMA
mapping API's. So, if my comments seem out of left field, this is
probably why :)
On 04-Nov-18 12:41 PM, Shahaf Shuler wrote:
> Request for comment on the high level changes present on this patch.
>
> The need to use external memory (memory belong to application and not
> part of the DPDK hugepages) is allready present.
> Starting from storage apps which prefer to manage their own memory blocks
> for efficient use of the storage device. Continue with GPU based
> application which strives to achieve zero copy while processing the packet
> payload on the GPU core. And finally by vSwitch/vRouter application who
> just prefer to have a full control over the memory in use (e.g. VPP).
>
> Recent work[1] in the DPDK enabled the use of external memory, however
> it mostly focus on VFIO as the only way to map memory.
> While VFIO is common, there are other vendors which use different ways
> to map memory (e.g. Mellanox and NXP[2]).
>
> The work in this patch moves the DMA mapping to vendor agnostic APIs
> located under ethdev. The choice in ethdev was because memory map should
> be associated with a specific port(s). Otherwise the memory is being
> mapped multiple times to different frameworks and ends up with memory
> being wasted on redundant translation table in the host or in the device.
So, anything other than ethdev (e.g. cryptodev) will not be able to map
memory for DMA?
I have thought about this for some length of time, and i think DMA
mapping belongs in EAL (more specifically, somewhere at the bus layer),
rather than at device level. Placing this functionality at device level
comes with more work to support different device types and puts a burden
on device driver developers to implement their own mapping functions.
However, i have no familiarity with how MLX/NXP devices do their DMA
mapping, so maybe the device-centric approach would be better. We could
provide "standard" mapping functions at the bus level (such as VFIO
mapping functions for PCI bus), so that this could would not have to be
reimplemented in the devices.
Moreover, i'm not sure how this is going to work for VFIO. If this is to
be called for each NIC that needs access to the memory, then we'll end
up with double mappings for any NIC that uses VFIO, unless you want each
NIC to be in a separate container.
>
> For example, consider a host with Mellanox and Intel devices. Mapping a
> memory without specifying to which port will end up with IOMMU
> registration and Verbs (Mellanox DMA map) registration.
> Another example can be two Mellanox devices on the same host. The memory
> will be mapped for both, even though application will use mempool per
> device.
>
> To use the suggested APIs the application will allocate a memory block
> and will call rte_eth_dma_map. It will map it to every port that needs
> DMA access to this memory.
This bit is unclear to me. What do you mean "map it to every port that
needs DMA access to this memory"? I don't see how this API solves the
above problem of mapping the same memory to all devices. How does a
device know which memory it will need? Does the user specifically have
to call this API for each and every NIC they're using?
For DPDK-managed memory, everything will still get mapped to every
device automatically, correct? If so, then such a manual approach for
external memory will be bad for both usability and drop-in replacement
of internal-to-external memory, because it introduces inconsistency
between using internal and external memory. From my point of view,
either we do *everything* manually (i.e. register all memory for DMA
explicitly) and thereby avoid this problem but keep the consistency, or
we do *everything* automatically and deal with duplication of mappings
somehow (say, by MLX/NXP drivers sharing their mappings through bus
interface).
> Later on the application could use this memory to populate a mempool or
> to attach mbuf with external buffer.
> When the memory should no longer be used by the device the application
> will call rte_eth_dma_unmap from every port it did registration to.
>
> The Drivers will implement the DMA map/unmap, and it is very likely they
> will use the help of the existing VFIO mapping.
>
> Support for hotplug/unplug of device is out of the scope for this patch,
> however can be implemented in the same way it is done on VFIO.
>
> Cc: pawelx.wodkowski@intel.com
> Cc: anatoly.burakov@intel.com
> Cc: gowrishankar.m@linux.vnet.ibm.com
> Cc: ferruh.yigit@intel.com
> Cc: thomas@monjalon.net
> Cc: arybchenko@solarflare.com
> Cc: shreyansh.jain@nxp.com
>
> Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
>
> [1]
> commit 73a639085938 ("vfio: allow to map other memory regions")
> [2]
> http://mails.dpdk.org/archives/dev/2018-September/111978.html
> ---
--
Thanks,
Anatoly
next prev parent reply other threads:[~2018-11-14 11:19 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-04 12:41 Shahaf Shuler
2018-11-14 11:19 ` Burakov, Anatoly [this message]
2018-11-14 14:53 ` Shahaf Shuler
2018-11-14 17:06 ` Burakov, Anatoly
2018-11-15 9:46 ` Shahaf Shuler
2018-11-15 10:59 ` Burakov, Anatoly
2018-11-19 11:20 ` Shahaf Shuler
2018-11-19 17:18 ` Burakov, Anatoly
[not found] ` <DB7PR05MB442643DFD33B71797CD34B5EC3D90@DB7PR05MB4426.eurprd05.prod.outlook.com>
2018-11-20 10:55 ` Burakov, Anatoly
2018-11-22 10:06 ` Shahaf Shuler
2018-11-22 10:41 ` Burakov, Anatoly
2018-11-22 11:31 ` Shahaf Shuler
2018-11-22 11:34 ` Burakov, Anatoly
2019-01-14 6:12 ` Shahaf Shuler
2019-01-15 12:07 ` Burakov, Anatoly
2019-01-16 11:04 ` Shahaf Shuler
2018-11-19 17:04 ` Stephen Hemminger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ba00bccc-f20d-47a9-052f-3e5b6bc1c2c7@intel.com \
--to=anatoly.burakov@intel.com \
--cc=arybchenko@solarflare.com \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@intel.com \
--cc=gowrishankar.m@linux.vnet.ibm.com \
--cc=olgas@mellanox.com \
--cc=pawelx.wodkowski@intel.com \
--cc=shahafs@mellanox.com \
--cc=shreyansh.jain@nxp.com \
--cc=thomas@monjalon.net \
--cc=yskoh@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).