From: "Medvedkin, Vladimir" <vladimir.medvedkin@intel.com>
To: Jerin Jacob <jerinjacobk@gmail.com>,
fengchengwen <fengchengwen@huawei.com>, <sburla@marvell.com>,
<anoobj@marvell.com>
Cc: Anatoly Burakov <anatoly.burakov@intel.com>, <dev@dpdk.org>,
Kevin Laatz <kevin.laatz@intel.com>,
Bruce Richardson <bruce.richardson@intel.com>
Subject: Re: [PATCH v1 1/3] dmadev: add inter-domain operations
Date: Fri, 27 Oct 2023 14:46:28 +0100 [thread overview]
Message-ID: <3504aaf1-b4e0-44f5-883a-5e4bc0197283@intel.com> (raw)
In-Reply-To: <CALBAE1PiCdA3nLUBo-XwRVGNY7AykxyjJwznmoNdpuDXh37=yg@mail.gmail.com>
Hi Satananda, Anoob, Chengwen, Jerin, all,
After a number of internal discussions we have decided that we're going
to postpone this feature/patchset till next release.
>[Satananda] Have you considered extending rte_dma_port_param and
rte_dma_vchan_conf to represent interdomain memory transfer setup as a
separate port type like RTE_DMA_PORT_INTER_DOMAIN ?
>[Anoob] Can we move this 'src_handle' and 'dst_handle' registration to
rte_dma_vchan_setup so that the 'src_handle' and 'dst_handle' can be
configured in control path and the existing datapath APIs can work as is.
>[Jerin] Or move src_handle/dst_hanel to vchan config
We've listened to feedback on implementation, and have prototyped a
vchan-based interface. This has a number of advantages and
disadvantages, both in terms of API usage and in terms of our specific
driver.
Setting up inter-domain operations as separate vchans allow us to store
data inside the PMD and not duplicate any API paths, so having multiple
vchans addresses that problem. However, this also means that any new
vchans added while the PMD is active (such as attaching to a new
process) will have to be gated by start/stop. This is probably fine from
API point of view, but a hassle for user (previously, we could've just
started using the new inter-domain handle right away).
Another usability issue with multiple vchan approach is that now, each
vchan will have its own enqueue/submit/completion cycle, so any use case
relying on one thread communicating with many processes will have to
process each vchan separately, instead of everything going into one
vchan - again, looks fine API-wise, but a hassle for the user, since
this requires calling submit and completion for each vchan, and in some
cases it requires maintaining some kind of reordering queue. (On the
other hand, it would be much easier to separate operations intended for
different processes with this approach, so perhaps this is not such a
big issue)
Finally, there is also an IDXD-specific issue. Currently, IDXD HW
acceleration is implemented in such a way that each work queue will have
a unique DMA device ID (rather than a unique vchan), and each device can
technically process requests for both local and remote memory (local to
remote, remote to local, remote to remote), all in one queue - as it was
in our original implementation.
By changing implementation to use vchans, we're essentially bifurcating
this single queue - all vchans would have their own rings etc., but the
enqueue-to-hardware operation is still common to all vchans, because
there's a single underlying queue as far as hardware is concerned. The
queue is atomic in hardware, and technically, ENQCMD instruction returns
status in case of enqueue failure (such as when too many requests are in
flight), so technically we could just not pay attention to number of
in-flight operations and just rely on ENQCMD returning failures to
handle error/retry, but the problem with this is that this failure is
only happening on submit, not on enqueue.
So, in essence, with IDXD driver we have two choices: either we
implement some kind of in-flight counter to prevent our driver from
submitting too many requests (that is, vchans will have to cooperate -
use atomics or similar), or every user will have to handle not just
errors on enqueue, but also on submit (which I don't believe many people
do currently, even though technically submit can return failure - all
non-test usage in DPDK seems to assume submit realistically won't fail,
and I'd like to keep it that way).
We're in process of measuring performance impact of different
implementations, however I should note that while atomic operations on
data path are unfortunate, realistically these atomics are accessed only
at beginning/end of every 'enqueue-submit-complete' cycle, and not on
every operation. At the first glance where are no observable performance
penalty in regular use case (assuming we are not calling submit for
every enqueued job).
>[Satananda]Do you have usecases where a process from 3rd domain sets
up transfer between memories from 2 domains? i.e process 1 is src,
process 2 is dest and process 3 executes transfer.
This usecase is working with proposed API on our hardware.
>[Chengwen]And last, Could you introduce the application scenarios of
this feature?
We have used this feature to improve performance for memif driver.
On 09/10/2023 06:05, Jerin Jacob wrote:
> On Sun, Oct 8, 2023 at 8:03 AM fengchengwen <fengchengwen@huawei.com> wrote:
>> Hi Anatoly,
>>
>> On 2023/8/12 0:14, Anatoly Burakov wrote:
>>> Add a flag to indicate that a specific device supports inter-domain
>>> operations, and add an API for inter-domain copy and fill.
>>>
>>> Inter-domain operation is an operation that is very similar to regular
>>> DMA operation, except either source or destination addresses can be in a
>>> different process's address space, indicated by source and destination
>>> handle values. These values are currently meant to be provided by
>>> private drivers' API's.
>>>
>>> This commit also adds a controller ID field into the DMA device API.
>>> This is an arbitrary value that may not be implemented by hardware, but
>>> it is meant to represent some kind of device hierarchy.
>>>
>>> Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
>>> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>> ---
>> ...
>>
>>> +__rte_experimental
>>> +static inline int
>>> +rte_dma_copy_inter_dom(int16_t dev_id, uint16_t vchan, rte_iova_t src,
>>> + rte_iova_t dst, uint32_t length, uint16_t src_handle,
>>> + uint16_t dst_handle, uint64_t flags)
>> I would suggest add more general extension:
>> rte_dma_copy*(int16_t dev_id, uint16_t vchan, rte_iova_t src, rte_iova_t dst,
>> uint32_t length, uint64_t flags, void *param)
>> The param only valid under some flags bits.
>> As for this inter-domain extension: we could define inter-domain param struct.
>>
>>
>> Whether add in current rte_dma_copy() API or add one new API, I think it mainly
>> depend on performance impact of parameter transfer. Suggest more discuss for
>> differnt platform and call specification.
> Or move src_handle/dst_hanel to vchan config to enable better performance.
> Application create N number of vchan based on the requirements.
>
>>
>> And last, Could you introduce the application scenarios of this feature?
> Looks like VM to VM or container to container copy.
>
>>
>> Thanks.
>>
--
Regards,
Vladimir
next prev parent reply other threads:[~2023-10-27 13:46 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-11 16:14 [PATCH v1 0/3] Add support for inter-domain DMA operations Anatoly Burakov
2023-08-11 16:14 ` [PATCH v1 1/3] dmadev: add inter-domain operations Anatoly Burakov
2023-08-18 8:08 ` [EXT] " Anoob Joseph
2023-10-08 2:33 ` fengchengwen
2023-10-09 5:05 ` Jerin Jacob
2023-10-27 13:46 ` Medvedkin, Vladimir [this message]
2023-11-23 5:24 ` Jerin Jacob
2023-08-11 16:14 ` [PATCH v1 2/3] dma/idxd: implement " Anatoly Burakov
2023-08-11 16:14 ` [PATCH v1 3/3] dma/idxd: add API to create and attach to window Anatoly Burakov
2023-08-14 4:39 ` Jerin Jacob
2023-08-14 9:55 ` Burakov, Anatoly
2023-08-15 19:20 ` [EXT] [PATCH v1 0/3] Add support for inter-domain DMA operations Satananda Burla
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3504aaf1-b4e0-44f5-883a-5e4bc0197283@intel.com \
--to=vladimir.medvedkin@intel.com \
--cc=anatoly.burakov@intel.com \
--cc=anoobj@marvell.com \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=fengchengwen@huawei.com \
--cc=jerinjacobk@gmail.com \
--cc=kevin.laatz@intel.com \
--cc=sburla@marvell.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).