From: Maxime Coquelin <maxime.coquelin@redhat.com>
To: "Hu, Jiayu" <jiayu.hu@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
Cc: "Ye, Xiaolong" <xiaolong.ye@intel.com>,
"Wang, Zhihong" <zhihong.wang@intel.com>
Subject: Re: [dpdk-dev] [PATCH 0/4] Support DMA-accelerated Tx operations for vhost-user PMD
Date: Thu, 26 Mar 2020 09:47:53 +0100 [thread overview]
Message-ID: <198ba410-446e-d53c-cc60-5887c29fa0e3@redhat.com> (raw)
In-Reply-To: <1007246585054daca2afce895ac9f875@intel.com>
On 3/26/20 9:25 AM, Hu, Jiayu wrote:
> Hi Maxime,
>
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Thursday, March 26, 2020 3:53 PM
>> To: Hu, Jiayu <jiayu.hu@intel.com>; dev@dpdk.org
>> Cc: Ye, Xiaolong <xiaolong.ye@intel.com>; Wang, Zhihong
>> <zhihong.wang@intel.com>
>> Subject: Re: [PATCH 0/4] Support DMA-accelerated Tx operations for vhost-
>> user PMD
>>
>> Hi Jiayu,
>>
>> On 3/19/20 12:47 PM, Hu, Jiayu wrote:
>>
>>>>
>>>> Ok, so what about:
>>>>
>>>> Introducing a pair of callbacks in struct virtio_net for DMA enqueue and
>>>> dequeue.
>>>>
>>>> lib/librte_vhost/ioat.c which would implement dma_enqueue and
>>>> dma_dequeue callback for IOAT. As it will live in the vhost lib
>>>> directory, it will be easy to refactor the code to share as much as
>>>> possible and so avoid code duplication.
>>>>
>>>> In rte_vhost_enqueue/dequeue_burst, if the dma callback is set, then
>>>> call it instead of the SW datapath. It adds a few cycle, but this is
>>>> much more sane IMHO.
>>>
>>> The problem is that current semantics of rte_vhost_enqueue/dequeue API
>>> are conflict with I/OAT accelerated data path. To improve the performance,
>>> the I/OAT works in an asynchronous manner, where the CPU just submits
>>> copy jobs to the I/OAT without waiting for its copy completion. For
>>> rte_vhost_enqueue_burst, users cannot reuse enqueued pktmbufs when
>> it
>>> returns, as the I/OAT may still use them. For rte_vhost_dequeue_burst,
>>> users will not get incoming packets as the I/OAT is still performing packet
>>> copies. As you can see, when enabling I/OAT acceleration, the semantics of
>>> the two API are changed. If we keep the same API name but changing their
>>> semantic, this may confuse users, IMHO.
>>
>> Ok, so it is basically the same as zero-copy for dequeue path, right?
>> If a new API is necessary, then it would be better to add it in Vhost
>> library for async enqueue/dequeue.
>> It could be used also for Tx zero-copy, and so the sync version would
>> save some cycles as we could remove the zero-copy support there.
>>
>> What do you think?
>
> Yes, you are right. The better way is to provide new API with asynchronous
> semantics in vhost library. In addition, the vhost library better provides DMA
> operation callbacks to avoid using vender specific API. The asynchronous API may
> look like rte_vhost_try_enqueue_burst() and rte_vhost_get_completed_packets().
> The first one is to perform enqueue logic, and the second one is to return
> pktmbufs whose all copies are completed to users. How do you think?
That looks good to me, great!
The only think is the naming of the API. I need t think more about it,
but it does not prevent to start working on the implementation.
Regarding the initialization, I was thinking we could introduce new
flags to rte_vhost_driver_register:
- RTE_VHOST_USER_TX_DMA
- RTE_VHOST_USER_RX_DMA
Well, only Tx can be implemented for now, but the Rx flag can be
reserved.
The thing I'm not clear is when no DMA is available, how do we fallback
to the sync API.
Should the user still call rte_vhost_try_enqueue_burst(), but if no DMA,
it will call the rte_vhost_enqueue_burst() directly and then
rte_vhost_get_completed_packets() will return all the mbufs?
Thanks,
Maxime
> Thanks,
> Jiayu
>
>>
>> I really object to implement vring handling into the Vhost PMD, this is
>> the role of the Vhost library.
>>
>> Thanks,
>> Maxime
>
prev parent reply other threads:[~2020-03-26 8:48 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-17 9:21 Jiayu Hu
2020-03-17 9:21 ` [dpdk-dev] [PATCH 1/4] vhost: populate guest memory for DMA-accelerated vhost-user Jiayu Hu
2020-03-17 9:21 ` [dpdk-dev] [PATCH 2/4] net/vhost: setup vrings for DMA-accelerated datapath Jiayu Hu
2020-03-17 6:29 ` Liu, Yong
2020-03-17 9:35 ` Hu, Jiayu
2020-03-18 1:17 ` Liu, Yong
2020-03-17 9:21 ` [dpdk-dev] [PATCH 3/4] net/vhost: leverage DMA engines to accelerate Tx operations Jiayu Hu
2020-03-17 7:21 ` Liu, Yong
2020-03-17 9:31 ` Hu, Jiayu
2020-03-18 1:22 ` Liu, Yong
2020-03-17 9:21 ` [dpdk-dev] [PATCH 4/4] doc: add I/OAT acceleration support for vhost-user PMD Jiayu Hu
2020-03-17 6:36 ` Ye Xiaolong
2020-03-17 9:53 ` [dpdk-dev] [PATCH 0/4] Support DMA-accelerated Tx operations " Maxime Coquelin
2020-03-19 7:33 ` Hu, Jiayu
2020-03-19 9:10 ` Maxime Coquelin
2020-03-19 11:47 ` Hu, Jiayu
2020-03-26 7:52 ` Maxime Coquelin
2020-03-26 8:25 ` Hu, Jiayu
2020-03-26 8:47 ` Maxime Coquelin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=198ba410-446e-d53c-cc60-5887c29fa0e3@redhat.com \
--to=maxime.coquelin@redhat.com \
--cc=dev@dpdk.org \
--cc=jiayu.hu@intel.com \
--cc=xiaolong.ye@intel.com \
--cc=zhihong.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).