From: Maxime Coquelin <maxime.coquelin@redhat.com>
To: "Wang, Xiao W" <xiao.w.wang@intel.com>,
"Bie, Tiwei" <tiwei.bie@intel.com>
Cc: "alejandro.lucero@netronome.com" <alejandro.lucero@netronome.com>,
"dev@dpdk.org" <dev@dpdk.org>,
"Wang, Zhihong" <zhihong.wang@intel.com>,
"Ye, Xiaolong" <xiaolong.ye@intel.com>
Subject: Re: [dpdk-dev] [PATCH v4 09/10] net/ifc: support SW assisted VDPA live migration
Date: Mon, 17 Dec 2018 12:08:44 +0100 [thread overview]
Message-ID: <fb9d77cf-0001-3b9c-bd17-c0dccbfe9733@redhat.com> (raw)
In-Reply-To: <B7F2E978279D1D49A3034B7786DACF407A4BAD6C@SHSMSX101.ccr.corp.intel.com>
On 12/17/18 10:12 AM, Wang, Xiao W wrote:
> Hi Maxime,
>
>> -----Original Message-----
>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
>> Sent: Sunday, December 16, 2018 1:35 AM
>> To: Wang, Xiao W <xiao.w.wang@intel.com>; Bie, Tiwei <tiwei.bie@intel.com>
>> Cc: alejandro.lucero@netronome.com; dev@dpdk.org; Wang, Zhihong
>> <zhihong.wang@intel.com>; Ye, Xiaolong <xiaolong.ye@intel.com>
>> Subject: Re: [PATCH v4 09/10] net/ifc: support SW assisted VDPA live migration
>>
>>
>>
>> On 12/14/18 10:16 PM, Xiao Wang wrote:
>>> In SW assisted live migration mode, driver will stop the device and
>>> setup a mediate virtio ring to relay the communication between the
>>> virtio driver and the VDPA device.
>>>
>>> This data path intervention will allow SW to help on guest dirty page
>>> logging for live migration.
>>>
>>> This SW fallback is event driven relay thread, so when the network
>>> throughput is low, this SW fallback will take little CPU resource, but
>>> when the throughput goes up, the relay thread's CPU usage will goes up
>>> accordinly.
>>
>> s/accordinly/accordingly/
>>
>
> Will fix it in next version.
>
>>>
>>> User needs to take all the factors including CPU usage, guest perf
>>> degradation, etc. into consideration when selecting the live migration
>>> support mode.
>>>
>>> Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
>>> ---
>>> drivers/net/ifc/base/ifcvf.h | 1 +
>>> drivers/net/ifc/ifcvf_vdpa.c | 346
>> ++++++++++++++++++++++++++++++++++++++++++-
>>> 2 files changed, 344 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/net/ifc/base/ifcvf.h b/drivers/net/ifc/base/ifcvf.h
>>> index c15c69107..e8a30d2c6 100644
>>> --- a/drivers/net/ifc/base/ifcvf.h
>>> +++ b/drivers/net/ifc/base/ifcvf.h
>>> @@ -50,6 +50,7 @@
>>> #define IFCVF_LM_ENABLE_VF 0x1
>>> #define IFCVF_LM_ENABLE_PF 0x3
>>> #define IFCVF_LOG_BASE 0x100000000000
>>> +#define IFCVF_MEDIATE_VRING 0x200000000000
>>
>> MEDIATED?
>
> "mediate" is used as adjective here.
>
>>
>>>
>>> #define IFCVF_32_BIT_MASK 0xffffffff
>>>
>>> diff --git a/drivers/net/ifc/ifcvf_vdpa.c b/drivers/net/ifc/ifcvf_vdpa.c
>>> index f181c5a6e..61757d0b4 100644
>>> --- a/drivers/net/ifc/ifcvf_vdpa.c
>>> +++ b/drivers/net/ifc/ifcvf_vdpa.c
>>> @@ -63,6 +63,9 @@ struct ifcvf_internal {
>>> rte_atomic32_t running;
>>> rte_spinlock_t lock;
>>> bool sw_lm;
>
> [...]
>
>>> +static void *
>>> +vring_relay(void *arg)
>>> +{
>>> + int i, vid, epfd, fd, nfds;
>>> + struct ifcvf_internal *internal = (struct ifcvf_internal *)arg;
>>> + struct rte_vhost_vring vring;
>>> + struct rte_intr_handle *intr_handle;
>>> + uint16_t qid, q_num;
>>> + struct epoll_event events[IFCVF_MAX_QUEUES * 4];
>>> + struct epoll_event ev;
>>> + int nbytes;
>>> + uint64_t buf;
>>> +
>>> + vid = internal->vid;
>>> + q_num = rte_vhost_get_vring_num(vid);
>>> + /* prepare the mediate vring */
>>> + for (qid = 0; qid < q_num; qid++) {
>>> + rte_vhost_get_vring_base(vid, qid,
>>> + &internal->m_vring[qid].avail->idx,
>>> + &internal->m_vring[qid].used->idx);
>>> + rte_vdpa_relay_vring_avail(vid, qid, &internal->m_vring[qid]);
>>> + }
>>> +
>>> + /* add notify fd and interrupt fd to epoll */
>>> + epfd = epoll_create(IFCVF_MAX_QUEUES * 2);
>>> + if (epfd < 0) {
>>> + DRV_LOG(ERR, "failed to create epoll instance.");
>>> + return NULL;
>>> + }
>>> + internal->epfd = epfd;
>>> +
>>> + for (qid = 0; qid < q_num; qid++) {
>>> + ev.events = EPOLLIN | EPOLLPRI;
>>> + rte_vhost_get_vhost_vring(vid, qid, &vring);
>>> + ev.data.u64 = qid << 1 | (uint64_t)vring.kickfd << 32;
>>> + if (epoll_ctl(epfd, EPOLL_CTL_ADD, vring.kickfd, &ev) < 0) {
>>> + DRV_LOG(ERR, "epoll add error: %s", strerror(errno));
>>> + return NULL;
>>> + }
>>> + }
>>> +
>>> + intr_handle = &internal->pdev->intr_handle;
>>> + for (qid = 0; qid < q_num; qid++) {
>>> + ev.events = EPOLLIN | EPOLLPRI;
>>> + ev.data.u64 = 1 | qid << 1 |
>>> + (uint64_t)intr_handle->efds[qid] << 32;
>>> + if (epoll_ctl(epfd, EPOLL_CTL_ADD, intr_handle->efds[qid],
>> &ev)
>>> + < 0) {
>>> + DRV_LOG(ERR, "epoll add error: %s", strerror(errno));
>>> + return NULL;
>>> + }
>>> + }
>>> +
>>> + /* start relay with a first kick */
>>> + for (qid = 0; qid < q_num; qid++)
>>> + ifcvf_notify_queue(&internal->hw, qid);
>>> +
>>> + /* listen to the events and react accordingly */
>>> + for (;;) {
>>> + nfds = epoll_wait(epfd, events, q_num * 2, -1);
>>> + if (nfds < 0) {
>>> + if (errno == EINTR)
>>> + continue;
>>> + DRV_LOG(ERR, "epoll_wait return fail\n");
>>> + return NULL;
>>> + }
>>> +
>>> + for (i = 0; i < nfds; i++) {
>>> + fd = (uint32_t)(events[i].data.u64 >> 32);
>>> + do {
>>> + nbytes = read(fd, &buf, 8);
>>> + if (nbytes < 0) {
>>> + if (errno == EINTR ||
>>> + errno == EWOULDBLOCK ||
>>> + errno == EAGAIN)
>>> + continue;
>>> + DRV_LOG(INFO, "Error reading "
>>> + "kickfd: %s",
>>> + strerror(errno));
>>> + }
>>> + break;
>>> + } while (1);
>>> +
>>> + qid = events[i].data.u32 >> 1;
>>> +
>>> + if (events[i].data.u32 & 1)
>>> + update_used_ring(internal, qid);
>>> + else
>>> + update_avail_ring(internal, qid);
>>> + }
>>> + }
>>> +
>>> + return NULL;
>>> +}
>>> +
>>> +static int
>>> +setup_vring_relay(struct ifcvf_internal *internal)
>>> +{
>>> + int ret;
>>> +
>>> + ret = pthread_create(&internal->tid, NULL, vring_relay,
>>> + (void *)internal);
>>
>> So it will be scheduled without any affinity?
>> Shouldn't it use a pmd thread instead?
>
> The new thread will inherit the thread affinity from its parent thread. As you know, vdpa is trying to
> minimize CPU usage for virtio HW acceleration, and we assign just one core to vdpa daemon
> (doc/guides/sample_app_ug/vdpa.rst), so there's no dedicated pmd worker core.
OK, thanks for the clarification!
> Thanks,
> Xiao
>
next prev parent reply other threads:[~2018-12-17 11:08 UTC|newest]
Thread overview: 86+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-28 9:45 [dpdk-dev] [PATCH 0/9] " Xiao Wang
2018-11-28 9:45 ` [dpdk-dev] [PATCH 1/9] vhost: provide helper for host notifier ctrl Xiao Wang
2018-11-28 9:46 ` [dpdk-dev] [PATCH 2/9] vhost: provide helpers for virtio ring relay Xiao Wang
2018-12-04 6:22 ` Tiwei Bie
2018-12-12 6:51 ` Wang, Xiao W
2018-12-13 1:10 ` [dpdk-dev] [PATCH v2 0/9] support SW assisted VDPA live migration Xiao Wang
2018-12-13 1:10 ` [dpdk-dev] [PATCH v2 1/9] vhost: provide helper for host notifier ctrl Xiao Wang
2018-12-13 1:10 ` [dpdk-dev] [PATCH v2 2/9] vhost: provide helpers for virtio ring relay Xiao Wang
2018-12-13 10:09 ` [dpdk-dev] [PATCH v3 0/9] support SW assisted VDPA live migration Xiao Wang
2018-12-13 10:09 ` [dpdk-dev] [PATCH v3 1/9] vhost: provide helper for host notifier ctrl Xiao Wang
2018-12-14 13:33 ` Maxime Coquelin
2018-12-14 19:05 ` Wang, Xiao W
2018-12-14 21:16 ` [dpdk-dev] [PATCH v4 00/10] support SW assisted VDPA live migration Xiao Wang
2018-12-14 21:16 ` [dpdk-dev] [PATCH v4 01/10] vhost: remove unused internal API Xiao Wang
2018-12-16 8:58 ` Maxime Coquelin
2018-12-14 21:16 ` [dpdk-dev] [PATCH v4 02/10] vhost: provide helper for host notifier ctrl Xiao Wang
2018-12-16 9:00 ` Maxime Coquelin
2018-12-14 21:16 ` [dpdk-dev] [PATCH v4 03/10] vhost: provide helpers for virtio ring relay Xiao Wang
2018-12-16 9:10 ` Maxime Coquelin
2018-12-17 8:51 ` Wang, Xiao W
2018-12-17 11:02 ` Maxime Coquelin
2018-12-17 14:41 ` Wang, Xiao W
2018-12-17 19:00 ` Maxime Coquelin
2018-12-18 8:27 ` Wang, Xiao W
2018-12-18 8:44 ` Thomas Monjalon
2018-12-18 8:01 ` [dpdk-dev] [PATCH v5 00/10] support SW assisted VDPA live migration Xiao Wang
2018-12-18 8:01 ` [dpdk-dev] [PATCH v5 01/10] vhost: remove unused internal API Xiao Wang
2018-12-18 8:01 ` [dpdk-dev] [PATCH v5 02/10] vhost: provide helper for host notifier ctrl Xiao Wang
2018-12-18 15:37 ` Ferruh Yigit
2018-12-18 8:02 ` [dpdk-dev] [PATCH v5 03/10] vhost: provide helpers for virtio ring relay Xiao Wang
2018-12-18 8:02 ` [dpdk-dev] [PATCH v5 04/10] net/ifc: dump debug message for error Xiao Wang
2018-12-18 8:02 ` [dpdk-dev] [PATCH v5 05/10] net/ifc: store only registered device instance Xiao Wang
2018-12-18 8:02 ` [dpdk-dev] [PATCH v5 06/10] net/ifc: detect if VDPA mode is specified Xiao Wang
2018-12-18 8:02 ` [dpdk-dev] [PATCH v5 07/10] net/ifc: add devarg for LM mode Xiao Wang
2018-12-18 11:23 ` Maxime Coquelin
2018-12-18 8:02 ` [dpdk-dev] [PATCH v5 08/10] net/ifc: use lib API for used ring logging Xiao Wang
2018-12-18 8:02 ` [dpdk-dev] [PATCH v5 09/10] net/ifc: support SW assisted VDPA live migration Xiao Wang
2018-12-18 11:33 ` Maxime Coquelin
2018-12-18 8:02 ` [dpdk-dev] [PATCH v5 10/10] doc: update ifc NIC document Xiao Wang
2018-12-18 11:35 ` Maxime Coquelin
2018-12-14 21:16 ` [dpdk-dev] [PATCH v4 04/10] net/ifc: dump debug message for error Xiao Wang
2018-12-16 9:11 ` Maxime Coquelin
2018-12-14 21:16 ` [dpdk-dev] [PATCH v4 05/10] net/ifc: store only registered device instance Xiao Wang
2018-12-16 9:12 ` Maxime Coquelin
2018-12-14 21:16 ` [dpdk-dev] [PATCH v4 06/10] net/ifc: detect if VDPA mode is specified Xiao Wang
2018-12-16 9:17 ` Maxime Coquelin
2018-12-17 8:54 ` Wang, Xiao W
2018-12-14 21:16 ` [dpdk-dev] [PATCH v4 07/10] net/ifc: add devarg for LM mode Xiao Wang
2018-12-16 9:21 ` Maxime Coquelin
2018-12-17 9:00 ` Wang, Xiao W
2018-12-14 21:16 ` [dpdk-dev] [PATCH v4 08/10] net/ifc: use lib API for used ring logging Xiao Wang
2018-12-16 9:24 ` Maxime Coquelin
2018-12-14 21:16 ` [dpdk-dev] [PATCH v4 09/10] net/ifc: support SW assisted VDPA live migration Xiao Wang
2018-12-16 9:35 ` Maxime Coquelin
2018-12-17 9:12 ` Wang, Xiao W
2018-12-17 11:08 ` Maxime Coquelin [this message]
2018-12-14 21:16 ` [dpdk-dev] [PATCH v4 10/10] doc: update ifc NIC document Xiao Wang
2018-12-16 9:36 ` Maxime Coquelin
2018-12-17 9:15 ` Wang, Xiao W
2018-12-18 14:01 ` [dpdk-dev] [PATCH v4 00/10] support SW assisted VDPA live migration Maxime Coquelin
2018-12-13 10:09 ` [dpdk-dev] [PATCH v3 2/9] vhost: provide helpers for virtio ring relay Xiao Wang
2018-12-13 10:09 ` [dpdk-dev] [PATCH v3 3/9] net/ifc: dump debug message for error Xiao Wang
2018-12-13 10:09 ` [dpdk-dev] [PATCH v3 4/9] net/ifc: store only registered device instance Xiao Wang
2018-12-13 10:09 ` [dpdk-dev] [PATCH v3 5/9] net/ifc: detect if VDPA mode is specified Xiao Wang
2018-12-13 10:09 ` [dpdk-dev] [PATCH v3 6/9] net/ifc: add devarg for LM mode Xiao Wang
2018-12-13 10:09 ` [dpdk-dev] [PATCH v3 7/9] net/ifc: use lib API for used ring logging Xiao Wang
2018-12-13 10:09 ` [dpdk-dev] [PATCH v3 8/9] net/ifc: support SW assisted VDPA live migration Xiao Wang
2018-12-13 10:09 ` [dpdk-dev] [PATCH v3 9/9] doc: update ifc NIC document Xiao Wang
2018-12-13 1:10 ` [dpdk-dev] [PATCH v2 3/9] net/ifc: dump debug message for error Xiao Wang
2018-12-13 1:10 ` [dpdk-dev] [PATCH v2 4/9] net/ifc: store only registered device instance Xiao Wang
2018-12-13 1:10 ` [dpdk-dev] [PATCH v2 5/9] net/ifc: detect if VDPA mode is specified Xiao Wang
2018-12-13 1:10 ` [dpdk-dev] [PATCH v2 6/9] net/ifc: add devarg for LM mode Xiao Wang
2018-12-13 1:10 ` [dpdk-dev] [PATCH v2 7/9] net/ifc: use lib API for used ring logging Xiao Wang
2018-12-13 1:10 ` [dpdk-dev] [PATCH v2 8/9] net/ifc: support SW assisted VDPA live migration Xiao Wang
2018-12-13 1:10 ` [dpdk-dev] [PATCH v2 9/9] doc: update ifc NIC document Xiao Wang
2018-11-28 9:46 ` [dpdk-dev] [PATCH 3/9] net/ifc: dump debug message for error Xiao Wang
2018-11-28 9:46 ` [dpdk-dev] [PATCH 4/9] net/ifc: store only registered device instance Xiao Wang
2018-11-28 9:46 ` [dpdk-dev] [PATCH 5/9] net/ifc: detect if VDPA mode is specified Xiao Wang
2018-11-28 9:46 ` [dpdk-dev] [PATCH 6/9] net/ifc: add devarg for LM mode Xiao Wang
2018-12-04 6:31 ` Tiwei Bie
2018-12-12 6:53 ` Wang, Xiao W
2018-12-12 10:15 ` Alejandro Lucero
2018-12-12 10:23 ` Wang, Xiao W
2018-11-28 9:46 ` [dpdk-dev] [PATCH 7/9] net/ifc: use lib API for used ring logging Xiao Wang
2018-11-28 9:46 ` [dpdk-dev] [PATCH 8/9] net/ifc: support SW assisted VDPA live migration Xiao Wang
2018-11-28 9:46 ` [dpdk-dev] [PATCH 9/9] doc: update ifc NIC document Xiao Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fb9d77cf-0001-3b9c-bd17-c0dccbfe9733@redhat.com \
--to=maxime.coquelin@redhat.com \
--cc=alejandro.lucero@netronome.com \
--cc=dev@dpdk.org \
--cc=tiwei.bie@intel.com \
--cc=xiao.w.wang@intel.com \
--cc=xiaolong.ye@intel.com \
--cc=zhihong.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).