DPDK patches and discussions
 help / color / mirror / Atom feed
From: Maxime Coquelin <maxime.coquelin@redhat.com>
To: "Wang, Xiao W" <xiao.w.wang@intel.com>,
	"Bie, Tiwei" <tiwei.bie@intel.com>
Cc: "alejandro.lucero@netronome.com" <alejandro.lucero@netronome.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"Wang, Zhihong" <zhihong.wang@intel.com>,
	"Ye, Xiaolong" <xiaolong.ye@intel.com>
Subject: Re: [dpdk-dev] [PATCH v4 09/10] net/ifc: support SW assisted VDPA live migration
Date: Mon, 17 Dec 2018 12:08:44 +0100	[thread overview]
Message-ID: <fb9d77cf-0001-3b9c-bd17-c0dccbfe9733@redhat.com> (raw)
In-Reply-To: <B7F2E978279D1D49A3034B7786DACF407A4BAD6C@SHSMSX101.ccr.corp.intel.com>



On 12/17/18 10:12 AM, Wang, Xiao W wrote:
> Hi Maxime,
> 
>> -----Original Message-----
>> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
>> Sent: Sunday, December 16, 2018 1:35 AM
>> To: Wang, Xiao W <xiao.w.wang@intel.com>; Bie, Tiwei <tiwei.bie@intel.com>
>> Cc: alejandro.lucero@netronome.com; dev@dpdk.org; Wang, Zhihong
>> <zhihong.wang@intel.com>; Ye, Xiaolong <xiaolong.ye@intel.com>
>> Subject: Re: [PATCH v4 09/10] net/ifc: support SW assisted VDPA live migration
>>
>>
>>
>> On 12/14/18 10:16 PM, Xiao Wang wrote:
>>> In SW assisted live migration mode, driver will stop the device and
>>> setup a mediate virtio ring to relay the communication between the
>>> virtio driver and the VDPA device.
>>>
>>> This data path intervention will allow SW to help on guest dirty page
>>> logging for live migration.
>>>
>>> This SW fallback is event driven relay thread, so when the network
>>> throughput is low, this SW fallback will take little CPU resource, but
>>> when the throughput goes up, the relay thread's CPU usage will goes up
>>> accordinly.
>>
>> s/accordinly/accordingly/
>>
> 
> Will fix it in next version.
> 
>>>
>>> User needs to take all the factors including CPU usage, guest perf
>>> degradation, etc. into consideration when selecting the live migration
>>> support mode.
>>>
>>> Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
>>> ---
>>>    drivers/net/ifc/base/ifcvf.h |   1 +
>>>    drivers/net/ifc/ifcvf_vdpa.c | 346
>> ++++++++++++++++++++++++++++++++++++++++++-
>>>    2 files changed, 344 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/net/ifc/base/ifcvf.h b/drivers/net/ifc/base/ifcvf.h
>>> index c15c69107..e8a30d2c6 100644
>>> --- a/drivers/net/ifc/base/ifcvf.h
>>> +++ b/drivers/net/ifc/base/ifcvf.h
>>> @@ -50,6 +50,7 @@
>>>    #define IFCVF_LM_ENABLE_VF		0x1
>>>    #define IFCVF_LM_ENABLE_PF		0x3
>>>    #define IFCVF_LOG_BASE			0x100000000000
>>> +#define IFCVF_MEDIATE_VRING		0x200000000000
>>
>> MEDIATED?
> 
> "mediate" is used as adjective here.
> 
>>
>>>
>>>    #define IFCVF_32_BIT_MASK		0xffffffff
>>>
>>> diff --git a/drivers/net/ifc/ifcvf_vdpa.c b/drivers/net/ifc/ifcvf_vdpa.c
>>> index f181c5a6e..61757d0b4 100644
>>> --- a/drivers/net/ifc/ifcvf_vdpa.c
>>> +++ b/drivers/net/ifc/ifcvf_vdpa.c
>>> @@ -63,6 +63,9 @@ struct ifcvf_internal {
>>>    	rte_atomic32_t running;
>>>    	rte_spinlock_t lock;
>>>    	bool sw_lm;
> 
> [...]
> 
>>> +static void *
>>> +vring_relay(void *arg)
>>> +{
>>> +	int i, vid, epfd, fd, nfds;
>>> +	struct ifcvf_internal *internal = (struct ifcvf_internal *)arg;
>>> +	struct rte_vhost_vring vring;
>>> +	struct rte_intr_handle *intr_handle;
>>> +	uint16_t qid, q_num;
>>> +	struct epoll_event events[IFCVF_MAX_QUEUES * 4];
>>> +	struct epoll_event ev;
>>> +	int nbytes;
>>> +	uint64_t buf;
>>> +
>>> +	vid = internal->vid;
>>> +	q_num = rte_vhost_get_vring_num(vid);
>>> +	/* prepare the mediate vring */
>>> +	for (qid = 0; qid < q_num; qid++) {
>>> +		rte_vhost_get_vring_base(vid, qid,
>>> +				&internal->m_vring[qid].avail->idx,
>>> +				&internal->m_vring[qid].used->idx);
>>> +		rte_vdpa_relay_vring_avail(vid, qid, &internal->m_vring[qid]);
>>> +	}
>>> +
>>> +	/* add notify fd and interrupt fd to epoll */
>>> +	epfd = epoll_create(IFCVF_MAX_QUEUES * 2);
>>> +	if (epfd < 0) {
>>> +		DRV_LOG(ERR, "failed to create epoll instance.");
>>> +		return NULL;
>>> +	}
>>> +	internal->epfd = epfd;
>>> +
>>> +	for (qid = 0; qid < q_num; qid++) {
>>> +		ev.events = EPOLLIN | EPOLLPRI;
>>> +		rte_vhost_get_vhost_vring(vid, qid, &vring);
>>> +		ev.data.u64 = qid << 1 | (uint64_t)vring.kickfd << 32;
>>> +		if (epoll_ctl(epfd, EPOLL_CTL_ADD, vring.kickfd, &ev) < 0) {
>>> +			DRV_LOG(ERR, "epoll add error: %s", strerror(errno));
>>> +			return NULL;
>>> +		}
>>> +	}
>>> +
>>> +	intr_handle = &internal->pdev->intr_handle;
>>> +	for (qid = 0; qid < q_num; qid++) {
>>> +		ev.events = EPOLLIN | EPOLLPRI;
>>> +		ev.data.u64 = 1 | qid << 1 |
>>> +			(uint64_t)intr_handle->efds[qid] << 32;
>>> +		if (epoll_ctl(epfd, EPOLL_CTL_ADD, intr_handle->efds[qid],
>> &ev)
>>> +				< 0) {
>>> +			DRV_LOG(ERR, "epoll add error: %s", strerror(errno));
>>> +			return NULL;
>>> +		}
>>> +	}
>>> +
>>> +	/* start relay with a first kick */
>>> +	for (qid = 0; qid < q_num; qid++)
>>> +		ifcvf_notify_queue(&internal->hw, qid);
>>> +
>>> +	/* listen to the events and react accordingly */
>>> +	for (;;) {
>>> +		nfds = epoll_wait(epfd, events, q_num * 2, -1);
>>> +		if (nfds < 0) {
>>> +			if (errno == EINTR)
>>> +				continue;
>>> +			DRV_LOG(ERR, "epoll_wait return fail\n");
>>> +			return NULL;
>>> +		}
>>> +
>>> +		for (i = 0; i < nfds; i++) {
>>> +			fd = (uint32_t)(events[i].data.u64 >> 32);
>>> +			do {
>>> +				nbytes = read(fd, &buf, 8);
>>> +				if (nbytes < 0) {
>>> +					if (errno == EINTR ||
>>> +					    errno == EWOULDBLOCK ||
>>> +					    errno == EAGAIN)
>>> +						continue;
>>> +					DRV_LOG(INFO, "Error reading "
>>> +						"kickfd: %s",
>>> +						strerror(errno));
>>> +				}
>>> +				break;
>>> +			} while (1);
>>> +
>>> +			qid = events[i].data.u32 >> 1;
>>> +
>>> +			if (events[i].data.u32 & 1)
>>> +				update_used_ring(internal, qid);
>>> +			else
>>> +				update_avail_ring(internal, qid);
>>> +		}
>>> +	}
>>> +
>>> +	return NULL;
>>> +}
>>> +
>>> +static int
>>> +setup_vring_relay(struct ifcvf_internal *internal)
>>> +{
>>> +	int ret;
>>> +
>>> +	ret = pthread_create(&internal->tid, NULL, vring_relay,
>>> +			(void *)internal);
>>
>> So it will be scheduled without any affinity?
>> Shouldn't it use a pmd thread instead?
> 
> The new thread will inherit the thread affinity from its parent thread. As you know, vdpa is trying to
>   minimize CPU usage for virtio HW acceleration, and we assign just one core to vdpa daemon
>   (doc/guides/sample_app_ug/vdpa.rst), so there's no dedicated pmd worker core.

OK, thanks for the clarification!

> Thanks,
> Xiao
> 

  reply	other threads:[~2018-12-17 11:08 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-28  9:45 [dpdk-dev] [PATCH 0/9] " Xiao Wang
2018-11-28  9:45 ` [dpdk-dev] [PATCH 1/9] vhost: provide helper for host notifier ctrl Xiao Wang
2018-11-28  9:46 ` [dpdk-dev] [PATCH 2/9] vhost: provide helpers for virtio ring relay Xiao Wang
2018-12-04  6:22   ` Tiwei Bie
2018-12-12  6:51     ` Wang, Xiao W
2018-12-13  1:10   ` [dpdk-dev] [PATCH v2 0/9] support SW assisted VDPA live migration Xiao Wang
2018-12-13  1:10     ` [dpdk-dev] [PATCH v2 1/9] vhost: provide helper for host notifier ctrl Xiao Wang
2018-12-13  1:10     ` [dpdk-dev] [PATCH v2 2/9] vhost: provide helpers for virtio ring relay Xiao Wang
2018-12-13 10:09       ` [dpdk-dev] [PATCH v3 0/9] support SW assisted VDPA live migration Xiao Wang
2018-12-13 10:09         ` [dpdk-dev] [PATCH v3 1/9] vhost: provide helper for host notifier ctrl Xiao Wang
2018-12-14 13:33           ` Maxime Coquelin
2018-12-14 19:05             ` Wang, Xiao W
2018-12-14 21:16           ` [dpdk-dev] [PATCH v4 00/10] support SW assisted VDPA live migration Xiao Wang
2018-12-14 21:16             ` [dpdk-dev] [PATCH v4 01/10] vhost: remove unused internal API Xiao Wang
2018-12-16  8:58               ` Maxime Coquelin
2018-12-14 21:16             ` [dpdk-dev] [PATCH v4 02/10] vhost: provide helper for host notifier ctrl Xiao Wang
2018-12-16  9:00               ` Maxime Coquelin
2018-12-14 21:16             ` [dpdk-dev] [PATCH v4 03/10] vhost: provide helpers for virtio ring relay Xiao Wang
2018-12-16  9:10               ` Maxime Coquelin
2018-12-17  8:51                 ` Wang, Xiao W
2018-12-17 11:02                   ` Maxime Coquelin
2018-12-17 14:41                     ` Wang, Xiao W
2018-12-17 19:00                       ` Maxime Coquelin
2018-12-18  8:27                         ` Wang, Xiao W
2018-12-18  8:44                           ` Thomas Monjalon
2018-12-18  8:01               ` [dpdk-dev] [PATCH v5 00/10] support SW assisted VDPA live migration Xiao Wang
2018-12-18  8:01                 ` [dpdk-dev] [PATCH v5 01/10] vhost: remove unused internal API Xiao Wang
2018-12-18  8:01                 ` [dpdk-dev] [PATCH v5 02/10] vhost: provide helper for host notifier ctrl Xiao Wang
2018-12-18 15:37                   ` Ferruh Yigit
2018-12-18  8:02                 ` [dpdk-dev] [PATCH v5 03/10] vhost: provide helpers for virtio ring relay Xiao Wang
2018-12-18  8:02                 ` [dpdk-dev] [PATCH v5 04/10] net/ifc: dump debug message for error Xiao Wang
2018-12-18  8:02                 ` [dpdk-dev] [PATCH v5 05/10] net/ifc: store only registered device instance Xiao Wang
2018-12-18  8:02                 ` [dpdk-dev] [PATCH v5 06/10] net/ifc: detect if VDPA mode is specified Xiao Wang
2018-12-18  8:02                 ` [dpdk-dev] [PATCH v5 07/10] net/ifc: add devarg for LM mode Xiao Wang
2018-12-18 11:23                   ` Maxime Coquelin
2018-12-18  8:02                 ` [dpdk-dev] [PATCH v5 08/10] net/ifc: use lib API for used ring logging Xiao Wang
2018-12-18  8:02                 ` [dpdk-dev] [PATCH v5 09/10] net/ifc: support SW assisted VDPA live migration Xiao Wang
2018-12-18 11:33                   ` Maxime Coquelin
2018-12-18  8:02                 ` [dpdk-dev] [PATCH v5 10/10] doc: update ifc NIC document Xiao Wang
2018-12-18 11:35                   ` Maxime Coquelin
2018-12-14 21:16             ` [dpdk-dev] [PATCH v4 04/10] net/ifc: dump debug message for error Xiao Wang
2018-12-16  9:11               ` Maxime Coquelin
2018-12-14 21:16             ` [dpdk-dev] [PATCH v4 05/10] net/ifc: store only registered device instance Xiao Wang
2018-12-16  9:12               ` Maxime Coquelin
2018-12-14 21:16             ` [dpdk-dev] [PATCH v4 06/10] net/ifc: detect if VDPA mode is specified Xiao Wang
2018-12-16  9:17               ` Maxime Coquelin
2018-12-17  8:54                 ` Wang, Xiao W
2018-12-14 21:16             ` [dpdk-dev] [PATCH v4 07/10] net/ifc: add devarg for LM mode Xiao Wang
2018-12-16  9:21               ` Maxime Coquelin
2018-12-17  9:00                 ` Wang, Xiao W
2018-12-14 21:16             ` [dpdk-dev] [PATCH v4 08/10] net/ifc: use lib API for used ring logging Xiao Wang
2018-12-16  9:24               ` Maxime Coquelin
2018-12-14 21:16             ` [dpdk-dev] [PATCH v4 09/10] net/ifc: support SW assisted VDPA live migration Xiao Wang
2018-12-16  9:35               ` Maxime Coquelin
2018-12-17  9:12                 ` Wang, Xiao W
2018-12-17 11:08                   ` Maxime Coquelin [this message]
2018-12-14 21:16             ` [dpdk-dev] [PATCH v4 10/10] doc: update ifc NIC document Xiao Wang
2018-12-16  9:36               ` Maxime Coquelin
2018-12-17  9:15                 ` Wang, Xiao W
2018-12-18 14:01             ` [dpdk-dev] [PATCH v4 00/10] support SW assisted VDPA live migration Maxime Coquelin
2018-12-13 10:09         ` [dpdk-dev] [PATCH v3 2/9] vhost: provide helpers for virtio ring relay Xiao Wang
2018-12-13 10:09         ` [dpdk-dev] [PATCH v3 3/9] net/ifc: dump debug message for error Xiao Wang
2018-12-13 10:09         ` [dpdk-dev] [PATCH v3 4/9] net/ifc: store only registered device instance Xiao Wang
2018-12-13 10:09         ` [dpdk-dev] [PATCH v3 5/9] net/ifc: detect if VDPA mode is specified Xiao Wang
2018-12-13 10:09         ` [dpdk-dev] [PATCH v3 6/9] net/ifc: add devarg for LM mode Xiao Wang
2018-12-13 10:09         ` [dpdk-dev] [PATCH v3 7/9] net/ifc: use lib API for used ring logging Xiao Wang
2018-12-13 10:09         ` [dpdk-dev] [PATCH v3 8/9] net/ifc: support SW assisted VDPA live migration Xiao Wang
2018-12-13 10:09         ` [dpdk-dev] [PATCH v3 9/9] doc: update ifc NIC document Xiao Wang
2018-12-13  1:10     ` [dpdk-dev] [PATCH v2 3/9] net/ifc: dump debug message for error Xiao Wang
2018-12-13  1:10     ` [dpdk-dev] [PATCH v2 4/9] net/ifc: store only registered device instance Xiao Wang
2018-12-13  1:10     ` [dpdk-dev] [PATCH v2 5/9] net/ifc: detect if VDPA mode is specified Xiao Wang
2018-12-13  1:10     ` [dpdk-dev] [PATCH v2 6/9] net/ifc: add devarg for LM mode Xiao Wang
2018-12-13  1:10     ` [dpdk-dev] [PATCH v2 7/9] net/ifc: use lib API for used ring logging Xiao Wang
2018-12-13  1:10     ` [dpdk-dev] [PATCH v2 8/9] net/ifc: support SW assisted VDPA live migration Xiao Wang
2018-12-13  1:10     ` [dpdk-dev] [PATCH v2 9/9] doc: update ifc NIC document Xiao Wang
2018-11-28  9:46 ` [dpdk-dev] [PATCH 3/9] net/ifc: dump debug message for error Xiao Wang
2018-11-28  9:46 ` [dpdk-dev] [PATCH 4/9] net/ifc: store only registered device instance Xiao Wang
2018-11-28  9:46 ` [dpdk-dev] [PATCH 5/9] net/ifc: detect if VDPA mode is specified Xiao Wang
2018-11-28  9:46 ` [dpdk-dev] [PATCH 6/9] net/ifc: add devarg for LM mode Xiao Wang
2018-12-04  6:31   ` Tiwei Bie
2018-12-12  6:53     ` Wang, Xiao W
2018-12-12 10:15   ` Alejandro Lucero
2018-12-12 10:23     ` Wang, Xiao W
2018-11-28  9:46 ` [dpdk-dev] [PATCH 7/9] net/ifc: use lib API for used ring logging Xiao Wang
2018-11-28  9:46 ` [dpdk-dev] [PATCH 8/9] net/ifc: support SW assisted VDPA live migration Xiao Wang
2018-11-28  9:46 ` [dpdk-dev] [PATCH 9/9] doc: update ifc NIC document Xiao Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fb9d77cf-0001-3b9c-bd17-c0dccbfe9733@redhat.com \
    --to=maxime.coquelin@redhat.com \
    --cc=alejandro.lucero@netronome.com \
    --cc=dev@dpdk.org \
    --cc=tiwei.bie@intel.com \
    --cc=xiao.w.wang@intel.com \
    --cc=xiaolong.ye@intel.com \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).