DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Wang, Xiao W" <xiao.w.wang@intel.com>
To: Maxime Coquelin <maxime.coquelin@redhat.com>,
	"Bie, Tiwei" <tiwei.bie@intel.com>
Cc: "alejandro.lucero@netronome.com" <alejandro.lucero@netronome.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"Wang, Zhihong" <zhihong.wang@intel.com>,
	"Ye, Xiaolong" <xiaolong.ye@intel.com>
Subject: Re: [dpdk-dev] [PATCH v4 09/10] net/ifc: support SW assisted VDPA live migration
Date: Mon, 17 Dec 2018 09:12:57 +0000	[thread overview]
Message-ID: <B7F2E978279D1D49A3034B7786DACF407A4BAD6C@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <fee047d7-8314-e072-0f98-69faa4a5bd48@redhat.com>

Hi Maxime,

> -----Original Message-----
> From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
> Sent: Sunday, December 16, 2018 1:35 AM
> To: Wang, Xiao W <xiao.w.wang@intel.com>; Bie, Tiwei <tiwei.bie@intel.com>
> Cc: alejandro.lucero@netronome.com; dev@dpdk.org; Wang, Zhihong
> <zhihong.wang@intel.com>; Ye, Xiaolong <xiaolong.ye@intel.com>
> Subject: Re: [PATCH v4 09/10] net/ifc: support SW assisted VDPA live migration
> 
> 
> 
> On 12/14/18 10:16 PM, Xiao Wang wrote:
> > In SW assisted live migration mode, driver will stop the device and
> > setup a mediate virtio ring to relay the communication between the
> > virtio driver and the VDPA device.
> >
> > This data path intervention will allow SW to help on guest dirty page
> > logging for live migration.
> >
> > This SW fallback is event driven relay thread, so when the network
> > throughput is low, this SW fallback will take little CPU resource, but
> > when the throughput goes up, the relay thread's CPU usage will goes up
> > accordinly.
> 
> s/accordinly/accordingly/
> 

Will fix it in next version.

> >
> > User needs to take all the factors including CPU usage, guest perf
> > degradation, etc. into consideration when selecting the live migration
> > support mode.
> >
> > Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
> > ---
> >   drivers/net/ifc/base/ifcvf.h |   1 +
> >   drivers/net/ifc/ifcvf_vdpa.c | 346
> ++++++++++++++++++++++++++++++++++++++++++-
> >   2 files changed, 344 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/ifc/base/ifcvf.h b/drivers/net/ifc/base/ifcvf.h
> > index c15c69107..e8a30d2c6 100644
> > --- a/drivers/net/ifc/base/ifcvf.h
> > +++ b/drivers/net/ifc/base/ifcvf.h
> > @@ -50,6 +50,7 @@
> >   #define IFCVF_LM_ENABLE_VF		0x1
> >   #define IFCVF_LM_ENABLE_PF		0x3
> >   #define IFCVF_LOG_BASE			0x100000000000
> > +#define IFCVF_MEDIATE_VRING		0x200000000000
> 
> MEDIATED?

"mediate" is used as adjective here.

> 
> >
> >   #define IFCVF_32_BIT_MASK		0xffffffff
> >
> > diff --git a/drivers/net/ifc/ifcvf_vdpa.c b/drivers/net/ifc/ifcvf_vdpa.c
> > index f181c5a6e..61757d0b4 100644
> > --- a/drivers/net/ifc/ifcvf_vdpa.c
> > +++ b/drivers/net/ifc/ifcvf_vdpa.c
> > @@ -63,6 +63,9 @@ struct ifcvf_internal {
> >   	rte_atomic32_t running;
> >   	rte_spinlock_t lock;
> >   	bool sw_lm;

[...]

> > +static void *
> > +vring_relay(void *arg)
> > +{
> > +	int i, vid, epfd, fd, nfds;
> > +	struct ifcvf_internal *internal = (struct ifcvf_internal *)arg;
> > +	struct rte_vhost_vring vring;
> > +	struct rte_intr_handle *intr_handle;
> > +	uint16_t qid, q_num;
> > +	struct epoll_event events[IFCVF_MAX_QUEUES * 4];
> > +	struct epoll_event ev;
> > +	int nbytes;
> > +	uint64_t buf;
> > +
> > +	vid = internal->vid;
> > +	q_num = rte_vhost_get_vring_num(vid);
> > +	/* prepare the mediate vring */
> > +	for (qid = 0; qid < q_num; qid++) {
> > +		rte_vhost_get_vring_base(vid, qid,
> > +				&internal->m_vring[qid].avail->idx,
> > +				&internal->m_vring[qid].used->idx);
> > +		rte_vdpa_relay_vring_avail(vid, qid, &internal->m_vring[qid]);
> > +	}
> > +
> > +	/* add notify fd and interrupt fd to epoll */
> > +	epfd = epoll_create(IFCVF_MAX_QUEUES * 2);
> > +	if (epfd < 0) {
> > +		DRV_LOG(ERR, "failed to create epoll instance.");
> > +		return NULL;
> > +	}
> > +	internal->epfd = epfd;
> > +
> > +	for (qid = 0; qid < q_num; qid++) {
> > +		ev.events = EPOLLIN | EPOLLPRI;
> > +		rte_vhost_get_vhost_vring(vid, qid, &vring);
> > +		ev.data.u64 = qid << 1 | (uint64_t)vring.kickfd << 32;
> > +		if (epoll_ctl(epfd, EPOLL_CTL_ADD, vring.kickfd, &ev) < 0) {
> > +			DRV_LOG(ERR, "epoll add error: %s", strerror(errno));
> > +			return NULL;
> > +		}
> > +	}
> > +
> > +	intr_handle = &internal->pdev->intr_handle;
> > +	for (qid = 0; qid < q_num; qid++) {
> > +		ev.events = EPOLLIN | EPOLLPRI;
> > +		ev.data.u64 = 1 | qid << 1 |
> > +			(uint64_t)intr_handle->efds[qid] << 32;
> > +		if (epoll_ctl(epfd, EPOLL_CTL_ADD, intr_handle->efds[qid],
> &ev)
> > +				< 0) {
> > +			DRV_LOG(ERR, "epoll add error: %s", strerror(errno));
> > +			return NULL;
> > +		}
> > +	}
> > +
> > +	/* start relay with a first kick */
> > +	for (qid = 0; qid < q_num; qid++)
> > +		ifcvf_notify_queue(&internal->hw, qid);
> > +
> > +	/* listen to the events and react accordingly */
> > +	for (;;) {
> > +		nfds = epoll_wait(epfd, events, q_num * 2, -1);
> > +		if (nfds < 0) {
> > +			if (errno == EINTR)
> > +				continue;
> > +			DRV_LOG(ERR, "epoll_wait return fail\n");
> > +			return NULL;
> > +		}
> > +
> > +		for (i = 0; i < nfds; i++) {
> > +			fd = (uint32_t)(events[i].data.u64 >> 32);
> > +			do {
> > +				nbytes = read(fd, &buf, 8);
> > +				if (nbytes < 0) {
> > +					if (errno == EINTR ||
> > +					    errno == EWOULDBLOCK ||
> > +					    errno == EAGAIN)
> > +						continue;
> > +					DRV_LOG(INFO, "Error reading "
> > +						"kickfd: %s",
> > +						strerror(errno));
> > +				}
> > +				break;
> > +			} while (1);
> > +
> > +			qid = events[i].data.u32 >> 1;
> > +
> > +			if (events[i].data.u32 & 1)
> > +				update_used_ring(internal, qid);
> > +			else
> > +				update_avail_ring(internal, qid);
> > +		}
> > +	}
> > +
> > +	return NULL;
> > +}
> > +
> > +static int
> > +setup_vring_relay(struct ifcvf_internal *internal)
> > +{
> > +	int ret;
> > +
> > +	ret = pthread_create(&internal->tid, NULL, vring_relay,
> > +			(void *)internal);
> 
> So it will be scheduled without any affinity?
> Shouldn't it use a pmd thread instead?

The new thread will inherit the thread affinity from its parent thread. As you know, vdpa is trying to
 minimize CPU usage for virtio HW acceleration, and we assign just one core to vdpa daemon
 (doc/guides/sample_app_ug/vdpa.rst), so there's no dedicated pmd worker core.

Thanks,
Xiao

  reply	other threads:[~2018-12-17  9:13 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-28  9:45 [dpdk-dev] [PATCH 0/9] " Xiao Wang
2018-11-28  9:45 ` [dpdk-dev] [PATCH 1/9] vhost: provide helper for host notifier ctrl Xiao Wang
2018-11-28  9:46 ` [dpdk-dev] [PATCH 2/9] vhost: provide helpers for virtio ring relay Xiao Wang
2018-12-04  6:22   ` Tiwei Bie
2018-12-12  6:51     ` Wang, Xiao W
2018-12-13  1:10   ` [dpdk-dev] [PATCH v2 0/9] support SW assisted VDPA live migration Xiao Wang
2018-12-13  1:10     ` [dpdk-dev] [PATCH v2 1/9] vhost: provide helper for host notifier ctrl Xiao Wang
2018-12-13  1:10     ` [dpdk-dev] [PATCH v2 2/9] vhost: provide helpers for virtio ring relay Xiao Wang
2018-12-13 10:09       ` [dpdk-dev] [PATCH v3 0/9] support SW assisted VDPA live migration Xiao Wang
2018-12-13 10:09         ` [dpdk-dev] [PATCH v3 1/9] vhost: provide helper for host notifier ctrl Xiao Wang
2018-12-14 13:33           ` Maxime Coquelin
2018-12-14 19:05             ` Wang, Xiao W
2018-12-14 21:16           ` [dpdk-dev] [PATCH v4 00/10] support SW assisted VDPA live migration Xiao Wang
2018-12-14 21:16             ` [dpdk-dev] [PATCH v4 01/10] vhost: remove unused internal API Xiao Wang
2018-12-16  8:58               ` Maxime Coquelin
2018-12-14 21:16             ` [dpdk-dev] [PATCH v4 02/10] vhost: provide helper for host notifier ctrl Xiao Wang
2018-12-16  9:00               ` Maxime Coquelin
2018-12-14 21:16             ` [dpdk-dev] [PATCH v4 03/10] vhost: provide helpers for virtio ring relay Xiao Wang
2018-12-16  9:10               ` Maxime Coquelin
2018-12-17  8:51                 ` Wang, Xiao W
2018-12-17 11:02                   ` Maxime Coquelin
2018-12-17 14:41                     ` Wang, Xiao W
2018-12-17 19:00                       ` Maxime Coquelin
2018-12-18  8:27                         ` Wang, Xiao W
2018-12-18  8:44                           ` Thomas Monjalon
2018-12-18  8:01               ` [dpdk-dev] [PATCH v5 00/10] support SW assisted VDPA live migration Xiao Wang
2018-12-18  8:01                 ` [dpdk-dev] [PATCH v5 01/10] vhost: remove unused internal API Xiao Wang
2018-12-18  8:01                 ` [dpdk-dev] [PATCH v5 02/10] vhost: provide helper for host notifier ctrl Xiao Wang
2018-12-18 15:37                   ` Ferruh Yigit
2018-12-18  8:02                 ` [dpdk-dev] [PATCH v5 03/10] vhost: provide helpers for virtio ring relay Xiao Wang
2018-12-18  8:02                 ` [dpdk-dev] [PATCH v5 04/10] net/ifc: dump debug message for error Xiao Wang
2018-12-18  8:02                 ` [dpdk-dev] [PATCH v5 05/10] net/ifc: store only registered device instance Xiao Wang
2018-12-18  8:02                 ` [dpdk-dev] [PATCH v5 06/10] net/ifc: detect if VDPA mode is specified Xiao Wang
2018-12-18  8:02                 ` [dpdk-dev] [PATCH v5 07/10] net/ifc: add devarg for LM mode Xiao Wang
2018-12-18 11:23                   ` Maxime Coquelin
2018-12-18  8:02                 ` [dpdk-dev] [PATCH v5 08/10] net/ifc: use lib API for used ring logging Xiao Wang
2018-12-18  8:02                 ` [dpdk-dev] [PATCH v5 09/10] net/ifc: support SW assisted VDPA live migration Xiao Wang
2018-12-18 11:33                   ` Maxime Coquelin
2018-12-18  8:02                 ` [dpdk-dev] [PATCH v5 10/10] doc: update ifc NIC document Xiao Wang
2018-12-18 11:35                   ` Maxime Coquelin
2018-12-14 21:16             ` [dpdk-dev] [PATCH v4 04/10] net/ifc: dump debug message for error Xiao Wang
2018-12-16  9:11               ` Maxime Coquelin
2018-12-14 21:16             ` [dpdk-dev] [PATCH v4 05/10] net/ifc: store only registered device instance Xiao Wang
2018-12-16  9:12               ` Maxime Coquelin
2018-12-14 21:16             ` [dpdk-dev] [PATCH v4 06/10] net/ifc: detect if VDPA mode is specified Xiao Wang
2018-12-16  9:17               ` Maxime Coquelin
2018-12-17  8:54                 ` Wang, Xiao W
2018-12-14 21:16             ` [dpdk-dev] [PATCH v4 07/10] net/ifc: add devarg for LM mode Xiao Wang
2018-12-16  9:21               ` Maxime Coquelin
2018-12-17  9:00                 ` Wang, Xiao W
2018-12-14 21:16             ` [dpdk-dev] [PATCH v4 08/10] net/ifc: use lib API for used ring logging Xiao Wang
2018-12-16  9:24               ` Maxime Coquelin
2018-12-14 21:16             ` [dpdk-dev] [PATCH v4 09/10] net/ifc: support SW assisted VDPA live migration Xiao Wang
2018-12-16  9:35               ` Maxime Coquelin
2018-12-17  9:12                 ` Wang, Xiao W [this message]
2018-12-17 11:08                   ` Maxime Coquelin
2018-12-14 21:16             ` [dpdk-dev] [PATCH v4 10/10] doc: update ifc NIC document Xiao Wang
2018-12-16  9:36               ` Maxime Coquelin
2018-12-17  9:15                 ` Wang, Xiao W
2018-12-18 14:01             ` [dpdk-dev] [PATCH v4 00/10] support SW assisted VDPA live migration Maxime Coquelin
2018-12-13 10:09         ` [dpdk-dev] [PATCH v3 2/9] vhost: provide helpers for virtio ring relay Xiao Wang
2018-12-13 10:09         ` [dpdk-dev] [PATCH v3 3/9] net/ifc: dump debug message for error Xiao Wang
2018-12-13 10:09         ` [dpdk-dev] [PATCH v3 4/9] net/ifc: store only registered device instance Xiao Wang
2018-12-13 10:09         ` [dpdk-dev] [PATCH v3 5/9] net/ifc: detect if VDPA mode is specified Xiao Wang
2018-12-13 10:09         ` [dpdk-dev] [PATCH v3 6/9] net/ifc: add devarg for LM mode Xiao Wang
2018-12-13 10:09         ` [dpdk-dev] [PATCH v3 7/9] net/ifc: use lib API for used ring logging Xiao Wang
2018-12-13 10:09         ` [dpdk-dev] [PATCH v3 8/9] net/ifc: support SW assisted VDPA live migration Xiao Wang
2018-12-13 10:09         ` [dpdk-dev] [PATCH v3 9/9] doc: update ifc NIC document Xiao Wang
2018-12-13  1:10     ` [dpdk-dev] [PATCH v2 3/9] net/ifc: dump debug message for error Xiao Wang
2018-12-13  1:10     ` [dpdk-dev] [PATCH v2 4/9] net/ifc: store only registered device instance Xiao Wang
2018-12-13  1:10     ` [dpdk-dev] [PATCH v2 5/9] net/ifc: detect if VDPA mode is specified Xiao Wang
2018-12-13  1:10     ` [dpdk-dev] [PATCH v2 6/9] net/ifc: add devarg for LM mode Xiao Wang
2018-12-13  1:10     ` [dpdk-dev] [PATCH v2 7/9] net/ifc: use lib API for used ring logging Xiao Wang
2018-12-13  1:10     ` [dpdk-dev] [PATCH v2 8/9] net/ifc: support SW assisted VDPA live migration Xiao Wang
2018-12-13  1:10     ` [dpdk-dev] [PATCH v2 9/9] doc: update ifc NIC document Xiao Wang
2018-11-28  9:46 ` [dpdk-dev] [PATCH 3/9] net/ifc: dump debug message for error Xiao Wang
2018-11-28  9:46 ` [dpdk-dev] [PATCH 4/9] net/ifc: store only registered device instance Xiao Wang
2018-11-28  9:46 ` [dpdk-dev] [PATCH 5/9] net/ifc: detect if VDPA mode is specified Xiao Wang
2018-11-28  9:46 ` [dpdk-dev] [PATCH 6/9] net/ifc: add devarg for LM mode Xiao Wang
2018-12-04  6:31   ` Tiwei Bie
2018-12-12  6:53     ` Wang, Xiao W
2018-12-12 10:15   ` Alejandro Lucero
2018-12-12 10:23     ` Wang, Xiao W
2018-11-28  9:46 ` [dpdk-dev] [PATCH 7/9] net/ifc: use lib API for used ring logging Xiao Wang
2018-11-28  9:46 ` [dpdk-dev] [PATCH 8/9] net/ifc: support SW assisted VDPA live migration Xiao Wang
2018-11-28  9:46 ` [dpdk-dev] [PATCH 9/9] doc: update ifc NIC document Xiao Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B7F2E978279D1D49A3034B7786DACF407A4BAD6C@SHSMSX101.ccr.corp.intel.com \
    --to=xiao.w.wang@intel.com \
    --cc=alejandro.lucero@netronome.com \
    --cc=dev@dpdk.org \
    --cc=maxime.coquelin@redhat.com \
    --cc=tiwei.bie@intel.com \
    --cc=xiaolong.ye@intel.com \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).